# A Fully-Integrated Digital-Intensive Polar Doherty Transmitter

Yiyu Shen <sup>#</sup>, Mohammadreza Mehrpoo<sup>#</sup>, Mohsen Hashemi<sup>#</sup>, Michael Polushkin <sup>#</sup>, Lei Zhou <sup>#\*</sup>, Mustafa Acar <sup>\*</sup>, Rene van Leuken <sup>#</sup>, Morteza S. Alavi <sup>#</sup>, Leo de Vreede <sup>#</sup>, <sup>#</sup>ELCA group, Delft University of Technology, Delft,The Netherlands <sup>\*</sup>Ampleon, Nijmegen, The Netherlands

Abstract — This paper presents an advanced 2.3–2.8 GHz fully-integrated digital-intensive polar Doherty transmitter realized in 40 nm standard CMOS. The proposed architecture comprises CORDIC, digital delay aligners, interpolators, digital pre-distortion (DPD) circuitry in combination with frequency-agile wideband phase modulators followed by the digital main and peak power amplifier (PA) operating in quasi-load insensitive class-E using an on-chip power combiner. At 2.5 GHz, its maximum output power is +21.4 dBm. Drain efficiency is 49.4% at peak power, and 33.7% at 6-dB power back-off. Applying DPD for a 20-MHz 64-QAM signal, the measured EVM is better than  $-30 \, \text{dB}$  while the average drain efficiency is 24%.

*Index Terms* — Doherty transmitter, digital-intensive polar, efficiency enhancement, DPD, phase modulator, on-chip combiner.

#### I. INTRODUCTION

Recently, digital-intensive transmitters (DTX) [1], [2], [3] are gaining attention due to their excellent hardware scalability with nanoscale CMOS and their great potential to incorporate extensive digital correction circuitries such as digital pre-distortion (DPD). These properties are essential in achieving high system integration, linearity, and efficiency at low cost. However, due to the large peak-to-average power ratio (PAPR) of a modern complex modulated baseband signal, which typically yields to the severe degradation of the average efficiency, various efficiency enhancement techniques, like Doherty [4], [5], outphasing [6], and supply modulation [7], are currently also adopted in DTXs approaches. Among them, the Doherty topology is popular due to its large video and RF bandwidth along with its relative low hardware complexity.

However, the design of a fully-integrated digital Doherty PA exposes several challenges. First, PA topologies that feature linear operation, typically do not benefit from switching operation [6]. Second, on-chip matching networks, especially in the main path of the Doherty PA, are lossy, leading to a significant efficiency drop in their power back off (PBO) region [4].

To address these issues while achieving high overall system efficiency, spectral purity, and video bandwidth, a digital-intensive polar Doherty transmitter is proposed and realized in 40 nm bulk CMOS. To the authors knowledge,



Fig. 1. Block diagram of the proposed digital-intensive TX.

this architecture is the first bits-in RF-out single-chip DTX employing the Doherty topology.

## II. SYSTEM ARCHITECTURE OF PROPOSED TX

The block diagram of the proposed DTX is depicted in Fig. 1. The DTX consists of digital baseband processing unit, clock divider, interpolation filter, wideband phase modulators and digital PA branches.

The  $4 \times f_0$  single-ended off-chip clock, where  $f_0$  is the carrier frequency, is applied to an on-chip balun to convert the single-ended clock to a differential signal. This differential clock is then applied to a divide-by-4 circuit to generate the desired multi-phase clock signals at  $f_0$ . These clock signals are then fed to the main and peak phase modulators of the Doherty branches with the clock signals of the peak branch lagging 90 degree those of the main branch, emulating the input  $\lambda/4$  transmission line in a conventional Doherty topology.

Employing an on-chip CORDIC, the digital in-phase (I)and quadrature (Q) baseband signals are converted into their envelope (AM) and phase (PM) polar representation. The baseband AM signal is first interpolated and after splitting into  $AM_{main}$  and  $AM_{peak}$ , they are fed to the main and peak digital PAs (DPA). Note that preceding the PA stages, the envelope signals are first converted from a binary bit-stream into thermometer code and then are applied to the (digital) up-converted mixers. Meanwhile,



Fig. 2. (a) Topology of the proposed digital Doherty PA, (b) layout of power combining network.



Fig. 3. Block diagram of the digital baseband.

the baseband phase signal is first applied to a normalizer unit which decomposes the phase information into a constant envelope I/Q representation. These I/Q signals are then interpolated to suppress spectral sampling replicas. Consequently, the resultant I/Q signals drive two IQ RF-DAC based wideband phase modulators that generate the desired up-converting clock signals for the main and peak PA branches. The two branch signals are finally combined using an on-chip Doherty matching network.

### III. DESIGN OF MAIN BUILDING BLOCKS

## A. Digital Doherty PA and Power Combining Network

Fig. 2(a) depicts the topology of the proposed digital Doherty PA. It consists of the main and peak DPA branches, with each branch operating in a push-pull configuration. The DPA is designed as an array of unit-cell NMOS switches to achieve high peak drain efficiency despite having inferior transistor linearity [1], [6]. Note that each DPA branch operates in a quasi load-insensitive class-E that presents considerable advantages for efficiency enhancement techniques employing load modulation [6]. Moreover, the switch array configuration of the DPA bank modulates its on-resistance, i.e.  $R_{ON}$ , which, in turn, adjusts the RF output power. Furthermore, to satisfy the stringent out-of-band noise requirement of advanced wireless communications, the DPA resolution of each branch is selected to be 11-bit input which splits into two segments; a 9-bit MSB thermometer configuration along with a 2-bit LSB binary weighted structure. Consequently, each branch consists of a push-pull DPA incorporating 511 MSB unary cells together with another 3 LSB unary cells, leading to a 12-bit digital-intensive RF transmitter. Note that the switch  $R_{ON}$  in the MSB segment is a quarter of the LSB one. As illustrated in Fig. 2(a), every unary cell contains an address decoder in conjunction with phase aligner. In each sub-DPA cell, a single NMOS switch is employed to maximize the ratio between  $R_{OFF}$  and  $R_{ON}$ . In a conventional class-E PA, the peak drain voltage can approximately reach 3 times of the drain supply voltage  $(V_{DD})$ . However, for practical (lossy) implementations, this ratio is typically less. Thus, to avoid a transistor breakdown,  $V_{DD}$  is set to 0.7 V, while the foregoing sub-DPA drivers operate from a 1.1 V supply. The schematic and the layout of the power matching network are depicted in Fig. 2(a) and (b), respectively. This network utilizes a parallel power-combining configuration. For each branch, an identical 1:3 transformer is implemented to transform the load to the relatively low impedance seen by the DPA bank. Based on rigorous EM simulations, the related insertion loss and coupling factor of the transformer are 1 dB and 0.84, respectively. A single-ended C-L-C impedance inverter is adopted to operate as an impedance inverter. Note that the parasitic capacitance of both transformers are incorporated into impedance inverter structure.

#### B. Digital Baseband

The block diagram of the digital baseband is depicted in Fig. 3, comprising a CORDIC, fractional delay-line, DPD look-up tables (LUT) and normalizers. Their sampling frequency is  $f_0/4$ . As stated, using CORDIC, the AM and PM signals are derived from a  $2 \times 10$ -bit input I/Q signal. The delay mismatch between AM and PM introduced by the main and peak branches is compensated by the fractional delay-line circuitry. Its accuracy exceeds 250 ps. As stated, employing the energy efficient load-insensitive class-E topology for the switch array yields diminishing linearity performance, i.e. AM-AM and AM-PM distortion, due to inherently nonlinear relationship between the amplitude code word and the DPA output voltage, along with non-constant relation to the output phase. Thus, two on-chip AM-AM and AM-PM LUT SRAMs are integrated, in which the AM signal is first pre-distorted by the AM-AM LUT and, then, is fed to the address decoder. Likewise, the AM-PM LUT pre-distorts the PM and is mapped to the Cartesian domain by a normalizer that is implemented as an on-chip SRAM. This approach enforces a constant amplitude IQ phase vector (shown in Fig. 3) to drive the phase modulator. Importantly, the normalizer can also perform the IQ-image and LO-leakage calibration of the phase modulator.

# C. Wideband Phase Modulator

A wideband phase modulator is implemented in each branch, each consisting of a multi-phase mixer, a low-pass filter (LPF) and a limiter. The phase modulators can operate up to 3 GHz. Compared to conventional phase modulators [1], this wideband phase modulator is frequency agile, compact and has superior phase linearity.

## **IV. MEASUREMENT RESULTS**

The proposed all-digital TX is fabricated in 40 nm CMOS technology. Fig. 4 exhibits the chip micrograph. The total area is  $8.2 \text{ mm}^2$  with the core occupying 5 mm<sup>2</sup> while the digital Doherty PA and matching network occupys roughly 2.5 mm<sup>2</sup>. For the measurements, the I/Q data are initially loaded through SPI into two on-chip SRAMs. Fig. 5(a) presents the CW measurement results of digital Doherty PA. The peak continuous RF output power and drain efficiency are 21.4 dBm and 52%-46%, respectively, within the range of 2.3 to 2.8 GHz. Fig. 5 (b) illustrates the drain efficiency versus output power at 2.5 GHz. Note that all following measurements are taken with  $f_0 = 2.5$  GHz. Compared to a conventional class-B PA, the efficiency improvement is 9% at 6 dB PBO.



Fig. 4. Micrograph of proposed DTX.



Fig. 5. Measured (a) Drain efficiency and RF output power vs.  $f_0$ , (b) drain efficiency vs. RF output power.

In Fig. 6, the AM-AM and AM-PM characteristic with and without incorporating the on-chip DPD are presented which demonstrate the effectiveness of the on-chip DPD. The measured IQ image and LO leakage of the phase modulator are presented in Fig. 7. Without IQ calibration, the LO leakage and IQ image are  $-56 \, \text{dBc}$  and  $-50 \, \text{dBc}$ , respectively, with a 150 kHz baseband signal. Utilizing IQ calibration, the LO leakage and IQ image reduce to -58 dBc and -72 dBc, respectively. Finally, the spectral purity of the DTX is verified for a 20 MHz 64-QAM signal with 7.5 dB PAPR. According to Fig. 8, the measured out-of-band spectral purity is better than -40 dBc while the related average drain efficiency and EVM are 24% and  $-30 \, \text{dB}$ , respectively. Table I compares the performance with state-of-the-art digital fully-integrated CMOS PA and TXs. The proposed digital Doherty PA achieves high drain efficiency at both peak power mode and 6 dB PBO mode. V. CONCLUSION

This paper demonstrates a highly efficient fully-integrated digital-intensive Doherty polar transmitter

TABLE I

| PERFORMANCE SUMMARY AND COMPARISON TABLE WITH THE STATE-OF-THE-ART CMOS DIGITAL PA AND TX |
|-------------------------------------------------------------------------------------------|
| PERFORMANCE SUMMARY AND COMPARISON TABLE WITH THE STATE-OF-THE-ART CMOS DIGITAL PA AND TX |

| Ref       | Tech<br>(nm) | Freq.<br>(GHz) | Peak Pout<br>(dBm) | Peak<br>DE (%)    | DE (%) @PBO<br>LEVEL | Signal Modulation<br>(PAPR) | EVM<br>(dB) | Avg.<br>DE(%)     | TX<br>Chain      | DPA<br>Architecture |
|-----------|--------------|----------------|--------------------|-------------------|----------------------|-----------------------------|-------------|-------------------|------------------|---------------------|
| [1]       | 65           | 2.2            | 23.3               | 43                | 33@6dB               | 11g20 64QAM (6.5dB)         | -28         | 24.5              | YES <sup>1</sup> | Load Switch         |
| [2]       | 28           | 2.4            | 27.8               | N.A.              | N.A.                 | 11n40 (7.7dB)               | -30         | 19.5 <sup>2</sup> | YES              | Class B             |
| [3]       | 40           | 2.4            | 27.0               | N.A.              | N.A.                 | 11n40 (8.4dB)               | -30         | 14.5              | YES              | Digital IQ          |
| [4]       | 65           | 3.8            | 27.3               | 30                | 23@5.4dB             | 0.5M 16QAM (5.4dB)          | -25         | 22.1              | NO               | Doherty             |
| [5]       | 65           | 0.9            | 24.0               | 45 <sup>3,4</sup> | 34@6dB               | 11ac40 256QAM (9dB)         | -34.8       | 22                | NO               | Doherty             |
| [6]       | 40           | 5.9            | 22.2               | 49.2              | 20@8dB               | 20M 64QAM (7.2dB)           | -30         | 23.3              | NO               | Outphasing          |
| This work | 40           | 2.5            | 21.4               | 49.4              | 33.7@6dB             | 20M 64QAM (7.5dB)           | -30         | 24                | YES              | Doherty             |

<sup>1</sup> Not including baseband and DPD, <sup>2</sup>Including power consumption of driver stage, <sup>3</sup> Power-Added Eff., <sup>4</sup>Off-chip power combining



Fig. 6. Measured (a) AM-AM and (b) AM-PM characteristic



Fig. 7. Measured IQ image and LO leakage of phase modulator: (a) before and (b) after calibration.

in 40 nm CMOS technology. By employing class-E topology, this digital Doherty PA achieves 49.4% drain efficiency at a peak power of +21.4 dBm and 33.7% drain efficiency at 6 dB PBO. Employing on-chip DPD, the proposed digital RF transmitter demonstrates linearity better than -40 dBc for a 20 MHz 64-QAM signal at 2.5 GHz while its average drain efficiency and EVM are better than 24% and -30 dB, respectively. The proposed digital transmitter architecture is scalable with advanced CMOS process.

# ACKNOWLEDGMENT

The authors acknowledge Atef Akhnoukh of TU Delft and the IMEC / Europractice IC service team for their unlimited and high quality support, the people of Ampleon and NXP for their encouragement and advice, and the projects SEEDCOM (STW) and EAST (Catrene) for their financial support.



Fig. 8. Measured spectrum of 20 MHz 64-QAM signal.

## REFERENCES

- L. Ye, et al., "A digitally modulated 2.4GHz WLAN transmitter with integrated phase path and dynamic load modulation in 65nm CMOS," in *ISSCC Tech. Digest*, Feb. 2013, pp. 330-331.
- [2] R. Winoto, et al., "9.4 A  $2 \times 2$  WLAN and Bluetooth combo SoC in 28nm CMOS with on-chip WLAN digital power amplifier, integrated 2G/BT SP3T switch and BT pulling cancelation," in *ISSCC Tech. Digest*, Feb. 2016, pp. 170-171.
- [3] Z. Deng, et al., "9.5 A dual-band digital-WiFi 802.11a/b/g/n transmitter SoC with digital I/Q combining and diamond profile mapping for compact die area and improved efficiency in 40nm CMOS, " in *ISSCC Tech. Digest*, Feb. 2016, pp. 172-173.
- [4] S. Hu, et al., "Design of A Transformer-Based Reconfigurable Digital Polar Doherty Power Amplifier Fully Integrated in Bulk CMOS," in *IEEE J. Solid-State Circuits*, vol. 50, no. 5, pp. 1094-1106, May 2015.
- [5] V. Vorapipat, et al., "A wideband voltage mode Doherty power amplifier," in *Proc. of IEEE RFIC Symp.*, May 2016, pp. 266-269.
- [6] Z. Hu, et al., "A 5.9 GHz RFDAC-based outphasing power amplifier in 40-nm CMOS with 49.2% efficiency and 22.2 dBm power," *Proc. of IEEE RFIC Symp.*, May 2016, pp. 206-209.
- [7] G. Hanington, et al., "High-efficiency power amplifier using dynamic power-supply voltage for CDMA applications,", *IEEE Microwave Theory and Techniques*, vol. 47, no. 8, pp. 1471–1476, Aug. 1999.