A Digitally Assisted Non-Linearity Mitigation System for Tunable Channel Select Filters

Gangarajaiah, Rakesh; Abdulaziz, Mohammed; Sjöland, Henrik; Nilsson, Peter; Liu, Liang

Published in:
IEEE Transactions on Circuits and Systems II: Express Briefs

DOI:
10.1109/TCSII.2015.2504272

2016

Link to publication

Citation for published version (APA):

Total number of authors:
5

General rights
Unless other specific re-use rights are stated the following general rights apply:
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
• You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal

Read more about Creative commons licenses: https://creativecommons.org/licenses/

Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
A Digitally Assisted Non-Linearity Mitigation System for Tunable Channel Select Filters

Rakesh Gangarajaiah, Student Member, IEEE, Mohammed Abdulaziz, Student Member, IEEE, Henrik Sjöland, Senior Member, IEEE, Peter Nilsson, Senior Member, IEEE and Liang Liu, Member, IEEE,

Abstract—This paper presents a low-complexity system for digitally assisting a channel select filter to mitigate both even and odd order non-linearities. The proposed solution is scalable and can be utilized for non-linearity mitigation in different analog transceiver blocks. The system consists of an auxiliary path with a low resolution analog to digital converter (ADC) enabling digital recreation and measurement of the distortion in the main path, and relies on an adaptive digital signal processing algorithm to detect and tune the analog components to their optimal settings. The system provides robustness against process, voltage and temperature (PVT) variations and the digital part requires an equivalent logic of only 42 k gates in CMOS technology, enabling cost-efficient implementation on integrated circuits. The operation of the system has been verified by using a tunable channel select filter (CSF) capable of receiving a 10 MHz baseband signal interface to an external ADC. The results demonstrate that the proposed system is capable of tuning the CSF to its optimal bias voltage, providing a third order intermodulation reduction of 14.5 dB.

Index Terms—Adaptive signal processing, Interference cancellation, Intermodulation distortion, Nonlinear circuits.

I. INTRODUCTION

TECHNOLOGY scaling in modern complementary metal oxide semiconductor (CMOS) processes has resulted in high performance and low power digital signal processing both due to the smaller feature size of transistors and the capability to operate at lower supply voltages. While scaling has immensely benefited digital circuits it has created both advantages and problems in their analog counterparts, problems mainly in terms of linearity. Furthermore, an increase in wireless connections within a limited spectrum has forced many radio devices to operate close in frequency, resulting in powerful interference in close proximity to the signal of interest. This has resulted in increased linearity requirements on the analog building blocks of a wireless transceiver, such as the low noise amplifier (LNA), mixer, and channel select filter (CSF), and these requirements are expected to increase with newer 5G technologies. The straightforward method of implementing high-linearity analog circuits results in increased area and power, both of which have to be kept low for a cost effective wireless device. Consequently an increasing number of receiver operations are performed in the digital baseband where linearity is guaranteed. Nevertheless, some of the analog building blocks are essential and designers are looking at techniques to reduce distortion due to interference.

Several techniques have been reported in literature addressing the intermodulation (IM) mitigation problem and one strategy is to implement radio frequency (RF) components which can be calibrated to operate at their optimum bias points [1]–[4]. Unfortunately, this bias point shifts significantly depending on process, voltage and temperature (PVT) variations, requiring frequent re-calibration to continue operating in the high linearity mode. Recently, the concept of using an auxiliary path to recreate and cancel non-linearities has been proposed to increase the robustness of analog circuits. An adaptive interference cancellation technique operating in the analog domain was introduced in [5]. One of the aspects of this method is the use of analog components to perform calibration, which themselves might suffer from PVT variation. In [6], [7], IM generation is performed in the analog domain and powerful digital signal processing is employed for IM cancellation. In order to minimize the hardware in the analog circuitry and to better generate all the required IM terms for cancellation in an LNA, a mixed domain approach is suggested in [6]. Authors in [8] develop a complete analytical model for different nonlinearities in RF receivers and verify the effectiveness of adaptive algorithms for feed-forward cancellation on a software-defined radio. A method suitable for an analog to digital converter (ADC) using digital downsampling is presented in [9]. In all of these methods, the auxiliary paths need to operate continuously, leading to increased power consumption. A method which can satisfy the conflicting demands of high linearity and low power consumption is needed. In [10], the authors use the advantages of tunable analog circuits which remove IM at the source, and harness the power of adaptive digital signal processing capable of tuning the analog circuits towards optimal operation. Unlike the approach in [6]–[9], this method does not need to operate continuously and can be powered down when the optimal bias point has been reached, leading to significant power savings. Furthermore, this method can be applied to RF blocks which can be calibrated to mitigate IM distortion, provided that a low resolution linear alternate path can be realized.

Based on this concept, this paper presents the complete hardware design and implementation of a system for adaptive non-linearity tuning of an analog component. A fully functional system is built around a tunable CSF with 10 MHz baseband signal bandwidth [11]. The performance of the proposed system, configured to mitigate the 3\textsuperscript{rd} order IM, is evaluated under different input conditions and measurement results are presented. The system is implemented in 65 nm CMOS process and post layout simulation results are shown to evaluate the overhead in terms of area and power.

In detail, the digital part of the system performing this calibration consists of an alternate path ADC and a digital...
least mean squares (LMS) filter. The system has been implemented on a Xilinx Kintex-7 field programmable gate array (FPGA) and interfaced to the CSF using an external ADC. Several experiments have been performed by using different resolutions on the auxiliary ADC and the results show that the proposed method is capable of tuning the CSF towards optimal operation even when operating with a 4 bit ADC. The system is capable of detecting and tuning the CSF for mitigating both even and odd order IM using the same hardware blocks with minimal reconfiguration. The tuning can be achieved within 20 ms or can be turned off once the desired level of linearity has been reached leading to significant power savings.

The remainder of the paper is organized as follows. Section II gives a background of the system, and section III presents the implementation aspects of the digital tuning method. Different test scenarios and performance results are discussed in section IV and conclusions are presented in section V.

II. BACKGROUND

A traditional receiver chain in a wireless device is shown in Fig. 1 with the analog module corresponding to any or all of the building blocks in the RF front end. The analog modules are generally prone to non-linearities and are non-tunable, which in presence of strong interference and PVT variations results in IM distortion, that can corrupt the wanted signal. The problems due to non-linearities are especially aggravated in analog front-ends operating in the Frequency Division Duplexing (FDD) mode when the receive (Rx) and transmit (Tx) bands are closely located in frequency [12]. These problems are expected to be more severe in devices aiming to achieve 5G data rates when using wideband signal reception and hence low IM distortion is a necessity.

Even order non-linearities in the analog domain are usually well suppressed through cancellation by the use of differential signals, whereas cancellation of odd order non-linearities is a challenging task. A method of using parallel CMOS devices operating in sub-threshold region for canceling third order intermodulation (IM3) is presented in [1]. Another technique of cancellation in the transconductance stage is presented in [11], where the control voltage of the different stages in a CSF can be tuned for optimum cancellation of IM3. The schematic of this CSF along with the control signal for IM3 tuning is shown in Fig. 3. However, supply and temperature variation shift the optimal control voltage and so regular re-tuning of the filter is needed. In this paper we present the details of the hardware implementation of a digital loop capable of performing this tuning and demonstrate its effectiveness in different test scenarios. The proposed system, together with a digital to analog converter (DAC) is capable of performing run time tuning of the filter.

III. DIGITALLY ASSISTED NON-LINEARITY TUNING SYSTEM

A. System Overview

Fig. 2 depicts a run-time tuning mechanism capable of calibrating an analog module, in this example a CSF. The main path of the receiver is assumed to consist of different RF components such as an LNA and mixer which produce the amplified and downconverted signal \( x(t) \), which is fed into the tunable CSF. An ADC operating with an oversampling ratio (OSR) of eight is assumed to digitize the CSF output, which is then downsampled by a decimator to produce \( y(n) \) at the required digital baseband rate. The basic idea of the tuning mechanism is as follows. An auxiliary ADC is used to capture the in band (IB) and out of band (OOB) interferences, enabling the digital re-creation of any \( X^{th} \) order distortion which we aim to cancel. A linear adaptive algorithm, which minimizes the first order error between the main path signal \( y(n) \) and the signal \( x'(n) \) is used to extract the error signal \( e(n) \), which mainly contains the non-linear components of the main path signal. The level of correlation between the digitally recreated IM distortion \( k(n) \) and error \( e(n) \) is used as a measure of the non-linearity, with a higher correlation value indicating larger non-linearities. The system controller along with the DAC can then tune the analog component towards optimal operation. Note that the error signal \( e(n) \) contains both even and odd order non-linear components in the main path signal and hence by reconfiguring the IM generator to produce either the even or odd order IM, the corresponding non-linearity can be tuned by minimizing the correlation value. The procedure for tuning a component is as follows. When calibration is started, the digital loop performs a scan over
a predefined range of bias voltages where the performance of the RF component is evaluated and the correlation values are stored. Once the scan is completed, the system controller chooses the bias voltage which minimizes IM distortion. A new calibration scan is initiated every few minutes or when significant changes in operating temperature or supply voltage are detected. For example, if IM3 and IM5 have to be tuned, the digital loop is first configured to tune for IM3 followed by tuning for IM5. The system controller will store the correlation values for these runs and make a decision by comparing the values to determine the strongest IM. The bias voltage which minimizes the total IM distortion is then chosen at the end of the calibration scans. If, another module such as the LNA is to be tuned, the auxiliary path reference signal could be obtained by a highly linear but noisy measurement receiver, commonly found in modern day transceiver chips. The system is capable of adaptive detection of blockers and can tune an analog component to its optimal bias region after which it can be turned off to save power.

The digital loop customized for IM3 tuning can be mathematically described as,

$$z(n) = F(x'(n) + e(n))$$
$$e(n) = y(n) - z(n)$$
$$k(n) = x'(n)^3$$
$$C = k(n) * e(n),$$

where the function $F()$ describes the operation of the adaptive filter, $x'(n)$ is the reference signal from the auxiliary ADC and $e(n)$ is the error signal which is correlated with the output of the IM generator $k(n)$, to produce the correlation value $C$ for the current tuning step.

From the hardware resource perspective, the main components of the proposed system are an auxiliary ADC, an IM generator, two decimators to match the sample rates of the auxiliary path and the main path signals, an adaptive filter, a correlation unit and a system controller along with a DAC. In the current implementation, an off the shelf ADC is used for performing the tasks of the main and auxiliary path ADC. The IM generator is implemented using digital multipliers and requires two multiplication units to generate IM3.

B. Low Complexity Decimator Design

Decimation is an essential operation in many signal processing systems with the main goals of providing filtering along with sample rate reduction. An efficient method of choosing between different implementations is presented in [13]. One of the requirements of the proposed method is the presence of a “reference” signal in the auxiliary path, which may have a different sample rate than the signal in the main receiver path. In order to synchronize and enable low power implementation of the adaptive algorithm, a decimation filter chain is employed to match the sample rates of the auxiliary and the main path.

Two decimators are needed in the digital tuning loop, one for the IM generator output and one for the adaptive filter input regardless of the order of IM being tuned. The same decimators can be re-used when tuning either an even order or an odd order IM, provided that we perform tuning over only a single order IM in any given calibration scan. These filters are implemented using a three stage Half Band filter (HBF) chain. The stop band attenuation is set to about 40 dB to handle strong OOB interference and to enable performance measurement of the system with different auxiliary path ADC resolution ranging from eight to four bits. Fig. 4 shows the architecture of the proposed decimator chain. The first stage operates at a higher frequency and is implemented with a lower order whereas the third stage with a higher order enables a sharp cutoff. The HBF chain requires 13 non-trivial multiplications, which is equivalent to the response of a $36^{th}$ order finite impulse response (FIR) decimator with 10 bit coefficients. We choose to implement the decimator with HBF, as each of the three stages work at half the clock frequency of the previous stage, resulting in an overall lower power consumption. It has to be noted that alternate implementations of the ADCs in a single chip system may result in different decimation filters depending on the OSR of the auxiliary ADC.

C. The Adaptive Filter and Correlation Unit

A linear adaptive filter is capable of producing an error signal $e(n)$ in Fig. 2, containing the non-linear terms of the main path signal as detailed in Section III-A. The LMS algorithm is the simplest adaptive algorithm which aims to minimize the mean square error (MSE). Results in [10] show that a normalized LMS algorithm is capable of providing the required IM separation with a filter length of 9. We have chosen to use the standard LMS algorithm to avoid the division operation required by the normalized LMS, and to use a slower update factor $\mu = 0.25$ with a filter length of 17. An unrolled FIR structure with 10 bit coefficients is implemented.

The error signal $e(n)$ and the output of the IM generator $k(n)$ are generally not time synchronized, which can result in incorrect correlation values. To overcome this and better capture the correlation peaks, a bank of 5 correlators are used, as shown in Fig. 5, each performing correlation over different orthogonal frequency division multiplexing (OFDM) symbol samples. The maximum correlation value obtained is then used to control the DAC, which in turn tunes the CSF towards higher linearity. A reset signal is applied every 2048 samples, corresponding to one OFDM symbol period, to enable correlation calculations to restart for the next symbol. In the proposed system we have chosen to average correlation values of 30 OFDM symbols, corresponding to a period of 2ms per step, to increase the robustness of the estimated correlation values. The time spent at each tuning voltage step can be programmed by configuring counters, to provide a corresponding slower or faster scan over a bias voltage range.

![Fig. 4: Three stage half band decimation filter](image-url)
TABLE I: Hardware Resource Utilization

<table>
<thead>
<tr>
<th>Component</th>
<th>Clock (MHz)</th>
<th>FPGA Slices</th>
<th>ASIC Gates</th>
<th>Power (mW)</th>
</tr>
</thead>
<tbody>
<tr>
<td>IM3 Generator</td>
<td>250</td>
<td>195</td>
<td>0.7k</td>
<td>0.18</td>
</tr>
<tr>
<td>Decimator</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Stage 1</td>
<td>250</td>
<td>193</td>
<td>1k</td>
<td>0.58</td>
</tr>
<tr>
<td>Stage 2</td>
<td>125</td>
<td>301</td>
<td>1.4k</td>
<td>0.56</td>
</tr>
<tr>
<td>Stage 3</td>
<td>62.5</td>
<td>408</td>
<td>2.4k</td>
<td>0.42</td>
</tr>
<tr>
<td>Adaptive Filter +Correlator</td>
<td>32.25</td>
<td>6665</td>
<td>32k</td>
<td>4.90</td>
</tr>
<tr>
<td>Total</td>
<td></td>
<td>8664</td>
<td>42.3k</td>
<td>8.20</td>
</tr>
</tbody>
</table>

TABLE II: Comparison of Area and Power

<table>
<thead>
<tr>
<th>Component</th>
<th>ASIC area (65nm)</th>
<th>Power (mW)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Tunable CSF [11]</td>
<td>0.19mm²</td>
<td>4.2</td>
</tr>
<tr>
<td>Digital tuning loop</td>
<td>0.086mm²</td>
<td>8.2</td>
</tr>
<tr>
<td>ADC [15]</td>
<td>0.018mm²</td>
<td>2.0</td>
</tr>
</tbody>
</table>

D. Hardware Implementation of the Digital Tuning Loop

In order to evaluate the hardware cost of the proposed system, the digital tuning loop customized to mitigate IM3 of the CSF was implemented on a Xilinx Kintex-7 [14] FPGA. Table I shows the resource utilization on the FPGA, the area required by a corresponding implementation in 65 nm CMOS technology and the average power consumption of the proposed tuning loop. The digital loop is implemented to cover different PVT corners in the 65 nm CMOS technology resulting in higher robustness and guaranteed functionality correct operation. Table II shows a bigger picture of the area and power cost of the proposed system. We have chosen a state-of-the-art ADC for comparison, but full chip implementation of the proposed system could have more relaxed requirements than the ADC in [15]. Since the calibration scans are typically initiated once every few minutes and complete within 20 ms, the total power overhead for the digital loop is negligible. The Adaptive filter with the correlation unit occupies 75% of the area required for the digital tuning loop. Further area reductions can be obtained by implementing a folded LMS filter at the cost of higher operating frequency. Nevertheless, the total area which includes one IM3 generation unit, two decimators, and the unfolded adaptive filter with the correlation unit is half that of the CSF [11].

IV. MEASUREMENT RESULTS

A. Verification System Setup

The performance of the digital tuning system was tested by connecting it to the CSF using an external ADC as shown in Fig. 6. Three Rohde and Schwarz (R&S) SMIQ06B signal generators (SIG) were used to generate one IB and two OOB blockers, with both modulated and non-modulated signals. The output and the input signals of the CSF were fed into two separate channels on a 4DSP FMC125 ADC [16]. The ADC operated at 1.25 GHz and the samples produced were digitally downsampled by dropping 3 out of every 4 samples to yield an effective sample rate of 312.5 MHz, which is slightly higher than the sample rate considered in [10]. The samples from the ADC were synchronized by using FIFOs, and a clock generator module from Xilinx was used to generate the different clock signals. Due to a limit of the number of external pins on the Kintex-7 FPGA, the DAC was replaced by a display implemented using ChipScope, and the CSF control voltage was tuned manually. The FSEA spectrum analyzer (SA) from R&S was used to measure the actual level of IM3 at the output of the CSF, which was then compared against the values obtained from the ChipScope display.

B. Test Results and Discussion

A two tone test was performed with the setup of Fig. 6 with two OOB continuous wave blockers at 49 MHz ($F_1$) and 25 MHz ($F_2$), resulting in an inband IM3 at 1 MHz ($2F_2 - F_1$). A single tone signal at 2.36 MHz was also provided as an input to check the effectiveness of the adaptive algorithm to reject inband signals. A more difficult scenario with an inband blocker was also tested. The full resolution of the main path ADC was used and the OOB blockers were set to be at -7 dBFS and a wideband signal at -31 dBFS was used as the IB signal. The screenshots from the spectrum analyzer with this setup is
shown in Fig. 7 with CSF output before and after calibration. It can be seen that the IM3 level can be reduced by 14.5 dB from -42.5 dBm to -57 dBm.

The performance and cost of the digital tuning loop is mainly determined by the auxiliary ADC. A high resolution ADC, while providing a better tuning capability, would result in increased area and power consumption, which might overshadow the improvements obtained by tuning the analog component. On the other hand, a low resolution ADC will not provide the required degree of IM detection. Measurements with different resolutions on the auxiliary ADC were performed to determine the minimum required resolution by simple digital dropping of the least significant bits from the auxiliary ADC. Fig. 8 shows the normalized correlation values against the control voltage of the CSF obtained from the digital control loop when operating with 8, 6 and 4 bit resolutions. The plot for the measured IM3 corresponds to the values read out from the spectrum analyzer and indicates the true level of IM3 distortion. It can be seen that the plot with 8 bits ADC follows the true performance of the CSF to a high degree and also that a 4 bit ADC is capable of detecting the control voltage for close to optimum biasing. A performance improvement of about 14 dB can be obtained by using the proposed system and simulation results from a 16QAM, 10 MHz baseband Long Term Evolution (LTE) signal are shown in Fig. 9. IM3 levels of -26 dB and -40 dB are chosen to highlight the error vector magnitude (EVM) improvement pictorially, with a 14 dB change in IM3 resulting in an EVM improvement of around 5 times. Several low resolution ADCs are presented in [15], [17], [18] and one such solution could be implemented with the digital control loop for a single chip solution.

V. CONCLUSION

This paper presents a low-complexity digital system for assisting a tunable CSF. It is capable of detecting and tuning the CSF to its optimal setting for minimum distortion after which it can be shut down resulting in minimum power consumption overhead. The proposed solution has been verified both by simulations and hardware implementation on a Xilinx Kintex-7 FPGA interfaced to the tunable CSF and the results obtained show that the algorithm can be implemented with a low resolution 4 bit ADC. The proposed system requires a total of only 42 k gates and is robust to PVT variations mainly due to the digital nature of the tuning circuit.

ACKNOWLEDGMENT

This work is a part of the DARE project and the authors thank the Swedish Foundation for Strategic Research.

REFERENCES