An Energy Efficiency Perspective on Massive MIMO Quantization

Sarajlic, Muris; Liu, Liang; Edfors, Ove

Published in: Proceedings of the 50th Asilomar Conference on Signals, Systems and Computers

DOI: 10.1109/ACSSC.2016.7869084

2017

Document Version: Peer reviewed version (aka post-print)


General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
• You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal

Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
Abstract—One of the basic aspects of Massive MIMO (MaMi) that is in the focus of current investigations is its potential of using low-cost and energy-efficient hardware. It is often claimed that MaMi will allow for using analog-to-digital converters (ADCs) with very low resolutions and that this will result in overall improvement of energy efficiency. In this contribution, we perform a parametric energy efficiency analysis of MaMi uplink for the entire base station receiver system with varying ADC resolutions. The analysis shows that, for a wide variety of system parameters, ADCs with intermediate bit resolutions (4 - 10 bits) are optimal in energy efficiency sense, and that using very low bit resolutions results in degradation of energy efficiency.

I. INTRODUCTION

Massive MIMO [1] is an emerging wireless communication technique that promises substantial gains in both spectral and energy efficiency compared to traditional cellular systems. One feature of MaMi is its robustness to hardware imperfections [2], which implies that using MaMi will result in improved cost and energy efficiency. This fact has inspired a flurry of research activity focusing on diverse aspects of hardware impairments in MaMi and specific signaling and signal processing design tailored with the effects of the impairments in mind - all with the common goal of further improving the efficiency of the system.

A significant body of this research aims at analyzing and fine-tuning MaMi-based systems using ADCs with low resolution [3], [4]. The motivation for such investigations is well-established: the power consumption of ADCs grows at least linearly with sampling rate [15] and will prove to be a power consumption bottleneck in systems with very large bandwidths. Therefore, there is a huge interest in reducing the power consumption of ADCs by reducing the bit resolution as much as possible.

However, it is not perfectly clear whether choosing ADCs with extremely low resolutions will be beneficial from energy efficiency point of view, and analyses of this problem appear scarce. Only some very recent results [5], [6] seem to indicate that very low bit resolutions are not optimal in energy efficiency sense.

The analysis in this contribution aims at providing an insight in how the overall energy efficiency of a MaMi system - specifically, MaMi uplink - behaves when the ADC resolution is changed. The total power consumption of MaMi base station is parameterized, so that the analysis covers a wide variety of system architectures - from very economical to very power-hungry. Also, the ADC power consumption model that is used attempts to reflect the functional dependencies that are found in actual ADC designs and as such aims to be close to hardware design reality.

II. ENERGY EFFICIENCY METRIC

Energy efficiency of a base station (BS) receiver in the uplink of a MaMi system is defined as

$$\eta = \frac{C}{P_{\text{tot}}} \text{[bits/Joule]},$$

with $C$ [bits/s] being the uplink sumrate and $P_{\text{tot}}$ [W] total power consumption of the MaMi BS (ADCs together with all other receiver blocks, analog and digital).

Dependences of sumrate and power consumption on ADC bit resolution $b$ need to be resolved separately. To this end, we first turn to finding an appropriate model for the impact of ADCs on system performance.

III. ADC PERFORMANCE MODELING

This analysis assumes ADCs with bit resolution $b$ that perform scalar quantization and are uniform with $N_q = 2^b$ quantization levels. Uniform quantization was chosen because it is both close to hardware implementation reality [7] and allows for simple and tractable modeling. Additionally, sampling is assumed to be performed at Nyquist rate.

Quantizer mapping rule: given the ADC resolution $b$ and a real positive scalar $X_{\text{ol}}$, the quantization step of the quantizer is defined as $\Delta = 2X_{\text{ol}}/2^b$, and the values of $N_q$ quantization levels are assigned as $q_i = i\Delta = (N_q + 1)\Delta/2$, $i = 1, \ldots, N_q$. Additionally, the real line segment $[-X_{\text{ol}}, X_{\text{ol}}]$ can be divided into $N_q$ equal subsegments $[-X_{\text{ol}}, T_1], [T_1, T_2], \ldots, [T_{N_q-1}, X_{\text{ol}}]$ with subsegment boundaries (thresholds) $T_i = q_i + \Delta/2$. Given a discrete-time input $x[n]$, the input-output characteristic of the quantizer is defined as

$$Q(x[n]) = \begin{cases} q_1, & x \leq T_1 \\ q_i, & T_{i-1} < x \leq T_i, \quad i = 2, \ldots, N_q - 1 \\ q_{N_q}, & x > T_{N_q-1} \end{cases}$$

Quantization $Q(x[n])$ is a nonlinear mapping of $x[n] \in \mathbb{R}$ to a discrete set that results in additive distortion

$$Q(x[n]) = x[n] + q[n].$$

The nature of distortion $q[n]$ is twofold (in the follow-up, time index $n$ is dropped for clarity). If $|x| > X_{\text{ol}}$, $x$ is represented...
Correlation $\rho_{xq}$

<table>
<thead>
<tr>
<th>$x$</th>
<th>$q$</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0.01</td>
</tr>
<tr>
<td>5</td>
<td>0.02</td>
</tr>
<tr>
<td>10</td>
<td>0.03</td>
</tr>
<tr>
<td>15</td>
<td>0.04</td>
</tr>
<tr>
<td>20</td>
<td>0.05</td>
</tr>
</tbody>
</table>

Input backoff $\mu$ giving target performance

Input backoff $\mu$ achieving -13 dB deviation from PQN model.

Actual deviation and input-distortion correlation for the linear fit

by one of the outer quantization levels $q_1$ or $q_{N_q}$ and we say that the signal is “clipped”. Consequently, $q$ is referred to as clipping or overload distortion with variance $\sigma_{q_{ol}}^2$. On the other hand, if $|x| \leq X_{ol}$, the amplitude of distortion $q$ is bounded by $\frac{x}{\mu}$; distortion $q$ is then referred to as granular noise.

In practical systems, the ADC is usually preceded by an automatic gain control (AGC) variable gain amplifier that is used to conveniently adjust the dynamic range of the signal $x$. The primary purpose of the AGC is to minimize overload distortion. A welcome consequence of a properly controlled AGC is to conveniently adjust the dynamic range of the signal.

The model in question is described as follows: it was shown in [8] that, for a uniform quantizer with normally distributed $x$, the distortion $q$ can very well be approximated as being uniformly distributed, uncorrelated with the input and white, with the variance of $q$ being

$$\mathbb{E}\{q^2\} \approx \frac{1}{3} X_{ol}^2 2^{-2b} \approx \sigma_{q_{PQN}}^2.$$  

(4)

This commonly used model is usually referred to as the pseudoquantization noise (PQN) model. The approximations in the PQN model can be made extremely tight if the dynamic range of $x$ is set properly.

A commonly used design parameter for the AGC is the input backoff $\mu = X_{ol}^2/\mathbb{E}\{x^2\}$. In this work, $\mu$ is set so that the normalized deviation of the distortion variance from $\sigma_{q_{PQN}}^2$ is equal or less than $-13$ dB (indicating that the main source of distortion is granular noise and that it can be safely modeled by the PQN model).

The resulting $\mu^*(b)$ was then approximated linearly by a chord $\mu_{i}^*(b)$. Deviation $\delta \sigma_{q_{PQN}}^2$ and input-distortion crosscorrelation $\rho_{xq} = \mathbb{E}\{qx\}/\sqrt{\mathbb{E}\{x^2\} \mathbb{E}\{q^2\}}$ were obtained by simulations for $b \in [1, 25]$ and $\mu_{i}^*(b)$. The results are shown in Fig. 1 and show that the PQN model applies well even for very low bit resolutions (1 bit) if the AGC backoff is set properly.

IV. SYSTEM MODEL AND SUMRATE CALCULATION

By having determined an appropriate model for the effects of ADC, we now employ it in the overall system model for the uplink, which is constructed assuming the following system setup:

- Uplink of a single-cell MaMi system with $M$ antennas and $K$ users;
- i.i.d. Rayleigh block fading over $T$ symbols;
- Least-squares channel estimation is performed using orthogonal pilot sequences of length $\tau$ in the uplink;
- Channel estimates are used for linear receiver processing. Maximum ratio combining (MRC) and zero-forcing (ZF) receivers are considered.

A system model of the uplink, where ADCs are substituted by quantization noise sources following the PQN model and AGCs precede ADCs, is illustrated in Fig. 2.

![Fig. 2: Uplink system model with quantization noise](image-url)

User $k$ sends a data symbol $x_k \in \mathbb{C}$. User symbols are collected in vector $x = (x_1 \ x_2 \ldots \ x_K)^T$, with $\mathbb{E}\{x_k^H\} = I_K$. Single-carrier, narrowband transmission is assumed, and thus the propagation channel is represented by the $M \times K$ matrix $G = HD^{1/2}$, with the elements of $M \times K$ matrix $H$ modeling small-scale (SS) fading coefficients. The elements

- $G = HD^{1/2}$
- $H$
- $D^{1/2}$
- $M \times K$
of $H$ are zero mean, circularly symmetric, complex Gaussian random variables with variance 1.

The $K \times K$ matrix $D^{1/2}$ is a diagonal matrix of amplitude path gains and large-scale (LS) fading coefficients taken jointly. The $(m,k)$ element of $G$ can be written as $g_{mk} = h_{mk}\sqrt{\beta_k}$, with $h_{mk}$ being the narrowband small-scale fading coefficient between the $k$th user and $m$th antenna and $\beta_k$ the joint path power gain and large-scale fading coefficient. It should be pointed out that, if some uplink power control is employed, its effects will also be modeled by $\beta$. In the case of ideal uplink power control, all $\beta_k = 1$.

Assuming that every user transmits with equal transmit power $p_u$, the signal at the receive antennas is

$$y = \sqrt{p_u}Gx + n = \sqrt{p_u}HD^{1/2}x + n$$  \hspace{1cm} (5)

where $n = (n_1 \ n_2 \ldots n_m)^T$ is the vector of input-referred thermal noise at each antenna. Thermal noise powers at all antennas are assumed equal to $p_n$.

Received signal $y_i$ will experience variations of average power due to SS and LS fading. This power is averaged over both SSF and LSF and combined with $\mu_i^* \gamma_i$ to find the AGC gains that result in target performance as described in Section III. The AGC gain per I/Q branch of $i$th receiver chain is found to be

$$\gamma_i = \frac{2}{\mu_i \left( p_u \sum_{k=1}^{K} \beta_k + p_n \right)}$$  \hspace{1cm} (6)

Amplitude AGC gains $\sqrt{\gamma_i}$ can be conveniently collected in a diagonal matrix $\Gamma^{1/2}$.

The signal after the AGC is

$$\tilde{y} = \Gamma^{1/2}y = \sqrt{p_u} \Gamma^{1/2}HD^{1/2}x + \Gamma^{1/2}n$$  \hspace{1cm} (7)

Finally, quantization noise is added. Assuming $X_{q,i} = 1$, variance of complex quantization noise in the $i$th chain is

$$p_{q,i} = \mathbb{E} \left[ |q_i|^2 \right] = \frac{2}{3} \ 2^{-2b_i}$$  \hspace{1cm} (8)

and the signal model after the ADC becomes

$$z = \tilde{y} + q = \sqrt{p_u} \tilde{H}x + \tilde{n} + q,$$  \hspace{1cm} (9)

where $q$ holds the complex quantization noise samples from all antennas.

Channel estimation in the uplink is performed using pilot sequences that are spatially orthogonal and $\tau$ symbols long. More precisely, pilot sequences for all $K$ users are represented by a $K \times \tau$ matrix $\Phi = \sqrt{p_o}\tau \Psi$, where in turn $\Psi$ is a $K \times \tau$ matrix with orthonormal rows: $\Psi^H = I_{K \times K}$. Sequences $\Phi$ are optimal for least-squares pilot-based channel estimation [9].

When a block of pilot symbols $\Phi$ is transmitted, the received signal is

$$Z = \tilde{H}\Phi + \tilde{N} + \Xi,$$  \hspace{1cm} (10)

where the columns of matrices $\tilde{N} = [\tilde{n}_1 \ \tilde{n}_2 \ldots \tilde{n}_\tau]$ and $\Xi = [\ q_1 \ q_2 \ldots q_\tau ]$ are thermal and quantization noise vectors for each channel use (symbol). The least-squares channel estimate is then

$$\hat{H} = Z\Phi^H = \tilde{H} + \left( N + \Xi \right) \Phi^H = \tilde{H} + \Xi.$$  \hspace{1cm} (11)

Linear processing matrices for the uplink are formed using the channel estimates:

- MRC: $\hat{A}_{\text{MRC}} = \tilde{H}$,
- ZF: $\hat{A}_{\text{ZF}} = \tilde{H} \left( \tilde{H}^H \tilde{H} \right)^{-1}$.

The MIMO receiver applies the processing matrix to estimate the vector of symbols sent by the users as

$$\hat{x} = \hat{A}^H \hat{z} = \sqrt{p_u} \hat{A}^H \tilde{H}x + \hat{A}^H \tilde{n} + \hat{A}^H \tilde{q}.$$  \hspace{1cm} (12)

It can be shown that $\hat{A}$ can be split into a sum of two terms, one being the “true” processing matrix (based solely on the actual channel $\tilde{H}$) and the other an error term that is a consequence of channel estimation errors, namely

- MRC: $\hat{A}_{\text{MRC}} = A_{\text{MRC}} + A_{\text{MRC},e} = \tilde{H} + A_{\text{MRC},e}$, and
- ZF: $\hat{A}_{\text{ZF}} = A_{\text{ZF}} + A_{\text{ZF},e} = \tilde{H} \left( \tilde{H}^H \tilde{H} \right)^{-1} + A_{\text{ZF},e}$.

This fact holds for both MRC (follows directly from (11)) and ZF [10].

This simple decomposition allows for splitting the estimate of user data symbol $x_k$, pertaining to $k$th user, into a wanted signal term and a noise term

$$\hat{x}_k = x_k + w_k = \sqrt{p_u}a_k^H \hat{h}_k x_k + w_k,$$  \hspace{1cm} (13)

where $a_k$ is the $k$th column of $A$ and $\hat{h}_k$ the $k$th column of $H$. Additive noise term $w_k$ contains interuser interference and effects of thermal and quantization noise during channel estimation and data transmission phases.

One important observation (the proof of which is omitted here) is that the constituent terms of $w_k$ are all uncorrelated and Gaussian. This is a consequence of several factors, namely: quantization noise being uncorrelated with the input to the ADC, noise in channel estimation phase being independent from the one in data transmission phase, and a large number of antennas (so that the central limit theorem applies).

The signal-to-interference-thermal-and-quantization-noise ratio for $k$th user is then calculated as

$$\text{SINQR}_k = \frac{\mathbb{E}_{x,n,q} \left[ |a_k^H x_k|^2 \right]}{\mathbb{E}_{x,n,q} \left[ |w_k|^2 \right]}.$$  \hspace{1cm} (14)

The ergodic sumrate of the system is the sum of achievable rates for each user, averaged over channel realizations:

$$C = B \frac{T - \tau}{T} \sum_{k=1}^{K} \mathbb{E}_{H} \left\{ \log_2(1 + \text{SINQR}_k) \right\},$$  \hspace{1cm} (15)

with $B$ being the bandwidth of the system.
Pipeline ADC energy consumption

![Pipeline ADC energy consumption graph]

Fig. 3: ADC power consumption model, compared with actual ADC designs

V. POWER CONSUMPTION MODEL

In this work, system setup choices and models aim to be as close to hardware reality as possible. To this end, we focus on a particular type of ADC - the pipeline ADC. This type of ADC is typically designed for intermediate bit resolutions, medium to high sampling rates $f_s$ and has power consumption that is comparatively superior to other types of ADCs when observed over a wide range of operating resolutions (very low to very high) [11], [12], [13].

For the power consumption model of the ADC, this work adopts the model described in [14]. It represents a theoretical bound on power dissipation of pipeline ADCs that was nevertheless shown to correctly predict the trends observed in actual designs. As such, it can be of use in a parametric energy efficiency analysis, where the character of functional dependency between $b$ and power consumption is of primary interest.

As shown in Fig. 3, where the model from [14] is compared with selected pipeline ADC designs collected in [15], the functional dependency in the model matches the trend exemplified by state-of-the-art pipeline architecture factor helps tie together the power consumption of the ADCs and remaining receiver blocks: moreover, it allows for a parameterized analysis that covers a wide range of system architectures.

The aim of this contribution was to provide an initial overview of the energy efficiency trends as various system parameters change. To provide this initial insight, system performance simulations have been performed across a wide variety of system parameters.

Alongside primary system parameter $b$, several other important system parameters have been considered, namely $M$, $K$, $T$, $\tau$ and preprocessing $SNR = p_u/p_n$ (defined with large-scale fading normalized to the level of thermal noise). In order to reduce the dimensionality of the analysis, two auxiliary system parameters have been introduced, namely spatial loading $(K/M)$ and temporal loading $(K/T)$.

In addition to all the assumptions on system setup listed before, it was assumed that perfect power control was performed in the uplink (so all $\beta_k = 1$). In all the analyses, reference bit resolution $b_{ref}$ was set to 2.

For the first set of results, $\alpha$ and $SNR$ were swept together with $b$. Additionally, $M = 100$, $\tau = K$, $K/T = 0.01$ [users/coherence time], $K/M = 0.1$ [users/antenna]. Results are shown in Fig. 4. Optimal energy efficiency points are denoted by the circular marker.

Results indicate that, as power consumption of ADCs becomes comparable to power consumption of all the other blocks, from energy efficiency point of view it is beneficial to use lower bit resolution. However, in practical system designs it is reasonable to expect that ADC power consumption is only a small fraction of the total power consumption when ADC resolution is low.

Just to provide an illustrative example, BS power model presented in [16] was used with the parameters listed above (additionally, system bandwidth was assumed to be 20 MHz) and yielded $P_{rest} = 43.3W$. On the other hand, at $b_{ref} = 2$, of remaining blocks (analog and digital) needs to be taken into account. This proves to be an extremely challenging task due to wide variability of available system architectures and apparent lack of unifying theoretical information. Therefore, this work chooses a parametric approach to modeling of the total power consumption.

Power consumption of the blocks excluding ADCs, denoted by $P_{rest}$, is normalized by $P_{ADC, ref}$ - power consumption of ADCs across all RF chains at an arbitrary bit resolution $b_{ref}$. Total power consumption of the BS in the uplink can therefore be expressed as

$$P_{tot} = 2MP_{ADC, ref} + P_{rest} = 2M(P_{ADC} + \alpha P_{ADC, ref}),$$

where the quantity

$$\alpha = \frac{P_{rest}}{2MP_{ADC, ref}}$$

is referred to as the architecture factor. Introduction of the architecture factor helps tie together the power consumption of the ADCs and remaining receiver blocks: moreover, it allows for a parameterized analysis that covers a wide range of system architectures.

VI. RESULTS

The aim of this contribution was to provide an initial overview of the energy efficiency trends as various system parameters change. To provide this initial insight, system performance simulations have been performed across a wide variety of system parameters.
using a correction factor $\Omega = 100$, the ADC power consumption model described above gave $2\Omega P_{ADC} = 3 \text{mW}$, resulting in $\alpha = 1.5 \times 10^4$. While this is by no means a definite power number, it serves to illustrate what are reasonable orders of magnitude for $\alpha$.

Some other interesting insights can be drawn from this result, for example: system using MRC proves to be quite insensitive to changes in SNR and $b$, indicating that an overwhelmingly dominant impairment is the interuser interference and that playing with ADC resolutions will not yield a considerable impact on the energy efficiency; if ZF is used, the dynamics are much more pronounced and show that by going from a system design with a large SNR and large $\alpha$ (“wasteful” system) to a system where SNR and $\alpha$ are low (a more “economical” system) allows for choosing ADCs with smaller resolutions. Nevertheless, all systems with a “reasonable” $\alpha$ (say $10 \cdot 10^3$) should use ADCs with resolutions in the range $4 \cdot 10^4$ bits.

In order to focus more on what are the improvements and degradations of energy efficiency when using different ADC resolutions, we turn to a different analysis where spatial load $K/M$ and $M$ are swept together with $b$, and additionally $SNR = 0 \text{ dB}$, $K/T = 0.01 \text{ [users/coherence time]}$ with $\tau = K$ and $\alpha = 10^4$, results shown in Fig. 5.

What these results show is that going from optimal ADC resolution to a very low one can incur a substantial degradation of the energy efficiency (for ZF processing and with assumed values of system parameters, up to 5.5 times). This is due to sumrate being degraded while the overwhelming power consumption of other blocks “drowns” the coincident savings in power consumption of the ADCs. Another interesting observation is that, in the ZF case, increasing the number of antennas can help recover the energy efficiency lost by going to lower bit resolutions.

Finally, we take a look at the interplay between the channel estimation length and $b$ in the context of energy efficiency. We analyze a system with $K/M = 0.1 \text{ [users/antenna]}$, while varying $b$, SNR and training length. Architecture parameter $\alpha$ is again fixed to $10^4$. What is plotted is the normalized training length $\tau/T$ that maximizes the energy efficiency, results shown in Fig. 6. Main takeouts from here are that ZF is much more sensitive to quantization noise during training; even in the case of high temporal loading (indicating fast fading), when there is little room to spare for channel estimation, it is beneficial to train the system longer than minimum required time in order to compensate for the effects of quantization. The effect becomes more pronounced as the fading becomes slower and channel estimation is not so costly in terms of time. On the other
hand, we see that the system using MRC is so overwhelmed by interuser interference that additional training does little to improve the energy efficiency.

VII. CONCLUSION

A parameterized analysis of energy efficiency in the uplink of a MaMi system with varying ADC bit resolutions at the base station has been performed. System setup and models have been chosen with the aim of being close to practical system implementations. Initial results (obtained by simulations) indicate that using ADCs with very low bit resolutions is not an optimal approach from energy efficiency point of view, except for highly specific system architectures. Instead, for a wide variety of systems, ADCs with intermediate bit resolutions (4 - 10 bits) are shown to maximize system energy efficiency. Additionally, it was also shown that systems using MRC uplink processing are quite insensitive to the changes in ADC bit resolution, due to interuser interference being the prime source of impairments in such systems. On the other hand, systems using ZF processing (in addition to showing overall superiority in terms of energy efficiency compared to MRC - based systems) are shown to be rather sensitive to changes in bit resolution.

REFERENCES