Erik G. Larsson
N.B.: When citing this work, cite the original article.
©2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Erik G. Larsson, MIMO Detection Methods: How They Work, 2009, IEEE signal processing magazine, (26), 3, 91-95.
http://dx.doi.org/10.1109/MSP.2009.932126
Postprint available at: Linköping University Electronic Press
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-21997
IEEE SIGNAL PROCESSING MAGAZINE [91] MAY 2009
[lecture NOTES ]
D igital Object Identifier 10.1109/MSP.2009.932126I
n communications, the receiver often observes a linear superposition of separately transmitted informa-tion symbols. From the receiver’s perspective, the problem is then to separate the transmitted symbols. This is basically an inverse problem with a finite-alphabet constraint. This lecture will present an accessible overview of state-of-the-art solutions to this problem. RELEVANCE
The most important motivating applica-tion for the discussion here is receivers for multiple-antenna systems such as multiple-input, multiple-output (MIMO), where several transmit antennas simul-taneously send different data streams. However, essentially the same problem occurs in systems where the channel itself introduces time- or frequency-dis-persion, in multiuser detection, and in cancellation of crosstalk.
PREREQUISITES
General mathematical maturity is required along with knowledge of basic linear algebra and probability.
PROBLEM STATEMENT
Concisely, the problem is to recover the vector s [R n from an observation of the form
y 5Hs 1e , y [R m
, (1)
where H [R m 3n is a known (typically, estimated beforehand) channel matrix and e [R m
represents noise. We assume that e ,N 10, s I 2. The ele-ments of s , say s k , belong to a finite alphabet S of size |S |. Hence there are
|S |n possible vectors s . For simplicity of
our discussion, we assume that all quantities are real-valued. This is most-ly a matter of notation, since C n is iso-morphic to R 2n . We also assume that m $n , that is, (1) is not underdeter-mined, and that H has full column rank. This is so with probability one in most applications. We also assume that H has no specific structure. If H has structure, for example, if it is a Toeplitz matrix, then one should use algorithms that can exploit this structure. We want to detect s in the maxi m um-likelihood (ML) sense. This is equivalent to
The problem: min s [S n
7y 2Hs 7. (2) Problem (2) is a finite-alphabet-con-strained least-squares (LS) problem,
which is known to be nondeterministic
polynomial-time (NP)-hard. The compli-cating factor is of course the constraint s [S
n
, otherwise (2) would be just clas-sical LS regression.
SOLUTIONS
As a preparation, we introduce the
QL-decomposition of H : H 5QL , where
Q [R m 3n is orthonormal (Q T Q 5I ), and L [R n 3n
is lower triangular. Then
7y 2Hs 7257QQ T 1y 2Hs 272
171I 2QQ T 21y 2Hs 272
57Q T y 2Ls 72
171I 2QQ T 2y 72,
where the last term does not depend on s .
It follows that we can reformulate (2) as Equivalent problem: min s [S n
7y |2Ls 7,
where y |!Q T y (3)or, in yet another equivalent form, as
min 5s 1,c , s n 6
s k P S
5f 11s 121f 21s 1, s 22
1c 1f n 1s 1,c , s n 26,
where
f k 1s 1,c , s k 2!a y |
k 2a k
l 51L k , l s l b 2
. (4)Problem (4) can be visualized as a decision tree with n 11 layers, |S | branches emanating from each nonleaf node, and |S |n leaf nodes. See Figure 1. To any branch, we associate a hypotheti-cal decision on s k , and the branch metric f k 1s 1,c , s k 2. Also, to any node (except
the root), we associate the cumulative
m e t r i c f 11s 121c 1f k 1s 1,c , s k 2, which is just the sum of all branch met-rics accumulated when traveling to that node from the root. Finally, to each node, we associate the symbols 5s 1,c , s k 6 it takes to reach there from the root.
Clearly, a naive but valid way of solving (4) would be to traverse the entire tree to find the leaf node with the smallest cumu-lative metric. However, such a brute-force search is extremely inefficient, since there are |S |n leaf nodes to examine. We will now review some efficient, popular, but
approximate solutions to (4). ZERO-FORCING (ZF) DETECTOR
The ZF detector first solves (2), neglect-ing the constraint s [S
n
s |!arg min s [R
n
7y 2Hs 75arg min s [R
n
7y ,2Ls 75L 21y ,. (5) Of course, L 21 does not need to be explic-itly computed. For example, one can do Gaussian elimination: take s |15y |1/L 1,1,
then s |251y |22s |1 L 2,12/L 2,2, and so forth. ZF then approximates (2) by projecting each s |k onto the constellation S
Erik G. Larsson
MIMO Detection Methods: How They Work
1053-5888/09/$25.00©2009IEEE
s^k53s|k 4!arg min
s k[S
|s k2s|k|. (6)
We see that s
|5s1L21Q T e, so s| in (5) is free of intersymbol interference. This is how ZF got its name. However, u nfortunately ZF works poorly unless H is well conditioned. The reason is that the c orrelation between the noises in s|k is neglected in the projection operation (6). This correlation can be very strong, espe-cially if H is ill conditioned (the covariance matrix of s
| is S!s#1L T L2212. There are some variants of the ZF approach. For example, instead of computing s
| as in (5), one can use the MMSE estimate (take s
|5E3s|y4). This can improve perfor-mance somewhat, but it does not overcome the fundamental problem of the approach. ZF DETECTOR WITH
DECISION FEEDBACK (ZF-DF) Consider again ZF, and suppose we use Gaussian elimination to compute s
| in (5). ZF-DF [1] does exactly this, with the modification that it projects the symbols onto the constellation S in each step of the Gaussian elimination, rather than afterwards. More precisely,
1) Detect s1
via s^15arg min
s1[S
f11s12
5c y|1
L1,1
d.
2) Consider s1 known (s15s^1) and
detect s2
via s^25arg min
s2[S
f21s^1, s22
5c
y|22s^1L2,1
L2,2
d.
3)Continue for k53, . . . , n
s^k5arg min
s k[S
f k1s^1,c, s^k21, s k2
5c
y|k2S k21l51 L k, l s^l
L k, k
d.
In the decision-tree perspective,
ZF-DF can be understood as just examin-
ing one single path down from the root.
When deciding on s k, it considers
s1,c, s k21 known and takes the s k that
corresponds to the smallest branch met-
ric. Clearly, after n steps we end up at one
of the leaf nodes, but not necessarily in the
one with the smallest cumulative metric.
In Figure 2(a), ZF-DF first chooses
the left branch originating from the root
(since 1 , 5), then the right branch
(since 2 . 1) and at last the left branch
(because 3 , 4), reaching the leaf node
with cumulative metric 1111355.
The problem with ZF-DF is error
propagation. If, due to noise, an incor-
rect symbol decision is taken in any of
the n steps, then this error will propa-
gate and many of the subsequent deci-
sions are likely to be wrong as well. In
its simplest form (as explained above),
ZF-DF detects s k in the natural order,
but this is not optimal. The detection
order can be optimized to minimize the
effects of error propagation. Not sur-
prisingly, it is best to start with the sym-
bol for which ZF produces the most
reliable result: that is, the symbol s k for
which S k, k is the smallest, and then
proceed to less and less reliable sym-
bols. However, even with the optimal
ordering, error propagation severely
limits the performance.
SPHERE DECODING (SD)
The SD [2], [9] first selects a user
parameter R, called the sphere radius. It
then traverses the entire tree (from left
to right, say). However, once it encoun-
ters a node with cumulative metric
larger than R, then it does not follow
down any branch from this node. Hence,
in effect, SD enumerates all leaf nodes
w h i c h l i e i n s i d e t h e s p h e r e
7y|2Ls72#R. This also explains the
algorithm’s name.
In Figure 2(b), we set the sphere radi-
us to R5 6. The SD algorithm then tra-
verses the tree from left to right. When it
encounters the node “7” in the right sub-
tree, for which 7 . 6 5R, SD does not
follow any branches emanating from it.
Similarly, since 8 . 6, SD does not visit
[FIG1] P roblem (4) as a decision tree, exemplified for binary modulation (S = {–1, +1}, |S| = 2) and n = 3. The branch metrics f k(s1, . . ., s k) are in blue written next to each branch. The cumulative metrics f1(s1)+ . . . + f k(s1, . . . , s k) are written in red in the circles representing each node. The double circle represents the optimal (ML) decision.
IEEE SIGNAL PROCESSING MAGAZINE [92] MAY 2009
IEEE SIGNAL PROCESSING MAGAZINE [93] MAY 2009
any branches below the node “8” in the rightmost subtree.
SD in this basic form can be much improved by a mechanism called prun-ing. The idea is this: Every time we reach a leaf node with cumulative metric M , we know that the solution to (4) must be contained in the sphere 7y |2Ls 72#M . So if M , R , we can set R J M , and con-tinue the algorithm with a smaller sphere radius. Effectively, we will adaptively prune the decision tree, and visit much fewer nodes than those in the original sphere. Figure 2(c) exemplifies the prun-ing. Here the radius is initialized to R 5`, and then updated any time a leaf node is visited. For instance, when visit-ing the leaf node “4,” R will be set to R 54. This means that the algorithm will not follow branches from nodes that have a branch metric larger than four. In particular, the algorithm does not exam-ine any branches stemming from the node “5” in the right subtree.
The SD algorithm can be improved in many other ways, too. The symbols can be sorted in an arbitrary order, and this order can be optimized. Also, when traveling down along the branches from a given node, one may enumerate the branches either in the natural order or in a zigzag fashion (e.g., s k 5525, 23, 21, 21, 3, 56 versus s k 5521, 1, 23, 3, 25, 56). The SD algorithm is easy to implement, although the procedure cannot be directly parallelized. Given large enough initial radi-us R , SD will solve (2). However, depending on H , the time the algorithm takes to finish
will fluctuate, and may occasionally be very long.
FIXED-COMPLEXITY
SPHERE DECODER (FCSD)
FCSD [3] is, strictly speaking, not really sphere decoding, but rather a clever combination of brute-force e numeration and a low-complexity, approximate detec-tor. In view of the decision tree, FCSD visits all |S |r nodes on layer r , where r , 0#r #n is a user parameter. For each node on layer r , the algorithm considers 5s 1, ..., s r 6 fixed and formulates and solves the subproblem
min
5s r 11,c , s n 6s k [S
5f r 111s 1,c , s r 112
1. . .1f n 1s 1,c , s n 26. (7)
[FIG2] I llustration of detection algorithms as a tree search. Solid-line nodes and branches are visited. Dashed nodes and branches are not visited. The double circles represent the ultimate decisions on s . (a) ZF-DF: At each node, the symbol decision is based on choosing the branch with the smallest branch metric. (b) SD, no pruning: Only nodes with S n k 51f k 1s 1, . . . , s k 2#R are visited. (c) SD, pruning: Like SD, but after encountering a leaf node with cumulative metric M , the algorithm will set R :5M . (d) FCSD: Visits all nodes on the r t h layer, and proceeds with ZF-DF from these.
IEEE SIGNAL PROCESSING MAGAZINE [94] MAY 2009
[lecture NOTES ]
c ontinued
In effect, by doing so, FCSD will reach down to |S |r of the |S |n leaves. To form its symbol decisions, FCSD selects the leaf, among the leaves it has visited, which has the smallest cumulative metric f 11s 121c 1f n 1s 1,c , s n 2.
The subproblem (7) must be solved once for each combination 5s 1, ..., s r 6, that is |S |r times. FCSD does this approx-imately, using a low-complexity method (ZF or ZF-DF are good choices). This works well because (7) is overdetermined: there are n observations (y |1,c , y |n ), but only n 2r unknowns (s r 11,c , s n ). More precisely, the equivalent channel matrix when solving (7) will be a tall sub-matrix of H , which is generally much better conditioned than H .
Figure 2(d) illustrates the algorithm. Here r 51. Thus, both nodes “1” and “5” in the layer closest to the root node are visited. Starting from each of these two nodes, a ZF-DF search is performed.
Naturally, the symbol ordering can be optimized. The optimal ordering is the one which renders the problem (7) most well-conditioned. This is achieved by sorting the symbols so that the most “difficult” symbols end up near the tree root. Note that “difficult symbol” is non-trivial to define precisely here, but intui-tively think of it as a symbol s k for which S k ,k is large.
The choice of r offers a tradeoff between complexity and performance. FCSD solves (2) with high probability even for small r , it runs in constant time, and it has a natural parallel structure. Relatives of FCSD that produce soft out-put also exist [4].
SEMIDEFINITE-RELAXATION (SDR) DETECTOR
The idea behind SDR [5], [6] is to relax the finite-alphabet constraint on s into a matrix inequality and then use semidefi-nite programming to solve the resulting problem. We explain how it works, for binary phase-shift keying (BPSK) symbols (s k [5616). Define
s 2!c s 1d , S !s 2s 2T 5c s 1d 3s T 14,
C !c L T L 2L T y
,2y ,T
L
d .Then
7y |2Ls 725s T c s 17y |
725Trace 5C S 617y |72
so solving (3) is the same as finding the
vector s [S n that minimizes Trace {C S }. SDR exploits that the constraint s [S n is equivalent to requiring that rank {S } 5 1, n 1151 and diag {S } 5 {1, . . . , 1}. It then proceeds by minimizing Trace {C S } with respect to S , but relaxes the rank constraint and instead requires that S be positive semidefinite. This relaxed problem is convex, and can be effi-ciently solved using so-called interior point methods. Once the matrix S is found, there are a variety of ways to deter-mine s , for example to take the dominant eigenvector of S (forcing the last element to unity) and then project each element onto S like in (6). The error incurred by the relaxation is generally small.
LATTICE REDUCTION (LR)AIDED DETECTION
The idea behind LR [8], [9] is to trans-form the problem into a domain where the effective channel matrix is better conditioned than the original one. How does it work? If the constellation S is uniform, then S may be extended to a scaled enumeration of all integers, and S
n may be extended to a lattice S
n . For illustration, if S 5523,21, 1, 36, then S
n 55c ,23,21, 1, 3,c 63c 35c ,23,21, 1, 3,c 6. LR decides first on an n 3n matrix T that has inte-ger elements (T k ,l [Z ) and which maps t h e l a t t i c e S
n o n t o i t s e l f : Ts [S
n 4s [S
n . That is, T should be invertible, and its inverse should have integer elements. This happens precisely if its determinant has unit magnitude: |T |561. Naturally, there are many such matrices (T 56I is one trivial example). LR chooses such a matrix T for which, additionally, HT is well condi-tioned. It then computes
s
^r !arg min s 9P S n
||
y 21HT 2s r ||. (8) Problem (8) is comparatively easy, since
HT is well conditioned, and simple
methods like ZF or ZF-DF generally
work well. Once s
^r is found, it is trans-formed back to the original coordinate
system by taking s
^5T 21s ^r . LR contains two critical steps. First, a suitable matrix T must be found. There are good methods for this (e.g., see refer-ences in [8], [9]). This is computationally expensive, but if the channel H stays constant for a long time then the cost of finding T may be shared between many instances of (2) and complexity is less of an issue. The other problem is that while
the solution s ^ always belongs to S 2n
, it
may not belong to S
n . Namely, some of its elements may be beyond the borders of the original constellation. Hence a clipping-type operation is necessary and this will introduce some loss.
SOFT DECISIONS
In practice, each symbol s k typically is composed of information-carrying bits, say 5b k ,1,c , b k , p 6. It is then of interest to take decisions on the individual bits b k ,i , and often, also to quantify how reli-able these decisions are. Such reliability information about a bit is called a “soft decision,” and is typically expressed via the probability ratio
P 1b k ,i 51|
y 2
P 1b k ,i 50|
y 2
5
g s :b k ,i 1s 251 P 1s |
y 2
g s :b k ,i 1s 250 P 1s |
y 2
5
g s :b k ,i 1s 251exp a 21
||
y 2Hs ||2b P 1s 2
g s :b k ,i 1s 250exp a 21 ||
y 2Hs ||2
b P 1s 2
.
(9)
Here “s :b k ,i 1s 25b ” means all s for which the i th bit of s k is equal to b , and P 1s 2 is the probability that the transmitter sent s . To derive (9), use Bayes rule and the Gaussian assumption made on e [4].
Fortunately, (9) can be relatively well approximated by replacing the two sums in (9) with their largest terms. To find these maximum terms is a slightly modified version of (2), at least if all s
are equally likely so that P 1s 251/|S |n . Hence, if (2) can be solved, good approx-imations to (9) are available too. An even better approximation to (9) is obtained if more terms are retained, i.e.,
CONCLUSIONS
The goal of this lecture has been to pro-vide an overview of approaches to (2), in the communications receiver context. Which method is the best in practice? This depends much on the purpose of solving (2): what error rate can be toler-ated, what is the ultimate measure of performance (e.g., frame-error-rate, worst-case complexity, or average com-plexity), and what computational plat-form is used. Additionally, the bits in s may be part of a larger code word and different s vectors in that code word may
either see the same H (slow fading) or
many different realizations of H (fast
f ading). This complicates the picture,
because notions that are important in
slow fading (such as spatial diversity) are
less important in fast fading, where
diversity is provided anyway by time vari-
ations. Detection for MIMO has been an
active field for more than ten years, and
this research will probably continue for
some time.
AUTHOR
Erik G. Larsson (erik.larsson@isy.liu.se)
is a professor and head of division for
communication systems in the EE
department (ISY) of Linköping
University, Sweden. For more informa-
tion, visit www.commsys.isy.liu.se.
REFERENCES
[1] G. D. Golden, G. J. Foschini, R. A. Valenzuela,
and P. W. Wolniansky, “Detection algorithm and
initial laboratory results using V-BLAST space-time
communications architecture,” IEE Ele ctron. Le tt.,
vol. 35, pp. 14–16, Jan. 1999.
[2] E. Viterbo and J. Boutros, “A universal lattice
code decoder for fading channels,” IEEE Trans. In-
form. Theory, vol. 45, pp. 1639–12, July 1999.
[3] L. G. Barbero and J. S. Thompson, “Fixing the
complexity of the sphere decoder for MIMO detec-
tion,” IEEE Trans. Wire le ss Commun., vol. 7, pp.
2131–2142, June 2008.
[4] E. G. Larsson and J. Jaldén, “Soft MIMO detection
at fixed complexity via partial marginalization,” IEEE
Trans. Signal Proce ssing, vol. 56, pp. 3397–3407,
Aug. 2008.
[5] P. Tan and L. Rasmussen, “The application of
semidefinite programming for detection in CDMA,”
IEEE J. Select. Areas Commun., vol. 19, pp. 1442–
1449, Aug. 2001.
[6] B. Steingrimsson, Z.-Q. Luo, and K. Wong, “Soft
quasi-maximum-likelihood detection for multiple-
antenna wireless channels,” IEEE Trans. Signal
Processing, vol. 51, no. 11, pp. 2710–2719, Nov.
2003.
[7] B. M. Hochwald and S. Brink, “Achieving near-
capacity on a multiple-antenna channel,” IEEE
Trans. Commun., vol. 51, no. 3, pp. 3–399, Mar.
2003.
[8] C. Windpassinger and R. Fischer, “Low-complexi-
ty near-maximum-likelihood detection and precoding
for MIMO systems using lattice reduction,” in Proc.
IEEE Information The ory Workshop, 2003, pp.
345–348.
[9] E. Agrell, T. Eriksson, A. Vardy, and K. Zeger,
“Closest point search in lattices,” IEEE Trans.
Inform. The ory, vol. 48, pp. 2201–2214, Aug.
2002.[SP]
[dsp TIPS&TRICKS]c ontinued from page 82 AUTHOR
Maurice Givens (Maurice.Givens@gastechnol-ogy.org) is a specialist in the research and design of digital signal p rocessing at Gas Technology Institute and an adjunct lecturer at Wright College, Chicago. He is a r egistered professional engineer, a Senior Member of the IEEE, and a senior member of NARTE.
REFERENCES
[1] R. Harris, D. Chabries, and P. Bishop, “A variable step (VS) adaptive filter algorithm,” IEEE Trans. Acoust. Speech Signal Processing, vol. ASSP-34, pp. 309–316, Apr. 1986.
[2] J. Evans, P. Xue, and B. Liu, “Analysis and implementa-tion of variable step size adaptive algorithms,” IEEE Trans. Signal Processing, vol. 41, pp. 2517–2535, Aug. 1993.
[3] T. Haweel and P. Clarkson, “A class of order statistic LMS algorithm,” IEEE Trans. Signal Processing, vol. 40, pp. 44–53, Jan. 1992.
[4] T. Aboulnasr and K. Mayyas, “A robust variable step-size LMS-type algorithm: Analysis and simulations,” IEEE Trans. Signal Processing, vol. 45, pp. 631–639, Mar. 1997.
[5] D. Pazaitis and A. Constantinides, “A novel kurtosis driven variable step-size adaptive algorithm,” IEEE Trans. Signal Processing, vol. 47, pp. 8–872, Mar. 1999.
[6] R. Kwong and E. Johnston, “A variable step size LMS algorithm,” IEEE Trans. Signal Proce ssing, vol. 40, pp. 1633–12, July 1992. [SP
]
IEEE SIGNAL PROCESSING MAGAZINE [95] MAY 2009