APPLICATION OF COMPRESSIVE SENSING TOTHE DESIGN OF WIDEBAND SIGNAL ACQUISITION RECEIVERS
John Treichler
Applied Signal Technology, Inc.Sunnyvale, California
Mark Davenport, Richard Baraniuk
Rice UniversityHouston, Texas
ABSTRACT
Compressive sensing (CS) exploits the sparsity present inmany signals to reduce the number of measurements neededfor digital acquisition. With this reduction would come, intheory, commensurate reductions in the size, weight, powerconsumption, and/or monetary cost of both signal sensors andany associated communication links. This paper examines theuse of CS in environments where the input signal takes theform of a sparse combination of narrowband signals of unknown frequencies that appear anywhere in a broad spectralband. We formulate the problem statement for such a receiverand establish a reasonable set of requirements that a receivershould meet to be practically useful. The performance of aCS receiver for this application is then evaluated in two ways:using the applicable (and still evolving) CS theory and usinga set of computer simulations carefully constructed to compare the CS receiver against the performance expected froma conventional implementation. This sets the stage for work in a sequel that will use these results to produce comparisonsof the size, weight, and power consumption of a CS receiveragainst an exemplar of a conventional design.
1. INTRODUCTION
Compressive sensing (CS) [1–3] exploits the sparsity presentin many signals to reduce the number of measurementsneeded for acqusition. It has been shown theoretically that,under the right set of circumstances, CS can dramaticallyreduce the number of measurements needed to detect, characterize, and/or extract signals, and therefore can reduce bythe same factor the storage and/or transmission rate neededto handle the signal at the sensing point. Conversely, signalswith much larger bandwidths could be accepted by existingacquisition systems. If these reductions were found to proportionally reduce the size, weight, and power consumption(SWAP) and cost of operational signal acquisition systems,then the practical impact could be transformative.This paper examines the potential practicality of CS tobuild signal acquisition receivers for the speciﬁc, but important, case where the receiver’s input signal takes the form of a sparse combination of narrowband signals of unknown fre
FrontEndSensor Forwarding Link OutputsBackEndProcessor
Fig. 1
. A wideband signal acquisition receiver.quencies that can appear anywhere in a broad spectral band.Our objective here is to examine the speciﬁc application of the acquisition receiver, thus providing the opportunity to testthe robustness of CS techniques to imperfect match with itsunderlying theoretical assumptions.Section 2 formulates the problem statement for the signalreceiver and establishes a set of requirements that a receivershould meet to be highly attractive for practical use. Section 3 applies the existing CS theory to determine how well aCSbased receiver should work in this case, while Section 4presents the interim results of a set of simulationbased evaluations designed to measure the expected performance of aCSbased receiver in comparison with theory and with a conventional receiver design. These results are discussed in Section 5, and recommendations for additional study and investigation appear in Section 6.
2. PUTATIVE REQUIREMENTS
Our objective in this paper is to explore the attributes and capabilities of CS by examining how it might be applied to meeta speciﬁc set of requirements. The particular application addressed is a wideband radio frequency (RF) signal acquisitionreceiver, a device commonly used in both commercial andmilitary systems to monitor a wide band of radio frequencies for the purposes of
(i)
detecting the presence of signals,
(ii)
characterizing them, and, where appropriate,
(iii)
extracting a speciﬁc signal from the several that might be presentwithin that band. A highlevel system diagram incorporatingsuch a receiver is shown in Figure 1. While many types of acquisition receivers have been designed, built, and sold overthe years, we will choose here a set of putative requirementsfor such a receiver so that comparisons and analysis can bedone. The reader is free to, and in fact invited to, choose1
Table 1
. A putative set of speciﬁcations for an advanced RFsignal acquisition receiver.Instantaneous bandwidth
B
500 MHzInstantaneous dynamic range
D
96 dBSNR degradation/noise ﬁgure
NF
12 dBMaximum signal bandwidth
W
200 kHzother operational parameters and repeat the comparison.The attributes that characterize an acquisition receivertypically fall into two categories: technical speciﬁcations —such as instantaneous bandwidth — and various “costs”—such as SWAP and monetary cost. In this paper we willaddress only the few most important technical speciﬁcations:
•
Instantaneous bandwidth
— the RF range over whichsignals will be accepted by the receiver and handledwith their full ﬁdelity;
•
Instantaneous dynamic range
— the ratio of the maximum to minimum signal power level with which received signals can be handled with full ﬁdelity;
•
SNR degradation
— sometimes termed “noise ﬁgure”,a measure of the tendency of the receiver to lower theinput signaltonoise ratio (SNR) of a received signal,usually measured in dB. The root cause of this degradation has historically depended on the technology usedto build the receiver.
•
Maximum signal bandwidth
— the maximum combined bandwidth of the constituent signals in the acquisition bandwidth of the receiverThese requirements must be met subject to many constraints,including, at least, SWAP and monetary cost. There arealso typically systemlevel constraints, such as the bandwidthavailable for communicating what the receiver has discoveredto other assets or a central processing facility.Historically RF signal acquisition receivers were ﬁrstbuilt using purely analog technology, then, more recently,with analog technology conditioning the signal environmentsufﬁciently to employ a highrate analogtodigital converter(ADC) followed by digital processing, storage, and/or transmission. When and if it can be applied, CS offers the promiseto
(i)
increase the instantaneous input bandwidth,
(ii)
lowerall of the cost attributes, and
(iii)
move the computationallyintensive portions of the acquisition process away from thesensor and toward a central processing facility.For the purposes of the comparisons to be made in thispaper, we will assume a set of requirements, listed in Table 1,for an acquisition system that are quite audacious and wouldstress, at the least, conventional implementations. To meetthe bandwidth and dynamic range requirements, conventionaldesigns would typically be forced to use techniques based on
Sensor OutputState of RandomizedSensor DetectionCharacterizationExtractionGeolocationSupportBackEndProcessor FrontEndCompressiveSensor
Fig. 2
. The “processing asymmetry” assumed in a CS signalacquisition receiver.scanning narrowband receivers across the band. If CSbasedsystems can be shown to work in such cases without the needfor scanning at the receiver, then they would have broad application.We make two last, but important, assumptions:1.
Signal sparsity
— In order to meet the ﬁrstorder assumption of all CS techniques that the input signal besparse in some way, we assume that the sum of thebandwidths of all signals present in the full acquisitionband, which we denote
W
, is no more than 200 kHz.Note that this is signiﬁcantly smaller than the instantaneous bandwidth
B
of 500 MHz. Thus we are assuming that the RF input to the receiver is quite sparse inthe frequency domain (the instantaneous bandwidth isonly
1
/
2500
occupied in this case). While inputs withthis level of spectral sparsity are not common, they exist often enough to make a solution useful if it can befound. To test the impact of the sparsity assumption forthis application, we will evaluate the performance, boththeoretical and in simulation, for both the case wherethe input is noisefree, in which case the input signal istruly sparse, and in the more practical case where theinput is contaminated with additive white noise.2.
Processing asymmetry
— Our presumption is that thereis no cost to the computation done to process the output of the compressive sensor. Our objective is to minimize all costs, including the transmission bandwidthbetween the sensor and the processor, and, for the purposes of this paper, we are prepared to do as muchprocessing as needed to detect, characterize, and/or recover the signal of interest. This assumed separation of functions is illustrated in Figure 2.
3. COMPRESSIVE SENSING AND ITSAPPLICATION TO WIDEBAND RADIO RECEIVERS3.1. CS theory
In laying out the putative requirements in Section 2, we assumed that the realvalued, continuoustime signal of interest,which we will denote by
x
(
t
)
, has instantaneous bandwidth
B
and maximum signal bandwidth
W
. Thus, each segment of 2
x
(
t
)
of duration
T
= 1
seconds has the Fourier representation
x
(
t
) =
Ψ
(
α
) =
2
B
−
1
k
=0
α
k
ψ
k
(
t
)
,
(1)where
ψ
k
(
t
) =
e
j
2
πkt/
(2
B
)
are the Fourier basis functionsand where
α
= [
α
0
,α
1
,...,α
2
B
−
1
]
is a complexvaluedvector of length
2
B
that has
2
W
nonzeros corresponding tothe active frequencies of the constituent signals in the acquisition bandwidth
B
.
1
For the moment, we assume that
α
k
= 0
exactly outside of these
2
W
nonzeros, in which case we saythat the vector
α
is
2
W

sparse
. Finally, note that while wefocus on Fouriersparse signals in this paper, the CS theory isgeneral and extends to signals sparse in arbitrary bases.The ShannonNyquist sampling theorem tells us that
2
B
uniform samples of
x
(
t
)
per
T
= 1
second contain all of the information in
x
(
t
)
. Our goal in CS is to do better: toacquire
x
(
t
)
via a set of
2
B/Q
measurements with
Q
≥
1
aslarge as possible. The
subsampling factor
Q
strongly affectsthe various costs (i.e., SWAP and monetary cost) described inSection 2. Observe that if the locations of the
2
W
nonzerosof
α
are known
a priori
, then by ﬁltering and decimation, wecoulddrive
Q
aslargeas
Q
f
=
B/W
. OuraimistoshowthatCSbased techniques will enable us to acquire
x
(
t
)
via a setof
2
B/Q
ﬁxed, nonadaptive, linearmeasurementsthatrequireno
a priori
knowledge of the locations of the nonzeros of
α
,with
Q
nearly as large as
Q
f
.Towards this end, we take the measurements
y
=
Φ
(
x
(
t
)) +
e
,
(2)where
Φ
isalinear
measurementoperator
thatmapscontinuoustime functions deﬁned on the time interval
t
∈
[0
,
1]
to alength
2
B/Q
vector
y
of measurements, and where
e
is alength
2
B/Q
vector that represents
measurement noise
generated by the acquisition hardware. The central theoreticalquestion in CS is how to design the measurement operator
Φ
to ensure that we will be able to recover the signal
x
(
t
)
fromthe measurements
y
. While there are many approaches tosolving this problem, the most common method is to split thisquestion into two parts:
(i)
What properties of
Φ
will ensurethat there exists
some
algorithm that can recover
x
(
t
)
from
y
? and
(ii)
What algorithms can perform this recovery in anefﬁcient manner?The answer to the ﬁrst question is rather intuitive. Whilealternative properties have been studied, the majority of thework in CS assumes that the measurement operator
Φ
satisﬁes the socalled
restricted isometry property
(RIP) [4]. Inour setting, where
x
(
t
)
has a
2
W
sparse representation withrespect to the basis
Ψ
, the RIP requires that there exists a
1
We set the segmentlength to
T
= 1
second for notational convenience;for snippets of other lengths one can merely replace
B
and
W
below with
BT
and
WT
,
mutatis mutandis
.
constant
δ
∈
(0
,
1)
such that
√
1
−
δ
≤
Φ
(
Ψ
(
α
))
−
Φ
(
Ψ
(
β
))
2
α
−
β
2
≤√
1 +
δ,
(3)holds for all
2
W
sparse
α
and
β
. In words,
ΦΨ
preservesthe Euclidean distance between vectors that are
2
W
sparse.Intuitively, the RIP means that for any particular set of measurements
y
, there exists at most one possible
2
W
sparse signal consistent with these measurements, since if there weretwo distinct
2
W
sparse signals that mapped to the same set of measurements, then this would violate the lower inequality of (3). In principle, this means that it should be possible (in thenoisefree setting at least) to exactly recover any
2
W
sparsesignal. Furthermore, if a small amount of measurement noiseis added to the measurements
y
as in (2), then the RIP provides a guarantee that any
2
W
sparse signal consistent withthe perturbed measurements will be close to the srcinal signal, and so the RIP ensures that the system has a degree of stability and robustness to measurement noise.We now consider how to design an operator
Φ
satisfyingthe RIP. An important result from the CS theory is that for anygiven basis
Ψ
(not just the Fourier basis we focus on here),if we draw a random matrix
R
of size
2
B/Q
×
2
B
whoseentries
r
ij
are independent realizations from a Gaussian,Rademacher (
±
1
valued), or more generally, any bounded,zeromean distribution, then with overwhelmingly high probability
Φ
=
R
Ψ
−
1
will satisfy (3) for
2
W
sparse signalsprovided that
Q
≤
Q
c
=
κ
0
Q
f
ln
Q
f
(4)where
κ
0
<
1
is a constant that depends on
B
and the probability with which (3) holds [5]. From this we concludethat CSbased measurement operators
Φ
pay a small penalty,quantiﬁed by
κ
0
/
ln(
Q
f
)
, for not exploiting any
a priori
knowledge of the locations of the nonzero frequencies.In general, the theoretical analysis is somewhat lacking interms of the precise value of
κ
0
. For instance, an asymptoticanalysis of the results in [5] would suggest that
κ
0
≈
1
/
50
would be sufﬁcient, but this value seems to be far more conservative than what is required in practice. Thus, if a speciﬁcvalue for
κ
0
is required, one must determine this value experimentally. This is typically accomplished via Monte Carlosimulations that identify how many measurements are sufﬁcient to ensure exact recovery in the noisefree setting on atleast, say, 99% of trials. As an example, it is shown in [6]that
Q <
≈
0
.
6
Q
f
/
ln(
Q
f
)
is sufﬁcient to enable exact recovery in the noisefree setting. Note that this is not the sameas demonstrating that
Q <
≈
0
.
6
Q
f
/
ln(
Q
f
)
is sufﬁcient toensure that
Φ
satisﬁes the RIP, but it is highly suggestive thatthe true value of
κ
0
is much greater than the conservative estiamtes provided by the theory. We will observe this phenomenon for ourselves in Section 4.Since the random matrix approach is somewhat impractical to build in hardware, several hardware architectures have3
Pseudorandom Number Generator
Seed
Integrator SampleandHold
y
[
n
]
Quantizer
Fig. 3
. Random demodulator for obtaining compressive measurements of analog signals.been implemented and/or proposed that enable compressivesamples to be acquired in practical settings. Examples include the random demodulator [6], random ﬁltering [7], andrandom convolution [8,9].We brieﬂy describe the random demodulator as an example of such a system; see Figure 3 for a block diagram. Thefour key components are a pseudorandom
±
1
“chipping sequence”
p
c
(
t
)
operating at the Nyquist rate
2
B
or higher, alow pass ﬁlter, represented by an ideal integrator with reset,and a lowrate ADC consisting of a sampleandhold circuitand a quantizer. The input analog signal
x
(
t
)
is modulated bythe chipping sequence and integrated. The output of the integrator is sampled and quantized, and the integrator is reset after each sample. The output measurements from the sampleandhold circuit are then quantized. The subsampling factor
Q
c
determines the sampling rate
C
at which the sampleandhold circuit and quantizer operate. Speciﬁcally, we must have
C
≥
2
B/Q
c
.Mathematically, the linear measurement operator
Φ
of the random demodulator can be decomposed into the product
Φ
=
R
Ψ
−
1
, where
R
is a matrix of size of size
2
B/Q
×
2
B
.The resulting
R
matrix, while randomized, typically containssome degree of structure. For example, the random convolution architecture [8,9] endows
R
with a Toeplitz structure.While theoretical analysis of structured random matrices remains a topic of active study in the CS community, there doexist theoretical guarantees for some architectures [6,8]. Theamount of subsampling possible (
Q
c
) with these constructions is generally consistent with fully random measurementsas given in (4), although the denominator is sometimes raisedto a small power (e.g., 2 or 4) for provability.
3.2. CS recovery algorithms
We now address the question of how to recover the signal
x
(
t
)
from the measurements
y
. The srcinal CS theory proposed
1
minimization as a recovery technique when dealing withnoisefree measurements [1,2]. Noisy measurements as in(2) can be easily handled using similar techniques providedthat the noise
e
is bounded, meaning that
e
2
≤
. In thiscase, assuming that
Φ
=
R
Ψ
−
1
satisﬁes the RIP, in whichcase we can write
y
=
R
Ψ
−
1
(
Ψ
(
α
)) =
Rα
, the convexprogram
α
= argmin
α
α
1
s.t.
Rα
−
y
2
≤
(5)can recover any sparse signal
α
. Thus by setting
x
(
t
) =
Ψ
(
α
)
, we can recover
x
(
t
)
. More speciﬁcally, this is madeprecise in [10] which establishes that for
2
W
sparse signals,the recovery error can be bounded by
x
(
t
)
−
x
(
t
)
2
≤
κ
1
,
(6)where
κ
1
≥
2
is a constant that depends on the subsamplingfactor
Q
. This constant is similarly difﬁcult to determine theoretically, but in practice it should be very close to 2 providedthat
Q < Q
c
. Thus, measurement noise has a controlled impact on the amount of noise in the reconstruction. A similarguarantee can be obtained for approximately sparse, or
com pressible
, signals
2
.While convex optimization techniques like (5) are powerful methods for CS signal recovery, there also exist a varietyof alternative algorithms that are commonly used in practiceand for which performance guarantees comparable to that of (6) can be established. In particular, iterative algorithms suchas CoSaMP, iterative hard thresholding (IHT), and variousother thresholding algorithms are known to satisfy similarguarantees to (6) [11–13]. Most of these algorithms are builton similar techniques and can be easily understood by breaking the recovery problem into two separate subproblems:identifying the locations of the nonzero coefﬁcients of
α
and estimating the values of the nonzero coefﬁcients of
α
.The former problem is clearly somewhat challenging, butonce solved, the latter is relatively straightforward and can besolved using standard techniques like least squares. In particular, again using the factorization of
Φ
=
R
Ψ
−
1
, supposethat we have identiﬁed the indices of the nonzero coefﬁcientsof
α
(its
support
), denoted by the index set
J
, and let
R
J
denote the submatrix of
R
that contains only the columnscorresponding to the index set
J
. Then an optimal recoverystrategy is to solve the problem:
α
= argmin
α
y
−
R
J
α
2
,
(7)which is a standard leastsquares problem that can be solvedvia the pseudoinverse of
R
J
, denoted
R
†
J
:
α
=
R
†
J
y
= (
R
T J
R
J
)
−
1
R
J
y
.
(8)Note that in the noisefree case, if
α
is
2
W
sparse and thesupport estimate
J
is correct , then
y
=
R
J
α
, and so plugging this into (8) yields
α
=
α
. Thus, the central challenge
2
By compressible, we mean signals that are well approximated by asparse signal. The guarantee on the recovery error for compressible signals issimilar to (6) but includes an addititional additive component that quantiﬁesthe error incurred by approximating
x
(
t
)
with a sparse signal. Therefore, if a signal is very close to being sparse, then this error is negligible, while if asignal is not sparse at all, then this error can be quite large.
4
in recovery is to identify the correct locations of the nonzeros. CoSaMP and many related algorithms solve this problemby iterativelyidentifying likely nonzeros, estimating theirvalues, and then improving the estimate of which coefﬁcients arenonzero.
3.3. Noise folding and CS measurements
Most of the CS literature focuses on ensuring the stability of the CS acquisition and recovery process in the face of
measurement noise
e
in (2) [10–13]. In this section, we take acareful look at the effect of
signal noise
that is present in thesignal before acquisition. Speciﬁcally, we now assume thatthe Fourier representation of
x
(
t
)
consists of a
2
W
sparsesignal corrupted by additive Gaussian noise
n
. We will assume that the noise
n
is zeromean and spectrally white withcovariance matrix
Σ
n
=
σ
2
I
2
B
and added across the entire
2
B
dimensional Fourier spectrum
α
and not just the
2
W
nonzeros of
α
. Thus, we acquire the measurements
y
=
Φ
(
Ψ
(
α
+
n
)) =
Rα
+
Rn
.
(9)The noise situation is subtly different from (2), because thenoise in the measurements
y
is now scaled by the matrix
R
.Our chief interest here is to understand how
R
impacts thesignal noise and how it manifests itself in the ﬁnal recoveredsignal.We ﬁrst observe that
Rn
is zero mean with covariancematrix
Σ
Rn
=
σ
2
RR
T
. To further simplify this expression,we make some relatively intuitive assumptions concerning
R
:
(i)
each column of
R
has norm (energy) approximately equalto
1
,
3
(ii)
the rows of
R
have approximately equal norm, and
(iii)
the rows of
R
are orthogonal. The ﬁrst assumption simply means that the columns of
R
are weighted equally, whichmeans that the noise
n
k
on each coefﬁcient
α
k
will contributeroughly an equal amount to the total noise. If the locationsof the nonzeros were known
a priori
, then we could do better by assigning more weight to the columns corresponding to
α
k
= 0
and less weight to columns corresponding to
α
k
= 0
(which are purely noise). However, in the absence of thisknowledge, if we want the impact of the noise to be independent of the location of the nonzeros, then there is no alternative but to weight the columns equally. Similarly, the secondassumptionrequiresthateachmeasurementshouldhaveequalweight. Finally, the third assumption is not strictly necessary,butitseemsreasonablethatifwewishtotakeasfewmeasurements as possible, then each measurement should provide asmuch new information about the signal as possible. Note thatthese assumptions hold for randomly generated
R
matrices,as well the
R
matrices corresponding to more practical architectures such as the random demodulator [6].By combining
(i)
and
(ii)
, we can infer that each row of
R
should have norm of approximately
2
B/
(2
B/Q
) =
√
Q
.
3
Note that this property is implied by the RIP by considering signals withonly one nonzero coefﬁcient.
Combining this with
(iii)
, we obtain the approximation
RR
T
≈
Q
I
2
B/Q
.
(10)Thus, the covariance matrix of
Rn
is approximately
Σ
Rn
≈
RR
T
(
σ
2
I
2
B
) =
σ
2
Q
I
2
B/Q
.
(11)From (11) we see that the noise
Rn
is spectrally white butampliﬁed by the subsampling factor
Q
. This makes senseintuitively, since we are projecting all of the noise in the
2
B
dimensional input signal (
α
or equivalently
x
(
t
)
) downinto the
2
B/Q
dimensional measurements
y
, and all of thenoise power must be preserved. In the literature, this effect isknown as
noise folding
.Noise folding has a signiﬁcant impact on the amount of noise present in CS measurements. Speciﬁcally, when acquiring a
2
W
sparse signal, as we double the subsampling factor
Q
(a one octave increase), the signaltonoise ratio (SNR) of the measurements
y
decreases by 3dB. In words,
for the acquisition of a noisy signal of ﬁxed sparsity, the SNR of the CSmeasurements decreases by 3 dB for every octave increase inthe subsampling factor
.We note that alternative signal acquisition techniques like
bandpass sampling
(sampling a narrowband signal uniformlyat a subNyquist rate to preserve the values but not the locationsofitslargeFouriercoefﬁcients)areaffectedbyanidentical 3dB/octave SNR degradation. However, in practice bandpass sampling suffers from the limitation that it is impossibleto determine the srcinal component center frequencies, andfurthermore, if there are multiple narrowband signals presentthen it causes irreversible aliasing, in which case the components can overlap and will be impossible to separate. Incontrast to bandpass sampling, however, CS acquisition preserves sufﬁcient information to enable the recovery of boththe values and the locations of the large Fourier coefﬁcients.
3.4. Noise folding and CS reconstruction
We now examine the impact of noise folding on CS signalreconstruction. Rather than directly analyzing a particularreconstruction algorithm, we will instead consider the performance of an
oracle
based recovery algorithm that has perfectknowledge of the true location of the
2
W
nonzeros of
α
,which we have denoted by
J
. While an oracle is clearly impractical, it characterizes the best that we can hope to achieveusing any practical algorithm. And, in fact, we ﬁnd thatpractical algorithms like CoSaMP typically perform almostas well as the oraclebased recovery algorithm.Given
J
, the optimal estimate of
α
is given by (8), resulting in
α
=
α
+
R
†
J
Rn
.
(12)Thus, we recover the signal
α
plus the additive noise nowscaled by the matrix
R
†
J
R
. To understand the properties of 5