Implementation and Testing of a Low-Overhead …spinlab.wpi.edu/pubs/Kowalski_AEROCONF_2018.pdfwhen compared with conventional timestamp-based synchro-nization approaches. In the past,

Implementation and Testing of a Low-OverheadNetwork Synchronization Protocol

Daniel R. Kowalski, Timothy M. Christman, Andrew G. KleinDepartment of Engineering and Design, Western Washington University

Bellingham, WA 98225kowalsd2, chris70, [email protected]

Mitchell W.S. Overdick, Joseph E. CanfieldPACCAR Technical Center

Mt. Vernon, WA 98273mitchell.overdick, [email protected]

D. Richard Brown IIIDepartment of ECE, Worcester Polytechnic Inst.

Worcester, MA [email protected]

Abstract—This paper describes a radio frequency implementa-tion of precise synchronization between two software definedradios without the use of timestamps. Contemporary synchro-nization protocols mostly rely on exchanging digital timestampsbetween nodes. The finite precision of these digital timestampslimits the degree of synchronization achievable, and the addi-tional overhead of sending timestamps reduces data through-put. Previous experimental work demonstrates that synchro-nization information can instead be conveyed at the physicallayer through pairwise message exchanges between nodes. Byforgoing the use of digital timestamps the problem of finiteprecision error can be avoided and the accuracy of synchroniza-tion is only limited by fundamental bounds of delay estimation.Our implementation was successful in synchronizing two EttusN210 Software Defined Radios to within a standard deviation of8.18 ns.

TABLE OF CONTENTS

1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. TIMESTAMP-FREE SYNCHRONIZATION PROTOCOL 23. DELAY ESTIMATION FOR BANDPASS RF SIGNALS . 24. PASSBAND RADIO-FREQUENCY IMPLEMENTATION 35. EXPERIMENTAL RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . 56. DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7BIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1. INTRODUCTION

Recently, there has been great interest in techniques such asdistributed MIMO and beamforming [1]. These techniquespermit multiple devices to coordinate their communicationand pool their antenna resources, thereby forming a virtualantenna array. Such distributed transmission techniquesrequire precise synchronization between devices to permitcarrier phase alignment. The past few decades have seen thedevelopment of a number of synchronization protocols suchas Network Time Protocol [2], the Global Positioning System[3], and several lightweight synchronization protocols for usein resource constrained sensor networks [4]. A common

978-1-5386-2014-4/18/$31.00 c©2018 IEEE

theme with existing protocols is that they rely on exchangingdigital timestamps between devices. Finite precision errorresulting from digital time stamps inherently limits the accu-racy of such synchronization protocols. Furthermore, precisesynchronization requires frequent exchanging of timing in-formation. Exchanging digital timestamps at these speeds canpresent a prohibitive amount of overhead to network channelsbetween devices. To eliminate the significant issues of con-ventional synchronization for this application, a timestamp-free synchronization protocol was proposed that performssynchronization implicitly at the physical layer through pair-wise message exchanges [5]. Theoretical analysis has shownthat this leads to precise, accurate clock synchronizationbetween two devices, and results in lighter load on a networkwhen compared with conventional timestamp-based synchro-nization approaches.

In the past, this protocol has been used to synchronize variousDSP platforms with mean errors well below the period ofthe carrier frequency. Previously, an implementation of thisprotocol at acoustic frequencies using a Texas InstrumentsTMS320C6713 DSK [6], and a subsequent implementationat radio frequency (RF) using an Ettus E310 software de-fined radio [7] were tested. While these experiments wereencouraging, the former did not operate at radio frequencies,while the latter Ettus E310 implementation was only able toachieve 70 ns synchronization due to practical limitations inthe architecture of the software defined radio (SDR). That is,the delay estimation had to be performed on the basebandsignal after downconversion. It is well known from theCramer-Rao lower bound that the achievable accuracy ofdelay estimators on such signals are limited when comparedto performing delay-estimation on sampled RF signals [8].

This paper describes a real-time implementation of thetimestamp-free network synchronization protocol imple-mented on an Ettus N210 software defined radio, in whichthe delay estimator operates directly on the sampled RFsignal. This leads to synchronization performance that issignificantly improved with respect to the previous RF im-plementation of this protocol in Overdick et al. [7]. Experi-mental results presented in this paper show that the protocolcan synchronize two radios to within 8.18 ns of precisionwhile accounting for the presence of propagation delay. Thisexperimental study serves to demonstrate the improved syn-chronization performance of the timestamp-free protocol in asoftware radio architecture that employs RF passband signalsrather than baseband signals in performing delay estimation.

1

master node

clock ticks

slave node

timebase

master node

timebase

slave node

clock ticks

t(a)s

t(b)

t(c)

t(d)s

t(a)s

+t(d)s

2

ts = t+∆s[k] t

δs

Figure 1. Timestamp-free synchronization bidirectionalsignal exchange.

2. TIMESTAMP-FREE SYNCHRONIZATIONPROTOCOL

Figure 1 shows the interactions between a slave node andthe master node using the timestamp-free synchronizationprotocol. The time-varying clock offset at the slave node withrespect to the master node is denoted as ∆s[k] and local timeat the slave node is denoted as

ts = t+ ∆s[k]

where t is the reference time corresponding to the local clockat the master node. The timestamp-free synchronizationprotocol begins with the slave node transmitting a signal tothe master node at arbitrary local time t(a)

s . The signal arrivesat the master node at local time

t(b) = t(a)

s −∆s[k] + τs

where ∆s[k] is the current clock offset of the slave node withrespect to the master node and τs is the propagation delaybetween the slave node and the master node. The masternode then transmits a signal back to the slave node at timet(c) where t(c) is selected such that

t(b) + t(c)

2(mod T0) = 0 (1)

where T0 is master node clock tick period. Note that, unlikethe usual sender/receiver synchronization protocol, e.g. [9],no timestamps are exchanged between the nodes. Implicittiming information is embedded in the master node’s re-sponse to the slave node by selecting t(c) so that a localclock tick the master node is centered between t(b) andt(c). Assuming a reciprocal channel and transmitter/receiverchain, the slave node receives the reply signal from the masternode at local time

t(d)s = t(c) + ∆s[k] + τs.

The slave node can then estimate its clock tick offset withrespect to the master node by calculating

δs =

(t(a)s + t(d)s

2

)T0

(2)

where the notation (z)T0 corresponds to wrapping z to theinterval [−T0/2, T0/2). The offset estimate in (2) can be useddirectly for immediate clock offset correction at the slavenode or as an input to a filtering algorithm to correct bothclock offsets and drifts.

3. DELAY ESTIMATION FOR BANDPASS RFSIGNALS

This section provides an overview of the delay estimatorused in the timestamp-free synchronization protocol. Pre-vious experimental work that reported on the accuracy ofthe timestamp-free protocol on RF signals used an EttusE310 embedded software radio [7]. Due to the front-endarchitecture of the E310 software radio, estimation on abaseband signal was required and made use of an approachbased on quadratic interpolation. Because the Ettus N210radios employed in this paper permit direct sampling of anRF signal, we assume the synchronization pulse is a bandpasssignal of the form

s(t) = cos(Ω0t)u(t)

where u(t) is a bandlimited signal such that U(Ω) = 0 for all|Ω| ≥ Ω0. The synchronization pulse in discrete time can beexpressed as

s[`] = [cos(Ω0t)u(t)]t=`Ts

= cos(ω0`)u[`]

where ω0 = Ω0Ts is the normalized carrier frequency inradians/sample.

The discrete-time observation with unknown delay τ can beexpressed as

y[`] = [cos(Ω0(t− τ))u(t− τ)]t=`Ts

= cos(ω0(`− τfs))u(`Ts − τ)

for ` = 0, . . . , L− 1 where L is the number of samples in theobservation. Standard cross-correlation techniques with thetemplate waveform s[`] can be used to generate a quantizeddelay estimate ˆ ∈ Z. The accuracy of this quantized delayestimate is limited, however, by the sampling rate of the delayestimator.

To refine the delay estimate, we define

si[`, ˆ] = cos(ω0(`− ˆ))u[`− ˆ]

sq[`, ˆ] = sin(ω0(`− ˆ))u[`− ˆ]

for ` = 0, . . . , L− 1 where ˆ is the quantized delay estimate.

2

We can then compute

zi[ˆ] =

K−1∑`=0

y[`]si[`, ˆ]

=1

2

K−1∑`=0

(cos(ω0(2`− τfs − ˆ)) + cos(ω0(τfs − ˆ))

)× u(`Ts − τ)u[`− ˆ]

≈ 1

2cos(ω0(τfs − ˆ))

K−1∑`=0

u(`Ts − τ)u[`− ˆ]

where the approximation results from the fact that∑K−1`=0 cos(ω0(2` − τfs − ˆ))u(`Ts − τ)u[` − ˆ] ≈ 0.

Similarly, we can calculate

zq[ˆ] =

K−1∑`=0

y[`]sq[`, ˆ]

≈ 1

2sin(ω0(τfs − ˆ))

K−1∑`=0

u(`Ts − τ)u[`− ˆ]

We can then compute

θ = tan−1

(zq[ˆ]

zi[ˆ]

)(3)

≈ tan−1

(sin(ω0(τfs − ˆ))

cos(ω0(τfs − ˆ))

)= ω0(τfs − ˆ).

The refined delay estimate can then be computed as

τ =

(ˆ+

θ

ω0

)Ts

where θ is calculated according to (3). Note that ˆ representsthe integer, sample-level delay, which in the sequel we termthe coarse estimate; meanwhile, θ

ω0represents the fractional-

sample delay, so we term it the fine estimate. Again, thecoarse estimate can be be computed with standard cross-correlation techniques.

4. PASSBAND RADIO-FREQUENCYIMPLEMENTATION

To demonstrate the superiority of passband over basebandsignals with respect to delay estimation, the timestamp-free synchronization protocol was implemented with radio-frequency signals on Ettus N210 SDRs with wired connec-tions between devices. Unlike the Ettus E310 SDRs whichperform analog downconversion before sampling, the EttusN210 SDRs are able to sample the incoming RF signaldirectly, which, according to the Cramer-Rao lower bound[8], leads to increased synchronization accuracy. To allowsufficient time for computation, a relatively low samplingfrequency of 250kHz was selected for this implementation.The USRP devices operate in block fashion, where each time

a receive command is issued, a block of samples is returnedto the calling routine. Similarly, transmission also works inblock fashion, where transmissions are scheduled in advanceby passing blocks of data with a transmit command. Inaddition, the desired time of transmission is provided to thetransmit command, and needs to be sufficiently far in thefuture to prevent sampling under-runs. In this implementa-tion, we used the default block size of 1000 samples and atransmission delay of fifteen blocks.

Two Ettus USRP N210s were used to implement the protocol:one as the master node, a second as the slave node. Inaddition, a third N210 was used as a measurement deviceto determine the resulting clock offset. Each of the nodes’software was implemented in C++ using the USRP HardwareDriver (UHD) provided by Ettus, and the USRP’s wereconnected to host computers with Core i7-6700k processors.As the N210’s are part of Ettus’ “networked” series of SDRs,the signal processing and logic needed to implement thesynchronization protocol were run in real-time on the hostcomputers, and not on the N210 SDR’s themselves. The re-sult of this is processing latency that is significantly larger ascompared to the previously considered stand-alone embeddedE310 platform which performs all computation on the deviceitself [7].

RX-A RX-BTX-A TX-B

TX-A TX-BRX-A RX-B

RX-A

RX-B

TX-A

TX-B

initiating pulse

from slave

response

from master

constant

master

clock

adjusted

slave

clock

Slave Node

Measurement

Device

RX Daughter

Board

TX Daughter

Board

RX

Daughter

Board

RX Daughter

Board

TX Daughter

Board

TX

Daughter

Board

Figure 2. Diagram of the connections between the N210sused for the master node, slave node, and measurementdevice. Each N210 has an independent local oscillator.

A diagram of the timestamp-free synchronization test setupis shown in Fig. 2. Each of the three USRPs utilized twoEttus LF daughterboards which provide access to the 14-bitADC and 16-bit DAC on the N210s. One daughter-boardwas set to receive incoming signals while the other was usedto transmit. Each daughterboard contains two channels; forinstance, the LFTX transmitting daughterboard has channelsTX-A and TX-B, while the LFRX receiving daughterboardhas channels RX-A and RX-B. The basic operation is thatthe slave node transmits an initial pulse which is receivedby the master node. Once the master node detects that it

3

has received a pulse, it transmits a time-reversed copy of thereceived pulse which is received by the slave node. In order toexternally measure the accuracy of synchronization achieved,each node also transmits a clock signal on its remaining TXchannel. The master node transmits its clock signal as drivenby its internal oscillator while the slave node adjusts its clockin attempt to transmit simultaneously with the master node.In addition to sending the pulses to the measurement node,the pulses are displayed on the oscilloscope for qualitativeevaluation and debugging purposes.

Signals used in this implementation were modulated sincpulses as described in Section 3, with a carrier frequencyequivalent to half of the Nyquist sampling rate of the masterand slave nodes, with baseband bandwidth chosen to utilizea large proportion of the available spectrum while avoidingaliasing. For simplicity the same signals were used forexchanges between the master and slave nodes, and for clockpulses sent to the measurement device. Figure 3 shows a 4 msrecording of typical synchronized clock pulses transmitted bythe slave and master nodes.

0 0.5 1 1.5 2 2.5 3 3.5 4

time (ms)

-400

-200

0

200

400

600

800

1000

1200

1400

1600

am

plit

ude

Slave

Master

Figure 3. Example recording of clock pulses sent by masterand slave nodes.

Figure 4. Experimental test setup.

A state-diagram of the master node is shown in Fig. 5. TheTX-B and RX-A signal paths at the master node facilitate allcommunication of sync pulses to and from the slave nodewhile TX-B facilitates the clock output. Each signal pathon the master node is controlled synchronously within thesoftware, but is displayed on the diagram asynchronously for

simplicity. Cross-correlation is used to detect the presence ofa pulse. When the magnitude of the cross-correlation breaksa predefined threshold, the master node schedules the timereversed transmission of the received waveform to the slavenode. The transmissions at the master node are scheduled 4blocks in advance to accommodate the latency between thehost computer and the radio.

Receive Chain Transmit Chain

send

clock

pulse

clock channel (A)sync channel (B)

wait for

sync

pulse

play ipped

received

pulse

Y

N

pulse

received?

get next

received

block

search

for sync

pulse

ip

received

pulse

Y

N

pulse

found?

sync channel (A)

Figure 5. Flow diagram of master node operation.

A state-diagram of the slave node is shown in Fig. 6. TX-A and RX-B facilitate all communication of sync pulses toand from the master node and TX-B outputs the synchro-nized clock output. Because the slave node adjusts its clockbased on the master’s timed response, the implementationcomplexity of the slave node is significantly higher than themaster node. When searching, the slave node performscross-correlation on every receive block in order to locate thesample at which the peak of the sinc pulse was received. Aswith the master node, the slave node computes the magnitudeof the cross-correlation when searching for a pulse to confirmthat the threshold is surpassed. When the detection thresholdis broken and a pulse is detected the slave node sets a flagfor calculation, and saves both the position of the peak in themost recent block of received samples as well as the complexvalue resulting from the cross-correlation. The calculationstep uses the peak position as the coarse delay estimate, andthe fine delay estimator in equation (3) uses the peak cross-correlation value to calculate the fractional delay.

The resulting total delay (coarse and fine) is used at the slavenode to calculate an estimate of the master node’s clock offsetand drift using a smoothing filter. The slave node maintainsestimates of the master node’s clock offset and drift that areupdated every block, and are subsequently used to generate anew delayed pulse template which is transmitted on channelTX-B of the slave node. When a new delay offset observationis available at the slave node, the slave node’s prediction ofthe master clock offset and drift is refined. The smoothing

4

Receive Chain Transmit Chain

sync channel (B)

get next

received

block

search

block for

pulse

calculate

oset from

master

Y

N

pulse

found?

clock channel (B)

send

clock

pulse

predict

master

clock pulse

update

clock

estimate

Y

N

new

oset?

send

sync

pulse

sync channel (A)

Figure 6. Flow diagram of slave node operation.

filter makes use of the standard two-state (offset and drift)clock model [10], and is implemented as a pseudo-Kalmanfilter with a static Kalman gain; this results in a performancepenalty with respect to a true Kalman filter; however, thestatic gain also yields a reduction in complexity by avoidingcomputation of the covariance matrix that updates the gainsto appropriate magnitudes as the process approaches steady-state. Instead, an optimized set of gains are experimentallychosen using the approach described in [6] and ample time isgiven for the system to approach steady-state.

The slave node’s predictions of the clock offset and driftestimates at time k are denoted ∆k|k−1 and Dk|k−1, and areinitially set to zero and one respectively. In each block, thepseudo-Kalman prediction is computed via[

∆k|k−1Dk|k−1

]=

[1 T0 1

] [∆k−1|k−1Dk−1|k−1

](4)

where T is the time-step, equal to 1000-samples in our case.We denote z[k] as the new clock offset observation. Whennew observations arrive, we compute the innovations y[k]which correspond to the difference between the predictedclock offset from the Kalman filter and the new clock offsetobservation at the time of the new clock offset observation, or

y[k] = z[k]−∆k|k−1.

This innovation process is subsequently multiplied by thestatic Kalman filter gains K1 and K2, and used to adjust theslave node’s prediction of the master node’s clock offset anddrift via the update equation[

∆k|kDk|k

]=

[∆k|k−1Dk|k−1

]+

[K1

K2

]y[k]. (5)

5. EXPERIMENTAL RESULTS

This section summarizes our experimental results in imple-menting the timestamp-free synchronization protocol usingbandpass RF signals. First, we investigate the N210’s abilityto estimate offsets between signals by synthesizing clock sig-nals with known offsets, and comparing these known offsetswith the measurement device’s estimate. Second, we test theN210 when measuring two signals from the same source,isolating the measurement integrity from the transmitter’sinduced error. Third, we report on the process of calibratingthe pseudo-Kalman filter used to refine the output of thedelay estimator. Finally, we report on synchronization datacollected. The clock pulses sent out by the master and slavenodes were recorded by the measurement device while theoffsets between these signals were calculated after runtime inMATLAB.

Due to constraints mentioned in the discussion section below,the master and slave nodes were not able to operate at a ratehigher than 250 kilo-samples per second without running outof time to complete each operation before the next block. Themeasurement device was able to operate at 5 mega-samplesper second because processing of the data recorded by themeasurement device was not done in real-time, but insteadwas conducted in MATLAB after recording each session.Clock pulses were sent from each node to the measurementdevice in 1000 sample blocks (equivalent to 4 ms in duration)back-to-back. Synchronization signals exchanged betweennodes were sent at a period of 15 blocks (60 ms) to allowenough time to compute cross-correlation, and to allow fortransmission delays between nodes.

Characterizing the Precision of the Transmission and Mea-surement Device

In order to validate the results derived from sampled data itis imperative to know the precision and accuracy betweenthe transmission and measurement device. To accomplishthis a single N210 was configured to send clock pulses onboth of its TX channels, but with an adjustable, intentionaloffset between the signals. The measurement device recordedthese signals and offsets between pulses were calculated inMATLAB. These calculated offsets were compared to thetrue offsets as described by the test. Signals were transmittedwith delay offsets ranging from ±2 µs in increments of 400ns, corresponding to ±0.5 sample periods at an incrementof 0.1 samples. Signals were transmitted at each offset 250times each session, and results were averaged over 5 sessions.Figure 7 shows a plot of the error statistics of a run of thistest. The standard deviation of the error over all offsets from±0.5 fractional samples was found to be 5.97 ns. This wasa limiting factor in our experimentation as the coarseness ofthis measurement determined the minimum delay offset thereceiver node was able to accurately detect.

Characterizing the Precision of Received Pulses with a ’T-Junction’

Following a similar model of the precision test, a radio wasconfigured to send simultaneous pulses to the device undertest (DUT) along a T-Junction connector. This way, thetransmission is duplicated along two wires, effectively trans-mitting two synchronized pulses and eliminating potentialerror introduced by the transmission device. The error from

5

-0.6 -0.4 -0.2 0 0.2 0.4 0.6

true offset (fractional samples)

-10

-8

-6

-4

-2

0

2

4

6

8

10

estim

ation e

rror

(ns)

Figure 7. Statistics showing measurement device accuracy

each measurement device (labeled A,B, and C) is isolated andquantified for each radio with this process. The results of thisexperiment are reported in Table 1.

DUT mean[y[k]] std[y[k]]

A -0.51 ns 8.54 nsB 0.55 ns 7.16 nsC -0.82 ns 6.77 ns

Table 1. Summary of T-Junction precision results

Selection of Pseudo-Kalman Filter Gains

As described earlier in Section 4, and by equations (4) and(5), we implemented a simplified, pseudo-Kalman filter tosmooth and correct for measurement error in delay offsetestimates. Due to the use of a reduced-complexity pseudo-Kalman filter without an adaptive Kalman filter gain, it wasnecessary to determine static filter gains which provided thebest smoothing performance and minimized prediction error.The statistics of prediction errors calculated by the slavenode during runtime can serve as an indicator of performancein absence of ground truth. To take advantage of this weconfigured the slave node to record delays between sent andreceived signals calculated during runtime. We then usedMATLAB post-processing to apply pseudo-Kalman filterswith different gains and simulate the resulting prediction errorstatistics as if the data had been filtered in real time. By doingthis we were able to determine which filtering coefficientsminimized the resulting prediction error. Filtering coeffi-cients were chosen to minimize error over the last minute ofa 5 minute recording, allowing the filter to take advantage ofprediction error memory accrued during the beginning of thesession.

Synchronization Accuracy Test

To evaluate the effectiveness of the timestamp-free synchro-nization protocol we implemented the procedure describedin the implementation section above. To control for processparameter variation across devices, the accuracy of synchro-nization was tested with each SDR configured to operate aseach role in the protocol. A total of 6 tests were conductedwith each of the three different SDRs serving alternately as

the measurement device, master node, or slave node. Duringeach test the master and slave nodes ran the protocol for 3minutes prior to recording to allow the nodes to synchronize.The measurement device then recorded two minutes of clocksignals (about 30,000 pulses) sent from each node, and thisrecording was used to evaluate the accuracy of synchroniza-tion between the node. Prediction error statistics and mas-ter/slave clock pulses were recorded in each test. The resultsof these tests are summarized in Table 2. The mean standarddeviation across all configurations was 13.2 ns. In the nextsection, we will discuss the implication and limitations ofthese results.

M/S mean[y[k]] std[y[k]]

A/B 4.66 ns 8.18 nsB/A -3.85 ns 8.41 nsC/A 2.72 ns 16.3 nsA/C 1.77 ns 13.4 nsB/C -6.95 ns 15.9 nsC/B -3.76 ns 16.7 ns

Table 2. Summary of clock pulse synchronization results

6. DISCUSSION

This work represents a significant improvement with respectto the previous RF implementation of the protocol on EttusE310 SDRs [7]. Results indicate a worst-case synchroniza-tion error with a standard-deviation of 16.7 ns, correspondingto about 0.1% of the period of the carrier frequency. It isworth mentioning that the inaccuracy of our measurementdevice has been shown to be comparable to the degree ofsynchronization achieved. Without the ability to measure de-lay offsets between signals to a finer resolution, our reportedresults will continue to be dominated by the measurementerror in our estimation device. There were several other keyfactors that limited the degree of synchronization, includingnetwork latency, computational complexity, and error com-pounded from finite-precision effects, all of which we nowexplore in more detail.

In our implementation the data rate of the receiver is limitedby the processing delay between the radio and the computer.In addition to this, due to the nature of the USRPs, all trans-missions must be scheduled in advance. Accounting for theclock drift that occurs from the time when the transmissionis scheduled to the time when it actually transmits is simple,but the need to accommodate this additional latency does addpotential error to the system.

Additionally, the complexity of the computation, and re-quirement to operate in real-time limit the possible samplingrate of the USRP N210s. By far the most computationallyintensive operation in the protocol is the repeated cross-correlations (i.e. receive filtering) that is performed at both themaster and slave nodes. If cross-correlation does not finishexecuting before the next block of samples is ready, thensamples are lost and the device reports underruns. This limitsthe degree of synchronization achievable as lower samplingfrequencies require lower bandwidth signals which limit thedegree of delay estimation accuracy as given by the Cramer-Rao Lower Bound [8].

6

One more item that negatively impacted synchronizationperformance was the sensitivity of the arctangent-based finedelay estimator in (3) when fractional delays approached ahalf sample. Specifically, we observed that small errors fromfinite precision effects sourced from the ADC created smallerrors in the cross-correlation, which subsequently accumu-lated to larger errors in the fine delay estimator.

7. CONCLUSION

This work demonstrates the synchronization accuracy andimplementation challenges of implementing timestamp-freesynchronization protocol at RF using sampled passband sig-nals instead of modulated baseband signals. An implementa-tion of the protocol using such signals is presented along withwith experimental results and analysis. Experimental resultsshow that the protocol is able to achieve synchronizationto within a tenth of a percent of the period of the carrierfrequency over 4 ms prediction intervals.

Future work could improve the accuracy of the timestamp-free synchronization protocol by addressing issues mentionedin Section 6. Specifically, the sampling rate of our devicesis currently bottlenecked by the time it takes the CPU tocompute cross-correlation between template and receivedwaveforms. This latency could be reduced by offloading com-putation from the CPU of the host computer to the GPU usingNVIDIA’s CUDA API. Alternatively, when the developmentof Ettus’ RFNoC technology is sufficiently mature, much ofthe computation (i.e. the cross-correlation) can be offloadedto the FPGA on the Ettus radios. Future work could alsoextend this implementation to operate in a wireless settingwith multipath and additional noise, or to work with morethan two synchronizing nodes. In a fully wireless, multiple-node scenario, one could investigate use of the timestamp-free network synchronization protocol in combination withconsensus-based techniques [11] for operation in a distributedsetting without the need for a centralized master/slave ap-proach.

All source code for this experiment is available at —https://github.com/agklein1/

tsfreesync-n210

ACKNOWLEDGMENTS

This work was supported by a Research Experience for Un-dergraduates (REU) Supplement to National Science Foun-dation award CCF-1319458.

REFERENCES

[1] R. D. Preuss and D. R. Brown III, “Two-way synchro-nization for coordinated multicell retrodirective down-link beamforming,” IEEE Transactions on Signal Pro-cessing, vol. 59, no. 11, pp. 5415–5427, Nov 2011.

[2] D.L. Mills, “Internet time synchronization: the networktime protocol,” IEEE Trans. Commun., vol. 39, no. 10,pp. 1482–1493, Oct. 1991.

[3] W. Lewandowski, J. Azoubib, and W.J. Klepczynski,“GPS: primary tool for time transfer,” Proceedings ofthe IEEE, vol. 87, no. 1, pp. 163–172, Jan 1999.

[4] F. Sivrikaya and B. Yener, “Time synchronization insensor networks: a survey,” IEEE Netw., vol. 18, no. 4,pp. 45–50, Jul.–Aug. 2004.

[5] D Richard Brown and Andrew G Klein, “Precisetimestamp-free network synchronization,” in Informa-tion Sciences and Systems (CISS), 2013 47th AnnualConference on. IEEE, 2013, pp. 1–6.

[6] M. Li, S. Gvozdenovic, A. Ryan, R. David, D. R.Brown, and A. G. Klein, “A real-time implementationof precise timestamp-free network synchronization,” in2015 49th Asilomar Conference on Signals, Systems andComputers, Nov 2015, pp. 1214–1218.

[7] M. Overdick, J. Canfield, A. G. Klein, and D. R. Brown,“A software-define radio implementation of timestamp-free netwrok synchronization,” in IEEE Intl. Conf.on Acoustics, Speech and Signal Processing (ICASSP),Mar. 2017, pp. 1193–1197.

[8] A. Weiss and E. Weinstein, “Fundamental limitationsin passive time delay estimation–Part I: Narrow-bandsystems,” Acoustics, Speech and Signal Processing,IEEE Transactions on, vol. 31, no. 2, pp. 472 – 486,April 1983.

[9] S. Ganeriwal, R. Kumar, and M.B. Srivastava, “Timing-sync protocol for sensor networks,” in ProceedingsACM SenSys 2003. ACM New York, NY, USA, Nov.2003, pp. 138–149.

[10] Lorenzo Galleani, “A tutorial on the two-state model ofthe atomic clock noise,” Metrologia, vol. 45, no. 6, pp.S175, 2008.

[11] D Richard Brown, Andrew G Klein, and Rui Wang,“Monotonic mean-squared convergence conditions forrandom pairwise consensus synchronization in wirelessnetworks,” IEEE Transactions on Signal Processing,vol. 63, no. 4, pp. 988–1000, Feb. 2015.

7

BIOGRAPHY[

Daniel R. Kowalski is currently pursu-ing a B.S. in electrical engineering fromWWU and expects to graduate in June2018. His current research activitiesinclude working with software-definedradios, and clock synchronization. Hewishes to continue his studies in Embed-ded and Cyberphysical systems in appli-cation of IoT sensor networks.

Timothy M. Christman is currently pur-suing a B.S. in electrical engineeringfrom WWU and expects to graduate inJune 2018. His current research ac-tivities include software-defined radios,synchronization, and fault line detectionin power systems. He hopes to studyor work with applications in machinelearning in the future.

Andrew G. Klein received the B.S.degree from Cornell University, Ithaca,NY, USA, the M.S. degree from theUniversity of California, Berkeley, CA,USA, and the Ph.D. degree from CornellUniversity, all in electrical engineering.Previously, he was an Assistant Profes-sor with the Worcester Polytechnic In-stitute, Worcester, MA, USA, from 2007

to 2014, and he was a Post-Doctoral Researcher with Su-plec/LSS, Paris, France, from 2006 to 2007. He joined theDepartment of Engineering and Design, Western WashingtonUniversity, Bellingham, WA, USA, in 2014, where he iscurrently an Associate Professor.

Mitchell W.S. Overdick received hisB.S. degree in Electrical Engineering atWWU in 2017, and was a member ofthe ASPECT Lab. Currently, he worksas an embedded systems validation en-gineer at PACCAR in Mount Vernon,WA. His research interests include Em-bedded Systems, Signal Processing, Ma-chine Learning, and Automotive.

Joseph E. Canfield works as an elec-trical hardware validation engineer atthe PACCAR Technical Center in MountVernon, WA. His research interests in-clude radio communications and analogaudio technology.

D. Richard Brown III received the B.S.and M.S. degrees from the University ofConnecticut in 1992 and 1996, respec-tively, and the Ph.D. degree from CornellUniversity in 2000, all in electrical engi-neering. From 1992 to 1997, he was withGeneral Electric Electrical Distributionand Control. He was a Faculty Memberwith the Worcester Polytechnic Institute,

Worcester, MA, USA, in 2000. He was a Visiting AssociateProfessor with Princeton University from 2007 to 2008. Since2016, he has been with the Computing and CommunicationFoundations Division, National Science Foundation, as aProgram Director.

8

Documents

Implementation and Testing of a Low-Overhead …spinlab.wpi.edu/pubs/Kowalski_AEROCONF_2018.pdfwhen compared with conventional timestamp-based synchro-nization approaches. In the past,