Sensing-Transmission Edifice Using Bayesian Nonparametric Traffic Clustering in Cognitive Radio Networks

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 13, NO. 9, SEPTEMBER 2014 2141

Sensing-Transmission Edifice Using BayesianNonparametric Traffic Clustering in Cognitive

Radio NetworksM. Ejaz Ahmed, Ju Bin Song, Zhu Han, Fellow, IEEE , and Doug Young Suh

Abstract—In cognitive radio networks, the main objective of spectrum sensing is to exploit spectrum holes left by the primary users(PUs). Different PUs’ traffic patterns might provide different opportunities for second user (SU) spectrum access. In this paper, weidentify the PUs’ traffic patterns and then maximize SU transmission accordingly. First a theoretical framework is developed to clusterPU traffic patterns based on a Bayesian nonparametric inference model, in which the number of traffic types is unknown. Second, inorder to exploit the spectrum holes, we study a sensing-transmission structure to optimize the SU transmission strategy. Specifically,we exploit the short and long transmission opportunities based on the PU traffic pattern and channel idle time distribution. Finally, wepropose a threshold-based sensing-transmission method that optimizes the SU utility, while protecting PU transmissions. Bothsensing and transmission errors are considered for perfect sensing with/without acknowledgement-based transmission and imperfectsensing, respectively. From the simulation results, we show that the proposed technique outperforms the nonparametric mean shiftclustering algorithm. Furthermore, we utilize these clustering results to optimize the SU’s transmission strategy with perfect andimperfect sensing. We compare our proposed technique with the probabilistic sensing-transmission structure and show theperformance gain in terms of throughput.

Index Terms—Cognitive radio, payload identification, Bayesian nonparametric identification, sensing-transmission trade-off

1 INTRODUCTION

IN cognitive radio (CR) networks, it is very impor-tant for secondary users (SUs) to be aware of primary

users’ (PUs’) activities (such as traffic patterns) in orderto efficiently utilize PUs’ channels. For example, two PUs’channels with different traffic patterns can have the samechannel occupancy probabilities but the ON/OFF frequen-cies can still be different. Therefore, the channel accessstrategies for those two channels should be optimizeddifferently by the SUs. Since the channel idle time distri-butions/traffic patterns are application-specific [1]–[4], theSU spectral access opportunities vary for different PU traf-fic patterns. Traditionally, the SUs employ the spectrumlearning techniques with a predefined PU channel idletime distribution assumption [6], [7]. However, in prac-tice, the channel idle time distribution is specific to thetraffic patterns [4], [6], [8]. The estimation of the PU idletime distribution is challenging for the SUs. For example,the SU learning-based approaches need considerable timeto adapt to a specific traffic pattern. In our preliminarywork [4], [5], we have shown statistics on the length of the

• M. E. Ahmed, J. B. Song, and D. Y. Suh are with the Departmentof Electronics and Radio Engineering, Kyung Hee University, Yongin446-701, South Korea.E-mail: [email protected]; {jsong, suh}@khu.ac.kr.

• Z. Han is with the Department of Electrical and Computer Engineering,University of Houston, Houston, TX 77004 USA. E-mail: [email protected].

Manuscript received 26 July 2013; revised 12 Nov. 2013; accepted 20Nov. 2013. Date of publication 5 Dec. 2013; date of current version22 July 2014.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference the Digital Object Identifier below.Digital Object Identifier 10.1109/TMC.2013.156

time domain opportunities for various traffic patterns, andstudied transmission optimization based on obtained statis-tics. Based on the information of PUs’ activities, dynamicspectrum access (DSA) has been studied in this paper, andwe focus on exploiting the time domain spectral oppor-tunities, i.e., the idle time between successive PU packetsthat are specific to traffic patterns. These time domain spec-tral opportunities can be exploited by the SU to efficientlyutilize the spectral opportunities [8]–[13].

Motivated by the above mentioned issues, we target onstudying the following questions:

• What is the PU channel idle time distribution andits parameters for each traffic pattern (application) inthe CR network, and how fast should the SU adaptto the PU traffic pattern?

• How much extra throughput gain is achieved byutilizing the PU traffic pattern?

• What is the optimal sensing-transmission strategyfor a specific traffic pattern with the correspondingchannel idle time distribution?

To answer the first question, we propose a Bayesiannonparametric clustering approach for identifying trafficpatterns. For questions two and three, the utility maxi-mization problem is formulated to maximize the averageutility over time. Our contributions in this study are asfollows: 1) We employ Bayesian nonparametric inferenceto cluster the PU traffic patterns, so that the SU can uti-lize the cluster parameters to obtain the PU channel idletime distribution, and optimize their transmission strate-gies accordingly. The proposed nonparametric variational

1536-1233 c© 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

2142 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 13, NO. 9, SEPTEMBER 2014

inference approach can identify and cluster an unboundednumber of traffic patterns in an unsupervised manner.2) We exploit the traffic-specific short time opportunities(STO) between successive PU packet arrivals and longtime opportunities (LTO) when PU is idle, so as to effi-ciently utilize PU spectral opportunities. 3) We proposea threshold-based optimal spectrum access policy underpractical constraints, which is applicable with both perfectand imperfect sensing cases. Moreover, a collision penaltyis imposed to the SUs to mitigate their aggressiveness. Insummary, we provide reconsideration and improvement forcurrent SU sensing-transmitting edifice.

The rest of the paper is organized as follows. In Section 2,we discuss background knowledge and recent work in thedomain of SU transmission. In Section 3, we present ourtraffic pattern model. In Section 4, we propose the feature-based Bayesian nonparametric inference for various trafficsources. In Section 5, we explain the SU transmission frame-work and present SU utility maximization based on theclustering results obtained from Section 4. In Section 6, weshow the simulation and experimental results. In Section 7,we conclude the study. For convenience, we list mostimportant symbols in Table 1.

2 RELATED WORK

Current literature of the SU channel access techniquesfocuses on the observation-learning decision theories [6]–[9], [11], i.e., the SU observes the PU action and then learnthe spectrum accordingly. However, the limitation of suchapproaches result is slow adaptation to frequently chang-ing traffic patterns where each traffic application follows aunique traffic pattern [4], [8], [11].

In [6], the machine learning approach proposed a frame-work for SUs to utilize transmission opportunities betweenpacket bursts of the PU traffic. A machine learning algo-rithm was proposed with a hidden Markov model (HMM)and a gradient method was constructed for finding theunderlying PU traffic pattern. The goal was to model thetime-varying characteristic of the PU channel and inferthe PU traffic pattern. However, with PU traffic patternsvarying frequently, the delay for learning each traffic pat-tern becomes significant. In [4], a Bayesian nonparametricclustering approach is proposed to cluster various trafficpatterns. Since, the PU channel access opportunities are PUtraffic pattern specific, therefore, an SU transmission strat-egy is proposed in this study to efficiently utilize thoseopportunities. Similarly, in [14], a nonparametric robust ker-nel density estimation is proposed to effectively decreasethe number of PU channel switches. However, the PU’schannel behavior is largely dependent on the traffic appli-cation; therefore, for different traffic applications, SUs arerequired to devise the transmission strategy differently. Incontrary, in our proposed technique, we first cluster trafficapplications in a nonparametric way to obtain the clus-ter (traffic) parameters, and then use cluster parameters tomaximize the SU utility.

The threshold-based approach proposed a SU transmis-sion maximization over time in [7]. The authors proposeda general distribution for the PU idle time and intro-duced a reward-cost based SU transmission strategy. The

observation-based history is maintained and updated overtime, without considering the traffic pattern. However, thehistory-based decisions for SU transmission often adaptslowly when the traffic pattern changes frequently. Thisslow adaptation results in small SU utility gains.

The heuristic algorithms proposed a time domain idlechannel time exploitation between bursty transmissions in amulti-access channel scenario in [9], [10]. A statistical modelderived from empirical data is presented for multi-accessstrategies. The IEEE 802.11 MAC protocol is exploited tocreate opportunities for SUs. For example, in the contin-uous transmission of VoIP packets, the gap between twopackets is large enough for the SUs to transmit. However,the challenge is how to enable SU to utilize those whitespaces. Since each traffic application shows distinctivebehaviors [4], the SU needs to identify a traffic application(pattern) and exploit transmission opportunities.

In [11], the approach is based on detecting unutilizedfrequency bands by the PUs, and then it provides a flexiblefacility of the SUs to access those resources for their com-munication by satisfying technical constraints. Moreover,it proposes the technical constraints identification for theprimary-secondary systems smooth functioning. Similarly,in [8], an analytical framework was proposed for a dynamicspectrum access. First, the proposed components capturesthe dynamics of the channel occupancy using the Markovchain formulation. The traffic characteristics of the PUsare incorporated to obtain the PU channel idle proba-bility. Second, the spectrum utilization is formulated asmaximization of the transmissions.

Different from literature, we model the number of trafficapplications (patterns) with the Dirichlet process mixturemodel [4], [15], [16] which results in traffic clustering. Thiscluster knowledge results in quick adaptation to traffic pat-terns. Then based on the obtained traffic pattern, the SUtransmission strategy is optimized accordingly.

3 TRAFFIC PATTERN MODEL

In this section, we discuss the traffic pattern model forour proposed sensing-transmission edifice in CR networks.Here, the SU exploits the PU traffic patterns from vari-ous applications and optimizes their transmission strategycorresponding. We divide spectrum access opportunities intwo categories as shown in Fig. 1. First, STO can be utilizedby studying the behaviors of the PU traffic applications’patterns. Second, LTO is to utilize the opportunities whenthe PU is idle. The individual STO and LTO periods areindependent and follow their respective distribution func-tions. In Fig. 1 (left), the PU inter-arrival time with the VoIPtraffic pattern is shown, by using the real wireless tracesin [17], [18].

The goal is to optimize the SU transmissions based onthe traffic statistics. But first, we need to obtain the statisticsfrom Bayesian nonparametric clustering, which will be dis-cussed in the sequel. The SUs’ objective is to cluster varioustraffic patterns, obtain the distribution parameters for eachcluster, and optimize transmission strategy accordingly.

The clustering is based on the different PU trafficsfeatures, which are the characteristics unique to each appli-cation’s traffic patterns. In the proposed approach, we

AHMED ET AL.: SENSING-TRANSMISSION EDIFICE USING BAYESIAN NONPARAMETRIC TRAFFIC CLUSTERING 2143

TABLE 1List of Symbols

consider following three features for clustering traffic pat-terns:

3.0.1 Packet LengthPacket lengths for different traffic payloads are likely tobe different. For example, the UDP packet size is longer,

the gaming packet size may vary depending on gamedynamics, and the VoIP packet has smaller packet lengthsto minimize jitter. So we use the packet length as one fea-ture point for identifying different traffic applications. Let�pl be the vector of observed packet lengths. We define

�pl = [pl1, pl2, . . . , plN]T, (1)


Fig. 1. PU packet arrival (left) based on data from [17], [18], and short time opportunities between successive PU packets (right), where dottedlines represent the transmission opportunity for SUs.

where N is the number of observations and pli is thepacket length for the ith packet.

3.0.2 Packets Inter-Arrival TimePacket inter-arrival times for different applications alsovary depending upon the requirements of applications. Forexample, in VoIP, the inter-arrival time is small to avoidannoying effects caused by jitter. We define the vector ofthe packet inter-arrival time as

�pint = [pint1, pint2, . . . , pintN], (2)

where pinti is the ith packet inter-arrival time for the PUtraffic pattern.

3.0.3 Variance in Packet LengthThe packet lengths may change in every connection. Forexample, by investigating real wireless traces in [17], [18],for gaming data, we have observed that packet lengths varysignificantly during the communication. Let �� be the vectorof the variance of packet lengths as

�� = [[var(pl)]ℵ1 , [var(pl)]ℵ2 , . . . , [var(pl)]ℵN ]T = [�1,

�2, . . . , �N]T, (3)

where [var(pl)] is the variance of packet lengths in awindow ℵi.1

Our proposed scheme can be extended beyond the abovefeatures. Let yn be the feature space vector with featuresof the PUs’ traffic pattern. With three features discussedabove, we have

yi = [pli, pinti,�i]T. (4)

The matrix representation is given by

Y = [y1,y2, ...,yN]. (5)

In Section 4, we will describe the Bayesian nonparamet-ric variational inference technique to cluster traffic patternsfrom Y.

1. The size of window increases with observations.

The clusters, obtained from the Bayesian nonparametricclustering, have their own respective distribution parame-ters, i.e., �θi, which is the distribution parameter of features,y. Based on clusters distribution parameters, the SUs devisetransmission strategy accordingly. Although the channelis continuously used for packet transmissions by the PU,there are large gaps between consecutive PU packets asshown in Fig. 1 (right). This situation is similar for vari-ous traffic applications [8], [11]. To explore the STO andLTO, for sensing and transmission, we have the followingassumptions:

• SU has a fixed unit SU sensing time �S and a fixedunit transmission time �T.

• �S and �T are relatively short.• Each successful SU transmission earns benefit B, and

a penalty L which is imposed for a collision.2Due to noises and limited number of observations, the

sensing may not be 100% accurate [7]. As a consequence,it may result into false alarms, (Pf ), which declares the PUchannel busy when the PU channel is actually idle. Herethe detection probability, (Pd), refers to the probability ofestimating the PU channel correctly. Similarly, in an idealscenario for detecting collision, a negative acknowledge-ment (NACK) from the SU receiver signifies that a collisionhas occurred with the PU. However, the SU can successfullydecode the SU transmitted packet even when the PU istransmitting. Moreover, a NACK may result from SU chan-nel fading or interference from other users. Therefore, twoprobabilities are defined as follows:

γ0 = p [NACK|No collission with PU],γ1 = p [NACK|Collision with PU].

In this next section, we discuss the basics of the tech-niques on how to cluster the traffic patterns. Then inSection 5, we propose the sensing-transmission edifice.

2. It is shown later, for the SU to apply its transmission strategyby maximizing the utility function (reward gained - penalty), the colli-sion penalty L can be adjusted to control the aggressiveness of the SUaccess. Therefore, the proposed spectrum sharing can achieve requiredtrade-off between SU access and PU protection by adjusting L.


4 BAYESIAN NONPARAMETRIC VARIATIONALINFERENCE FOR TRAFFIC CLUSTERING

In this section, we propose a nonparametric variationalinference approach for PU traffic pattern clustering, andestimate the parameters for each cluster distribution soas to optimize SU transmission. Given Y, our goals areto find 1) how many patterns generated Y, and 2) whichdata point yi belongs to which traffic pattern. To solvethe above questions, we need to obtain a posterior dis-tribution for the set of observations, where the posteriordistribution represents the total number of traffic patterns(clusters) in the data set Y. We use the generative model [4]to model the posterior distribution. In the generative modelapproach, the models are generated to describe differentclusters and use these models to classify data points, e.g.,clustering via a Dirichlet process mixture model [4], [15],[16], [19]. The Dirichlet process mixture model is a corner-stone of Bayesian nonparametric inference. In [4], we usedan infinite Gaussian mixture model (IGMM) with breakingconstruction with Gibbs sampling, where IGMM assumesthat the data comes from a mixture of an infinite numberof distributions. There exist two nonparametric candidateapproaches in the generative model for inference: First,a sampling-based approach uses Markov chain Monte-Carlo (MCMC) to cluster the data set Y. However, theMCMC based sampling approach has limitations includingslow convergence [19], and difficult to diagnose conver-gence [15]. Therefore, we rely on the second approach thatis relatively quicker: A variational method converts infer-ence problems into optimization problems. The main ideathat governs variational inference is that it formulates thecomputation of marginal or conditional probability in termsof an optimization problem that depends on the numberof free parameters, i.e., variational parameters. Here, vari-ational parameters results into the marginal or conditionalprobabilities of interests.

In the following, we first discuss the Dirichlet basics.Then, we study the inference based on the model. Next, wefurther improve the inference with some variations. Finally,we propose the optimization solution.

4.1 Dirichlet Distribution and Dirichlet ProcessWhen the number of traffic patterns is unknown or varieswith the passage of time, we can model the feature spaceusing an instance of the nonparametric Bayesian models,which is widely used for data clustering [4], [15], [20]–[22].

4.1.1 Dirichlet Distribution (DP)We use the Dirichlet distribution, which is the extension ofthe Beta distribution for multivariate cases, and the con-jugate prior of the multinomial distribution in Bayesianstatistics. The Dirichlet distribution is the probability den-sity function that represents the probability of K events(x1, x2, . . . , xK), given that event xi has been observed αi −1times, i = 1, . . . ,K.

Definition 1. The Dirichlet distribution of order K ≥ 2 withparameters α1, · · · , αK > 0 has a probability density functionwith respect to the Lebesgue measure on the Euclidean spaceRK−1 given by

Dir(α1, α2, . . . , αK) = f (x1, x2, . . . , xK;α1, α2, . . . , αK)

= �(∑

i αi)∏i �(αi)

K∏

i=1

xαi−1i , (6)

∀, x1, x2, . . . , xK−1 > 0 satisfying∑K

i=1 xi = 1. The first termin (6) is a normalization constant, which is a multinomial betafunction.

4.1.2 Dirichlet ProcessA Dirichlet process is a distribution over probability mea-sure. Therefore, draws from a Dirichlet process follow arandom distribution. Let Go be a base probability distri-bution over Y, and α be a positive real-valued scalar. TheDirichlet process is defined as

Definition 2. A random measure G is distributed according tothe Dirichlet process DP(α,Go) with base distribution Go andconcentration parameter α, i.e., G ∼ DP(α,Go), if

(G(A1), . . . ,G(Ak)) ∼ Dir(αGo(A1), . . . , αGo(Ak)) (7)

for every finite measurable partition {A1, . . . ,Ak} of Y, whereDir(·) is the Dirichlet distribution.

The parameter Go and α play an important part in DP.The base distribution Go is the mean of DP, i.e., we haveE[G(A)] = Go(A), where A ⊂ Y. The concentration param-eter α is an inverse variance, i.e., larger the α, the DP willbe more concentrated around the mean.

Since G is a random distribution (G ∼ DP(α,Go)), it ispossible to draw samples from G. Integrating out G, thejoint distribution of variables {y1, . . . ,yn} demonstrates aclustering effect. Suppose that with n−1 samples, yn−1 hasalready been obtained. Then from [19], the predictive dis-tribution for yn, conditioning on y1,y2, . . . ,yn−1 and withG marginalized out, we have

yn|y1, . . . ,yn−1 ∼ 1α + n − 1

(αGo +

n−1∑

i=1

δyi

), (8)

where δyi is an atomic distribution centered on yi.The discreteness property of draws from a DP in (8)

implies the clustering behavior. Since values yi are repeated,let �θ = {�θk}K

k=1 denote the distinct values of {y1, . . . ,yn}. Letz = {z1, . . . , zn} be the cluster assignment variables suchthat yi = �θzi , and |z| is the number of cells in a partition.The distribution of yn is given by

yn ={

�θi, with prob.|{j:zj=i}|n−1+α ,

y, y ∼ Go, with prob. αn−1+α ,

(9)

where |{j:zj = i}| is the number of times the value �θi occursin y1, . . . ,yn−1. The unique values of yi result into parti-tioning of the data set into clusters such that within eachcluster, i.e., cluster k, θk takes on the same value yk. From(8), the two properties can be given as [19]:

(1) There is a positive probability that two samples willhave exactly the same value due to the discretenessproperty of DP [16], [19].

(2) In DP, the more often a given value is observed inthe past, the higher its chance to be observed againin the future.


4.1.3 Dirichlet Process Mixture Model (DPMM)Traditional parametric models using a fixed and finite num-ber of parameters can suffer from over or under fittingof data when there is a misfit between the complexity ofmodel and the amount of data available. Therefore, we useDPMM since the number of traffic patterns can be dynamic.The nonparametric nature of DPMM translates the mix-ture models with a countably infinite number of clusters.The DP is used as a nonparametric prior in a hierarchi-cal Bayesian specification [19]. The observations {y1, . . . ,yn}can be modeled with the latent parameters {�θ1, · · · , �θn}.Each �θn is drawn independently and identically from G,while each yi has distribution p (·|�θi) parameterized by �θi,i.e.,

G|{α,Go} ∼ DP(α,Go), �θn|G ∼ G, yn|�θn ∼ p(yn|�θn). (10)

Recall the first property of G in (8). G is discrete, and mul-tiple �θi can take on the same value simultaneously, whichshows clustering effect. Data generated from the model in(10) can make partitions based on the distinct values ofparameters �θn. This property leads to a flexible mixturemodel, in which the number of traffic patterns (clusters)grows as more data are observed.

4.1.4 DPMM Representation via Stick BreakingAnalogy

The Dirichlet process mixture model can be represented bystick-breaking construction. In stick-breaking analogy, thelatent parameter �θk is the model parameter of the clusterk. The zk is a cluster assignment variable, where it takeson value k with a probability πk. Considering a randomvariable, vi ∼ Beta(1, α) and �θi ∼ Go, where i ∈ {1, 2, . . .},the stick-breaking representation of G is

G =∞∑

i=1

πi(v)δ�θi, (11)

where

πi(v) = vi

i−1∏

j=1

(1 − vj). (12)

Here V = {vi}∞i=1, i ∈ {1, 2, · · · } is stick lengths and δ�θiis

an atomic distribution centered on �θi. The discreteness ofDP is evident from G. It consists of a countably infinite setof model parameters (atoms) �θi drawn independently fromG. The mixing proportions πi(v) are the successive break-ing of a unit length stick into infinite number of sticks.The size of each successive piece of stick is dependent onα from Beta(1, α). It is also observed that the number ofclusters used as prior are a logarithmic function of theobservations [16].

The DP mixture consists of a vector π(v), which is aninfinite vector of mixing proportions and {�θ1, �θ2, · · · } whichare the model parameters representing mixing components.Recall, zi is the assignment variable of the mixture compo-nent with which the data point yi is associated. The wholeprocedure can be presented by the following steps.

1) Draw vi|α ∼ Beta(1, α), i ∈ {1, 2, · · · }.2) Draw �θi|Go ∼ Go, i ∈ {1, 2, · · · }.

Fig. 2. Graphical model of an exponential family DP. Nodes are randomvariables, edges are dependencies, and plates are replications.

3) For the data point n, do:

(a) Draw zn|{v1, v2, · · · } ∼ Multinomial(π(v)).(b) Draw yn|zn ∼ p(�yn|�θzn).

In this paper, we consider DP mixtures for which theobservable data are from an exponential family distribution,and the base distribution is the corresponding conjugateprior. The stick breaking construction of DP is presented inthe form of the Bayesian network in Fig. 2. The conditionaldistributions of vk and zn are demonstrated. The distribu-tion of yn, which is conditioned on zn and {�θ1, �θ2, · · · } isgiven as

p(yn|zn, �θ1, �θ2, · · · ) =∞∏

i=1

h(yn) exp {�θKi yn − a(yn)}1[zn=i]

,

(13)

where a(·) is the appropriate cumulant generating functionand h(·) is a nonnegative function, h(·) > 0. Here, y is thesufficient statistics for the parameter �θ , since for exponentialfamilies the sufficient statistic is a function that fully sum-marizes the data within the density function. The vector ofsufficient statistics of the corresponding conjugate familycan be given as, (�θK,−a(�θ))K. The Go can be given as

p(�θ |λ) = h(�θ) exp{λK1

�θ + λ2(−a(�θ))− a(λ)}, (14)

where we decompose the hyperparameters λ, i.e., λ1 con-tains the first dim(�θ) components and λ2 is a scalar. Fromthe discreteness property of DP, it can be used as a priorprobability in the infinite mixture models. To calculate theposterior probability, we employ the Dirichlet process as aprior, and proceed further to estimate the posterior usingvariational inference.

4.2 Inference for DPMMSince we have prior in the form of DP, we rely on varia-tional inference for the posterior estimation of DP.3 In thissubsection, we employ variational inference to estimate theposterior distribution over a set of latent variables given theobservation yi. Our goal is to find out the indicators, zi, tocluster data points. Consider a model with hyperparame-ters H and observations yn. Then the posterior distribution

3. We use variational inference to pick a family of distributionsover the latent variables, w, with its own variational parameters, ω,and then, set the ω so as to make qω(·) close to the posterior of interest.


of the latent variables w can be written as

p(w|y,H) = p(w,y|H)p(y|H) = exp{log p(y,w|H)− log p(y|H)}.

(15)

Working directly with the posterior in (15) is typicallyimpossible due to the need to compute a normalizationconstant, p(y|H) [19]. Let qω(w) be a family of distributionsover latent variables indexed by variational parameter ω.The evidence lower bound (ELBO) is a lower bound on thelogarithm of the marginal probability of the observationslog p(yi|H). Using the Jensen’s inequality, which impliesthat the logarithm of a function’s expected value is greaterthan or equal to the expected value of the logarithm; theELBO can be derived as,

p(y|H) = log∫

wp(w,y|H)dw

= log∫

wp(w,y|H)q(w)

q(w)

= log(

Eq

[p(y,w)

q(w)

])

≥ Eq[ log p (y,w|H)] − Eq[ log q(w)] = �(q).

(16)

We restrict q(w) to be in a family that is tractable, i.e., forwhich the expectations can be efficiently computed, andwe try to find the member of the family of distributionsthat maximizes the the bound. Solving this maximizationproblem is equivalent to finding the member of the familythat is closest in Kullback-Leibler (KL) divergence to theposterior [19]. We utilize mean-field variational inferencefor posterior estimation using optimization. The mean-field methods are based on optimizing KL divergence withrespect to a variational distribution. Maximizing the ELBO(lower bound) is the same as minimizing the differencebetween the lower bound and the true posterior. Therefore,our goal is to minimize the KL divergence between twodistributions qω(w) and p(w|yi,H), i.e.,

KL(qω(w)||p (w|Y,H)) = Eq

[

logqω(w)

p(w|y,H)

]

= Eq[ log qω(w)] − Eq[ log p(w|y,H)]

= −(Eq[ log p(w,y|H)]−Eq[ log qω(w)])+ log p(y|H)

= −�(q)+ log p(y|H). (17)

log p(y|H) does not depend on ω. In this way, by opti-mizing the gap in (17), the posterior for latent variablesis obtained. So, minimizing the KL divergence is the sameas maximizing the ELBO. The above equation can writtenas a maximization of the lower bound (ELBO) on the logmarginal likelihood as

log p(y|H) ≥ Eq[ log p(w,y|H)] − Eq[ log qp(w)]. (18)

The gap is the divergence between the variational distribu-tion qω(·) and the true posterior p(·). Since the mean-fieldvariational methods are based on the KL divergence mea-sure, it is necessary to choose a family of distributions qω(w)such that the optimization in (17) is tractable. By doingso, one can break some dependencies between the latent

variables that make the true distribution difficult to com-pute. Therefore, we consider the fully-factorized variationaldistributions that break all the dependencies.

4.3 Variational Inference for DPMMHere, we propose a mean-field variational algorithm, whichis a particular class of variational methods. In this subsec-tion, we intend to instantiate latent variables and hyper-parameters with respect to the DPMM. The proposedalgorithm is based on the stick-breaking representation ofthe DP mixture (Fig. 2). Therefore, the latent variablesin the stick-breaking are the stick length, model parame-ters (atoms), and the cluster assignment indicator for datapoints, i.e., w = {V,�,Z}, and the hyperparameters are thescaling parameter and the parameters of the conjugate basedistribution, H = {α, λ}. From (18), the variational bound onthe log marginal probability of the data can be given as,

log p(y|H) ≥ Eq[ log p(V|α)]

+Eq[ log p(�θ |λ)] +N∑

n=1

(Eq[ log p(Zn|V)]

+Eq[ log p(yn|Zn)])− Eq[ log q(V, �θ ,Z)].

(19)

For the above bound, we need to find the family of vari-ational distributions that approximate the distribution ofG, which is expressed in V = {V1,V2, · · · , } and � ={�θ1, �θ2, · · · }. The truncated stick-breaking representation [23]is used here. Therefore, the value of K is fixed. Let q(ωK =1) = 1, which implies that the mixture proportions πk(v)are equal to zero for k > K in (12). Truncated stick-breakingin [23] is used for the sampling inference. However, theDP is full in our case, and only the variational distribu-tion is truncated. Therefore, we set K as a free variationalparameter that can be set manually.

The factorized family of variational distributions formean-field inference can be written as

q(v, �θ , z) =K−1∏

k=1

qηk(vk)

K∏

k=1

q�k(�θk)

N∏

n=1

qρn(zn). (20)

In the above equation, qηk(vk) is the beta distribution, q�k(�θk)

is the exponential family distributions, and qρn(zn) is themultinomial distributions. The variational parameters aregiven by

ω = {η1, · · · , ηK−1, �1, · · · , �K, ρ1, · · · , ρN}. (21)

It is shown that there are multiple parameter optionsfor each latent variable under the variational distribution.Therefore, we need to optimize the bounds in (19) basedon the variational parameters in (21).

4.4 Optimizing With Coordinate Ascent MethodHere, a coordinate ascent algorithm for optimization is pre-sented, which optimizes the bounds in (19) with respectto the variational parameters. In (19), all the terms involvestandard computation; however, the summation term needsto be changed in the context of the indicator random vari-able. Therefore, rewriting the third term in (19) using the


indicator random variables, we have

Eq[ log p (Zn|V) ]=Eq

[

log

( ∞∏

i=1

(1 − Vi)1[zn>i]V1[zn=i]

i

)]

,

(22)

where we know that Eq[ log(1 − VK)] = 0 and q(zn > K) =0. By introducing the truncation concept, we can truncatesummation at k = K, which follows as,

Eq[ log p (Zn|V)] =K∑

i=1

q(zn > i)

Eq[ log(1 − Vi)] + q(zn = i)Eq[ log Vi], (23)

where the description of terms of (23) is given by

q(zn = i) = ρn,i,

q(zn > i) =K∑

j=i+1

ρn,j, (24)

Eq[ log Vi] = ψ(ηi,1)− ψ(ηi,1 + ηi,2),

Eq[ log(1 − Vi)] = ψ(ηi,2)− ψ(ηi,1 + ηi,2). (25)

The function ψ(·) arises from the derivative of the lognormalization factor in the Beta distribution. By using themean-field coordinate ascent method, we have

ηk,1 = 1 +∑

n

ρn,k,

ηk,2 = � +∑

n

K∑

j=k+1

ρn,j, (26)

�k,1 = λ1 +∑

n

ρn,kyn,

�k,2 = λ2 +∑

n

ρn,k, ρn,k ∝ exp(�k), (27)

where, for k ∈ {1, . . . ,K} and n ∈ {1, · · · ,N},

�k = Eq[ log Vk] +k−1∑

i=1

Eq[ log(1 − Vi)]

+Eq[�θk]Kyn − Eq[a(�θk)]. (28)

Iterating over these updates, we optimize (19) with respectto the variational parameters defined in (21). The varia-tional distributions need to be initialized according to thespecific application. The poor choices for initialization mayresult in local maxima [19]. Here variational distribution isinitialized by incrementally updating the parameters basedon the data set Y. The algorithm runs multiple times to getthe best possible parameters for clustering.

The predictive distribution is given as

p (yN+1|y,H) =∫ ( ∞∑

k=1

πk(v)p(yN+1|�θk)

)

dp (v,�|y,H).(29)

By the factorized variational approximation, the distribu-tion of latent variables �θk and stick lengths are separatedand the infinite sum is truncated [19]. Therefore, we canapproximate the predictive distribution by the product ofexpectations, which is easy to compute with variational

approximation,

p(yN+1|y,H) ≈K∑

k=1

Eq[πk(V)]Eq[p(yN+1|�θk)], (30)

where q depends on y, α, and λ.The possible extension is done by integrating over a dif-

fuse prior on the scaling parameter α. Therefore, the finalobtained result is in the form of latent variables, zi and�θk. Recall zi is the cluster assignment variable that assignsobservations to a cluster (traffic pattern). In the next section,we optimize SU transmit strategy based on the clusteringresults for each traffic pattern.

5 SU LEARNING FRAMEWORK

Following the clustering results for the SU given the trafficpatterns statistics from Section 4, we propose a sensing-transmission strategy. From Bayesian nonparametric infer-ence, the SU has knowledge of the following statistics: i)the number of traffic patterns (K) with their correspond-ing parameters �θi = {�pint, �pl, ��},where i ∈ {1, . . . ,K}; andii) the probability density function and its parameters foreach feature point, e.g., the PU packet inter-arrival time�pint that follows the Pareto long-tail distribution with scaleparameter ζC

k and shape parameter ζHk for traffic pattern k.

In this section, we utilize these PU traffic pattern statisticsfor sensing-transmission edifice. We propose a utility func-tion to reward the SU for successful packet transmissionsand to penalize it for colliding with the PU. We show howthe SU can adapt its transmission schedule in STO and LTOstates based on traffic patterns.

5.1 Idle Probability-Based PU State TransitionLet t be the time elapsed since the PU’s last state transitionfrom busy to idle. Given the PU is idle at time t under trafficpattern k, the probabilities �t that the PU will remain idleduring the SU actions (transmission or sensing) is

�St = 1−Fx(t+�S)

1−Fx(t), �

Tt = 1 − Fx(t + �T)

1 − Fx(t), (31)

where subscripts S and T represent SU sensing and trans-mission, respectively. Here, FX(·) is the cumulative distri-bution function of the PU idle time.

5.2 SU Learning StateWe denote the SU action space at as at ∈{1: transmit, 0: sense}. A SU gets observation ot aftersensing action aS

t . The observation for sensing is repre-sented by oS

t ∈ {IDLE,BUSY} and the observation fortransmission is oT

t ∈ {ACK,NACK}.Let φt denote the conditional probability that at time t.

the PU is idle given the action-observation (at, ot) historyfor a traffic pattern k. The observation ot, obtained fromthe action at, is used to update the SU estimation of the PUchannel idle probability, φ

(t+�oSS )

or φ(t+�oT

T ), which is given

as follows: when at=0,

φ(t+�oS

S )=

⎧⎪⎨

⎪⎩

φt�St (1−Pf )

φt�St (1−Pf )+(1−φt�

St )(1−Pd)

, if oS = IDLE,

φt�St Pf

φt�St Pf +(1−φt�

St )pd

, if oS = BUSY;(32)


when at=1,

φ(t+�oT

T )=

⎧⎪⎨

⎪⎩

φt�Tt (1−γ0)

φt�Tt (1−γ0)+(1−φt�

Tt )(1−γ1)

, if oT = ACK,φt�

Tt γ0

φt�Tt γ0+(1−φt�

Tt )γ1

, if oT = NACK.(33)

Recall from Section 3, the sensing accuracy cannot beguaranteed with detection probability 100%; therefore, thesensing error is also incorporated in (32). Similarly, the suc-cessful packet transmission is dependent on the following,1) Channel must be idle for the SU packet duration φt�

Tt ;

2) The ACK is received from the SU-receiver. The ACK isdependant on γ0 and γ1, both of which are modeled in (33).In the next subsections, we propose the SU’s benefit modelin STO and LTO by using (31), (32), and (33). We utilizethe generic model for STO and LTO with their respectivechannel idle time distribution parameters.

5.3 SU Benefit in STOThe SU gets benefit B after each successful transmission,and is penalized with loss L if its transmission collide withthe PU’s transmission. The benefit gained by the SU isdependent on the current traffic pattern k. Eventually, theproposed utility function maximizes the benefit obtainedin STO and LTO states. In STO, the opportunity for the SUlies in the time between two successive PU packet arrivals.However, the length of this period depends on the trafficpattern [4].

From Section 4, the SU has knowledge about the numberof traffic patterns and the distribution of each feature pointyi. This knowledge is used to obtain the channel idle prob-ability for a traffic pattern. The packet inter-arrival time,packet length, and the variance in packet length of thePU traffic affects the channel occupancy/availability statis-tics [1], [4], [6], [14]. In the sequel, we derive the PU channelidle probabilities during SU action at given a traffic pat-tern. Let pk(pint), pk(pl), and pk(�) be the probability densityfunctions of packet inter-arrival time, packet length, andvariance in packet length, respectively, for traffic pattern k,which can be obtained by marginalizing out �θk, i.e.,

pk(pl) =∫

pint

∫

�

�θkdpintd�, (34)

pk(pint) =∫

pl

∫

�

�θkdpld�, (35)

pk(�) =∫

pint

∫

pl

�θkdpintdpl. (36)

Let Dpkt = E[ �pl] be the expectation of PU packet dura-tion and �k = E[ ��] be the expectation of variance inpacket length for the traffic pattern k, respectively. ThePU packet arrivals follow a Pareto long-tail distributionfrom p(pint) with scale and shape parameters as ζC

k andζH

k , respectively [17], [18]. From the measurement results,Dk is constant for most of the traffic patterns. However, thetraffic patterns like gaming show variable packet lengths.

Given that the PU is idle at time t and the traffic patternis k, the probability that the PU remains idle during the SUaction from (31) is given as

�St =

( ζCk

ζCk + �S

)ζHk, �

Tt =

( ζCk

ζCk + �T

)ζHk. (37)

Let φkt,sto denote the conditional probability that the PU

channel is idle at time t given a traffic pattern k. Wemodel the PU’s STO state with the renewal reward pro-cess. The cycle completes with the arrival of the PU packet.It is assumed that the packet inter-arrival time satisfies theErgodicity property with independent and identically dis-tributed (i.i.d.) assumptions. Therefore, the probability thatPU is idle at time t, given ζC

k ,Dk, and �k, can be writtenby

φkt,sto = 1 − E[PU packet duration of traffic k]

E[PU packets inter-arrival time of traffic k]

= 1 − Dk(ζHk − 1) [1 − Q(�k)]

ζCk ζ

Hk

, (38)

where Q(�k) is the Q function for variance in packet length,given as, p(pl > �) = Q(�−μ

σ), which is the probability that

�k of traffic pattern k is away from its mean μ by at-least�− μ.

From (37) and (38), the probability that PU is idle attime t and remain idle during SU’s packet transmission�T is [φk

t �Tt ]. Therefore, the probability that SU success-

fully transmits a packet and receives an ACK from the SUreceiver is φt�

Tt (1 − γ0) + (1 − γt�

Tt )(1 − γ1). Then the

immediate benefit gained by the SU is given by

bt(φkt,sto) = [φt�

Tt,sto(1 − γ0)+ (1 − φt�

Tt,sto)(1 − γ1)]B�T,

(39)

where B is the benefit that the SU gets after a successfultransmission, it is an extra time given for transmission to aSU. However, the SU will be penalized with the expectedimmediate penalty L as a result of colliding with the PU.About the penalty setting in the proposed approach, it isgiven as

ct(φkt,sto) = (1 − φk

t,sto�Tt,sto)L�T. (40)

Therefore, the expected immediate utility of the SU canobtain at time t with information state φk

t is given by

rt(φkt,sto, 1) = bt(φ

kt,sto)− ct(φ

kt,sto), (41)

rt(φkt,sto, 0) = 0. (42)

5.4 SU Benefit in LTOLTO is the state when the PU is idle. From the extensivemeasurement of real traffic traces, it has been analyzed thatthe channel idle duration follows the heavy-tail distribu-tion [9], [10], [17], [18]. In LTO, the channel history recordfor action (at) is updated in the similar way like in STO,using (32) and (33). Let φk

t,lto be the conditional probabil-ity that the PU is idle at time t. Following (39) and (40),the probability of receiving an ACK from the SU successfultransmission is φt�

Tt,lto(1 − γ0) + (1 − φt�

Tt,lto)(1 − γ1). The

immediate benefit obtained by the SU from a successfulpacket transmission is given by

mt(φkt,lto) = [φt�

Tt (1 − γ0)+ (1 − φt�

Tt )(1 − γ1)]B�T. (43)

Similarly, the SU is penalized with the immediate expectedpenalty as

ct(φkt,lto) = 1 − (φk

t,lto�Tt,lto)L�T. (44)


Therefore, the expected utility becomes

rt(φkt,lto, 1) = mt(φ

kt,lto)− ct(φ

kt,lto), (45)

rt(φkt,lto, 0) = 0. (46)

5.5 Maximizing Utility for the SUSensing-Transmission

Let IDf /BSf be the length of the f th idle/busy period. SinceIDf /BSf are i.i.d. random variables, the SU spectrum accessin the long run consists of repeated trials of the access pol-icy in a PU cycle. An access policy of SU is denoted by ϒ ,which is a function of the action space {0, 1} at time t. Ourobjective is to find sensing-transmission maximization withthe expected utility of the time unit, given by

maxϒ

= limF→∞

F∑

f=1

⎡

⎣Lk∑

j=1

Mk∑

i=1

rji(φ

ki,sto, ai)+

Pk∑

i=1

rji(φ

ki,lto, ai)

⎤

⎦ /F

F∑

f=1

(IDf + BSf )/F

,

(47)

where∑Mk

i=1 rji(φ

kt,sto, at) is the total obtainable reward by

the SU in the jth inter-arrival time of the PU packets undertraffic pattern k. Here, Mk are the number of short trans-mission opportunities for the SU in the jth STO, as shownin Fig. 1 (right). Lk is the expected number of STOs (SUopportunities), as shown in Fig. 1 (left). Similarly, Pk is theexpected number of packets transmission opportunities forthe SU in LTO. Our goal is to maximize the average utilityper unit time given in (47).

The total utility obtained in each idle-busy cycle is iden-tically and independently distributed. Therefore, by the lawof large numbers, we have

limK→∞

=E

⎡

⎣Lk∑

j=1

Mk∑

i=1

rji(φ

ki,sto, ai)+

Pk∑

i=1

rji(φ

ki,lto, ai)

⎤

⎦

E[ID] + E[BS].

(48)

Since E[ID]+E[BS] is fixed, the average-utility optimizationproblem is translated to a problem that aims to maximizethe total expected utility in an ID/BS period. The optimalspectrum access policy is to obtain the maximum utilityfunction as:

U(0, φ) = supϒ

Uϒ(0, φ), (49)

Uϒ(0, φ) = Eϒ

⎡

⎣

⎛

⎝Lk∑

j=1

Mk∑

i=1

rji(φ

ki,sto, ai)+

Dk∑

i=1

rji(φ

ki,lto, ai)

⎞

⎠

|φo = φ], (50)

where Uϒ(0, φ) is the utility achieved by policy ϒ . Here itis assumed that the SU can detect the beginning of the PUidle period, i.e., the value of t is perfectly known.

5.6 Optimality FormulationLet U(t, φ) be the maximum expected utility of the SU attime t. Therefore, we rely on the Bellman equation [24]to write the value of the decision problem at a certain

Fig. 3. S(t, φ) and M(t, φ) for the proposed and OSTS approaches.

point in time. The Bellman equation suits our case for thefollowing reasons. 1) it is a dynamic programming basedequation associated with discrete-time optimization; 2) Itbreaks a multi-period (STO/LTO) planning problem intosimple steps at different points in time; 3) It optimizes theobjective function. Therefore, we maximize the expectedutility function in (48) as

U(t, φ) = max{0,1}

{S(t, φ),M(t, φ)

}, (51)

where S(t, φ) and M(t, φ) are the expected utility benefitobtained by the SU when sensing the channel and transmit-ting a packet, respectively. S(t, φ) and M(t, φ) are depictedin Fig. 3. From Fig. 3, the SU transmission opportunity isachieved earlier, for the proposed approach than the opti-mal sensing-transmission structure (OSTS) approach [7],with a fix collision probability. Here, S(t, φ) and M(t, φ) aregiven in (52) and (53), respectively.

S(t, φ)=

⎧⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

∑

i∈{ID,BS}p[oS

t+�S= i]U(t + �S, φ

kt,sto + �S(i)), if

STO,∑

i∈{ID,BS}p[oS

t+�S= i]U(t + �S, φ

kt,lto + �S(i)), if

LTO,(52)

M(t, φ)=

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

[φk

t,sto�Tt,sto(1 − γ0)B + (1 − φk

t,sto�Tt,sto)((1 − γ1)

B − L)]�T

+∑

j∈{A,N}p[oT

t+�T= j]U(t + �T, φt+�T (j)), if STO,

[(φk

t,lto�Tt,lto(1 − γ0)B + (1 − φk

t,lto�Tt,lto)((1 − γ1)

B − L)]�T

+∑

j∈{A,N}p[oT

t+�T= j]U(t + �T, φt+�T (j)), if LTO.

(53)

From (52) and (53), the total expected utility obtained bythe SU at time t in STO and LTO is given in (51). If S(·) isgreater than M(·), the SU will sense; otherwise transmit. Weassume that the sensing result is idle if the PU channel isidle for the whole sensing duration �S. The SU should keepsensing until the PU channel becomes idle. Next, we showfrom the following lemma for the property of the utility.


Lemma 1. U(t, φ) is a convex function of φ for given t.

Proof. Let 0 ≤ p ≤ 1, and 0 ≤ φ1, φ2 ≤ 1, we show thatU(t, pφ1+(1−p)φ2) ≤ pU(t, φ1)+(1−p)U(t, φ2). Followingthe same depiction from [24]. Let us suppose the initialstate φ is determined by flipping a biased coin with prob-ability p for head appearance. With the head appearancethe initial state becomes φ = φ1, else φ = φ2. The max-imum reward obtained is pU(t, φ1) + (1 − p)U(t, φ2), ifthe outcome of a coin flip is known. On the other hand,the maximum reward obtained is U(t, pφ1 + (1 − p)φ2),if we do not know the outcome of the coin flip. Hence,the result holds.

Furthermore, we define a technical condition under whichthe optimal threshold-based policy is defined as

T∗ = min{

τ :�Tt <

L − (1 − ψ1)B(ψ1 − ψ0)B + L

,∀t > τ

}

. (54)

When ψ0 = 0 and ψ1 = 1, (54) can be written as

T∗ = min{τ :�Tt < L/(B + L),∀t > τ }. (55)

The physical meaning of T∗ is that the SU should remainsilent and not transmit after T∗. Recall that �

Tt is the con-

ditional probability that the PU will not return during SUtransmission of length �T, given that the PU is idle at timet. Relying on (54) and immediate benefit rt(·) as in (41) and(45), we have ∀t > T∗, rt(1, 1) < 0, i.e., all transmissionsafter T∗ result into a negative benefit even when the channelis idle at time t. Therefore, U(t, 1) = 0,∀t > T∗. However,U(t, φ) is a decreasing function of t [7].

We consider the case when T∗ < ∞. In this case, anoptimal threshold-based policy is presented. The physicalmeaning of T∗ < ∞ is that for large t, the probability of PUreturn during a SU transmission is higher, and thus, thepotential loss becomes greater than benefit, if L is chosenappropriately. Many distributions satisfy the above con-dition4. To show that the proposed SU access strategy isoptimal, we need to show that the following lemma holds.

Lemma 2. U(t, φ) increases in φ for a given t.

Proof. By the convexity of U(·), we have U(t, φ) ≤ φU(t, 1),and therefore, U(t, φ) = 0,∀φ ∈ [0, 1] when t ≥ T∗.Consequently, ∀t ≥ T∗,U(t, φ) is an increasing functionof φ. Let t > T∗ − i, U(t, φ) increases in φ. Then fort ≥ T∗ − i − 1, we have the first term in the right sideof (51), i.e., S(t, φ), increases in φ with a constant slope≥ 0, as shown in Fig. 3. The second term in (51), i.e.M(t, φ), increases from backward induction hypothesis,as shown in Fig. 3. Therefore U(T∗ − i − 1, φ) increases.

From lemma 1 and lemma 2, the proposed spectrum accessedifice is optimal [7]. For the distribution that satisfies thetechnical constraint T∗ < ∞, we find t so that the conditionin (54) is satisfied. Furthermore, we use backward inductionto find the optimal threshold φ∗

t to maximize utility U(t, φ).

4. Uniform distribution, Gaussian distribution, Raleigh distribu-tion, Weibull distributions with shape parameter β > 1, and a classof beta distributions with parameter α ≥ 1[7]. The condition T∗ < ∞does not hold for all distributions, such as the Weibull distributionwith β < 1 [7] where T∗ may be infinite because �

Tt is an increasing

function of t.

TABLE 2Clusters Parameter Estimation Results for Traffic Sources from

Nonparametric Variational Inference

6 EXPERIMENTAL RESULTS

To study the effectiveness of the proposed approach, weconsider the simulation results in three aspects:

1) Traffic pattern detection and clustering: the hit ratefor assigning every single data point to its cor-rect cluster, which is defined as the number datapoints assigned to their original clusters over thetotal number of feature points.

2) Traffic-based threshold for sensing-transmission: Weshow the sensing-transmission threshold for varioustraffic patterns, and demonstrate the effect of thesensing-transmission threshold for various sensingtime �S.

3) Throughput analysis: In this metric, we show theperformance gain obtained from the proposed algo-rithm by exploiting STO and LTO states to maximizethe SU utility. We compare throughput obtainedby the proposed approach with the OSTS [7]. TheOSTS approach optimizes the SU’s transmissionstrategy in the PU idle time based on the SU’saction-observation history.

6.1 Traffic Detection and ClusteringIn this subsection, we describe and illustrate the effective-ness of the proposed non-parametric clustering approachfor detection and clustering of different traffic patterns. Inthis set of experiments, we use real wireless traces availableonline [17], [18] for the 3G network. We utilize three sources(UDP, VoIP, Game) in our data set. Since the proposed clus-tering approach is non-parametric, an unbounded numberof traffic patterns can be clustered. For a different num-ber of traffic patterns in the data set, the hyper-parametersare set accordingly to achieve higher accuracy. The clusterparameters estimated from approach in Section 4 is shownin the Table 2.

6.1.1 Clustering AccuracyIn this set of experiments, the number of traffic patternsvaries from 1 to 4. For a specific number of clusters, thehyper-parameters set is tuned accordingly, as shown inFig. 4. In Fig. 4 (left and middle), the hit accuracy is shownwith varying data set sizes. We compared the accuracy ofthe proposed approach with a non-parametric mean shift(MS) algorithm. From the clustering results, the proposed


Fig. 4. Hit accuracy with 150 data points (left) and Hit accuracy with 100 data points (middle) for the proposed and MS methods. Convergence timefor the proposed and MCMC based clustering approaches with corresponding parameters (right).

Fig. 5. Sensing-transmission thresholds for STO (left) and LTO (right) with corresponding cluster parameters.

approach outperforms with 90% to 100 % hit accuracy.However, the MS algorithm is unable to achieve higheraccuracy.

6.1.2 Convergence TimeWe compared the convergence time of the proposed algo-rithm with the Markov chain Monte Carlo (MCMC) [16].From Fig. 4 (right), the convergence time is shown withvarying numbers of data points. The hyper-parametersare sensitive to clustering results; therefore, the respectivehyper-parameters for both algorithms are stated in Fig. 4(right). It is evident from the results that the proposedclustering algorithm converges faster than the MCMC algo-rithm. Therefore, a SU can make a decision quickly toexploit transmission opportunities.

6.2 Traffic-Based Threshold forSensing-Transmission

The PU idle time distribution is obtained from the clus-tering results in Table 2, which follows the Pareto heavytail. It is evident from the Table 2 that the parametersfor the a Pareto heavy tail distribution are traffic-specific.Therefore, for SU transmission optimization, the parame-ters for the current traffic application are plugged-in fromcluster results into (37) and (38). The per time unit ben-efit is B = 1. Since we exploit both STO and LTO states,the spectrum opportunity is greater. The SU packet dura-tion is �T = 5. The impact of variable sensing time over

the performance is studied in the simulation, by tuning�S = 1, 2, 4.

Our performance metric is SU throughput, which refersto SU successful transmission time normalized by the PUidle-busy cycle. The SU’s throughput upper bound in morethan the PU idle time, due to transmission opportunity inPU STOs. Therefore, the upper bound can go above 50%spectrum utilization. We consider both perfect sensing andimperfect sensing with capture effect.

6.2.1 Traffic-Specific Sensing-Transmission ThresholdsIt shows in Table 2 that the parameters for each trafficpattern varies significantly. Consequently, the SU needs toadapt accordingly to optimize transmission strategy effi-ciently. Fig. 5 shows the sensing-transmission threshold (φ)in STO and LTO. The STO and LTO state transmissionthresholds are different, as the mean idle time distributionsare different for both cases. This is evident from Fig. 5.The optimal threshold φ varies between 0.92 to 0.99 withdifferent exponential rises in φ for each traffic pattern.

6.2.2 Varying SU Sensing TimeFrom the simulation results, by varying sensing time �S,φ also changes. It is observed that for the larger �S, φ issmaller. In Fig. 6 (left), the threshold for STO significantlyfluctuates due to the short arrivals of PU packets. Since eachinter-arrival time pint in STO is relatively smaller than LTO,the threshold varies significantly in STO. We also demon-strate the effect of �S with the threshold φ for various traffic


Fig. 6. Sensing-transmission thresholds for both STO and LTO states with varying sensing time �S which is corresponding to cluster parameters(left). Sensing-transmission thresholds for the proposed approach with varying sensing time �S (right).

Fig. 7. Throughput analysis under perfect sensing for different traffic patterns (left), and throughput comparison with OSTS approach for VoIP trafficpattern (right).

patterns as shown in Fig. 6 (right). The cluster parametersplay significant role in adaptation of the SU to the PU trafficpattern, which results in throughput gain of the SU.

From Fig. 6, threshold (φ) for two states is shown. Thethreshold for STO state fluctuates significantly due to theshorter transmission opportunities for the SU. It is evi-dent from Fig. 6 that by varying sensing time, the SUs’transmission threshold changes as well.

6.3 Throughput AnalysisIn this scenario, we present a throughput of the proposedalgorithm, and then compare it with an OSTS approach [7].We also demonstrate perfect and imperfect sensing impactson throughput.

6.3.1 Perfect Sensing Without ACKWe show throughput analysis with varying collisionpenalty L. The performance of the proposed scheme isshown in Fig. 7. When there is no collision penaltyL = 0, the SU can fully utilize the spectrum opportuni-ties. The throughput decreases with the collision penalty L.Therefore, the PU can adjust L to achieve the desired pro-tection against the SUs, and thus aggressive access from

the SUs can be effectively controlled. In Fig. 7 (right), theeffectiveness of the proposed algorithm is compared withthe OSTS approach for VoIP traffic pattern, which showsperformance gains in terms of throughput.

From Fig. 8, throughput for the game and P2P traf-fic patterns show approximately similar behavior for theproposed approach, the reason for this similarity is in theSTO/LTO periods, i.e., from the clustering results obtainedfrom the proposed variational inference in Table 2, bothof these traffic patterns show approximately similar packetinter-arrival times, pint. However, the variance in packetlength, �, reduces the overall throughput for the game traf-fic pattern as compare to the UDP traffic pattern, whichtransmits fixed size packets. Moreover, the overall through-put for the UDP traffic pattern is less than VoIP, UDP,and P2P. It is due to the smaller STO/LTO period, i.e.,the packet inter-arrival times, pint, which results in smallerSTO/LTO, as shown in Table 2. The performance of theOSTS approach, in Fig. 8 for UDP, game, P2P, and VoIP traf-fic patterns remains smaller due to the fact that it ignoresthe STO periods. From Fig. 8, we observe a significant per-formance gain by exploiting STOs corresponding to eachtraffic pattern.


Fig. 8. Throughput comparison of the proposed and OSTS approaches for game (left), P2P (middle), and UDP traffic patterns (right).

Fig. 9. Throughput results of the proposed approach with varying γ1 (left). Throughput results of the proposed and OSTS approaches with γ1 = 0.1,γ1 = 0.5, and γ1 = 1 (middle). Imperfect sensing results for the proposed method (right).

6.3.2 Perfect Sensing with ACKWe know that γ0 is the packet error rate due to channelfading or noise, where γ1 is the capture effect at the SUreceiver. As shown in Fig. 9, with SU-Rx acknowledge-ment, we consider imperfect PU/SU collision detection bysetting γ0 = 0.1 and γ1 = 0.1, 0.5, 1.0, respectively. Weobserve that when the collision cost is small, the SU obtainshigher throughput with a higher capture effect. However,the effect of the higher capture effect is demonstrated inFig. 9. We also compared our proposed approach with theOSTS approach, and show the performance gain with thecapture effect γ0. By adjusting the collision cost, the SUthroughput and PU protection can be effectively balanced.

6.3.3 Imperfect SensingTo test the impact of imperfect detection, we set Pf = 0.1and the detection probability Pd at various values. FromFig. 9 (right), we observe that the effective throughputobtained by SU decreases with decrease in Pd. This decreasein throughput is due to increase in the collision penalty L.We also observe from Fig. 9 (right), that by increasing thecollision penalty, L, the over-all throughput is decreased.

7 CONCLUSION

In this paper, we study two main concepts to facilitatethe SU in efficiently utilizing PU spectrum opportunities.First, the SU clusters traffic patterns in a nonparametricway by using variational Bayesian inference. Second, basedon the obtained parameters, a sensing-transmission strat-egy is presented for the SU. A SU obtains a reward forsuccessful transmission and a collision penalty for each col-lision. Each clustered traffic pattern has its own unique setof parameters. Those traffic-specific parameters help the SU

to optimize its transmission strategies differently for vari-ous traffic patterns. From the experimental and simulationresults, we have shown that the proposed algorithm hassuperior performances in detecting various traffic patternsand maximization of SU throughput. We have also demon-strated that short transmission opportunities can increasespectrum utilization by exploiting traffic-specifics.

ACKNOWLEDGMENTS

This work is partially supported by National ResearchFoundation of Korea (No. 20090075107) and the Ministryof Science, ICT, and Future Planning, ICT R&D Program2013. The corresponding author is J. B. Song.

REFERENCES

[1] M. Mellia, A. Pescape, and L. Salgarelli, “Traffic classificationand its applications to modern networks,” Comput. Netw. Int. J.Comput. Telecommun. Netw., vol. 53, no. 5, pp. 759–760, Apr. 2009.

[2] C. Gu, S. Zhang, X. Xue, and H. Huang, “Online wireless meshnetwork traffic classification using machine learning,” J. Comput.Inf. Syst., vol. 7, no. 5, pp. 1524–1532, Aug. 2011.

[3] D. Bonfiglio, M. Mellia, M. Meo, D. Rossi, and P. Tofanelli,“Revealing Skype traffic: When randomness plays with you,”ACM Comput. Commun. Rev., vol. 37, no. 4, pp. 37–48, Oct. 2007.

[4] M. E. Ahmed, J. B. Song, N. T. Nguyen, and Z. Han,“Nonparametric Bayesian identification of primary users’ pay-loads in cognitive radio networks,” in Proc. IEEE Int. Conf.Communication, Ottawa, ON, Canada, Jun. 2012.

[5] M. E. Ahmed, J. B. Song, and Z. Han, “Traffic pattern-basedreward maximization for secondary user in dynamic spectrumaccess,” in Proc. IEEE WCNC, Shanghai, China, Apr. 2013.

[6] K. W. Choi and E. Hossain, “Opportunistic access to spectrumholes between packet bursts: A learning-based approach,” IEEETrans. Wireless Commun., vol. 10, no. 8, pp. 2497–2509, Aug. 2011.

[7] S. Huang, X. Liu, and Z. Ding, “Optimal sensing-transmissionstructure for dynamic spectrum access,” in Proc. IEEE Int. Conf.INFOCOM, Rio de Janeiro, Brazil, Apr. 2009, pp. 2295–2303.


[8] Q. Zhao, L. Tong, and A. Swami, “Decentralized cognitive MACfor dynamic spectrum access,” in Proc. 1st IEEE Int. Symp. NewFrontiers Dynamic Spectrum Access Networks, Baltimore, MD, USA,Nov. 2005, pp. 224–232.

[9] S. Geirhofer, L. Tong, and B. M. Sadler, “Cognitive radios fordynamic spectrum access—Dynamic spectrum access in the timedomain: Modeling and exploiting white space,” IEEE Commun.Mag., vol. 45, no. 5, pp. 66–72, May 2007.

[10] S. Geirhofer, L. Tong, and B. M. Sadler, “A measurement-basedmodel for dynamic spectrum access in WLAN channels,” inProc. IEEE Military Communications Conf., Washington, DC, USA,Oct. 2006.

[11] P. Papadimitratos, S. Sanjaranaryanan, and A. Mishra, “Abandwidth sharing approach to improve licensed spectrumutilization,” IEEE Commun. Mag., vol. 43, no. 12, pp. 10–14,Dec. 2005.

[12] L. Gao, X. Wang, Y. Xu, and Q. Zhang, “Spectrum trad-ing in cognitive radio networks: A contract-theoretic modelingapproach,” IEEE J. Sel. Areas Commun., vol. 29, no. 4, pp. 843–855,Apr. 2011.

[13] X. Wang et al., “Spectrum sharing in cognitive radio networks—An auction based approach,” IEEE Trans. Syst., Man Cybern. B,Cybern., vol. 40, no. 3, pp. 587–596, Jun. 2010.

[14] M. E. Ahmed, J. S. Kim, R. Mao, J. B. Song, and H. Li, “Distributedchannel allocation using kernel density estimation in cognitiveradio networks,” Electron. Telecommun. Res. Inst. J., vol. 34, no. 5,pp. 771–774, Oct. 2012.

[15] F. Wood and M. J. Black, “A nonparametric Bayesian alternativeto spike sorting,” J. Neurosci. Methods, vol. 173, no. 1, pp. 1–12,Jun. 2008.

[16] Y. Teh, M. Jordan, M. Beal, and D. Blei, “Hierarchical Dirichletprocesses,” J. Amer. Statist. Assoc., vol. 101, no. 476, pp. 1566–1581,Dec. 2006.

[17] [Online]. Available: http://crawdad.cs.dartmouth.edu/meta.php?name=snu/wow_via_wimax

[18] [Online]. Available: http://crawdad.cs.dartmouth.edu/meta.php?name=kaist/wibro

[19] D. Blei and M. Jordan, “Variational inference for Dirichlet processmixtures,” Bayesian Anal., vol. 1, no. 1, pp. 121–144, Aug. 2006.

[20] N. T. Nguyen, G. Zheng, Z. Han, and R. Zheng, “Device fin-gerprinting to enhance wireless security using nonparametricBayesian method,” in Proc. IEEE Int. Conf. INFOCOM, Shanghai,China, Apr. 2011, pp. 1404–1412.

[21] N. T. Nguyen, R. Zheng, and Z. Han, “On identifying primaryuser emulation attacks incognitive radio systems using nonpara-metric Bayesian classification,” IEEE Trans. Signal Process., vol. 60,no. 3, pp. 1432–1445, Mar. 2012.

[22] W. Saad, Z. Han, H. V. Poor, T. Basar, and J. B. Song, “Acooperative Bayesian nonparametric framework for primary useractivity monitoring in cognitive radio network,” IEEE J. Sel. AreasCommun., vol. 30, no. 9, pp. 88–99, Jun. 2012.

[23] H. Ishwaran and L. F. James, “Gibbs sampling methods forstick-breaking priors,” J. Amer. Statist. Assoc., vol. 96, no. 453,pp. 161–173, Mar. 2001.

[24] S. M. Ross, Introduction to Stochastic Dynamic Programming.Orlando, FL, USA: Academic, 1995.

M. Ejaz Ahmed received his MCS degree inComputer Science from the Kohat University ofSciences and Technology, Kohat, Pakistan andthe M.S(I.T) degree in Information Technologyfrom the National University of Sciences andTechnology (NUST), Islamabad, Pakistan in2006 and 2011, respectively. Currently, he ispursuing a Ph.D. degree in the Electronicsand Radio Engineering Department at KyungHee University, South Korea. Currently, hisresearch interests include Bayesian nonpara-

metric inference and cooperative communications in cognitiveradios.

Ju Bin Song received the Ph.D. degree fromthe Department of Electronic and ElectricalEngineering, University College London (UCL),London, U.K., in 2001 as well as the B.Sc. andM.Sc. degrees in 1987 and 1989, respectively.From 1992 to 1997, he was a Senior Researcherin the Electronics and TelecommunicationsResearch Institute (ETRI), South Korea. Hewas a Research Fellow in the Department ofElectronic and Electrical Engineering at UCL in2001. During 2002-2003, he was an Assistant

Professor in the School of Information and Computer Engineering atHanbat National University, South Korea. He is currently a Professorin the Department of Electronics and Radio Engineering, Kyung HeeUniversity, South Korea, since 2003. He was appointed Head of theDepartment of Radio Engineering, Kyung Hee University, in 2005. From2009 to 2010, he was a Visiting Professor in the Department of ElectricalEngineering and Computer Science at the University of Houston, Texas,USA. His current research interests include resource allocation in com-munication systems and networks, cooperative communications, gametheory, optimization, cognitive radio networks, and smart grid. Dr. Songserves as a member of the technical program committee for interna-tional conferences on communications and networks and is an editor forinternational journals on communications and networks. Dr. Song was aKyung Hee University Best Teaching Award recipient in 2004 and 2012,respectively.

Zhu Han received the B.S. degree in electronicengineering from Tsinghua University, in 1997,and the M.S. and Ph.D. degrees in electricalengineering from the University of Maryland,College Park, in 1999 and 2003, respectively.From 2000 to 2002, he was an R&D Engineerof JDSU, Germantown, Maryland. From 2003to 2006, he was a Research Associate at theUniversity of Maryland. From 2006 to 2008,he was an assistant professor in Boise StateUniversity, Idaho. Currently, he is an Assistant

Professor in the Electrical and Computer Engineering Department atthe University of Houston, Texas. His research interests include wire-less resource allocation and management, wireless communicationsand networking, game theory, wireless multimedia, security, and smartgrid communication. Dr. Han is an Associate Editor of IEEE Transactionson Wireless Communications since 2010. Dr. Han is the winner of IEEEFred W. Ellersick Prize 2011. Dr. Han is an US NSF CAREER awardrecipient 2010. Dr. Han is the coauthor for the several papers that wonthe best paper awards in IEEE Conferences. He has been an IEEEfellow since 2014.

Doug Young Suh received B.S. degree inDepartment of Nuclear Engineering from SeoulUniversity, Seoul, Korea, in 1980, and the M.Sand Ph.D. degrees from the Department ofElectrical Engineering in Georgia Institute ofTechnology, Atlanta, GA, USA, in 1986 and 1990.In September 1990, he joined Korea Academyof Industry and Technology and conductedresearch on HDTV until 1992. Since February1992, he is a professor in College of Electronicsand Information in Kyunghee University. His

research interests include networked video and video compression. Hehas been working as a Korean delegate for ISO/IEC MPEG Forum since1996.

� For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

Documents

Sensing-Transmission Edifice Using Bayesian Nonparametric Traffic Clustering in Cognitive Radio Networks