On Backup Battery Data in Base Stations of Mobile Networks ...xiaoyif/Papers/BatPro-CIKM.pdf · On Backup Battery Data in Base Stations of Mobile Networks: Measurement, Analysis,

On Backup Battery Data in Base Stations of MobileNetworks: Measurement, Analysis, and Optimization

Xiaoyi FanSchool of Computing Science

Simon Fraser UniversityBurnaby, BC, Canada

[email protected]

Feng WangDepartment of Computer and

Information ScienceThe University of Mississippi

University, MS, [email protected]

Jiangchuan LiuSchool of Computing Science

Simon Fraser UniversityBurnaby, BC, Canada

[email protected]

ABSTRACTBase stations have been massively deployed nowadays to af-ford the explosive demand to infrastructure-based mobilenetworking services, including both cellular networks andcommercial WiFi access points. To maintain high serviceavailability, backup battery groups are usually installed onbase stations and serve as the only power source during pow-er outages, which can be prevalent in rural areas or duringsevere weather conditions such as hurricanes or snow storms.Therefore, being able to understand and predict the batterygroup working condition is of immense technical and com-mercial importance as the first step towards a cost-effectivebattery maintenance on minimizing service interruptions.

In this paper, we conduct a systematical analysis on a realworld dataset collected from the battery groups installed onthe base stations of China Mobile, with totally 1,550,032,984records from July 28th, 2014 to February 17th, 2016. Wefind that the working condition degradation of a batterygroup may be accelerated under various situations and cancause premature failures on batteries in the group, which canhardly be captured by nowadays maintenance procedure andeasily lead to a power-outage-triggered service interruptionto a base station. To this end, we propose BatPro, a bat-tery profiling framework, to precisely extract the featuresthat cause the working condition degradation of the batterygroup. We formulate the prediction models for both batteryvoltage and lifetime and develop a series of solutions to yieldaccurate outputs. By real world trace-driven evaluations, wedemonstrate that our BatPro approach can precisely predic-t the battery voltage and lifetime with the RMS error lessthan 0.01 v.

CCS Concepts•Information systems→Data stream mining; •Hardware→ Batteries; •Applied computing → Enterprise appli-cations; •Computer systems organization → Reliabili-ty;

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].

CIKM’16 , October 24 - 28, 2016, Indianapolis, IN, USAc© 2016 Copyright held by the owner/author(s). Publication rights licensed to ACM.

ISBN 978-1-4503-4073-1/16/10. . . $15.00

DOI: http://dx.doi.org/10.1145/2983323.2983734

KeywordsBackup power system; battery aging profiling; remaininglifetime prediction; multi-instance multi-label learning

1. INTRODUCTIONThe global mobile service markets have drastically ex-

panded due to recent advances in the wireless telecommu-nication technologies, particularly the 3G/4G as well as theemerging 5G, which also render the mobile network band-width becomes higher than ever. Meanwhile, a large andgrowing number of wireless access points are being deployedby network operators, e.g., ShawOpen1 and TELUS Wi-Fihotspots2. To afford such great changes, more and more basestations have been strategically constructed and deployed inthe mobile network infrastructure to satisfy the service cov-erage and quality (e.g., bandwidth) requirements. The basestations, which may range from large towers to small tow-ers for local coverage, are then aggregated into the carriernetwork through the backhaul infrastructure by fiber linesor microwave links. However, this renders that the networkdowntime will cascade to all the dependent base stationsif any part experiences a power outage [1], which althoughis arguably rare in urban areas but may often happen dur-ing bad weathers such as hurricane, tornado or snow storm,especially in rural areas.

Besides connecting to the utility grids, each base station isalso equipped with a backup battery group to improve theservice availability. When a power outage happens in theutility grid, to avoid any service interruption, the batterygroup discharges to support the communication equipmentuntil the emergency generator is delivered to provide enoughpower supply. This makes the battery group and its workingconditions play a critical role during the power outage.

However, while the scale of base stations is fast expandingfor service coverage, the practice is that only a limited num-ber of engineers can be dispatched for regular examinationon the backup power systems, and the duration between twoconsecutive checks can be as long as over three months. Dur-ing this period, if the battery working condition deteriorates,a power outage can easily keep interrupting the service, e-specially in rural areas or during bad weather conditions,where the emergency repairing can take days or even weeks.

Therefore, being able to predict the working condition ofthe battery group is of immense technical and commercial

1https://www.shaw.ca/wifi/2https://wififinder.telus.com/

importance for system maintenance, as it is the first steptowards a cost-effective battery maintenance on minimizingservice interruptions. Yet a successful prediction requiresnot only the knowledge of the ageing processes that leads toa loss of performance within a battery, but also a good un-derstanding of the stress status that may induce and accel-erate the aging process, as well as the relationships betweenthe stress status and the ageing process.

To this end, we collaborated with China Mobile, the mo-bile phone operator with the largest number of subscribers(currently 806 million users), and collected 46,913 equip-ment data with totally 1,550,032,984 rows from July 28th,2014 to February 17th, 2016 including 105 categories of sta-tus, which are further processed and analyzed by our cloudcomputing platform with Hadoop 1.2.1 and Hive 1.2.1. Wefind that the working condition degradation of a batterygroup may be accelerated under various situations and cancause premature failures on batteries in the group, which canhardly be captured by nowadays maintenance procedure andeasily lead to a power-outage-triggered service interruptionto a base station.

To address this issue, we propose BatPro, a battery pro-filing framework, to precisely predict base station batterygroup working conditions by extracting the features thatcause the working condition degradation. In particular, wedecompose the voltage in time series into the aging and fluc-tuation terms. Based on the voltage aging term and statusin logs, we analyze the battery aging trend to profile its evo-lution including the slope and the variance that indicatesthe working conditions of the battery. We formulate the bat-tery aging problem as a multi-labels multi-instances problemwith the objective to minimize the root-mean-square (RMS)error between the predictive value and the real-world value,so as to ensure an accurate battery voltage profiling. Basedon the framework, we further formulate the prediction mod-els for both battery voltage and lifetime and propose a se-ries of solutions to yield accurate outputs. By real worldtrace-driven evaluations, we demonstrate that our BatProapproach can precisely predict the battery voltage and life-time with the RMS error less than 0.01 v.

The rest of the paper is organized as follows. Section 2provides our detailed observations on the battery properties.Section 3 presents BatPro framework to efficiently solve theproblem. Section 4 discusses the performance evaluationresults on our BatPro approach. We provide a literaturereview in Section 5 and we conclude this paper in Section 6.

2. OVERVIEW AND DATA ANALYSISIn this section, we describe the background and motiva-

tion of our work, and then discuss the collected dataset fromChina Mobile with our observations to better understand thebattery working condition and its deterioration process.

2.1 Backup Battery in Base StationsWe illustrate a generic backup power system in the base

stations of mobile networks. The equipment in base stationsis supported by the utility grid, where the battery group isinstalled as the backup power. In case that the utility gridinterrupts, the battery discharges to support the communi-cation switching equipment during the period of the poweroutage. As Fig. 1(a) shows, there are two battery groups inthe mobile network base station and each battery group con-tains 24 cell batteries. In Fig. 1(b), the monitoring system

(a) Two battery groups (b) The monitoring systems

Figure 1: The monitoring systems

connects to each cell of the battery group and periodicallyrecords the voltage and status in both normal and abnormalsituations. When the monitoring system reports the alertstatus, e.g. the power outage, the emergency repairing ser-vice is scheduled depending on the accident severity. Sincefew base stations have the diesel generators permanently in-stalled on site, the engineers have to drive the emergencydiesel generator to provide the power, which can take timeto arrive at the site. The power outage can occur frequent-ly and severely in the rural areas and developing countriesdue to the unstable utility grid. To make it even worse, theconstruction of infrastructure often makes that the base s-tations are difficult to reach, e.g. slippery rock trails in themountains, where the workers have to manually carry theheavy generators to the site.

As a result, the mobile network carriers may have to tradeoff between the cost and the quality of service, so that theyeven abandon the base stations in the tough surroundingsuntil the utility grid is restored. On the other hand, con-sidering the labor costs for the mobile telecom carriers, theperiodical maintenance for the battery group is generally oflong intervals, which further exaggerates the possibilities ofbattery accidents during the outage of the utility grid. Ifthe repairing engineers cannot arrive at the site before thebattery group is exhausted, the availability of the base sta-tions cannot be guaranteed. Thus prediction for the lifetimeof battery group is meaningful for the service availability,which is helpful for maintenance engineers to solve the po-tential issues in advance during the periodical maintenance.

Although the Li-ion and NiCd batteries demonstrate thelatest development in battery technology due to their small-er size, lower weight, and better storage efficiency, the majordrawbacks of these types of batteries are the high cost. Onthe other hand, Lead-acid batteries in Fig. 1(a) have largecapacities and thus have been widely used for storage inbackup power supplies in base stations. The aging mecha-nism of Li-ion batteries attracts many efforts [21], where thefrequent activities of Li-ion batteries produce lots of logs andprovide possibilities to measure the battery working condi-tions. Yet the lead-acid batteries in base stations normallykeep in the float-charging status, where float-charging sta-tus represents that a battery maintains the capacity by com-pensating for self-discharge after being fully charged. Themonitoring system collects the float voltage from the float-charging batteries twice per day, which makes the datasetconsiderately sparse. Therefore extracting the features fromsuch a sparse data source and predicting the working con-ditions of lead acid batteries pose many challenges and herewe take the first attempt to tackle these issues.

historybattery

equipmentid

12405850

12405850

12405850

historystatus

equipmentid

12405850

12405850

12405850

recordtime floatvalue signalseverity

2015-01-28 9:37 AM 1.38341 0

2015-01-28 3:36 PM 2.22239 255

2015-01-28 4:51 PM 0.97471 0

starttime endtime meanings

2015-01-28 9:31 AM 2015-01-28 10:06 AM voltage low

2015-01-28 12:01 PM 2015-01-28 12:09 PM voltage low

2015-01-28 4:49 PM 2015-01-28 6:24 PM voltage low

Figure 2: A piece of a real log file

2.2 Battery Data and StatusThe log data that we have collected is from July 28th, 2014

to February 17th, 2016. In our dataset, we have identified105 categories of status codes and obtained 46,913 equip-ment data with totally 531 tables and 1,550,032,984 rows.As Fig. 2 shows, the voltage data is recorded in the tablenamed historybattery with an equipment id, record time,float value of voltage and signal severity, while a status inthe table named historystatus consists of an equipment id, s-tart timestamp, end timestamp and message text describingthe status.

2.2.1 Voltage Readings

July 2014 Jan 2015 July 2015 Jan 2016Date

1.6

1.8

2

2.2

2.4

2.6

The

mea

n of

dai

ly v

olta

ge (

v)

PoorGood

Figure 3: Mean voltage versus battery status

The voltage of each cell battery is the most important fea-ture that we have measured, as it reflects the power outputpattern of the battery. In general, we have observed tworepresentative categories of cell batteries, where we manu-ally choose 1578 batteries as the newly-installed group andput 1459 batteries into the nearly-dead group depending onthe repair records. The rated voltage of cell is around 2.23vand the rated voltage of battery group is 53.5v, where 24cell batteries are connected in serial as one battery group.Based on this, we further analyze the typical status of thevoltage patterns inside the two representative cell batterycategories. Fig. 3 shows the significant differences in meanvoltage between the newly-installed and nearly-dead batter-ies. The blue solid line plots the mean voltage of newly-installed batteries, which judders between 2.14v and 2.24v.The red dotted line shows the decay trend on the mean volt-

age of the nearly-dead batteries. There is a clear downwardtrend close to the failure date, where the battery power fre-quently falls down and becomes quickly exhausted, causingmany issues and alerts in the mobile network base station.

0 0.5 1 1.5 2 2.5Mean voltage (v)

0

100

200

300

400

500

600

Rem

aing

life

tim

e (d

ays)

Figure 4: Correlation between the remaining life and meanvoltage

Fig. 4 further plots how the mean voltage and the lengthof remaining lifetime correlate with each other, which in-dicates that the mean voltage has strong correlations withthe battery life. Fig. 5 shows the results on the voltagevariances, where the blue solid line represents the newly-installed battery can output a steady power and the vari-ance of the voltage keeps very close to zero. The red dottedline illustrates that the variance of the nearly-dead batteriesincreases much faster than the newly-installed batteries.

July 2014 Jan 2015 July 2015 Jan 2016Date

0

0.04

0.08

0.12

0.16

0.2

The

var

ianc

e of

dai

ly v

olta

ge (

v)

PoorGood

Figure 5: Voltage variances versus battery status

Fig. 6 illustrates that the voltage variance has the corre-lation with the length of remaining lifetime, indicating thatthe variance of the output voltage from the batteries overtime also reflects the aging trend of battery quality degra-dation. These observations motivate us to predict batteryworking conditions based on the battery historical voltages.

0 0.5 1 1.5 2 2.5Voltage variance (v)

0

100

200

300

400

500

600R

emai

ng li

fe ti

me

(day

s)

Figure 6: Correlation between the remaining life and voltagevariance

July 2014 Jan 2015 July 2015 Jan 2016

2

2.05

2.1

2.15

2.2

Vol

tage

(V

)

July 2014 Jan 2015 July 2015 Jan 20162.21

2.215

2.22

2.225

2.23

Vol

tage

(V

)

TrendErrorHigh TempLow VoltageDischarge

Figure 7: Example of time series showing battery aging phe-nomenon with status

2.2.2 Battery StatusWe perform an analysis on the battery status in the logs to

explore their potential relationships with the working con-ditions of the batteries. Fig. 7 shows an example of aneighteen-month-long time series voltage collected from onecell battery. This cell suffered a deep battery discharge atthe beginning of July 2015, and another started in August.This leads to the degradation of the whole battery pack,yet such the aging status are too messy and excessive to becaptured. Tab. 1 lists the number and percentage of some s-elected status categories, and Fig. 8 also shows the frequencydistribution among all the 105 categories. We can see thatthe distribution is highly skewed: the most popular categoryis Alert, at about 28.09%, the second is Faulty cell, at about20.42%; and the third is Discharging, at about 10.70%.

We take three status as examples to further investigate thecorrelations between status and battery remaining lifetime,which are low float voltage, discharge and fault cell shownin Fig. 9(a), (b) and (c), respectively. We count the specificstatus number for each battery until the batteries are re-placed, and pick up the top-30 batteries with the maximumnumber of status. From July 28th, 2014 to February 17th,2016, there are 576 days in our dataset, where the remaining

Rank Category Count Percentage1 Alert 1479914 28.09%2 Faulty cell 1075533 20.42%3 Discharge 563469 10.70%4 Low float voltage 457767 8.69%5 Too high 292966 5.56%6 Low 263625 5.00%14 Power outage 48605 0.92%26 Failure 9100 0.17%27 Voltage too low 7060 0.13%30 Low cell voltage 2830 0.05%34 Generator on charge 960 0.02%66 No power 23 4.36E-0669 AC out 10 1.89E-0670 Generator running 10 1.89E-06

Table 1: LIST OF BATTERY STATUS CATEGORIES

0 20 40 60 80 100Categories

0

5

10

15

Fre

quen

cy

#105

Figure 8: Distribution of battery status categories

lifetime of most batteries in our dataset is longer than 576days, therefore dash lines represent that those batteries onit have longer remaining lifetime than 576 days. Fig. 9(a)and (b) plot the correlation between low float voltage, Dis-charge and remaining lifetime. This clearly demonstratesthat there exists a strong correlation between battery re-maining lifetime and low float voltage, as well as betweenbattery remaining lifetime and Discharge. We further plotthe remaining lifetime against the number of fault cell s-tatus in the system in Fig. 9(c), which does not show anynoticeable correlation between them. This implies that theremaining lifetime is only affected by some specific status.The observations suggest that the diverse status have dif-ferent influences on the battery working conditions, thus itis necessary to discriminatingly differentiate these status forthe accurate prediction.

3. BATPRO: DESIGN SKETCHBased on our measurement and analysis, the battery work-

ing conditions are correlated with the historical voltage andstatus logs. Yet there still remain several challenges to pre-cisely predict the battery lifetime, especially when there aremassive status in logs with the noises existing. In this sec-tion, we propose BatPro, a battery profiling framework, topredict the battery working conditions.

0 50 100 150 200 250Average number of status per day

0

100

200

300

400

500

600

Rem

aini

ng li

fetim

e

(a) Low float voltage status

0 0.5 1 1.5 2 2.5 3 3.5Average number of status per day #104

0

100

200

300

400

500

600

Rem

aini

ng li

fetim

e

(b) Discharge status

0 50 100 150 200 250 300 350Average number of status per day

0

100

200

300

400

500

600

Rem

aini

ng li

fetim

e

(c) Fault cell status

Figure 9: Correlation between the remaining life and the number of different status

3.1 Problem StatementOur observations suggest that the working condition of

batteries can be predicted by its relevant historical voltageand status, which motivates us and serves as the basis forthe prediction of battery remaining lifetime. Before we pro-ceed with the detailed solutions for the BatPro framework,we first summarize the key notations in Tab. 2. The moni-toring systems collect raw messages from all batteries withthe voltage and status, and we select the float voltage fromthe table named historybattery, where each vti having thevoltage value of battery i at t time. Now we have voltageset V = {v1,v2, ...,vN} and status set X = {x1,x2, ...,xN},where N is the number of batteries. Each voltage recordsvi = {v1i , v2i , ..., vTi } is a sequence of voltage records in timeseries for battery i, where vT is a value of voltage at time T .Our goal is to extract and profile the aging trends for theworking conditions of batteries, and predict the remaininglifetime of batteries. Since the remaining lifetime is pre-dicted base on the voltage as mentioned, the voltage agingextraction problem can be defined as follows:

Given the battery data with voltage V and historical sta-tus X from time slot 1 to T , we formulate this problem topredict the remaining lifetime of batteries based on the pre-

dictive voltage. Let vi′ = {v′T+1

i , ..., v′T+Ki } define the pre-

dictive voltage in the following K time slots, which is com-pared with the real-world voltage vi = {vT+1

i , vT+2i , ...vT+K

i }.The whole objective function can be written the followingway:

min

N∑i=1

T+K∑j=T+1

(vji − v′ji )

2 (1)

With the domain knowledge and our observations that thevoltage is the criterion of the battery condition, the remain-ing lifetime can be easily predicted with the voltage thresh-old θ for pre-defined battery failure.

3.2 BatPro FrameworkTo solve the problem, we propose the BatPro framework

to predict the working conditions of batteries shown in Fig. 10,which consists of the following stages: data preparation,voltage decomposition, voltage prediction and remaining life-time prediction. In the BatPro framework, we first filter outa large amount of noise in historical voltage and only utilizethe float voltage to reduce the interference on the voltagefrom other batteries activities, e.g., battery discharging. Wealso pull the relevant textual status from the database, as

mentioned in the prior section. While the historical statusare correlated with working conditions of batteries, a sin-gle status record is not a reliable factor for the prediction.Therefore, the BatPro framework decomposes the historicalfloat voltage in time series into the aging and fluctuationterms. In the aging term, we propose a concept of penaltyto describe the influences of a group of status on the voltageaging trend, which will be further illustrated later. Basedon the voltage aging term, the framework utilizes the learn-ing algorithm with status data to predict the future agingtrend, which is further combined with the predictive voltagefluctuation term by ARIMA. With the predictive voltage incoming months, the BatPro framework can then estimatethe remaining lifetime of batteries.

3.3 Profiling StrategyN the number of batteriesV voltage spacevi the voltage of battery ivia the aging term of battery ivif the fluctuation term of battery ivti the voltage of battery i on time tl the length of time segmentvkia the aging term of battery i on time segment kski the voltage slope of battery i on time segment kyki the voltage penalty of battery i on time segment kX status spacexi a status set for battery ixi,j j-th status in i-th battery groupmi the number of status for battery group iY penalty labels spaceyi a set of labels for battery iyi,j j-th penalty label in i-th battery groupni the number of penalty label for battery group iθ pre-defined battery failure thresholdri remaining life time for battery i

Table 2: Summary of Notations

As mentioned in Fig. 10, we decompose a given time seriesv into the aging term va and the fluctuation term vf , asfollowing shows:

v = va + vf (2)

To ensure that the trend va is monotonic, we can simplywrite the constraint of the objective function as:

vta ≥ vt+1a (3)

Fluctuation

Component Profiling

Aging Component

Profiling

Data Preprocessing

Battery Status Data

Battery Voltage Data

Log files

2015-1-28 2015-1-29 2015-1-30 2015-2-10... ...

Multi-Instances Multi-labels

Learning

Object (Battery)

label

label

label

Label

(Penalty)

...

...

Instance

(Status)

...

...

instance

instance

instance

Battery Voltage

Prediction

Battery Lifetime

Prediction

28-Jan-2015,

512002818,12405850,24100660,

2015-01-28 00:05:35.0,

1.24985,4,Low,0,3

29-Jan-2015,

512002818,12405850,24100260,

2015-01-29 08:51:00.0,

2.22702,5,null,255,3

Figure 10: Overview of the BatPro framework

Let vtf describe the variance of the voltage, which is in-creasing fluctuation with the growth of t. We further extendthe objective function of the optimization problem in Eq. 1as minimizing the reconstruction error ||v−v′a−v′f ||2 andensuring flatness of the fluctuation term over time, which issubject to the monotonicity constraint in Eq. 3. In the nextsubsections, we predict the voltage based on aging term andfluctuation term separately, and combine them to estimatethe battery remaining lifetime.

3.3.1 Aging TermTo extracting the voltage aging term and make the fluctu-

ation term stationary, we segment the float voltage in timeseries, and ensure the mean value difference of each segmen-t to be as small as possible. More specifically, given thevoltage vi in a time series, we partition it into segments asshown below:

vi = {v1i , v2i , ..., vli,︸︷︷︸1st seg

vl+1i , ..., v2li ,︸︷︷︸2nd seg

..., v(k−1)l+1i , ..., vkli ,︸︷︷︸

kth seg

...} (4)

where l is the length of the segment and each segment k hasinitial value vkia. We use the polynomial regression to fit eachsegment and the slope value ski to define the voltage agingterm in time segment k. The aging term can be defined as:

via = {(v1ia, s1i ), (v2ia, s2i ), ..., (vkia, s

ki ), ..} (5)

Then we propose a concept voltage penalty yki to representthe acceleration on the aging term:

yki = ski − sk−1i (6)

For battery i in time segment k, we have the aging term(vkia, s

ki ) as well as yki . We utilize the historical battery sta-

tus in the logs to predict the penalty yk+1i for the aging term

via. Given the training examples with a collection of his-torical battery status and voltage penalty, we can formulatea predictive problem as constructing a status classifier forpredicting voltage penalty. One of the major challenges is

that the monitoring system collects the status for each bat-tery group as mentioned in Section 2, where the cell batteryin the group has the exactly same status log, despite the cellbattery has the unique voltage data. Thus we predict thevoltage penalty for a battery group rather than every cellbattery, while the remaining life time of cell battery can beestimated based on the voltage penalty value of the batterygroup.

The problem can be transformed to a MIML problem [24],where each training example is associated with not only mul-tiple instances but also multiple labels. Instead of receivinga set of independently labeled instances as in the standardclassification, our MIML model shown in Fig. 10 receives aset of bags which are simultaneously associated with mul-tiple labels. In our problem, each status is regarded as aninstance and a voltage penalty is defined as a label. Theconcept bag denotes a set of status xi, where each bag xi

has mi instances xi = {xi,1, xi,2, ..., xi,mi}. Given the bagsobtained from different batteries, the goal is to build a clas-sifier that will label the unseen bags with a voltage penaltycorrectly.

Formally, let X denote the instance space and Y rep-resent the set of labels. We denote the training data by{(x1,y1), (x2,y2), ..., (xN,yN)} that consist of N examples,where yi is the penalty label associated with xi. As men-tioned, each battery has ni cells with different voltage ag-ing trend, where yi = {yi,1, yi,2, ..., yi,ni} is a subset ofall possible labels and yi ∈ Y. Then the objective is tolearn a function fMIML : 2X → 2Y from a given data set{(x1,y1), (x2,y2), ..., (xN,yN)} to accurately predicts thepenalty label yi for each bag xi.

Based on MIMLfast [11], we propose a modified approachwith an effective approximation of the original MIML prob-lem. Specifically, to utilize the relations among multiplelabels, MIMLfast first learns a shared space for all the la-bels from the original features, and then trains label specificlinear models from the shared space. To identify the key

instance to represent a bag for a specific label, the classifi-cation model is trained on the instance level, and then selectthe instance with maximum prediction. To make the learn-ing efficient, stochastic gradient descent is used to optimizean approximated ranking loss. In the testing phase, differ-ent with MIMLfast which returns a subset of all possiblelabels with the prediction value, the BatPro framework ob-tains the penalty label for each bag by selecting the one withthe largest prediction value. This modification enables us toeasily find the unique penalty label for each battery with thehighest confidence score.

3.3.2 Fluctuation TermVoltage variance is also a significant feature in predicting

the battery life. The fluctuation of float voltage is not easyto capture from the original time series using human eyes.With the domain knowledge and regular maintenance poli-cies, we know that most batteries chronically keep in float-charging status, which periodically make batteries chargeand discharge with weak current among the fixed intervals.Time series approach is suitable for describing such kindof stochastic process and also easy to establish the fore-casting model to forecast the fluctuation term of voltage.ARIMA [2] is one of the most popular time series modelsfor predicting future values of a time series, which has beenwidely used in financial, economic and social scientific fields.ARIMA model can be represented as a linear combination ofthe past observations and past errors, which consists of threeparts, i.e., an Autoregressive (AR) model, a Moving Aver-age (MA) model and an integrated part. The ARIMA(p, q)model is written as follows:

Y (t) =

p∑i=1

βiY (t− i) +

q∑i=1

θiεt−i + εt (7)

where Y(t) is the number of views in the time t. β1, ..., βp arethe parameters of Autoregressive (AR) model, and θ1, ..., θqare the parameters of the Moving Average (MA) model.ε1, ..., εt−1 are white noise error terms, which are generallyassumed to be Gaussian random variables with zero meanand constant variance. In our BatPro framework, we utilizea partial ARIMA model to represent the fluctuation term.

vtf =

q∑i=1

θiεt−i + εt (8)

Given the time series of float voltage vi in the past severaldays, it can make a fine-grained prediction for the voltage’sfuture evolution by leveraging the trend, periodicity and au-tocorrelation exhibited in the history information. The Bat-Pro framework can thus use it to extract and forecast thefluctuation term vif based on the voltage vi.

3.3.3 Remaining LifetimeWith the voltage aging trend term va and fluctuation term

vf , we can predict the future voltage v′ by Eq. 2. With thedomain knowledge and our observations that the voltage isthe criterion of the battery condition, we say the workingcondition of battery i is good, when the float voltage valueis higher than pre-defined threshold θ, which can be writtenas follows:

via + vif > θ (9)

With the voltage vta + vtf , the aging term st and penalty

yt of the voltage at t time, thus the remaining lifetime ri forbattery i can be calculated as follows:

ri =via + vif − θ

si + yi(10)

4. EXPERIMENTS AND DISCUSSIONThis section presents the evaluation of our BatPro frame-

work. The BatPro framework provides the precise predictionfor battery working conditions, which is evaluated againstthree well-known prediction approaches in time series, i.e.,Linear Regression [3], ARIMA [2] and Wavelet [7].

4.1 Experiment Setup

Figure 11: Clusters for Hive-based data processing

The massive data analysis requires lots of time and capac-ity, therefore we move the original 323 GB data from SybaseASE Database3 onto our cloud platform to make queryingover the archived data efficient, which is shown in Fig. 11.Our platform consists of eight Dell PowerEdge R430 Server,each equipped with two Xeon E5-2630 v3 2.4GHz 8 core,256 GB DDR3 RAM and a 10 Gbits/sec Network InterfaceCard (NIC). Hyper-Threading is enabled for the CPU sothat each CPU core can support two threads (virtual CPUcore). All the physical servers are inter-connected through aNETGEAR 16-port 10-gigabit switch. We use a dedicatedphysical machine to host the master node, rather than a vir-tual machine, so as to ensure fast response time with min-imized resource contention. Other physical machines thathost virtual slave nodes run the latest Xen Hypervisor (ver-sion 4.1.3). For the operating systems running on VMs (bothDomain0 and DomainU), we use the popular Ubuntu 12.04LTS 64 bit (kernel version 3.11.0-12). All the VMs are al-located 8 virtual CPU cores and 4 GB memory. We runHive 1.2.1 [19], a data warehouse infrastructure built on topof Apache Hadoop 1.2.1 and suitable for data managemen-t and analytical queries with HiveQL, an SQL-based querylanguage.

4.2 ResultsIn this section, we evaluate the prediction accuracy of Bat-

Pro framework using real-world trace collected from ChinaMobile and demonstrate the performance in estimating the

3www.sybaseproducts.com/

40 60 80 100 120BatPro 0.0030 0.0083 0.0053 0.0098 0.0060ARIMA 0.0154 0.0222 0.0290 0.0390 0.0451LR 0.0372 0.0386 0.0387 0.0398 0.0400Wavelet 0.0870 0.0868 0.0787 0.0794 0.0798

Table 3: RMS error between actual and predictive value

remaining battery lifetime. To evaluate the prediction qual-ity, we run the experiment on the real-world data with 1691batteries data, 1282 examples in the training set and 409examples in the test set.

-5-4-3-2-10Penalty on aging term #10-4

0

1

2

3

4

Per

cent

age

(%)

Figure 12: The distribution of penalties on aging term

We calculate the differences between the slopes of the pri-or 365 days and latter 120 days to generate a penalty labelsfor the battery aging terms. Fig. 12 summarizes the dis-tribution of the penalty trends that are extracted from thedataset, where the range is from 0 to −5 × 10−4 and mostbatteries are in the slow deterioration status. In Tab. 3,we compare the BatPro approach with Linear Regression(LR) [3], ARIMA [2] and Wavelet [7] and compute the root-mean-square error between the predictive voltage value andactual data. Our BatPro approach can precisely predict thebattery voltage with the RMS error less than 0.01 v. AsFig 13 shows, we can see that our method performs best a-mong the majority of the compared schemes with the small-est error between the extracted trend and the ground-truthtrend. Especially when the slope is small and the aging phe-nomenon is tiny, our method performs much better than LR,ARIMA, and Wavelets. Moreover, the aging trend extract-ed by our scheme is monotonic and satisfies the nature ofthe aging behavior, while LR, ARIMA, or Wavelets does nothave such advantage as they have no monotonic constraintin extracting the trend.

As the battery replacement is the consequence of a bat-tery failure [18], we choose 112 batteries with the replace-ment records in the log to verify the prediction accuracy forremaining lifetime. We count the survival days for each bat-tery after the failure alert by the different approaches, i.e.,BatPro, ARIMA, Wavelet and Linear Regression. Fig. 14plots the relation between the percentage and the survivaldays after the failure alert. We can see that 87% batter-

20 40 60 80 100 120Dates

0

0.02

0.04

0.06

0.08

0.1

RM

S E

rror

(v)

ARIMABatProLinearWavelet

Figure 13: Root-mean-square Errors

ies are replaced in three months after BatPro’s alters, whichdemonstrates the BatPro framework is a strong predictor forbattery working conditions. We also observe that there arestill a small number of batteries not replaced after our failurealerts. We have a close look at those batteries, and find outthere are redundant battery groups in their mobile networkbase stations. Therefore repairing engineers postponed themaintenance service for those batteries, which also demon-strates the BatPro framework can help the maintenance en-gineers to detect the potential issues in the battery groups.

0 16 48 80Days

0

20

40

60

80

100

Per

cent

age

(%)

ARIMABatProLinearWavelet

Figure 14: Survival days after our failure alert

4.3 Case Study: Insight into a BatteryIn this section, we select one instance, i.e., one battery,

from the large-scale experiments as a case study to intu-itively demonstrate the effectiveness of the BatPro frame-work. We applied BatPro and ARIMA approach on the 365days battery data to predict the voltage and battery work-ing conditions in the future 120 days. The result is shown inFig. 15, which is consistent with our assumption that statuscould bring permanent impacts to battery working condi-

tions. ARIMA only uses single battery’s float voltage, andit cannot model the effects of historical status. Taking ac-count of the effects of status, our BatPro framework is ableto quickly respond to those status and provides an accuratepredictive trend. In Fig. 15, we can clearly see that theBatPro framework can successfully predict the decreasingtrend of battery voltage and predict the working conditionof this battery in the next three months. As we know, themajor drawback of ARIMA model is that it needs a relative-ly long period of history information for prediction. In theexperiment, ARIMA works well to predict the fluctuationterms based on the voltage data in prior 365 days, althoughit cannot estimate the aging terms accurately. With a pre-defined threshold of the battery, we can then easily predictthe battery remaining lifetime, as demonstrated in the pre-vious subsection.

20 40 60 80 100 120Days

2.22

2.223

2.226

2.229

2.232

2.235

2.238

Vol

tage

(V

)

Real DataArimaBatPro

Figure 15: Voltage predict

5. RELATED WORKIn this section, we discuss two categories of researches

that are closely related to our work, namely, the predictionin time series and the analysis of the log status.

5.1 Time Series AnalysisThe time series research has attracted significant effort-

s in recent decades, due to its importance to all technicalsystems. Here, we discuss only some recent and relevantworks. One of the simplest approaches is smoothing or fil-tering, which can be done using different techniques, includ-ing wavelets [7] and moving average processes [9]. Anotherintuitive approach for trend extraction is fitting a linear orhigher level model to the time series data, while the commonway of fitting a model is a method of least squares [3]. Au-toRegressive Integrated Moving Average (ARIMA) [2] is amodel-based technique that allows extracting trend and sea-sonal components from the time series. This approach canbe used for both modeling and forecasting, and it is especial-ly useful when there is a certain seasonal component. Thelimitation of this model is that the parameters must be pro-vided by the user, which may be very subjective and lead toinadequate results. Singular Spectrum Analysis (SSA) [5] isa parameter free decomposition techniques for time series,

which allows the extraction of the alleged trend, seasonaland noise components from time series.

Djurdjanovic et al. [4] proposed a comprehensive frame-work named Watchdog agent for analysis and monitoring ofthe systems based on the time series analysis. Their focuswas more on the methodology of data collection and analy-sis rather than the methods of decomposing time series andextracting necessary features. Luo et al. [15] presented anapproach to find the correlation between actual status andtime series in order to diagnose incidents. Their approachmatched the status with certain subsequences of time seriesin order to give a real explanation of the time series shape.Herodotou et al. [10] proposed a probabilistic method in da-ta centers failure monitoring, which is domain specific andinitially builds a network model. Recently, a novel time se-ries analysis technique proposed by Ulanova et al. [20], whichallows the decomposition of the time series into trend andfluctuation components of complex physical systems.

5.2 Status in LogsThere are many performance diagnosing techniques that

rely on specific types of data, where logs can be used tofurther understand the system conditions and analyze thecommon patterns. Li et al. [14] is aimed at discovering andunderstanding the common patterns, but did not touch theproactive prevention of errors. Gu et al. [8] proposed anonline failure forecast system by observing the stream sys-tem’s status, which uses an ensemble of decision tree clas-sifiers and is applicable in an online setting. Fu et al. [6]used finite state automata to model sequential dependenciesbetween messages and detect anomalies. Makanju et al. [16]proposed IPLoM by creating status descriptions based onclustering text messages in the logs, but interpreting themrequires unavailable domain knowledge. Xu et al. [23] [22]combined textual messages in console logs to construct per-formance features and conducted the Principal ComponentAnalysis (PCA) [12] to detect anomalies. Similarly, Nagarajet al. [17] modeled status logs and state logs into perfor-mance features to infer the anomalies between components.Kimura et al. [13] proposed a modeling and status extrac-tion method of network log data using a tensor factorizationapproach, yet they did not focus on detection of anoma-lies. Most recently, Sipos et al. [18] considered log-basedpredictive analysis in order to monitor the conditions of theoperating equipment, yet they ignored the possibility of finetemporal patterns to simplify the model.

Different with the prior approaches, the BatPro frame-work focuses on a large amount of batteries and utilizes notonly the signal value but also the status data to comprehen-sively predict the battery working conditions.

6. CONCLUSIONIn this paper, we proposed BatPro, a battery profiling

framework, to precisely extract the features that cause theworking condition degradation of the battery group. Basedon the framework, we formulated the prediction models forboth battery voltage and lifetime and proposed a series of so-lutions to yield accurate outputs. By real world trace-drivenevaluations, we demonstrate that our BatPro approach canachieve much higher prediction accuracy on the battery volt-age and lifetime, which can serve as a basis towards a cost-effective battery maintenance on minimizing the service in-terruptions in mobile network base stations.

7. ACKNOWLEDGMENTSWe would like to thank the anonymous reviewers for valu-

able and insightful comments. This research is supported inpart by an NSERC Discovery Grant, a Strategic ProjectGrant, an RTI grant, and an E.W.R. Steacie Memorial Fel-lowship. Feng’s work is partly supported by the Start-upGrant from the University of Mississippi and NSF I/UCRC1539990.

8. REFERENCES[1] Balshe, W. Power system considerations for cell

tower applications. Cummins Power Generation(2011).

[2] Box, G. E., Jenkins, G. M., and Reinsel, G. C.Linear nonstationary models. Time Series Analysis,Fourth Edition (1976), 93–136.

[3] Charnes, A., Frome, E., and Yu, P.-L. Theequivalence of generalized least squares and maximumlikelihood estimates in the exponential family. Journalof the American Statistical Association 71, 353 (1976),169–171.

[4] Djurdjanovic, D., Lee, J., and Ni, J. Watchdogagent–an infotronics-based prognostics approach forproduct performance degradation assessment andprediction. Advanced Engineering Informatics 17, 3(2003), 109–125.

[5] Elsner, J. B., and Tsonis, A. A. Singular spectrumanalysis: a new tool in time series analysis. SpringerScience & Business Media, 2013.

[6] Fu, Q., Lou, J.-G., Wang, Y., and Li, J. Executionanomaly detection in distributed systems throughunstructured log analysis. In Data Mining, 2009.ICDM’09. Ninth IEEE International Conference on(2009), IEEE, pp. 149–158.

[7] Grossmann, A., and Morlet, J. Decomposition ofhardy functions into square integrable wavelets ofconstant shape. SIAM journal on mathematicalanalysis 15, 4 (1984), 723–736.

[8] Gu, X., Papadimitriou, S., Yu, P. S., and Chang,S.-P. Online failure forecast for fault-tolerant datastream processing. In Data Engineering, 2008. ICDE2008. IEEE 24th International Conference on (2008),IEEE, pp. 1388–1390.

[9] Hamilton, J. D. Time series analysis, vol. 2.Princeton university press Princeton, 1994.

[10] Herodotou, H., Ding, B., Balakrishnan, S.,Outhred, G., and Fitter, P. Scalable nearreal-time failure localization of data center networks.In Proceedings of the 20th ACM SIGKDDinternational conference on Knowledge discovery anddata mining (2014), ACM, pp. 1689–1698.

[11] Huang, S.-J., and Zhou, Z.-H. Fast multi-instancemulti-label learning. arXiv preprint arXiv:1310.2049(2013).

[12] Jolliffe, I. Principal component analysis. WileyOnline Library, 2002.

[13] Kimura, T., Ishibashi, K., Mori, T., Sawada, H.,Toyono, T., Nishimatsu, K., Watanabe, A.,Shimoda, A., and Shiomoto, K. Spatio-temporal

factorization of log data for understanding networkevents. In INFOCOM, 2014 Proceedings IEEE (2014),IEEE, pp. 610–618.

[14] Li, T., Liang, F., Ma, S., and Peng, W. Anintegrated framework on mining logs files forcomputing system management. In Proceedings of theeleventh ACM SIGKDD international conference onKnowledge discovery in data mining (2005), ACM,pp. 776–781.

[15] Luo, C., Lou, J.-G., Lin, Q., Fu, Q., Ding, R.,Zhang, D., and Wang, Z. Correlating events withtime series for incident diagnosis. In Proceedings of the20th ACM SIGKDD international conference onKnowledge discovery and data mining (2014), ACM,pp. 1583–1592.

[16] Makanju, A. A., Zincir-Heywood, A. N., andMilios, E. E. Clustering event logs using iterativepartitioning. In Proceedings of the 15th ACM SIGKDDinternational conference on Knowledge discovery anddata mining (2009), ACM, pp. 1255–1264.

[17] Nagaraj, K., Killian, C., and Neville, J.Structured comparative analysis of systems logs todiagnose performance problems. In Presented as partof the 9th USENIX Symposium on Networked SystemsDesign and Implementation (NSDI 12) (2012),pp. 353–366.

[18] Sipos, R., Fradkin, D., Moerchen, F., and Wang,Z. Log-based predictive maintenance. In Proceedingsof the 20th ACM SIGKDD international conference onknowledge discovery and data mining (2014), ACM,pp. 1867–1876.

[19] Thusoo, A., Sarma, J. S., Jain, N., Shao, Z.,Chakka, P., Anthony, S., Liu, H., Wyckoff, P.,and Murthy, R. Hive: a warehousing solution over amap-reduce framework. Proceedings of the VLDBEndowment 2, 2 (2009), 1626–1629.

[20] Ulanova, L., Yan, T., Chen, H., Jiang, G.,Keogh, E., and Zhang, K. Efficient long-termdegradation profiling in time series for complexphysical systems. In Proceedings of the 21th ACMSIGKDD International Conference on KnowledgeDiscovery and Data Mining (2015), ACM,pp. 2167–2176.

[21] Vetter, J., Novak, P., Wagner, M., Veit, C.,Moller, K.-C., Besenhard, J., Winter, M.,Wohlfahrt-Mehrens, M., Vogler, C., andHammouche, A. Ageing mechanisms in lithium-ionbatteries. Journal of power sources 147, 1 (2005),269–281.

[22] Xu, W., Huang, L., Fox, A., Patterson, D., andJordan, M. Online system problem detection bymining patterns of console logs. In Data Mining, 2009.ICDM’09. Ninth IEEE International Conference on(2009), IEEE, pp. 588–597.

[23] Xu, W., Huang, L., Fox, A., Patterson, D. A.,and Jordan, M. I. Mining console logs for large-scalesystem problem detection. SysML 8 (2008), 4–4.

[24] Zhou, Z.-H., Zhang, M.-L., Huang, S.-J., and Li,Y.-F. Multi-instance multi-label learning. arXivpreprint arXiv:0808.3231 (2008).

Documents

On Backup Battery Data in Base Stations of Mobile Networks ...xiaoyif/Papers/BatPro-CIKM.pdf · On Backup Battery Data in Base Stations of Mobile Networks: Measurement, Analysis,