15
A Systematic Approach for Dynamic Targeted Monitoring of KPIs Alejandro Mat´ e 1 Kostas Zoumpatianos 1 Themis Palpanas 3 Juan Trujillo 2 John Mylopoulos 1 Elvis Koci 1 1 Department of Computer Science, University of Trento, Italy [email protected] {zoumpatianos,jm}@disi.unitn.it [email protected] 2 Software and Computing Systems, University of Alicante, Spain [email protected] 3 Paris Descartes University, France [email protected] Abstract There has been growing interest for more than a decade in Business Analytics as a means for improving business performance. One of the most popular Business Analytics technique in- volves monitoring performance by means of Key Performance Indicators (KPIs). A KPI is a pow- erful tool that relates enterprise data to business goals, thereby enabling managers to guide the ana- lytic process and identify deviations in their strate- gic plan. Nevertheless, monitoring KPIs requires that they are evaluated at multiple levels of detail, in order to identify potential problems earlier in- stead of being noted after the fact. Unfortunately, there are obstacles to the generation and enactment of such monitoring processes. In particular, there is no systematic, tool-supported process for defining what is to be monitored given a strategic plan, nor are there tools for automatically generating moni- toring queries. As a result, monitoring consists of a manual process whereby queries are generated for high level indicators across a few scorecards and dashboards. In this paper we present a system- atic semi-automatic approach that covers the entire monitoring process. Our approach performs a par- Copyright c 2014 Alejandro Mat´ e, Kostas Zoumpatianos, Themis Palpanas, Juan Trujillo, John Mylopoulos, Elvis Koci. Permission to copy is hereby granted provided the original copy- right notice is reproduced in copies made. tial search guided by the KPIs of the company, gen- erating queries required during the monitoring pro- cess. Thanks to our approach, users become aware of the existence of problems and where they are lo- cated, without requiring a priori information about the nature of the problem being searched. More- over, we have implemented our approach in our CASE tool HERMES and evaluated the results on a case study using real data. Keywords: Monitoring, Business Intelligence, in- telligent search, KPI, business analytics 1 Introduction Business monitoring is a critical activity that looks for problems in business performance and alerts of their existence together with its source. It is one of the key tasks for companies because it allows decision makers to take corrective actions sooner rather than later, but it is also a challenging task because of the large volumes and high velocity of data that need to be processed. Business monitoring is typically supported by an information system that provides information about several Key Performance Indicators (KPIs) [18], spread across one or more Balanced Score- cards [13] and multiple dashboards. Traditionally, business monitoring has been based on the eval- uation of the aggregated values of KPIs by users

A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

Embed Size (px)

Citation preview

Page 1: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

A Systematic Approach for Dynamic Targeted Monitoring ofKPIs

Alejandro Mate1 Kostas Zoumpatianos1 Themis Palpanas3 Juan Trujillo2

John Mylopoulos1 Elvis Koci1

1 Department of Computer Science, University of Trento, [email protected] zoumpatianos,[email protected]

[email protected] and Computing Systems, University of Alicante, Spain

[email protected] Paris Descartes University, France

[email protected]

AbstractThere has been growing interest for more thana decade in Business Analytics as a means forimproving business performance. One of themost popular Business Analytics technique in-volves monitoring performance by means of KeyPerformance Indicators (KPIs). A KPI is a pow-erful tool that relates enterprise data to businessgoals, thereby enabling managers to guide the ana-lytic process and identify deviations in their strate-gic plan. Nevertheless, monitoring KPIs requiresthat they are evaluated at multiple levels of detail,in order to identify potential problems earlier in-stead of being noted after the fact. Unfortunately,there are obstacles to the generation and enactmentof such monitoring processes. In particular, there isno systematic, tool-supported process for definingwhat is to be monitored given a strategic plan, norare there tools for automatically generating moni-toring queries. As a result, monitoring consists of amanual process whereby queries are generated forhigh level indicators across a few scorecards anddashboards. In this paper we present a system-atic semi-automatic approach that covers the entiremonitoring process. Our approach performs a par-

Copyright c© 2014 Alejandro Mate, Kostas Zoumpatianos,Themis Palpanas, Juan Trujillo, John Mylopoulos, Elvis Koci.Permission to copy is hereby granted provided the original copy-right notice is reproduced in copies made.

tial search guided by the KPIs of the company, gen-erating queries required during the monitoring pro-cess. Thanks to our approach, users become awareof the existence of problems and where they are lo-cated, without requiring a priori information aboutthe nature of the problem being searched. More-over, we have implemented our approach in ourCASE tool HERMES and evaluated the results ona case study using real data.Keywords: Monitoring, Business Intelligence, in-telligent search, KPI, business analytics

1 IntroductionBusiness monitoring is a critical activity that looksfor problems in business performance and alerts oftheir existence together with its source. It is oneof the key tasks for companies because it allowsdecision makers to take corrective actions soonerrather than later, but it is also a challenging taskbecause of the large volumes and high velocity ofdata that need to be processed.

Business monitoring is typically supported byan information system that provides informationabout several Key Performance Indicators (KPIs)[18], spread across one or more Balanced Score-cards [13] and multiple dashboards. Traditionally,business monitoring has been based on the eval-uation of the aggregated values of KPIs by users

Page 2: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

who regularly check the scorecard in order to en-sure that everything is in order. For example, theKPI “Customer retention increased by 3%” consid-ers the customers retained throughout all our storesin the current year. In order to provide a more com-plete view for KPI monitoring, dashboards providedetailed information. For example, if the above-mentioned KPI is failing, what has been its trendfor the past months?

However, dashboards are not flexible enough toquickly and accurately identify potential problemsthat may arise. What if we could not expect toachieve our target by the end of the year becauseour shops do not retain customers in certain re-gions? On the one hand, adopting a bird’s eye viewof the data we are monitoring over makes it dif-ficult to pinpoint a problem. On the other hand,adding all available information or subKPIs wouldrender the dashboards useless due to their complex-ity. As there are no clear criteria on what informa-tion should be provided and how, such approach iscurrently based on the experience of the businessanalyst, and does not allow users to easily discoverpotential deviations in order to apply corrections.Therefore, the main objective of this paper is topropose a systematic process for monitoring KPIsfrom multiple perspectives by automatically gener-ating monitoring queries.

Previous research has tackled this problem byeither (i) providing the user a cube that containsall the discovered deviations in the data identifiedwith respect to current trends [21, 19, 20], or (ii) byspecifying a certain set of targets to be monitoredwithin a subset of the data [23, 24]. Nevertheless,the first approach does not take into account thegoals and expectations of the users, thus potentiallyproviding information to the user that is irrelevant.For example, during an economic downturn we ex-pect our sales trends to change. Additionally, thesecond approach requires the user to specify a sub-set of the data to be monitored. While it takes intoaccount the expectations of the user, it also requiresa priori knowledge of the problem being monitoredand does not provide detailed information about thesource of the problem, since it does not perform anin-depth analysis. For example, using the above-mentioned KPI, we would need to know that wehave a significant problem at stores levels insteadof in a certain region or country.

In this paper we propose a systematic targetedmonitoring approach based on user KPIs, that pro-

vides insights for the user about existing deviationsin subKPIs. First, our approach takes each KPIdefined and performs a guided search looking forthe most significant subKPI segments contributingto the deviation of the KPI. Then, once these seg-ments have been identified, it provides the user aresult tree that can be navigated by the user in or-der to visualize how the deviation is spread acrossinstances. As a result, our approach helps the userto pinpoint the root of the problem. The advan-tages of our approach are that it (i) dynamicallylocates the most relevant contributors to the devi-ation, whether they are general problems or veryspecific instances, by drilling deeper only when itis relevant, (ii) requires no a priori knowledge ofthe problem, (iii) shows the user the most importantinformation and allows him to navigate the resultsin order to understand the structure of the problem,and (iv) allows the user to visualize the results forthe overall business plan, allowing him to decidewhat problems should be prioritized and what othertargets will be affected by the corrections.

The remainder of this paper is structured as fol-lows. Section 2, presents an overview of tech-niques proposed for finding deviations and monitordatasets. Section 3 describes the main componentsinvolved in our approach and how they interrelateto each other. Section 4 provides details about theimplementation of our CASE tool for supportingour approach. Section 5 presents the results of ap-plying our approach to a case study using syntheticdata. Finally. Section 6 presents the conclusionsand sketches the future work to be done.

2 Related Work2.1 Strategy modelingInitial works to represent business strategies andmonitor them were proposed during the 90’s andearly 2000 [13, 12]. The Balanced Scorecard wasproposed in [13], which quickly became one of themost widespread monitoring techniques for enter-prises. A scorecard presents a high level list ofthe most important KPIs for monitoring businessperformance from different perspectives. However,the simplicity of the Balanced Scorecard made itimpossible to perform detailed analysis on KPIsnor reason about the suitability of the businessstrategy and business initiatives. Dashboards [8]and Strategy maps were proposed [12] to tackle

Page 3: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

these problem. Dashboards provide context in-formation on one or more KPIs, thereby enablingmore detailed analysis, while Strategy Maps cap-ture the relationships between business initiativesand goals. However, dashboards are not flexible,they offer a pre-set view of the data and are notmeant for exhaustive analysis. Furthermore, Strat-egy maps are not completely formal and do not in-tegrate a view of the status of the business. thusoffering limited reasoning support.

More recently, the Business Motivation Model(BMM) [16] and the Business Intelligence Model(BIM) [22, 11] have appeared in order to formalizebusiness strategies. The former proposes a meta-model that captures concepts used in textual busi-ness plans, whereas the second proposes a domainspecific language for modeling business strategiesusing a goal oriented approach.

2.2 Data cube mining techniquesIn the last twenty years there has been extensiveresearch [5, 9, 21, 19, 20, 10], as well as a greatamount of applications developed from businessvendors (SSAS, Hyperion, Cognos, etc.) on OLAPtechnology. An OLAP system [5] allows a user toquickly acquire all the interesting information in alarge data warehouse by querying on aggregates atdifferent levels of abstraction. Additionally, he isgiven the ability to drill down on specific areas ofthe database, in order to get more detailed infor-mation about the parts that look most interesting.In such systems it is expected to have a high num-ber of dimensions and measures, one for each at-tribute of the original tuples. Each dimension isused by the analyst for grouping and filtering andit is common that they are part of hierarchies fora top down data analysis. An analyst can queryan OLAP system for aggregates of different types(SUM, AVG, COUNT) in respect to a grouping hehas defined and a specified measure. All the pos-sible groupings that an analyst can query to thesystem form a multi-dimensional data-cube. Thisstructure contains a set of cells, one for each mostdetailed grouping, which are contained by parentcubes at multiple different levels. Each level sum-marizes the data aggregates in the levels below it.

Various techniques have been proposed for ex-ploring a data-cube efficiently [17]. But asSarawagi et al. [21] have noted, by simply ob-serving the aggregates it becomes very tedious to

identify on which regions one has to drill down inorder to get information of interest. To solve thisproblem they have proposed a method that guidesthe user to the “most anomalous” areas by present-ing some indicators of exceptions based on devia-tions from a statistical model. Additionally, using aDIFF operator [19] the user can explore the reasonsthat explain these anomalous segments.

When having to deal with large amounts of dy-namic data coming into streams though, simple ag-gregations are not enough. As Chen et al. [4]have noted, it is important to be able to do regres-sion and trend analysis and while traditional datacube technology is good for static aggregates it failsto explain trends in the multi-dimensional space.For this reason they have introduced the streamcube architecture [10] which allows for the multi-dimensional monitoring of regression lines. Subse-quent work [1] enabled the monitoring of trend ex-ceptions, by capturing regression lines with slopesabove an exception threshold.

While both previous methods were able to iden-tify exceptional behaviors in a data warehouse, theywere tailored for untargeted analytics. This meansthat deviations were calculated in regards to therest of the data warehouse. While this is impor-tant when the analyst looks for unexpected results,it is not helpful when a formal strategy has to beevaluated for success or failure. In such cases, ag-gregations in OLAP cubes are used as input to KPIcalculations, which comprise not only of a valuebut also a target that has to be met at a specific timepoint. This target can be different for every subKPIand also vary across time. Our target is to guide theanalyst to the most failing - in regards to a targetvalue - subKPIs that mostly explain why a formallyspecified goal is going to be successful or not. Thatsaid the target of our algorithm is to find the sub-KPIs with that contribute most to a global failurebased always on their local target values.

2.3 ForecastingNormally, KPIs are defined together with strategicplans in order to help analysts measure the successof their goals. While it is important to know whya past goal has failed or succeeded, it is even moreimportant to monitor what the trend is for a currentgoal. Is the goal going to be successful or failedeventually? Therefore, formally defined goals referalways to the future. This means that target values

Page 4: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

for KPIs associated with the measurement of goal,are always defined for the future. There has beena lot of work done in this direction for forecastingsequential data. This work is mostly within the do-main of statistics. Models such as ARIMA, simplemoving averages or even trend estimation can beused to predict a future value. The interested readercan study in detail the variety of methods in [3].

2.4 Data Series MiningKPIs are inherently sequential data, most of thetimes involving time. A lot of methods have beendeveloped for finding similarities in sets of data-series [25, 2]. Such methods can be used com-plimentary to our approach for performing outlieranalysis and to assist analysts search for clusters ordata that are deviating from similar past behavior.These techniques can be extended in the case of un-certain data-series [6, 7].

3 A Systematic Approach forTargeted Monitoring

Our approach is based on an architecture with fourclearly differentiated components. We first presentthe architecture and then discuss each component.

3.1 System ArchitectureOur architecture for targeted monitoring is basedon the idea of exploiting existing domain knowl-edge captured in the business strategy and present-ing the results in the same terms. To this aim, thefirst component in the architecture, shown in Figure1, is the Business Strategy Model. The BusinessStrategy Model is responsible for capturing busi-ness goals and their corresponding KPIs. These el-ements contain all the relevant information for themonitoring process such as target values that theusers wish to achieve for each KPI, the worst valueexpected, time to target available for each KPI, etc.

The business strategy provides information tothe monitoring module, that is responsible for themonitoring process according to parameters speci-fied by the user. First, the monitoring module pre-processes the data in order to calculate all the val-ues required for the search. Then, the monitoringprocess starts for each KPI. First, with the aid of thequery generator, it poses an initial query to the datawarehouse. Afterwards, the module starts the mon-

Business modeling

Monitoring Engine

DW

KPI Definitions

Query Generator

Query Results

ResultsBase Query + Candidate

Findings

Result reporter

Forecasting

KPI Definition & Target Time

Figure 1: Architecture of our monitoring solution

itoring process, interacting on each step with thequery generator in order to retrieve all the neces-sary values from the information system. The ini-tial query created by the query generator is basedon the definition of each KPI. Then, as further de-tailed in Sections 3.4 and 3.5, during each step itgenerates a new query by constraining the previousone to a subset of the instances and to a certain di-mension level. These queries allow the monitoringmodule to store the most relevant deviations foundduring each iteration.

Finally, the result set obtained during the moni-toring process is sent to the result reporter in orderto be analyzed by the user. The result reporter fo-cuses on highlighting the main sources of deviationfor each KPI found in the data. To this aim, the re-sult reporter shows the elements found on each it-eration by the monitoring module in a hierarchicalmanner, as detailed in Section 3.6.

3.2 Business Strategy Modeling andProcess Setup

Business Strategy Modeling plays a key role in thetargeted monitoring process. It constitutes the start-ing point of the process, where users define busi-ness goals, the relationships between them, andspecify KPIs that will monitor these goals, amongother elements. The combination of these elementsserves a double purpose: (i) it allows users to iden-

Page 5: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

tify what goals and KPIs are more important forthem and how often they should be monitored, and(ii) it allows users to understand, in their own do-main, the results of the monitoring process.

The most important construct in the businessstrategy for our approach is the KPI. KPIs provideinformation about how to calculate the values to bemonitored, which is the basis of the process. How-ever, they also provide more important data, suchas the target value to be met by the KPI, the timeto target, which determines when the KPI shouldmeet its mark, the threshold to separate acceptableperformance from failure, and a worst case thresh-old to identify severely failing KPIs.

In order to support the business strategy model-ing step we make use of the Business IntelligenceModel (BIM) framework [22, 11], which was ex-tended using the Semantic Vocabulary and Busi-ness Rules (SBVR) proposal for supporting KPIdefinition [15]. BIM provides the basic constructsto capture business goals, indicators that monitorthem and represent KPIs, and situations, that repre-sent external and internal factors affecting businessgoals and help users understand why their expecta-tions may not be met or be exceeded. Additionally,the SBVR extension allows users to formalize KPIsand derive data warehouse queries from their defi-nition. These definitions are supported by a Busi-ness Dictionary that represents a business ontology,and allows users to specify specialized instances ofeach KPI, for example to specify customized tar-get values for certain subKPIs. For a more detailedexplanation we refer the reader to [15].

Once we have modeled the business strategy andhave a clear view of the important KPIs to monitorwe need to prepare the monitoring process. Beforewe can start, we need to perform two tasks. First,we have to decide if we are going to use forecastedvalues. Usually, when KPIs are used for monitor-ing, it is often the case that they are failing untilthe time to target is close to 0, because they actu-ally represent lag information [14]. For example, aKPI “Increase sales by 3%” will be failing duringmost part of the fiscal year until we are near its con-clusion. In these cases, we need to either forecastthe actual values of the KPIs to the point in timewhen they will be evaluated or we need to interpo-late their target values to the current period of time.Second, we need to calculate the target values forthe different subKPIs according to the dimensionlevels available in the data warehouse. This task

can be performed in different ways, as mentionedin [24]. First, users can manually define a hierarchyof target values for each KPI. This is the case ofEurope 2020, where multiple dimensions and lev-els, such as country, province, population age, andso on have their own manually defined target val-ues for each KPI. Second, we can apply interpola-tion techniques like proportional projection, usinga size attribute for each instance (e.g. population),or direct projection, where we get the values of theprevious period for each subKPI and add the corre-sponding increment. For example, “Increase salesby 3%” would imply a 3% increase across all in-stances contributing to the KPI. Given that this canbe a time consuming process, the calculus of targetvalues is performed during the setup step, beforethe monitoring process starts. Then, the same val-ues can be reused during the whole period that weare actively monitoring. Due to paper constraintswe do not discuss these techniques any further.

After the process setup has been completed, weneed to proceed to the monitoring step, where werequire an approach to perform an intelligent searchof the data space.

3.3 KPI Monitoring: Performing anIntelligent Search

Ideally, we would like to know all relevant devia-tions in the data so that appropriate actions can betaken, and, if possible, targeted specifically towardsproblematic instances. In order to perform a mean-ingful identification of instances, we can take ad-vantage of the fact that most Business Intelligence(BI) systems are supported by a data warehouse,structuring the data into facts (center of analysis)and dimensions (context of analysis). Dimensionsallow us to aggregate the results at different gran-ularity using multiple axis depending on the per-spective we wish to adopt.

Therefore, we can reformulate our problem sothat, for each KPI Ki that has an aggregated value,we want to know if any instance j, defined by a setof dimension levels D = l1, ..., ln, and contribut-ing to the value of Ki, presents an anomaly, suchthat Kij is deviating. A formal definition of thisdeviation is given below.

Definition 1 (KPI-Deviation) A KPI-Deviation isthe difference between the target and the actualvalue of the KPI at a given time point. It can be

Page 6: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

mathematically formalized as:

DK = TargetK −ActualK (1)

As data warehouses typically present multiplelevels of aggregation for each dimension, the prob-lem becomes recursive thus requiring to repeat thesame steps for Kij until the finest level of detailhas been reached in order to ensure that no devi-ation goes unnoticed. Unfortunately, such exhaus-tive approach has three important drawbacks. First,it is time consuming, as we always drill into allthe potential combinations of dimensions and weneed to calculate the value of the subKPI Kij forall instances at each point. Second, it extends thesearch space through all dimension levels withoutguaranteeing that they will be useful for pinpoint-ing the problem, increasing the number of oper-ations performed as the number of individual in-stances increases. Third, it overloads the user withinformation, making him unable to understand theresults, mixing important findings with irrelevantdata, thereby rendering the process pointless.

An alternative to this problem is to select a cer-tain set of dimensions and levels to monitor, andconstantly monitor them for each indicator [24].This way, the most relevant set of instances forthe user can be monitored automatically while thenumber of queries generated is controlled.

However, a more interesting approach is to directthe search across several dimensions and levels, byproviding criteria that drills into the dimensionscontaining instances that are contributing most tothe deviation, and report only the most significantfindings. This approach allows the user to betterunderstand the problems arising, by focusing on themain contributors to the deviation and gaining in-sight about how the deviation is spread across in-stances and levels.

In order to apply this approach, we adapt theconcept of Information Gain (IG) used in decisiontrees. IG measures the change in entropy betweenthe initial set of instances available and the set clas-sified according to a candidate attribute a. Sim-ilarly, we can consider the deviation in a KPI asa classification problem, where we wish to clas-sify the instances according to their contribution,whether they are responsible for the failure or foran outstanding performance. Then, the set of avail-able dimensions D act as attributes that allow us toseparate the data into meaningful instances whichare failing or succeeding.

Initially, we could consider using KPI values(current or forecasted) to classify each KPI intosucceeding, struggling, or failing. However, not allsubKPIs have the same weight with regards to theparent KPI. For example, North America and Eu-rope can both be failing by 2% for a KPI “Increasesales by 5%”. Nevertheless, North America may beaccounting for 70% of the sales while Europe onlymakes up for 30%. While both subKPIs are failing,North America is a much more relevant instancethan Europe in this case. In order to avoid this pit-fall, we redefine the concept of IG as follows:

CKj = ωj ∗ DKj (2)

IGl =(max(Cl)−min(Cl))

|l|∑j=1

CKj

(3)

Where DK represents the deviation of a KPI K,CKj represents the weighted (ω) contribution of in-stance j to the deviation DK , where ω is calcu-lated according to the size of j compared to otherinstances at the same level for weighted functionssuch as avg, and IGl denotes the IG provided bya candidate dimension level. This definition ofIG prioritizes dimension levels with instances thatpresent more extreme deviation values with regardsto the target as opposed to those that spread the de-viation evenly across all instances. Therefore, thisallows us to direct the search through the data spacedrilling always into the dimension with the biggestdiscrepancy. The denominator will be constant forall dimensions whose members aggregate the value

of the parent KPI. In other words,|l|∑j=1

CKj = DK .

Thus, a higher IG in one dimension implies thatthe deviation is more concentrated into a single in-stance compared to other dimensions.

However, although it is interesting to drill intodimension levels with more extreme values, wemay be interested in slicing the data in such a waythat we find not one instance, but rather sets such asthe top-10 contributors to the deviation. It is easyto see that these are specific cases of a more generalapproach, IG provided by top-K vs bottom-K con-tributors, which redefines Equation 3 as follows:

Page 7: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

Data: K-value : N, KPIs: K, Target Value Projections: P , Dimension Set: D, IL-threshold : ILThResult: Set of deviations found

1 R = ∅;2 foreach k ∈ K do3 append(R, k); Kp = k; Ks = k; IG = 0; IL = 0;4 while IL < ILTh do5 Kp = Ks;6 foreach d in D do7 l = GetCandidateDimensionLevel(d);8 Kd,l = RestrictToDimensionLevel(Kp, l);9 IGd = CalculateIG(Kd,l, P,N);

10 end11 d = GetMaxIGDimension(IGd);12 DrillDownDimension(d);13 l = GetCurrentDimensionLevel(d);14 Ks = selectedInstances(Ks, l);15 IL = IL + CalculateIL(Ks,Kp);16 append(R, Ks);17 end18 end19 return R;

Algorithm 1: Top-K search algorithm

IGl =

(k∑

i=1

asc(Cl))− (k∑

i=1

des(Cl))

|l|∑j=1

CKj

(4)

3.4 An Algorithm for KPI MonitoringSo far we have defined criteria to guide the search.Nevertheless, this criteria does not provide a clearstopping condition. Indeed, although at every stepwe choose the most relevant perspective of the dataaccording to the available dimensions, we wouldpotentially keep drilling down until we reached themost detailed view possible. Such case would onlybe interesting for the user if there were few detailedinstances accounting for most of the total devia-tion, but completely irrelevant otherwise. In orderto avoid this situation we could set a threshold forIG below which we would stop the drilling process.However it is worth noting that the IG of each levelvaries depending on what other levels have beenpreviously selected. Therefore, a threshold cannotbe set and dimensions cannot be discarded from thesearch space just because they initially present alow IG.

As we wish to drill down only as long as thesize of the problem we are looking at is still sig-nificant for the user, we introduce the concept ofInformation Loss (IL). IL represents the intuitivenotion that we are setting aside part of the problem,when we drill down into the most relevant descen-dants of a set of instances. IL captures this value bycalculating the difference between the total devia-tion of the parent KPI K and the contribution to thedeviation (CKj) of the candidate instances selectedfor further search at level l. IL is aggregated as theprocess drills in through multiple levels, eventuallygoing above a threshold specified by the user andmarking the process as finished for the current KPI.

ILKl = DK −|l|∑i=1

CKj ∗ isSelected(i) (5)

Now that we have described our approach forguiding and stopping the drilling process, we onlyneed to define clear criteria as to what instancesshould be selected for further search. The first op-tion is to maintain the same criteria we used forselecting a dimension level, and drill into the top-K candidates. However, as we do not have a prioriknowledge of how the deviation is spread across in-

Page 8: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

stances, it may be the case that less (or more) thanK instances should be analyzed. An alternative is tospecify a minimum coverage threshold Th insteadof the parameter K. Using this criterion, the userspecifies the minimum deviation threshold Th withregards to the parent KPI that he wants to cover oneach iteration:

n∑i=1

CKj

DK> Th (6)

As a result, applying this criterion, the algorithmselects the top-N instances required to meet thethreshold stated by the user, allowing for a flexibleselection on each iteration.

Finally, after defining all the necessary elementsfor our search algorithm, we proceed to describethe process followed, as shown in Algorithm 1. Thealgorithm takes as input the following parameters:number N representing K candidates, the KPIs tobe monitored, the set of target value projections,the set of dimensions available, and the IL thresh-old. Then, for each KPI, it starts the search process.First, it retrieves the candidate level of each dimen-sion, one at a time (line 7). The candidate levelis used to further restrict the current KPI (line 8).The restricted KPI Kd,l is used to calculate the IGprovided by the candidate level l using Equation 4(line 9). Then, the algorithm retrieves the dimen-sion whose candidate provided the highest IG andrestricts the KPI according to the level specified(lines 11, 12). Afterwards, we select the most rel-evant instances according to their contribution andappend them to the relevant results (lines 13, 14,16). Finally, we calculate the IL by comparing thedeviation within the selected subset of instance andthe parent KPI (line 15). After we have finishedthe current iteration, we start the cycle again withthe new subset of instances obtained until we finishwith the current KPI and move to the next one. Wecan see a graphical representation of the process inFigure 2 1.

In this figure, we are drilling through the top-1 instances. First, we drill into the Time dimen-sion to analyze the IG provided by the level Quar-ter (step 1). As the employment is relatively sta-ble across quarters, the IG provided by the Timedimension is very low. Next, we drill into theGeographic dimension to analyze the IG provided

1Example is based on the Europe 2020 dataset, further de-scribed in Section 5.

(xg1) Employment Increase

(xg1) Employment in Greece

(xg1) Employment in Germany(xg1) Employment

in Q1(xg1) Employment in Q4

11 2

K=1

Geographic dimensionCountries

Time dimensionQuarters

IG = 0.3IG=0.8

(xg1) Employment in Greece-Q1

(xg1) Employment in Greece-Q4

3

Time dimensionQuarters 4

Geographic dimensionRegions

(xg1) Employment in Greece-C.Greece

(xg1) Employment in Greece-W.Macedonia

IG = 0.4 IG = 0.6

Figure 2: Example of the multidimensional searchprocess performed by the algorithm

by the level Country (step 2). As certain coun-tries present higher absolute deviations than oth-ers, the Country level provides significantly higherIG. Therefore, we select the Country level as ourchoice to drill in, and drill into the most significantinstance, Greece in our example. After drilling intoGreece we repeat the process again. We drill intothe Time dimension to analyze if Greece presentsany anomaly across quarters (step 3). Since the em-ployment rate is still stable (even if it is failing), theIG is still low. Finally, we analyze the Geographi-cal dimension, this time drilling into the level Re-gion (step 4). As some regions are bigger than oth-ers, the algorithm ranks Central Greece as the re-gion to drill in, even if West Macedonia presents ahigher deviation from its goal.

After presenting the algorithm, we highlight twofinal considerations that affect its implementation.First, it can be the case that a dimension presentsmultiple aggregation paths, i.e. multiple hierar-chies. In this case, the algorithm can be easilyadapted simply by calculating the IG of each can-didate level from a each hierarchy as if they weretwo separate dimensions. Then, once a hierarchyhas been selected by the algorithm, the remaininghierarchies within the same dimension will be ig-nored, as it is not possible to restrict a KPI usingmultiple hierarchies from the same dimension si-multaneously. Second, in order to reduce the num-ber of queries posed to the engine, the queries canbe optimized and the results of each restriction canbe cached. As we will see in Section ??, we cancreate a single query restricted to the subset of in-stances selected on each iteration, thus allowing us

Page 9: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

to drill in parallel across all the relevant instances.As a result, we reduce the number of a queries to asingle query per level analyzed.

A directed search approach using these criteriahas several advantages. First, it generates a limitednumber of queries, Ω(d ∗ l). Second it does not re-quire a priori knowledge of the relevant dimensionsand levels. Third, it allows the user to regulate thesearch, choosing whether the algorithm should stopearlier or drill deeper. Fourth, it allows us to re-port the most relevant instances found during thesearch in a hierarchic manner, thus, avoids flood-ing the user with information, as we will discuss inthe next section.

3.5 Query GeneratorIn order to apply the previous algorithm, we requireto generate a set of queries that retrieve the neces-sary data on each iteration. Query generation is theresponsibility of the query generator module, start-ing from the query generated from the definitionof the KPI and every subsequent query requiredduring the search process. Due to implementationdetails, our approach generates the initial queryin the MDX (MultiDimensional eXpressions) lan-guage (standard input for OLAP engines), whereasthe subsequent queries are generated in SQL. How-ever, this aspect does not alter the logic behind theprocess and it could be performed using either ofthe query languages on its own.

The initial query is generated from the KPIdefinition, following the approach presented in[15]. For example, given the KPI K defined assum employed divided by sum active populationrepresenting employment rate, the followingMDX query is generated (the set of unrestricteddimensions is not outputted unless requested):

WITH MEMBER [Measures].[result1] ASSUM([Measures].[EmployedPopulation])MEMBER [Measures].[result2] ASSUM([Measures].[TotalPopulation])MEMBER [Measures].[result3] AS[Measures].[result1] /[Measures].[result2]SELECT [Measures].[result3] ON 0FROM [Europe2020]

This initial MDX query is translated into SQL bythe OLAP engine, and sliced according to the time

period selected by the user for the monitoring pro-cess. As a result we obtain the initial SQL query forour monitoring process QK0 = M,D,C, whereM specifies a set of measures to be projected intocube cells, D contains the set of dimension levelsthat define the axis of the cube, and C specifies aset of constraints over the dimensions. QK0 acts asthe baseline query for the rest of the queries dur-ing the search process and, thus, all the followingqueries will only add or specialize levels in D andadd additional constraints to C.

At each step i, we need to choose which di-mension dj ∈ D we will drill in next. In orderto make this choice, use given QKi−1, we gener-ate one IG evaluation query QKid for each candi-date dimension level ldj , resulting in a query withDi = Di−1∪ ldj . If Di−1 already includes another,more general level from dj , then this level is substi-tuted by the current more specific candidate ldj . Forexample, given the previous query, the IG evalua-tion query generated for the region level within thecountry dimension is:

SELECT country.Region,(SUM(EmployedPopulation) * 100 /SUM(TotalPopulation)),year.Year_codFROM growth,country,year WHEREcountry.Country_ID=growth.CountryAND year.Year_ID=growth.YearAND year.Year_cod in (2012)GROUP BY country.Region,year.Year_cod

Then, QKid is used to calculate the IG of thedimension dj . Once we have selected the dimen-sion providing the highest IG, a query QKi is gen-erated with Ci = Ci−1 ∪ c, where c constraintsfurther search to the top-K instances contributingto the deviation at level ldj (or the set of instancesrequired for the specified coverage threshold).

Once we have obtained Qi we start the processagain until the stopping criteria is met. Followingthese steps, we have generated all the necessaryqueries needed for our search algorithm. There-fore, the only issue left to tackle is how to presentthe results obtained by the process in an under-standable way.

3.6 Result ReporterAs a result of the process described so far, thereare multiple findings that are relevant to the user.

Page 10: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

Figure 3: Most failing drill-down exploration

However, we need to visualize them in order to helpthe user understand the nature of the problem.

During each step, the first set of elements thatwe identify are the top-K instances that contributemost to the deviation. It is noteworthy that theseinstances are the result of combining current andprevious constraints on each iteration of the algo-rithm. For example, if we first drill into the Quarterlevel and then we drill into the country level for thefirst quarter, then, the instances will be Q1-Canada,Q1-Italy, and so on. Therefore, in order to providedetailed information about the nature of the devi-ation, we organize the instances in a hierarchicalway. At each level we have to provide the nameof the dimension d and the level ld selected by thealgorithm for drilling in, along with with the KPIsof the top-K instances, showing the relationshipswith their corresponding parent KPIs in the previ-ous level. For each instance, we show the % ofcontribution to the global deviation as well as howmuch it is deviating from its particular target.

Additionally, we can also highlight the top-K instances outperforming their expected targets.These instances are represented by the deviatinginstances that present negative deviation values, asTargetk < Actualk. This can help users to iden-tify success factors that contribute to the achieve-ment of their goals. A guided search that drillsinto these elements can easily be derived from theproposed algorithm, simply by selecting the tail-Kcontributors on each step.

Finally, some users may consider relevant to vi-sualize the top-K instances that are deviating mostfrom their own targets (even if they are not the ones

Figure 4: KPI definitions within the system

contributing most to the deviation). As in the pre-vious case, this data can again be easily obtainedsimply by ordering the instances according to theirnormalized deviation instead of their absolute con-tribution during the drill down process. Extractingthis information does not require introducing addi-tional modifications to the algorithm.

The final result is an interactive tree that canbe set to visualize different aspects of the datafound during the search. Therefore, it allows usersto analyze the most relevant findings according tothe expected performance of the different instancesstored in the data. Furthermore, it helps them un-derstand where most of the deviation is located and,with the aid of the dimension levels and instancevalues, aids them in finding why. We can see anexample of this visualization technique in figure 3.

Now that we have described the different partsinvolved in our approach, we proceed to show thedetails of our implementation and our case study inthe following sections.

4 ImplementationWe have implemented a web based prototype formonitoring KPIs as part of a Business Strategy.

It allows for the definition of KPIs which com-prise of a measure definition (the measured quan-tity), a target column definition (the target values),and a market magnitude column definition (how thesize of each sub-market is calculated). The magni-

Page 11: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

tude definition allows us to calculate the weightsfor our algorithm, and when it is not defined, acount of the instances in each dimension is used asa measure of magnitude. Additionally, these KPIscan be attached to Goals or Situations in order tomeasure the degree to which they are happening.All the target values are inserted in the database atthe lowest level of aggregation, and are then aggre-gated as averages for capturing the target of largersub-markets. For example if we have defined percountry targets, the target for a group of countrieswill be the weighted among all targets. Differenttypes of aggregations for the targets can be definedby the user if they fit the KPI definition.

All of these definitions are given by the user asaggregation functions that are evaluated above aROLAP schema. An example can be seen in thebottom part of Figure 4, where two KPIs are de-fined, one that measures the employment in Europeand another one that measures the early leaversfrom education. They contain a measure definedas a function evaluated above a snowflake ROLAPschema, a target column, and a magnitude columnwhich are columns of the fact table. This schema isalso defined as input in a setup step, where dimen-sions can be clustered by the user in sequential andnon-sequential ones. Sequential dimensions, suchas time, are used to measure how a KPI value fluc-tuates as they increase, for example trends can bemeasured along the time dimension, and forecast-ing models can applied in the resulting time-series.The system continuously generates such sequencesat the lowest level of aggregation, and uses themto perform forecasting. The resulting forecastedvalues are then stored again in the database, onlyto be overwritten when the real values are avail-able. Non-sequential dimensions on the other hand,can be used to drill in the most interesting sub-markets. The system automatically proposes a setof instances and dimensions to drill in based on ourKPI monitoring algorithm.

The user can browse the data warehouse infor-mation, and monitor a set of KPIs (using the Mon-itor buttons shown in Figure 4). The KPI valuein regards to the target is depicted in this screen,along with the current trend that it is following.By clicking the Monitor button, the user gets a de-tailed graphical break-down (Figure 3) of the re-sults of the algorithm. Each sub-tree representsa dimension along which the analyst can drill in.The sub-trees are ordered based on their informa-

Figure 5: Most successful drill-down exploration

tion gain, and the information gain of each dimen-sion is depicted next to each node. In Figure 3, thefirst dimension that the algorithm chooses to ex-plore (highest information gain) is the region di-mension. It then expands the tree and presentsthe top-4 failing regions (Southern, Eastern, West-ern, Northern), in the next level of the tree for theSouthern dimension for example, the country it-self is chosen as a next drill operation. This givesus a next sub-tree with the countries Spain, Italy,Greece, Portugal, ranked by the degree to whichthey are failing to meet their corresponding targets(either aggregate targets, or individual targets).

In the same way the user can also view the topmost succeeding sub-markets and explore them in-tuitively (Figure 5), with green color representingsuccess and red color representing failure to meettarget. In this case the top-4 regions are Northern,Western, Eastern, Southern, while for the Northernregion (best performing region) the top-4 perform-ers are Norway, UK, Iceland and Estonia.

Finally, if the user is further interested to exam-ine the trend of a sub-market she can get more in-formation through a screen like the one in Figure 6.The overall trend of this sub-market is displayedalong with the target values for every year.

4.1 Implementation DetailsThe backend of the system has been implementedin Python, while the user interface is making exten-sive use of JavaScript, HTML5 and AJAX. More-over, a custom made KPI definition API has beendeveloped in order to support our application. ThisAPI allows for the intuitive definition of complexKPIs and the automated generation of SQL queries,which can be directly evaluated on a Relational

Page 12: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

Figure 6: Trend examination

Database. For our experiments, we have used theMySQL database for storing the data, and the KPImonitoring API has been optimized to generatecompatible queries.

5 EvaluationFor our experiments, we have implemented a datawarehouse in MySQL with information about in-dicators for the European 2020 strategic planningframework, extracted from Eurostat. This scenariois an ideal candidate since it involves real data andwe are aware of the conclusions reached by ana-lysts after analyzing the data.

The data warehouse uses a snowflake schemaand has two dimensions, Country and Year. First,Country dimension presents two hierarchies thatcombine three different levels: Country name →Region→ All, and Country name→ Euro→ All.Second, Year dimension contains the actual year,specified as being a sequential dimension, and cho-sen for calculating all the trends.

Our fact table stores information including totalpopulation, employed population, GDP, and more,as well as target values for employment.

Figure 7: The employment KPI definition

According to this schema we have defined a KPIcalled Employment, as the ratio of the sum of theemployed population over the sum of the total pop-ulation. For the target value of this KPI, we definedan aggregation function that calculates the averageper sub-market target values. Our raw target val-ues were defined per year, per country. For calcu-lating the aggregate target values, we summed theweighted sub-targets. In our case the query thatwas generated for projecting targets to higher-levelsub-markets, weighted each country towards the to-tal population of Europe and used this weight tocalculate the target of each Region. In a recursiveway all the targets were calculated from the rootlevel to the overall Europe target. We chose the lastyear available in the database, 2012, as pivot forperforming the analysis. This latest year can eithercome from forecasting, or using the latest valuesexisting in the system.

In the general case any time point can be usedfor restricting the values in time and running the al-gorithm, resulting in the algorithm showing to theuser why a target at the selected time point is fail-ing. If a forecasted year is used, the user can iden-tify why something “will” fail, if a past year is useda user can identify why something “has failed” andif the present time is used, the user can identify whysomething “is failing”. The same analysis can alsobe done for successful KPIs for identifying the top-k most succeeding sub-markets. Finally, for cal-culating the weights we used the sum of the total

Page 13: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

population of each sub-market for calculating theratio over its immediate parent-market.

Considering this setup, and according to ana-lysts, the algorithm should pinpoint the Region hi-erarchy as providing more interesting informationthan the Euro/non-Euro hierarchy, and rank thecountries according to their contribution to the de-viation of Europe 2020 employment.

5.1 ResultsOur current tool implementation uses the recursivenon-optimised version of the algorithm, and there-fore, it generated 1 query for each instance to getits target value, 1 query to get its current value,and 1 query for getting the magnitude of each in-stance. However, the quick query processing speedallowed us to perform the process iteratively, as theuser decided to drill down more. We display the in-stances of every dimension that is chosen for a drillin, along with the weighted contribution of each in-stance to the higher level failure.

We set the algorithm to recommend the top-4reasons behind a failure or success. Indeed, our al-gorithm chose to split over the Region hierarchy asthe most informative one, and subsequently rankedthe most failing regions.

These regions were: Southern, Eastern, Westernand Northern. The first region (Southern), was ad-ditionally split to its next hierarchy level (countryname) and returned the top failing countries as fol-lows: Spain, Italy, Greece and Portugal (shown inFigure 3. The same split was performed showingthe corresponding countries in other regions.

Finally, we also instructed the algorithm to re-turn the top-4 most successful sub-markets. Thealgorithm again chose the Region dimension (as weexpected since it maximized the information gain)and ranked the regions in the reverse order: North-ern, Western, Eastern and Southern. For the North-ern region (highest performing), the top countrieswere: Norway, UK, Iceland and Estonia.

As we can see, positive contributions to the de-viation mean a failure, while negative contributionsmean that the target is succeeding, and as a resultis compensating for the total loss. This is a veryrelevant aspect for the user, because there can ex-ist instances that are deviating more than the totaldeviation, but are being compensated by other in-stances. The output screen for the top-4 worst per-formers can be seen in Figure 3 and the screen for

the top-4 best performers can be seen in Figure 5.

6 Conclusions and Future workCurrent monitoring techniques lack the detail andsimplicity required by users to be able to pinpointproblems in their business strategies and take cor-rective actions. In this paper we have presented asystematic approach that takes into account users’goals and targets to be met, finds the main elementsthreatening or contributing to them, and presentsthis information back to the user in a simple way.Our approach performs a dynamic, guided searchof the data space, drilling deeper into the levels thatprovide more information in order to pinpoint theproblem and stopping when the size of the problembecomes irrelevant for the user. Additionally, wehave discussed how this criteria can provide differ-ent information to the user. Furthermore, we haveshown how query generation can be automated inorder to perform the search without user interven-tion. Finally, we have presented how our approachis implemented in our CASE tool and have appliedit to a case study using real data.

The lack of a priori knowledge required in ourapproach, as well as the possibility of automatingquery generation, allow the user to identify signifi-cant findings with few mouse clicks. Compared toour approach, traditional monitoring would requiretedious OLAP navigation, chart elaboration, andanalysis, without ensuring that the effort is well-directed. Furthermore, as we focus on the biggestcontributors to the deviation of each target, this al-lows the user to get a better understanding of thereasons behind the deviation and how corrective ac-tions could be taken.

Nevertheless there is still much work to be donein the area. Although in this paper we haveexplored an information measure to successfullyguide the search process, other information mea-sures could be defined in order to look for differentaspects in the data. Additionally, Targeted monitor-ing can only be applied as long as we have avail-able a clear target criteria. In other cases, such aswhen monitoring changes in trends, we require un-targeted monitoring techniques. Furthermore, busi-ness strategy models capture factors that affect theachievement of business goals and could be used toimprove the monitoring process. Finally, all the ef-ficiency of these monitoring techniques needs to betested on very large datasets, such as those stored

Page 14: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

in the data warehouses of international companies.To this aim, our future work is focused on (i) eval-uating and improving the effectiveness of the algo-rithm presented, and (ii) defining and evaluating theapplicability of a monitoring process that empow-ers current SWOT (Strengths, Weaknesses, Oppor-tunities, and Threats) analysis making use of thedata available.

AcknowledgementsThis research was partially funded by the FP7EU ERC Advanced Investigator project Lucretius(grant agreement no. 267856), and developed inthe context of the project GEODAS-BI (TIN2012-37493-C03-03) funded by the Ministry of Edu-cation and the Ministry Economy and Competi-tiveness of Spain (MINECO). Alejandro Mate isfunded by the Generalitat Valenciana under thepostdoctoral grant APOSTD/2014/064.

About the AuthorAlejandro Mate is postdoc researcher at the Uni-versity of Trento, Italy. He received a BS and aMSc Degree in Computer Science from the Uni-versity of Alicante. He earned his PhD degree inComputer Science from the University of Alicantein 2013. He has published several papers in inter-national conferences and journals such as CAiSE,ER, RE, JSS, and IS. His research involves con-ceptual modeling, data warehouses, model drivendevelopment, and requirements engineering.

Kostas Zoumpatianos is a PhD student at thedbTrento group, University of Trento, Italy. Hisresearch involves data warehouses, business intel-ligence and data series management. He holdsa MSc Degree in Information Management anda BSc degree in Information and CommunicationSystems Engineering, both from the University ofthe Aegean, Greece.

Themis Palpanas Themis Palpanas is a profes-sor of computer science at the Paris Descartes Uni-versity (France). He received a BS degree from theNational Technical University of Athens (Greece),and the MSc and PhD degrees from the Universityof Toronto (Canada). He has worked at the Univer-sity of California at Riverside (USA), the IBM T.J.Watson Research Center (USA), and the Universityof Trento (Italy). His solutions have been imple-mented in world leading data management prod-

ucts and he is the author of eight US patents. He isthe recipient of three Best Paper awards, and found-ing member of the Event Processing Technical So-ciety. He has been General Chair for VLDB 2013.

Juan Trujillo is a Full-time Professor at the Uni-versity of Alicante (Spain) and leader of the Lu-centia Research Group. His main research topicsinclude BI applications, Big Data, KPIs, data ware-houses, UML, and MDA. He has also registeredseveral tools related to data warehouse modelling.He has advised 11 PhD students and publishedmore than 150 papers in highly impact conferencessuch as ER, UML or CAiSE, and 38 papers in JCRindexed jounals such as the DKE, DSS, ISOFT, IS,or JDM. He has also been co-editor of nine spe-cial issues in different JCR journals (e.g. DKE).He has also been PC member of different eventsand journals such as ER, DOLAP, DSS, JDM, orDKE, and PC Chair of DOLAP’05, DAWAK’05-’06, FP-UML’05-’09, MoDIC’12 and ER’13.

John Mylopoulos holds a distinguished pro-fessor position (chiara fama) at the University ofTrento, and a professor emeritus position at theUniversity of Toronto. He earned a PhD degreefrom Princeton University in 1970 and joined theDepartment of Computer Science at the Univer-sity of Toronto that year. His research interestsinclude conceptual modeling, requirements engi-neering, data semantics and knowledge manage-ment. Mylopoulos is a fellow of the Association forthe Advancement of Artificial Intelligence (AAAI)and the Royal Society of Canada (Academy ofSciences). He has served as general chair of in-ternational conferences in Artificial Intelligence,Databases and Software Engineering.

References[1] Y Dora Cai, David Clutter, Greg Pape, Ji-

awei Han, Michael Welge, and Loretta Auvil.MAIDS: mining alarming incidents from datastreams. In SIGMOD. ACM, 2004.

[2] Alessandro Camerra, Jin Shieh, Themis Pal-panas, Thanawin Rakthanmanon, and Ea-monn J. Keogh. Beyond one billion timeseries: indexing and mining very large timeseries collections with i sax2+. Knowl. Inf.Syst., 39(1):123–151, 2014.

[3] Chris Chatfield. Time-series forecasting.CRC Press, 2002.

Page 15: A Systematic Approach for Dynamic Targeted Monitoring of KPIsthemisp/publications/cascon14.pdf · A Systematic Approach for Dynamic Targeted Monitoring of KPIs ... pect our sales

[4] Yixin Chen, Guozhu Dong, Jiawei Han, Ben-jamin W Wah, and Jianyong Wang. Multi-dimensional regression analysis of time-series data streams. VLDB, 2002.

[5] Edgar F Codd, Sharon B Codd, and Clynch TSalley. Providing OLAP (on-line analyticalprocessing) to user-analysts: An IT mandate,volume 32. E. F. Codd & Associates, 1993.

[6] Michele Dallachiesa, Besmira Nushi, Kat-siaryna Mirylenka, and Themis Palpanas. Un-certain time-series similarity: Return to thebasics. PVLDB, 5(11):1662–1673, 2012.

[7] Michele Dallachiesa, Themis Palpanas, andIhab F. Ilyas. Top-k nearest neighbor searchin uncertain data series. PVLDB, 8(1), 2015.

[8] Wayne W Eckerson. Performance dash-boards: measuring, monitoring, and manag-ing your business. Wiley, 2010.

[9] Jim Gray, Surajit Chaudhuri, AdamBosworth, Andrew Layman, Don Re-ichart, Murali Venkatrao, Frank Pellow, andHamid Pirahesh. Data cube: A relationalaggregation operator generalizing group-by,cross-tab, and sub-totals. Data Min. Knowl.Discov., 1(1):29–53, January 1997.

[10] Jiawei Han, Yixin Chen, Guozhu Dong, JianPei, Benjamin W Wah, Jianyong Wang, andY Dora Cai. Stream Cube: An Architec-ture for Multi-Dimensional Analysis of DataStreams. Distributed and Parallel Databases,18(2):173–197, 2005.

[11] Jennifer Horkoff, Daniele Barone, Lei Jiang,Eric Yu, Daniel Amyot, Alex Borgida, andJohn Mylopoulos. Strategic business model-ing: representation and reasoning. Software& Systems Modeling, pages 1–27, 2012.

[12] Robert S. Kaplan and David P. Norton. Strat-egy maps: Converting intangible assets intotangible outcomes. Harvard Business Press,2004.

[13] Robert S. Kaplan, David P. Norton, RC Dorf,and M Raitanen. The balanced scorecard:translating strategy into action, volume 4.Harvard Business school press Boston, 1996.

[14] Gert HN Laursen and Jesper Thorlund. Busi-ness analytics for managers: Taking businessintelligence beyond reporting, volume 40.John Wiley & Sons, 2010.

[15] Alejandro Mate, Juan Trujillo, and John My-lopoulos. Conceptualizing and specifying keyperformance indicators in business strategymodels. In CASCON. IBM Corp., 2012.

[16] Object Management Group (OMG). Businessmotivation model 1.2, 2014.

[17] Themistoklis Palpanas and Nick Koudas. En-tropy based approximate querying and explo-ration of datacubes. SSDBM, 2001.

[18] David Parmenter. Key performance indicators(KPI): developing, implementing, and usingwinning KPIs. Wiley, 2010.

[19] Sunita Sarawagi. Explaining differences inmultidimensional aggregates. VLDB, 1999.

[20] Sunita Sarawagi. User-adaptive explorationof multidimensional data. VLDB, 2000.

[21] Sunita Sarawagi, Rakesh Agrawal, and Nim-rod Megiddo. Discovery-Driven Explorationof OLAP Data Cubes. Advances in DatabaseTechnology, 1377, 1998.

[22] Eric Yu, Jennifer Horkoff, Daniel Amyot,Greg Richards, and John Mylopoulos. Busi-ness modeling for business intelligence.Synthesis Lectures on Data Management,5(1):19–32, 2013.

[23] Konstantinos Zoumpatianos, Themis Pal-panas, and John Mylopoulos. Strategic man-agement for real-time business intelligence.In BIRTE, 2012.

[24] Konstantinos Zoumpatianos, Themis Pal-panas, John Mylopoulos, Alejandro Mate,and Juan Trujillo. Monitoring and diagnos-ing indicators for business analytics. In CAS-CON. IBM Corp., 2013.

[25] Kostas Zoumpatianos, Stratos Idreos, andThemis Palpanas. Indexing for interactiveexploration of big data series. In SIGMOD,2014.