Faculty of Computer Science, Electronics and ... · clustering algorithms are integrated from the scikit-learn such as K-means clustering [9] and DBScan clustering, a density-based

A Conceptual Development of Quench Prediction Appbuild on LSTM and ELQA framework

Matej Mertika, Maciej Wielgoszb, Andrzej Skoczenc

aThe European Organization for Nuclear Research - CERN, CH-1211 Geneva 23 SwitzerlandbFaculty of Computer Science, Electronics and Telecommunications,

AGH University of Science and Technology, Krakow, PolandcFaculty of Physics and Applied Computer Science, AGH University of Science and Technology,

Krakow, Poland

Abstract

This article presents a development of web application for quench prediction in Tech-nology Department, Machine Protection and Electrical Integrity group, Electrical En-gineering (TE-MPE-EE) at CERN. The authors describe an ELectrical Quality As-surance (ELQA) framework, a platform which was designed for rapid developmentof web integrated data analysis applications for different analysis needed during thehardware commissioning of the Large Hadron Collider (LHC). In second part the ar-ticle describes a research carried out with the data collected from Quench DetectionSystem by means of using an LSTM recurrent neural network. The article discussesand presents a conceptual work of implementing quench prediction application for TE-MPE-EE based on the ELQA and quench prediction algorithm.

Keywords: Software development, Data analysis, Web applications, ElectricalQuality Assurance (ELQA), Large Hadron Collider, Deep Learning, Recurrent neuralnetworks

1. INTRODUCTION

The Large Hadron Collider (LHC) is the world’s largest and complex experimentalfacility ever created. It was built by the European Organization for Nuclear Research(CERN) between 1998 and 2008 in collaboration with over 10,000 scientists and engi-neers from over 100 countries, as well as hundreds of universities and laboratories. Its27 kilometre circumference incorporates (among others) 1232 superconducting dipolemagnets that lie beneath France – Switzerland border near Geneva.

Starting up this largest and complex experimental facility requires many teams ofprofessionals and complex procedures due to its superconducting characteristic in thedomain of cryogenics, electrical engineering, electronics, computer science, materialscience, physics and others. The TE-MPE-EE at CERN provides an important task by

Email addresses: [email protected] (Matej Mertik), [email protected] (MaciejWielgosz), [email protected] (Andrzej Skoczen)

Preprint submitted to ArXiv April 2, 2018

arX

iv:1

610.

0920

1v1

[cs

.LG

] 2

5 O

ct 2

016

hardware commissioning of the LHC machine, where all circuits are verified with manyelectrical measurements in the machine in different conditions during the complex pro-cedure. The signals from measurements are stored in different systems at CERN, wheredata are available for their analysis. ELQA team started to develop a data analytic ap-plication to look through patterns implemented in the hardware commissioning in 2008and 2014. As such a framework for prototyping data analysis apps was developed inthe TE-MPE-EE.

The ELQA framework provides building blocks for designing web application fordata analysis. In this paper we present a conceptual development of Quench PredictionApplication from the research work on the prediction of quenches and its preliminaryresults. We show the concept of integration of

(a) a research on a deep learning algorithm for quench prediction and(b) a research in software design for prototyping data analysis application for TE-

MPE-EE.

The paper is structured as following. In first chapters there is a description of theELQA framework and software libraries used by software design, then LSTM conceptfor predicting quenches with deep learning is described on the real data. In the last partwe are presenting a way and a concept of implementing the Quench Prediction App forTE-MPE-EE at CERN.

2. ELQA FRAMEWORK

ELQA framework was developed for rapid design of data analysis application atTE-MPE-EE. Framework was designed for ELQA group where there is a need toaddress a vast amount of data that are collected in the hardware commissioning ofthe Large Hadron Collider (LHC). In the commissioning procedure the conditions ofthe circuits of the machine are verified and implemented by different measurementsthrough all 8 sectors of the machine. Based on software design following requirementswere identified carefully:

(a) searching signal’s similarities over the vast amount of tests,(b) visualizing trends of measurements through time domain in data,(c) developing corresponding data models for different flavors of ELQA measure-

ments such as TP4 (Test Procedure 4), DOC (Dipole Orbit Correctors), MIC (Mag-net Instrumentation Check),

(d) developing different scenarios of visualization of different group of measurements.

Two additional software characteristics were specially addressed:

(a) application should run in web environment and(b) framework should support high interaction for the user when exploring the data.

As such ELQA framework provide different components for building the web ap-plications. There are three main libraries used based on which the concept was defined:

2

Figure 1: The components of the framework for rapid design of data applications

(a) Bokeh Python interactive visualization library that targets modern web browsersfor presentation [1], library was selected when testing the visualisation librariesand requirements [2],

(b) Django high-level Python Web framework for Object Oriented paradigm to access-ing relational dataset of ELQA [3], and

(c) Scikit-learn kit, a Machine Learning library used for the implementation of theKDD algorithms [4].

Within these three main libraries other dependent libraries and components of CERNsoftware infrastructure were integrated. The relationships between them are shown inFig. 1.

Following subsections explain some detailed characteristic of the framework.

2.1. ELQA FRAMEWORK CHARACTERISTICS

2.1.1. Data modelsAccess to the data is addressed with the Object-Relational Mapping. Django frame-

work handles this mapping providing all the functionality of a Structure Query Lan-guage (SQL). The architecture of Django is layered with

(a) the bottom layer which is the ELQA database, followed by(b) access library that is responsible for communication between Python and the database

through SQL statements, here we used a Python library cx oracle [5] and

3

(c) specific Django database back-end where any kind of database supported by Django(PostgreSQL, MySql, SQLite and Oracle) is supported [3].

Above Django’s layers a data model is then defined for accessing the data of ELQA.Data models can be efficient programmed using the Django tools such as inspectdb forautomatically defining the models from database tables, where relationships betweenthe tables are separately defined.

2.1.2. PreprocessorsPreprocessors are classes for analytical preprocessing of the signals within defined

operators by the ELQA engineers. For the extraction of signals currently followingbasic statistics are used as extractors (and can be extended): average, minimal value,maximal value, skewness, kurtosis and properties of the linear regression applied toeach signal such as slope and standard error. Scipy and numpy libraries [6, 7] are usedby preprocessors implementation.

2.1.3. Data-minersScikit-learn, a Machine Learning library written in Python is selected for data min-

ers in the ELQA framework. It is built on NumPy, SciPy, and matplotlib and it islicenced under BSD open source licence [4]. Although there are many solutions andtools for data analysis available as open-source or commercial packages, some of themost used tools from the domains of Data Mining, Analytics, Big Data, and Data Sci-ence can be found on KDnuggets network maintained by a group of professionals andresearchers [8], Scikit-learn was selected as most appropriate as is written in Pythonand importantly has a strong community around the library.

The data-miner layer defines general class for analyzer. Currently two differentclustering algorithms are integrated from the scikit-learn such as K-means clustering[9] and DBScan clustering, a density-based spatial clustering of data with noise pro-posed by Martin Ester, Hans-Peter Kriegel, Jorg Sander and Xiaowei Xu in 1996 [10].

The analyzer class can be extended with other techniques from scikit-learn libraryor other libraries and algorithms in the framework. In the following section we presenta concept of using LSTM-based solution for detecting and predicting quenches.

2.1.4. VisualizersVisualizers serve user interaction and visualization of the data. When searching

for appropriate tools in communities many frameworks in different programming lan-guages were reconsidered. Due to the required architecture (see other evaluated li-braries on the Fig. 1) Bokeh was selected after studying the appropriate availablesolutions. Main reason was Bokeh’s ability to act as a web server in order to expose afully-fledged data visualization and analysis tool to multiple users [2].

3. CONCEPT OF USING THE LSTM-BASED SOLUTION

The idea behind using LSTM algorithm for quench detection and prediction isbased on unique and very useful properties of this deep learning algorithm in timeseries modelling. It turns out that in some cases a currently used quench protection

4

solution [11, 12] can be supplemented with an additional monitoring layer which mon-itors a performance of the Quench Protection System.

One of the biggest challenges in implementing of the LSTM concept is the con-struction of the network architecture by choosing a right set of hyper parameters suchas number of layers, neurons within each layer and input size. This is done in trial-and-error fashion which can be substantially leveraged by using an appropriate framework.Such a framework allows for visualization of the performance a given network instanceas well as easy manipulation of its macro parameter values.

3.1. RECURRENT NEURAL NETWORKS

Phenomena occurring in a real world may be perceived in a spatial and temporalspace. Since the neural models are supposed to mimic the real world the two differ-ent branches of neural networks evolved which address applications belonging to theaforementioned categories. Feed-forward and CNN networks are meant and most oftenused in applications of the spatial domain, whereas various kinds of recurrent modelssuch as RNN, LSTM and GRU are especially useful in temporal modelling tasks.

Unlike traditional models that are mostly based on hand-crafted features deep learn-ing neural networks can operate directly on raw data. This makes them especiallyuseful in applications where extracting features is very hard and even sometimes im-possible. It turns out that there are many fields of applications where no experts existwho can handle feature extraction or the area is simply uncharted and we do not knowwhether the data contains latent patterns worth exploring [13–15].

Foundations of the most currently used neural networks architectures were laid be-tween 1960 and 1990. For almost the last two decades researchers were not able totake full advantage of these powerful models and there were claims that they are inher-ently ineffective. The whole machine learning landscape changed in early 2010 wheredeep learning algorithms started to achieve state-of-the-art results in a wide range oflearning tasks. The breakthrough was brought about by several factors, among whichcomputing power, deluge of widely available data and affordable storage are consid-ered to be the critical ones. It is worth noting that in the presence of large amount ofdata the conventional simple linear models tend to under-fit or under-utilize computingresources.

Deep learning models can operate in both spatial and temporal domain. However,standard feed-forward and CNN neural networks are well suited for spatial domain[16–18], in contrast to recurrent neural networks which perform much better in theapplications of temporal domain [19–21]. CNNs and feed-forward networks rely onthe assumption of independence of data within training and testing set as presented inFig. 2. This means that after each training item is presented to the model, the currentstate of the network is lost i.e. temporal information is not taken into account in traininga model.

In a case of the independent data it is not an issue. But for data which containcrucial time or space relationships it may lead to a loss of majority of the informationwhich is located in between steps. Additionally, feed-forward models expect fixedlength of training vectors which is not always the case, especially when dealing withtime domain data.

5

Hidden layer

Input

Output

Figure 2: Architecture of standard feed-forward neural network

Hidden layer

Input

Output

h(t)

x(t)

y(t)

h(t+1)

x(t+1)

y(t+1)

h(t+2)

x(t+2)

y(t+2)

Figure 3: General overview of recurrent neural networks

6

Hidden layer

Input

Feed-forward layer

Output

Figure 4: Mapping RNN outputs.

Recurrent neural networks (RNNs) are models with the ability to process sequentialdata as one element at a time. Thus they can simultaneously model sequential andtime dependencies on multiple scales. Unfortunately, a range of practical applicationsof standard RNN architectures is quite limited. This is caused by the influence of agiven input on a hidden and output layer during the training of the network. It eitherdecays or blows up exponentially as it moves across recurrent connections. This effectis described as the vanishing gradient problem [22]. There were many unsuccessfulattempts to address the problem before LSTM was eventually introduced by Hochreiterand Schmidhuber [23] which ultimately solved it.

Recurrent neural networks may be visualized both as a looped-back architecturesof interconnected neurons which was presented in Fig. 2. Originally RNNs were meantto be used with single variable signals but they were also adapted for multiple streaminputs [22].

It is a common practice to use feed-forward recurrent layers together in order tomap outputs from RNN or LSTM to the result space as presented in Fig. 4.

In practical applications, the LSTM has shown extraordinary ability to learn long-range dependencies as compared to standard RNNs. Therefore, most of the state-of-the-art applications use LSTM model [24].

The LSTM internal structure is based on a set of connected memory blocks aspresented in Fig. 5. The memory blocks contain memory cells with loop-backed con-nections storing the temporal state of the network in addition to boundary units calledcell which serve as an interface for information propagation within the network.

There are three different gates in each memory block:

(a) input gate which controls input activations into the memory cell,(b) output gate controls cell outflow of activations into the rest of the network,

7

Internal state

tanh

+

x(t) h(t-1)

+

tanh

sigm+

h(t-1)

x(t) x(t)

sigm+

h(t-1)

x(t) x(t)

sigm+

h(t-1)

x(t) x(t)

hc

(t)

LSTM block

Pointwise multiplication

+ Sum over weighted inputs

Connection to next time step

ic

(t)

fc

(t)

gc

(t)

sc

(t)

oc

(t)

Figure 5: Architecture of LSTM memory block containing one memory cell

8

(c) f orgate gate scales the internal state of the cell before summing it with the inputthrough the self-recurrent connection of the cell. This enables gradual forgettingof a cell memory state.

Modern LSTM architectures may also contain peephole connections [25]. Sincewe do not use in our experiment it was not depicted in Fig. 5 and addressed in thisdescription.

The output of each LSTM block is calculated according to the following set ofequations:

g(t) = φ(Wgxx(t) + Wghh(t−1) + bg) (1)

i(t) = σ(Wixx(t) + Wihh(t−1) + bi) (2)

f (t) = σ(W f xx(t) + W f hh(t−1) + b f ) (3)

o(t) = σ(Woxx(t) + Wohh(t−1) + bo) (4)

s(t) = g(t) � i(t) + s(t−1) � f (t) (5)

h(t) = φ(s(t)) � o(t) (6)

While examining equations 1, 2, 3, 4 and 5 it may be noticed that instances for acurrent and previous time step are used for the value of hidden layer vector h as wellas for internal state s. Consequently, ht denotes a value of a hidden state vector ata current time step, whereas ht−1 refers to previous step. It is also worth noting theequations contain vector notation which means that they address whole set of LSTMcells. In order to address a single cell a subscript c is used as it is presented in Fig. 5where for instance ht

c refers to the value of hidden state within this particular cell.LSTM based network learns when to let activation into the internal states of its

cells and when to let the activation out. This is implemented with gating mechanism inwhich all the gates are considered as separate constitutes of LSTM block with their ownlearning capability. This means that they adapt in a training process as separate unitsto preserve a proper information flow. When both the gates are closed, the activationis cut-off from the outside connections, neither growing or shrinking, nor affecting theoutput at intermediate time steps. In order to make it possible a hard sigmoid functionσ was used which can output 0 and 1 as given by Eq. 7. This means that the gates canbe fully open or closed.

σ(x) =

0 if x ≤ tlax + b if x ∈ (tl, th)1 if x ≥ th

(7)

In terms of the backward pass, so-called constant error carousel enables the gradi-ent to propagate back through many time steps, neither exploding or vanishing [23, 24].

9

Table 1: Data sets used for training and testing the model

Data set size [MB]

small 22medium 111large 5000

3.2. EXPERIMENTAL SETUP

Timber database stores a record of many years of the magnets activity. This is ahuge amount of data with relatively few quench events. The first research question wehad to address was related to a time span surrounding a quench event which shouldbe taken into account for the LSTM model training. One day-long record of a singlevoltage time series for only one magnet occupies roughly 100 MB and there are severalvoltage time series associated with a single magnet [26] but ultimately we decided touse only one in our experiments, namely Ures.

Various kinds of magnets are used in LHC and they generate different number ofquench events. It is highly beneficial from a perspective of our research to pick amagnet for which large possible number of quenches is provided. The longest historyof quenches was provided for 600A magnets in Timber database. Therefore, we de-cided to focus our initial research on 600A magnets. Unfortunately, the data covering aquench periods of 600A magnets is very large i.e. an order of several gigabytes. How-ever, as it was mentioned before, the activity record of superconducting magnets duringoperational time of LHC is composed mostly of sections of normal work and partiallyof quench events. Furthermore, Timber does not enable automated quench periods ex-traction, despite having many useful features for data preprocessing and informationextraction.

Training deep learning models takes a long time even when fast GPUs are em-ployed. Therefore, a choice of a size of a training data and a set of hyper-parameters ofa model is critical in terms of the application performance. In course of experiments wediscovered that it is faster to first train a model with a small data set and preliminarilyadjust hyper-parameters and then tweak them on bigger data sets.

Consequently, we have created 3 different data sets: small, medium and the largeones as presented in Tab. 1.

The tests are in progress with a main goal of validating the LSTM-based setupcapability of modeling superconducting magnets behavior. The resolution of the dataacquired from Timber database is far too low to determine its ability to predict ordetect quenches. This is will require using more data of higher resolution, such aspost-mortem records, and is planned as the future work.

4. INTEGRATING LSTM MODEL INTO ELQA FRAMEWORK

Described LSTM concept developed at TE-MPE-EE can be efficiently used withinELQA framework for implementing quench prediction application for the end user.

10

Figure 6: Integration of framework and LSTM

Following characteristics and key benefits are identified for joining these two pieces ofresearch

(a) efficient application of the research conducted on quench detection and prediction,(b) incorporating quench detection application into web environment(c) need of an efficient graphical user interface for end user,(d) availability of high user interaction experience on the developed application for

quench detection.

Fig. 6 presents possible integration of the LSTM into the framework. It can beseen that LSTM concept should be wrapped by the new LSTM analyzer class and thenused in the ELQA framework. The graphical user interface is a next step that should bedesigned through the definition of the Quench Prediction App Dashboard, a key partfor any web apps developed for data analysis based on ELQA.

Following subsection provides an overview for integration process of the deeplearning LSTM.

4.1. INTEGRATING LSTM-BASED SOLUTION

Analyser class in ELQA is a general class where data miner algorithms are inte-grated from Scikit-learn data mining library. LSTM concept for quench prediction isimplemented in Python at TE-MPE-EE and can be logical integrated into ELQA withan extended LSTM from the Analyser.

11

Figure 7: Integration of framework and LSTM, grey entities presents LSTM quenchprediction parts in the ELQA framework

Accessing the data in Timber database will be extended by new Data model. ELQAframework allows to define new stack for integrating the access to the various datasetsand stacks through Django library. Fig. 7 presents a possible integration.

When model and algorithm are incorporated into the ELQA classes, full develop-ment of the dashboard can be then applied through the framework for user interfaceand interaction developed for a Quench prediction app.

4.2. IMPLEMENTING QUENCH PREDICTION DASHBOARD - APP DEVELOP-MENT

With incorporated data model to access to Timber dataset and wrapped LSTM al-gorithm of Quench Prediction Software within Analyser class, efficient developmentof the web application for Quench prediction can be applied through programming ofinherited LSTM dashboard. IN LSTM Dashboard widgets for Graphical User Inter-face are defined and then integrated through Bokeh library (input forms, plots), wheredata are applied to LSTM algorithm. Fig. 8 presents the concept of the Dashboardimplementation by ELQA.

It can be seen (Fig. 8) that conceptual building blocks when programming Dash-board for quench prediction application follow objects that need to be defined in theDashboard:

(a) input widget for user interaction (lists, input boxes, sliders etc..),

12

Figure 8: Dashboard design with ELQA for Quench Prediction app

(b) plots widget for plotting series,(c) data sources from the data model (Django model for accessing Timber) and(d) LSTM algorithm as data-miner for quench prediction application.

5. CONCLUSIONS

We presented a concept paper of integrating the LSTM techniques for quench de-tection into the web application used the ELQA framework for prototyping data analy-sis application developed at TE-MPE-EE at CERN. Main characteristics of the frame-work were presented as the concept of LSTM for quench detection. We showed themechanism and key benefits of joining these two entities for developing a quench pre-diction app for TE-MPE-EE at CERN. As explained, the LSTM concept can be inte-grated efficiently in order to provide an application that can be used by the engineerswhen providing hardware commissioning procedures of the LHC at CERN.

Acknowledgments

The framework is being tested to prototype data application by the ELQA team.We would like to thank the entire ELQA team and TE-MPE-EE at CERN for theirpersistent invaluable help throughout the development process.

13

References

[1] Bokeh Development Team, Bokeh: Python library for interactive visualization(2014).URL http://www.bokeh.pydata.org

[2] L. Barnard, M. Mertik, Usability of visualization libraries for web browsers foruse in scientific analysis, International Journal of Computer Applications 121 (1).URL http://research.ijcaonline.org/volume121/number1/pxc3904225.pdf

[3] Django Software Foundation. Django documentation [online] (2015).

[4] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al., Scikit-learn: Machinelearning in python, Journal of Machine Learning Research 12 (Oct) (2011) 2825–2830.

[5] A. Tuininga. cx Oracle [online] (2015).

[6] E. Jones, T. Oliphant, P. Peterson, et al. SciPy: Open source scientific tools forPython [online] (2001–).

[7] S. Van Der Walt, S. C. Colbert, G. Varoquaux, The NumPy array: a structurefor efficient numerical computation, Computing in Science & Engineering 13 (2)(2011) 22–30. doi:10.1109/MCSE.2011.37.

[8] G. Piatetsky-Shapiro. KDnuggets [online] (2015).

[9] J. A. Hartigan, M. A. Wong, Algorithm AS 136: A K-Means Clustering Algo-rithm, Journal of the Royal Statistical Society. Series C (Applied Statistics) 28 (1)(1979) pp. 100–108. doi:10.2307/2346830.URL http://www.jstor.org/stable/2346830

[10] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discov-ering clusters in large spatial databases with noise, in: Kdd, Vol. 96, AAAI Press,1996, pp. 226–231.

[11] L. Coull, D. Hagedorn, V. Remondino, F. Rodriguez-Mateos, LHC magnetquench protection system, IEEE Transactions on Magnetics 30 (4) (1994) 1742–1745. doi:10.1109/20.305593.

[12] R. Filippini, B. Dehning, G. Guaglio, F. Rodriguez-Mateos, R. Schmidt, B. Todd,J. Uythoven, A. Vergara-Fernandez, M. Zerlauth, Reliability assessment of theLHC machine protection system, in: Proceedings of the 2005 Particle AcceleratorConference, 2005, pp. 1257–1259. doi:10.1109/PAC.2005.1590726.

[13] C. Y. Chang, J. J. Li, Application of deep learning for recognizing infant cries, in:2016 IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), 2016, pp. 1–2. doi:10.1109/ICCE-TW.2016.7520947.

14

http://www.bokeh.pydata.org

http://www.bokeh.pydata.org

http://research.ijcaonline.org/volume121/number1/pxc3904225.pdf




https://docs.djangoproject.com/en/1.8/ref/databases/

http://cx-oracle.sourceforge.net

http://www.scipy.org/

http://www.scipy.org/

http://dx.doi.org/10.1109/MCSE.2011.37

http://www.kdnuggets.com

http://www.jstor.org/stable/2346830


http://dx.doi.org/10.2307/2346830


http://dx.doi.org/10.1109/20.305593

http://dx.doi.org/10.1109/PAC.2005.1590726

http://dx.doi.org/10.1109/ICCE-TW.2016.7520947

[14] M. Cai, Y. Shi, J. Liu, Deep maxout neural networks for speech recognition, in:Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Work-shop on, 2013, pp. 291–296. doi:10.1109/ASRU.2013.6707745.

[15] L. Tth, Phone recognition with deep sparse rectifier neural networks, in: 2013IEEE International Conference on Acoustics, Speech and Signal Processing,2013, pp. 6985–6989. doi:10.1109/ICASSP.2013.6639016.

[16] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015) 436–444, insight. doi:10.1038/nature14539.

[17] C. Farabet, C. Couprie, L. Najman, Y. LeCun, Learning hierarchical features forscene labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence35 (8) (2013) 1915–1929. doi:10.1109/TPAMI.2012.231.

[18] Y. LeCun, Deep learning convolutional networks, in: 2015 IEEE Hot Chips 27Symposium (HCS), 2015, pp. 1–95. doi:10.1109/HOTCHIPS.2015.7477328.

[19] J. Morton, T. A. Wheeler, M. J. Kochenderfer, Analysis of recurrent neu-ral networks for probabilistic modeling of driver behavior, IEEE Transactionson Intelligent Transportation Systems PP (99) (2016) 1–10. doi:10.1109/TITS.2016.2603007.

[20] F. Pouladi, H. Salehinejad, A. M. Gilani, Recurrent neural networks for se-quential phenotype prediction in genomics, in: 2015 International Conferenceon Developments of E-Systems Engineering (DeSE), 2015, pp. 225–230. doi:

10.1109/DeSE.2015.52.

[21] X. Chen, X. Liu, Y. Wang, M. J. F. Gales, P. C. Woodland, Efficient trainingand evaluation of recurrent neural network language models for automatic speechrecognition, IEEE/ACM Transactions on Audio, Speech, and Language Process-ing 24 (11) (2016) 2146–2157. doi:10.1109/TASLP.2016.2598304.

[22] A. Graves, Supervised Sequence Labelling with Recurrent Neural Networks,Springer Berlin Heidelberg, 2012. doi:10.1007/978-3-642-24797-2.

[23] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput. 9 (8)(1997) 1735–1780. doi:10.1162/neco.1997.9.8.1735.

[24] Z. C. Lipton, J. Berkowitz, C. Elkan, A critical review of recurrent neural net-works for sequence learning (2015). arXiv:1506.00019.

[25] K. Greff, R. K. Srivastava, J. Koutnk, B. R. Steunebrink, J. Schmidhuber, LSTM:A search space odyssey (2015). arXiv:1503.04069.

[26] C. Roderick, L. Burdzanowski, G. Kruk, The CERN accelerator logging service- 10 years in operation: A look at the past, present, and future, Proceedings ofICALEPCS2013 (2013) 612–614.

15

http://dx.doi.org/10.1109/ASRU.2013.6707745

http://dx.doi.org/10.1109/ICASSP.2013.6639016

http://dx.doi.org/10.1038/nature14539

http://dx.doi.org/10.1109/TPAMI.2012.231

http://dx.doi.org/10.1109/HOTCHIPS.2015.7477328

http://dx.doi.org/10.1109/TITS.2016.2603007

http://dx.doi.org/10.1109/TITS.2016.2603007

http://dx.doi.org/10.1109/DeSE.2015.52

http://dx.doi.org/10.1109/DeSE.2015.52

http://dx.doi.org/10.1109/TASLP.2016.2598304

http://dx.doi.org/10.1007/978-3-642-24797-2

http://dx.doi.org/10.1162/neco.1997.9.8.1735

http://arxiv.org/abs/1506.00019

http://arxiv.org/abs/1503.04069

Documents

Faculty of Computer Science, Electronics and ... · clustering algorithms are integrated from the scikit-learn such as K-means clustering [9] and DBScan clustering, a density-based