18
Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Embed Size (px)

Citation preview

Page 1: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Pattern Matching in DAME using AURA technology

Jim Austin, Robert Davis, Bojian Liang, Andy Pasley

University of York

Page 2: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Overview

• Context• AURA technology• DAME pattern matching problem• AURA solution• Search performance• Next steps

Page 3: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Context

• Vibration data from all engines in flight• Detection of unusual vibration patterns

– Novelties, anomalies– Automatic or manual

Search for similar vibration behaviour– Need to search large volumes of historical vibration

data

• Investigate search results and associated data– Service data records– CBR tools: Sheffield

Page 4: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

AURA technology

• AURA– Proven technology for searching large data sets– Ability to scale and maintain performance– Easily parallelised

• Examples– Address matcher– Molecular matcher

• Operation– Vectors compared to stored examples– Uses bit level comparison methods– Correlation Matrix Memory operations

Page 5: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

AURA architecture

Dat

a A

dapt

or

Sto

reS

ea

rch

Inp

ut

pa

tte

rn

Candidate Engine(Back check)

Indexer

Output pattern

AURASearchEngine

Results

binary

Store & Search

Store &

Search

Indexes or Data

ResultStore

Candidate Selector

Page 6: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

AURA storage & recall

Inp

ut

pa

tte

rn

Output pattern

AURASearchEngine

binary

2 1 20 0 0 0

* *Correlation Matrix Memories

Page 7: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

AURA software

• AURA re-designed– To improve performance of the AURA library in terms of

both memory usage and search times• 3 fold reduction in memory

• 3 fold reduction in search time

– To make the library easy to use• Simple API

• Typically only 4 or 5 API calls used

• Enable implementation as an OGSI GT3 service

– To engineer the library to commercial software standards• Comprehensive user guide and reference manual

Page 8: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Pattern matching problem

• Vibration data from sensors forms Z-mod data.• Tracked orders extracted from Z-mod data

Fre

quen

cy

Time

Trackedorder

TimeA

mpl

itude

Page 9: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Pattern matching problem

• Novelty or anomaly identified in tracked order data by feature detectors

Forms Query sub-sequence

Page 10: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Pattern matching problem

• Search for sub-sequences similar to the query in a large volume of tracked order data.– Need to investigate all possible alignments– Benchmark method is sequential scan– Noisy data: imprecise matching required– Various possible similarity measures

• Euclidian distance

• Correlation

Page 11: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

AURA solution

StoredTime series

AURA SearchEngine

Results

EncodedQuery

QueryTime Series

AURABackcheck

Encoded Time Series

Candidate Matches

Page 12: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

AURA solution

• Encoding: reduction in dimensionality – e.g. from 100pts to 10 values.

• Approximate search– From ~ 1,000,000s of alignments down to ~1000s of

candidate matches

• Backcheck– From ~1000s candidate matches to 100 or fewer results

Page 13: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Encoding technique

• Piecewise Aggregate Approximation• Values encoded using integer bins

Y-A

xis

X-Axis

Page 14: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Search efficiency

• Approximate search using AURA– Fast method of discarding poor matches– AURA search typically an order of magnitude or more faster

than sequential scan. – Candidate matches typically <1% of total.– Back check stage very efficient due to reduction in volume

of data• typically 1% or less of processing time for full sequential scan.

Page 15: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Data size

• Assume– Fleet of 100 aircraft, 4 engines each– Flying 10 hours per day– 5 data points per tracked order per second – 4 bytes per data point

• Totals– approx. 100 GigaBytes per year per tracked order– Roughly 10 tracked orders of interest so…

• Total approx. 1 TeraByte per year

Page 16: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Search performance

• Deployed system assumptions– 100 CPUs 2GHz each with 1GByte RAM.

• One per aircraft

– Each search needs to check 25,000,000,000 alignments of the query per year of tracked order data.

• Sequential scan– Measured at approx. 2 seconds for 5,000,000 alignments of

a 100 data point query (one CPU).– Extrapolates to approx. 500 seconds to search 5 years of

data assuming 1 CPU per aircraft

– This is too slow! Need to support multiple searches and searches on more than one tracked order.

Page 17: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Search performance

• Using AURA and PAA based approach– Search time reduced by approx an order of magnitude.

– Can search 5 years of data for 100 aircraft in approx:

50 seconds

– Believe this to be a workable solution – But response times potentially slower than this

• Need to handle a number of searches in parallel

• Communications and other overheads

Page 18: Pattern Matching in DAME using AURA technology Jim Austin, Robert Davis, Bojian Liang, Andy Pasley University of York

Distributed Aircraft Maintenance Environment - DAME

Next steps

• Technology– Refine similarity measures and encoding methods.

• Architecture– Develop additional services to distribute and organise the

search– Support multiple searches in parallel

• Measurement– Perform scaling trials on engine data– Obtain better estimates of overall performance

• Multiple searches

• Overheads