Modeling and Mining Sequential Data

Machine Learning and Data MiningPhilipp Singer

CC image courtesy of user puliarfanita on Flickr

What is sequential data?

Stock share price (Bitcoin)

Screenshot from bitcoinwisdom.com

Daily degrees in Cologne

Screenshot from google.com (data from weather.com)

Human mobility

Screenshot from maps.google.com

Web navigation

AustriaGermanyC.F. Gauss

Song listening sequences

Screenshots from youtube.com

Let us distinguish two types of sequence data

Continuous time series

Categorical (discrete) sequences

Let us distinguish two types of sequence data

Continuous time seriesStock share price

Daily degree in Cologne

Categorical (discrete) sequences (focus)Sunny/Rainy weather sequence

Human mobility

Web navigation

Song listening sequences

This lecture is about...

Modeling

Predicting

Pattern Mining

This lecture is about...

Modeling

Predicting

Pattern Mining

Markov Chains

S1S2S3

1/21/21/32/31

Markov Chain Model

Stochastic Model

Transitions between states

S1S2S3

1/21/21/32/31StatesTransition
probabilities

Markov Chain Model

Markovian propertyThe next state in a sequence only depends on the current one, and not on a sequence of preceding ones

S1S2S3

1/21/21/32/31StatesTransition
probabilities

Classic weather example

0.1SunnyRainy

0.90.50.5

Formal definition

State space

Amounts to sequence of random variables

Markovian memoryless property

Transition matrix

Rows sum to 1Transition matrix PSingle transition
probability

Example

0.1SunnyRainy

0.90.50.5

Transition matrix

Likelihood

Transition probabilities are parameters

Transition probabilityTransition count

Maximum Likelihood (MLE)

Given some sequence data, how can we determine parameters?

MLE estimation

Maximize!

See ref [1]

[1] http://journals.plos.org/plosone/articleid=10.1371/journal.pone.0102070

Prediction

Simply derived from transition probabilities

One option: Take max prob.

Prediction

What about t+3?

Pattern mining

Simply derived from (non-normalized)
transition matrix

Most common transitionSequential pattern

Full example

Training sequence

Full example

Transition counts

Transition matrix (MLE)

Full example

Likelihood of given sequence

We calculate the probability of the sequence with the assumption that we start with sunny.

Full example

Prediction?

Full example

Prediction?

Higher order Markov Chain models

Drop the memoryless assumption?

Models of increasing order2nd order MC model

3rd order MC model

Higher order Markov Chain models

Drop the memoryless assumption?

Models of increasing order2nd order MC model

3rd order MC model

2nd order example

depends on

Higher order to first order transformation

Transform state space

2nd order example new compound states

2nd order example

Reset states

Marking start and end of sequences

Transformation easier (same #transitions)

Comparing models

1st vs. 2nd order

Statistical model comparison necessary

Nested models higher order always fits better

Account for potential overfitting

Model comparison

Likelihood ratio testRatio between likelihoods for order m and k

Follows a Chi2 distribution with degrees of freedom

Only for nested models

Akaike Information Criterion (AIC)

The lower the better

Bayesian Information Criterion (BIC)

Bayes FactorsRatio of evidences (marginal likelihoods)

Cross validation

See http://journals.plos.org/plosone/articleid=10.1371/journal.pone.0102070

AIC example

1st order parameters2nd order parameters

AIC example

1st order parameters2nd order parameters

Example on
blackboard

Markov Chain applications

Google's PageRank

DNA sequence modeling

Web navigation

Mobility

Hidden Markov Chain Model

Hidden Markov Models

Extends Markov chain model

Hidden state sequence

Observed emissions

What is the weather like?

Forward-Backward algorithm

Given emission sequence

Probability of emission sequence?

Probable sequence of hidden states?

Hidden seq.Obs. seq.

Check out YouTube tutorial: https://www.youtube.com/watch?v=7zDARfKVm7sFurther material: cs229.stanford.edu/section/cs229-hmm.pdf

Note: Literature usually uses a start probability and uniform end probability for the forward-backward algorithm.

Forward

What is the probability of going to each possible state at t2 given t1?

Forward

forwardR

reset transition

Backwards

What is the probability of arriving at t4 given each
possible state at t3?R

Backwards

backwardreset

emission

Forward-Backward

Most likely state at t2

Forward-Backward

Posterior decoding

Most likely state at each t

For most likely sequence: Viterbi algorithm

Learning parameters

Train parameters of HMM

No tractable solution for MLE known

Baum-Welch algorithmSpecial case of EM algorithm

Uses Forward-Backward

HMM applications

Speech recognition

POS tagging

Translation

Gene prediction

Other related methods

Sequential Pattern Mining

PrefixSpan

Apriori Algorithm

GSP Algorithm

Reference: rakesh.agrawal-family.com/papers/icde95seq.pdf

Graphical models

Bayesian networksRandom variables

Conditional dependence

Directed acyclic graph

Markov random fieldsRandom variables

Markov property

Undirected graph

Questions?

Philipp Singerphilipp.singer@gesis.org

Modeling and Mining Sequential Data

Education

Sequential Pattern Mining - unipi.itdidawiki.di.unipi.it/lib/exe/fetch.php/dm/sequential_patterns_2019.pdf · Sequential Pattern Mining: Challenge Trivial approach: generate all possible

5.3 mining sequential patterns

Modeling Sequential Information Acquisition Behavior in ...tavana.us/publications/DS-SIA.pdf · Modeling Sequential Information Acquisition Behavior in Rational Decision ... 722 Modeling

Graph and Web Mining - Motivation, Applications and Algorithms · Constraint-Based Sequential Pattern Mining Constraint-based sequential pattern mining Constraints: User-specified,

Zips: Mining Compressing Sequential Patterns in Streams

Fast Vertical Mining of Sequential Patterns using …...Fast Vertical Mining of Sequential Patterns Using Co-occurrence Information 1 Introduction Sequential pattern mining: • a

Code Clone Detection using Sequential Pattern Mining - … · Code Clone Detection using Sequential Pattern ... technique for clone detection using sequential pattern mining titled

Heteroskedastic Gaussian Process Modeling and Sequential

Clustering and Sequential Pattern Mining

Approximate Mining of Consensus Sequential Patterns

Relationship Mining Sequential Pattern Mining Week 5 Video 4

Sequential Pattern Mining - unipi.itdidawiki.cli.di.unipi.it/lib/exe/fetch.php/dm/sequential_patterns.pdf · Sequential Pattern Mining Lecture Notes for Chapter 7 – Introduction

Prefixspan,: Mining Sequential Patterns Efficiently by Prefix …dm.kaist.ac.kr/kse525/resources/papers/icde2001prefix... · 2014-04-18 · Prefixspan,: Mining Sequential Patterns

Mining Actionable Subspace Clustering in Sequential Data

Sequential Pattern Mining of Social Networksrichard.myweb.cs.uwindsor.ca/cs510/survey_mumu.pdf · Sequential Pattern Mining of Social Networks Tamanna Sharmin Mumu mumut@uwindsor.ca

Data warehouse and Data Mining · Outliers . Sequential Pattern Mining • Sequential Pattern Mining is the mining of frequently occurring ordered events or subsequences as pattern

Temporal pattern mining: beyond (simple) sequential patterns ......1 Data Mining Temporal pattern mining: beyond (simple) sequential patterns Fouille de motifs temporels : au-delà

Mining Frequent Patterns II: Mining Sequential & Navigational Patterns

Anytime mining of sequential discriminative patterns in

Searching and mining sequential data - KTH