Upload
barnaby-horn
View
218
Download
0
Embed Size (px)
Citation preview
SeqStream: Mining Closed Sequential Pattern over Stream Sliding WindowsSeqStream: Mining Closed Sequential Pattern over Stream Sliding Windows
Lei Chang Tengjiao Wang Dongqing Yang Hua Luan
ICDM’08
Lei Chang Tengjiao Wang Dongqing Yang Hua Luan
ICDM’08
112/04/21 1
Outline.
Preliminary.
Algorithm.
Experimental results.
Conclusion.
112/04/21 2
Preliminary.
The inverse sequence of a sequence s, denoted by s’
s = <abae>, s’= <eaba>
An s-projected database Ds
<b>-projected database is {<da>,<ae>,<cda>,<cdae>,<cda>}
The size of Ds denoted as R(Ds)
The size of <b>-projected database is 14.112/04/21 3
<e>-projected database is {φ,φ,<bcda>,<ac>}
The size of <e>-projected database is 6.
The inverse database of D, denoted by D’
The database in current sliding window after inserting(but before removing), denoted by D^.
D^ : {<fbda>,<abaec>,<fbcdac>,<bcdae>,<ebcdaf>,<aeac>}112/04/21 4
In the inverse database of D^, the set of sequence from user appear in current window is called an insertion database denoted by D+.
The set of sequence from user that appear in remove winodw is called a removal database denoted by D-.
112/04/21 5
D^ : {<fbda>,<abaec>,<fbcdac>,<bcdae>,<ebcdaf>,<aeac>}
D^’: {<adbf>,<ceaba>,<cadcbf>,<eadcb>,<fadcbe>,<caea>}
D+ : {<ceaba>,<cadcbf>,<fadcbe>}
D- : {<cadcbf>,<eadcb>,<fadcbe>}
112/04/21 6
112/04/21 7
closed pattern : {<a>:6,<ae>:3,<c>:4,<ba>:5,<bda>:4,
<bcda>:3,<e>:4}
closed pattern : {<a>:6,<ab>:5,<adb>:4,<adcb>:3,<c>:4,
<e>:4,<ea>:3}
112/04/21 8
sn : A node n of an IST corresponds a sequence that starts from the root
node to that node, and the sequence is denoted by Sn.
c-node : If sn is a closed sequential sequence in D’, n is a c-node.
t-node : If sn is not a closed sequential sequence in D’ and it does not
have any t-node ancestor.
i-node : n is neither a c-node nor t-node. 112/04/21 9
Algorithm.
Element insertion
Element removal
State update
112/04/21 10
Element insertion
Theorem 2 : If a depth-1 node whose item does not occur in the newly coming element, nodes under that node will not change their attribute values and any t-node under it does not change its type after inserting the element.
Theorem 3 : After inserting a new element, if the PDBSize and support of a t-node do not change, it will keep to be a t-node.
112/04/21 11
112/04/21 12
Dc^’ : {<eaba>,<adbf>,<b>,<be>,<aea>}
Df^’ : {φ, φ,<adcbe>}
c : {<eaba>,<ab>,<b>,<be>,<aea>}
ca : {<ba>,<b>,<ea>}
cb : {<a>, φ,<e>}
ce : {<aba>, φ,<a>}
112/04/21 13
Element removal
Theorem 5 : After the removal of etc−w, a t-node may be deleted, but it never changes to a c-node or an i-node.
For each child node t of n, it computes st-projected database in the removal database D−
112/04/21 14
D − : {<cadcbf>,<eadcb>,<fadcbe>}
Da−: {<dcbf>,<dcb>,<dcbe>}
Db−: {<f> ,φ,<e>}
Dc − : {<adbf>, <b>,<be>}
……
Df − : {φ,<adcbe>}
112/04/21 15
State update
Theorem 6 : Given a t-node n in an IST for the inverse database D, there must exist an i-node or a c-node t in the IST.
i-node => c-node
c-node => t-node
112/04/21 16
112/04/21 17
Experimental results.
112/04/21 18
112/04/21 19
Conclusion.
This paper has proposed a Seqstream algorithm to mine closed sequential pattern in sliding window.
Designed for multi-stream?
112/04/21 20