34
Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois connection knowledge discovery mining software repositories join focus visualization information retrieval meet navigation Stellenbosch Computer Science

Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Embed Size (px)

Citation preview

Page 1: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Seeing

Things

in the Clouds

over concept lattices

with tag clouds

browsingsemi-structured data

Bernd Fischer

object

attribute context table

relation

Galois connection

knowledge discovery

mining

software repositories

join

focus

visualization

information retrieval

meet

navigation

Stellenbosch

Computer Science

Page 2: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find stuff on the Internet?

concept-based browsing

query

Page 3: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find stuff on the Internet?

Yikes! 3 370 000 results!

Page 4: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find stuff on the Internet?

concept-based browsing

query

lattice

Page 5: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find stuff you didn’t look for?

Retrieval: extract objects that satisfy a pre-defined criterion• query describes criterion• main operation is matching: check satisfaction against query• main goal is precision: show only relevant objects

Browsing: spontaneously explore a collection • focus describes current position and selection• main operation is navigation: change the focus• main goal is recall: show all relevant objects

Page 6: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you browse?

(hierarchical)navigation structure

focus

selection

Page 7: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you browse semi-structured data?

What is semi-structured data?

What is structured data?

Structured data has...• ... a very high degree of regularity• ... an explicit, tight format (schema)

Typical examples:• spreadsheets• relational databases (SQL: structured query language)

Page 8: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you browse semi-structured data?

What is semi-structured data? Semi-structured data ...

• ... contains both free-text and formatted fields

• ... has large structural variance• ... is implicitly formatted

Typical examples:• product reviews• newspaper articles

+ meta-data• revision control logs

Page 9: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Approach:• find a suitable abstract data representation

– bag-of-words, graphs, binary relations, RDF triples, XML, ...• find a suitable hierarchy

– metric spaces, graphs, concept lattices, ...• find a suitable visual representation

– lists, graphs, tag clouds, city scapes, ...• find a navigation algorithm

How do you browse semi-structured data?

Page 10: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you represent data?

Structured data is represented by n-ary relations or tables:• each object becomes a row• each column represents

an attribute type• text remains unstructured

author title year venue

Fischer Specification-based browsing... 2000 J. ASE

van Zijl Supernondeterministic finite... 2001 CIAA

Page 11: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you represent data?

Structured data is represented by n-ary relations or tables:• each object becomes a row• each column represents

an attribute type• text remains unstructured• set-valued attributes require normalization

author title year venue

Fischer Specification-based browsing... 2000 J. ASE

van Zijl Supernondeterministic finite... 2001 CIAA

Greene ConceptCloud: A Tag-cloud... 2014 FSE

Fischer ConceptCloud: A Tag-cloud... 2014 FSE

Page 12: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you represent data?

Structured data is represented by n-ary relations or tables:• each object becomes a row• each column represents

an attribute type• text remains unstructured• set-valued attributes require normalization

Semi-structured data can be represented by binary relations:• text is split into words• each occurring value and

word becomes an attribute• build context table: add cross if attribute applies to object

– word appears in document, meta-data, references ...

id title year venue

08 Specification-based browsing... 2000 J. ASE

15 Supernondeterministic finite... 2001 CIAA

42 ConceptCloud: A Tag-cloud... 2014 FSE

id author

08 Fischer

15 van Zijl

42 Greene

42 Fischer

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Page 13: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ)

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Page 14: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ) • common attributes:

α(O) = { a ∈ A | ∀o ∈ O : o ~ₓ a } α({08, 42} =

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

α({08, 42} = {Fischer, browsing}

Page 15: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ) • common attributes:

α(O) = { a ∈ A | ∀o ∈ O : o ~ₓ a }• common objects:

ω(A) = { o ∈ O | ∀a ∈ A : o ~ₓ a }• concept:

(O, A) s.t. α(O) = A ∧ ω(A) = O

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

α({08, 42} = {Fischer, browsing}

ω({Fischer, browsing}

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

ω({Fischer, browsing} = {08, 42}

extent intent

Page 16: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ) • common attributes:

α(O) = { a ∈ A | ∀o ∈ O : o ~ₓ a }• common objects:

ω(A) = { o ∈ O | ∀a ∈ A : o ~ₓ a }• concept:

(O, A) s.t. α(O) = A ∧ ω(A) = O

{08}{F,browsing,’00}

{42}{F,G,browsing,tag,’14}

{08, 42}{F,browsing}

{42}{tag}

extent intent

Page 17: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

How do you find hierarchy in relations?

Formal concept analysis:• formal context: (O, A, ~ₓ) • common attributes:

α(O) = { a ∈ A | ∀o ∈ O : o ~ₓ a }• common objects:

ω(A) = { o ∈ O | ∀a ∈ A : o ~ₓ a }• concept:

(O, A) s.t. α(O) = A ∧ ω(A) = O• sub-concept ordering:

(O₁, A₁) ≤ (O₂, A₂) iff O₁ ⊆ O₂ iff A₁ ⊇ A₂• concept lattice: concepts of a context form a complete lattice

{08}{F,browsing,’00}

{42}{F,G,browsing,tag,’14}

{08, 42}{F,browsing}

{42}{tag}

Page 18: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Are we there yet?

Nope.

Concept lattices induce • enough structure for navigation...• ... but too much to show directly!

Page 19: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you visualize concept lattices?

Approach:• don’t show the lattice• use concepts as focus• visualize only focus concept

– but in relation to lattice

Page 20: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you visualize concepts?

Approach:• don’t show the lattice• use concepts as focus• visualize only focus concept

– but in relation to lattice• use extent to derive tag cloud

Page 21: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you build tag clouds for concepts?

What is a tag cloud?

• visual representation of text data– summarize large data set

– emphasize important tags

• single words or short phrases• importance reflected as size

– frequency in document

– number of tagged items

– number of page hits

• different layout methods

Page 22: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you build tag clouds for concepts?

• intent looks like tag cloud...• ... but is common to all objects

⇒ all tags same size• instead: collect all attributes

from all objects in extent– can be expressed in

concept lattice:

– also add extent via object identifiers

• intent shown as largest tags– smaller tags are related

information

{08, 42}{Fischer,browsing}

Fischer Greene van Zijl browsing tag 2000 2001 2014

08 × × ×15 × ×42 × × × × ×

2 1 - 2 1 1 - 1

08 42 2000 2014 browsingFischer Greene tag

Page 23: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

The ConceptCloud Browser

by: Gillian Greene, US

file

message

date

author

controls

Page 24: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

The ConceptCloud Browser

most prolificcontributor

Page 25: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you navigate with tag clouds?

Navigation modes:• refinement: narrow the selection

– select a new tag

• widening: extend the selection– remove a selected tag

Page 26: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you navigate with concept lattices?

Navigation modes:• refinement: narrow the selection

– select a new tag: f’ = f ∧ δ(t)

• widening: extend the selection– remove a selected tag

(ω({t}), α(ω({t}))) if t∈Aδ(t) = (α(ω({t})), ω({t})) if t∈O

focusconcept

tagconcept

focusconcept

focusconcept

Page 27: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you navigate with concept lattices?

Navigation modes:• refinement: narrow the selection

– select a new tag: f’ = f ∧ δ(t)

• widening: extend the selection– remove a selected tag: f’ = f ∨ δ(t)

(ω({t}), α(ω({t}))) if t∈Aδ(t) = (α(ω({t})), ω({t})) if t∈O

tagconcept

focusconcept

tagconcept

focusconcept

focusconcept

focusconcept

Page 28: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

How do you navigate with concept lattices?

Navigation modes:• refinement: narrow the selection

– select a new tag: f’ = f ∧ δ(t)

• widening: extend the selection– remove a selected tag: f’ = f ∨ δ(t) f’ = ∧i∈π(f) \ {t} δ(i)– join-based widening can be

useful as well

(ω({t}), α(ω({t}))) if t∈Aδ(t) = (α(ω({t})), ω({t})) if t∈O

tagconcept

focusconcept

tagconcept

focusconcept

Page 29: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Navigation in the ConceptCloud Browser

Page 30: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Navigation in the ConceptCloud Browser

Page 31: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Navigation in the ConceptCloud Browser

Page 32: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

The Percept Browser

by: Carl Kritzinger, Fireworks

Page 33: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

• Semi-structured data is common but hard to analyze• Tag clouds are a good visualization approach...• ... and the combination with concept lattices makes it easy to

navigate and find related information• Flexible approach, generic tool

– different data sets– different types of contexts ( different types of analysis)⇒

• Scalability– DBLP, IMDb, Wikipedia?

• Customizability– context extraction– tool scripting

Conclusions & Future Work

Page 34: Seeing Things in the Clouds over concept lattices with tag clouds browsing semi-structured data Bernd Fischer object attribute context table relation Galois

Conclusions & Future Work

• Semi-structured data is common but hard to analyze• Tag clouds are a good visualization approach...• ... and the combination with concept lattices makes it easy to

navigate and find related information• Flexible approach, generic tool

– different data sets– different types of contexts ( different types of analysis)⇒

• Scalability– DBLP, IMDb, Wikipedia?

• Customizability– context extraction– tool scripting