46
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel

Learning Object Metadata Mining

  • Upload
    nhung

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Learning Object Metadata Mining. Masoud Makrehchi Supervisor: Prof. Mohamed Kamel. Outlines. Metadata Mining Metadata Representation Model Class-Term Matrix Case Study Conclusion Remarks. Metadata Mining. Metadata Definition Data about data, for example a library catalogue - PowerPoint PPT Presentation

Citation preview

Page 1: Learning Object Metadata Mining

Learning Object Metadata Mining

Masoud MakrehchiSupervisor:

Prof. Mohamed Kamel

Page 2: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

2 of 46

Outlines

• Metadata Mining• Metadata Representation Model• Class-Term Matrix• Case Study• Conclusion Remarks

Page 3: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

3 of 46

Metadata Mining• Metadata Definition

– Data about data, for example a library catalogue • Metadata Application:

– Cataloging (Item and Collections) – Resource Discovery – Electronic Commerce and Digital Signatures– Intelligent Software Agents – Content Rating – Intellectual Property Rights – Semantic Web– Learning Objects

• LOM Standards: IEEE LOM, DC, SCORM, CANCORE

Page 4: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

4 of 46

Metadata Mining

• Definition– extraction of implicit, previously

unknown, and potentially useful information from metadata.

• Methods– classification, clustering, summarization,

mining association rules, ontology extraction, information integration, keyword extraction, automatic title generation.

Page 5: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

5 of 46

Metadata Mining

• Why metadata mining?– No access to the data itself, lack of raw

data,– The data is not convenient for mining

(heterogeneous formats and non-text format)

– Diversity of metadata standards, and need to merge different metadata repositories,

– Ontology extraction is much easier in metadata level.

Page 6: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

6 of 46

Metadata Mining

Content-basedData Mining

Context-basedData Mining

Conceptual data architecture

Page 7: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

7 of 46

Metadata Mining

• Applications– Metadata mining instead of raw data

mining,– Metadata enrichment (keyword

extraction)– (Semi)-automatic Ontology extraction,– RDF, OWL and other semantic tagged

script mining,– Information integration (LOs aggregation

and integration),

Page 8: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

8 of 46

Metadata Mining

• Statistical methods based on word frequency analysis,

• Syntactic methods based on linguistic parsing and pattern matching,

• Structural methods studying the outline of the document,

• Conceptual (semantic) methods on the use of knowledge base to interpret the meaning.

Page 9: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

9 of 46

Metadata Mining

• We don’t use – Natural Language Processing (NLP),– Semantic analysis and processing, – Graph, tree and other sophisticate data

structures and models,– Dictionaries, thesauruses, and any other

global vocabularies (only a simple Porter stemmer).

Page 10: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

10 of 46

Outlines

• Metadata Mining• Metadata Representation Model• Class-Term Matrix• Case Study• Conclusion Remarks

Page 11: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

11 of 46

Metadata Representation Model

• We treat metadata as a text document (semi-structured format),

• The only measures are– statistical measures (like frequency)– geometric features (like location of a

specific term, the order of words in a term or phrase)

Page 12: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

12 of 46

Metadata Representation Model

• Vector Space Model

T

di

Vocabulary

Page 13: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

13 of 46

Metadata Representation Model

• Multi-Partition Vector Space Model

T

di

Vocabulary

Page 14: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

14 of 46

Metadata Representation Model

• Multi-Partition Vector Space Model

Page 15: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

15 of 46

Metadata Representation Model

• Converting to standard vector model

Page 16: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

16 of 46

Metadata Representation Model

• Weight of each partition– To be determined by expert, for

example: Wabstract=1.0, Wtitile=1.5.

• Membership degree of each term in every partition– By expert,– Frequency based measures (tfidf), – Geometric measures (location of each

term in the partition).

Page 17: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

17 of 46

Outlines

• Metadata Mining• Metadata Representation Model• Class-Term Matrix• Case Study• Conclusion Remarks

Page 18: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

18 of 46

Class-Term Matrix

• Document-Term Matrix (Collection X Vocabulary)– The matrix is very large. (thousands of

documents in the collection and millions of terms in the vocabulary),

– The matrix is sparse. Usually only small number of elements in the matrix are non zero (zipf's law),

– The matrix is dual with respect to terms and documents.

Page 19: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

19 of 46

Class-Term Matrix

• Class-Term Matrix (Class X Vocabulary)– The matrix is large. (tens of classes and

millions of terms in the vocabulary),– The matrix is less sparse,– The matrix is still dual with respect to

terms and classes.

Page 20: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

20 of 46

Class-Term Matrix

Class-term Frequency

Term significance measure

Normalized term significance measure

Page 21: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

21 of 46

Class-Term Matrix

... Sig(q,m)

text

...

...

Term m

Sig(1,m)te

xt ... Sig(2,m)

...

text

...

text

Cla

ss q

Sig(q,1) Sig(q,2)

Cla

ss 1

Term 1

Sig(1,1)

Term 2

Sig(1,2)

Cla

ss 2

Sig(2,1) Sig(2,2)

Page 22: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

22 of 46

Class-Term Matrix

... Sig(q,m)

text

...

...

Term m

Sig(1,m)te

xt ... Sig(2,m)

...

text

...

text

Cla

ss q

Sig(q,1) Sig(q,2)

Cla

ss 1

Term 1

Sig(1,1)

Term 2

Sig(1,2)

Cla

ss 2

Sig(2,1) Sig(2,2)

•Terminology•All terms which occur in a class (or concept)•A fuzzy set of all terms in the vocabulary

Page 23: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

23 of 46

Class-Term Matrix

... Sig(q,m)

text

...

...

Term m

Sig(1,m)te

xt ... Sig(2,m)

...

text

...

text

Cla

ss q

Sig(q,1) Sig(q,2)

Cla

ss 1

Term 1

Sig(1,1)

Term 2

Sig(1,2)

Cla

ss 2

Sig(2,1) Sig(2,2)

•Definition•All concepts (classes) which the term belongs to•A fuzzy set of all concepts (classes)

Page 24: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

24 of 46

Outlines

• Metadata Mining• Metadata Representation Model• Class-Term Matrix• Case Study• Conclusion Remarks

Page 25: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

25 of 46

Case Study

• Data set– There is no available LO metadata

repository– Citeseer computer science directory

(http://citeseer.ist.psu.edu/directory.html)– ~400,000 terms (vocabulary size) – 17 classes– 2,912 documents– Instead of data (in PDF or PS), we collected

BibTeX data (kind of metadata or catalogue) and abstracts of the articles.

Page 26: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

26 of 46

Case Study

Page 27: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

27 of 46

Case Study

Page 28: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

28 of 46

Case Study

• Types of Frequency Measures – Within document: by document-term

frequency (like tfidf)– Within class: by class-term frequency

(like term significance)– Within collection: by collection-term

frequency (like mean of term significances)

Page 29: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

29 of 46

Case Study

• Term Clustering: Categorizing all terms into three main groups– Features: More frequent terms within a class– Keywords: More frequent terms within some

documents belonging to a given class– Stopwords: More frequent terms in all classes

• Introducing Class-Collection Map– To visualize the location of each category

Page 30: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

30 of 46

Case Study

Features

Keywords

Stopwords

Within-collection Frequency

Within

-cla

ss F

requen

cy

Page 31: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

31 of 46

Case Study

Features

Keywords

Stopwords

Within-collection Frequency

Within

-cla

ss F

requen

cy

Page 32: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

32 of 46

Case Study

Features

Keywords

Stopwords

Within-collection Frequency

Within

-cla

ss F

requen

cy

Page 33: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

33 of 46

Case Study

• Extraction of Stopwords (doesn’t contribute to the meaning of the document)– General stopwords (a, an, the, in, …)– Domain-specific stopwords

• Politics: Government, State,• Medicine: Patient, • Education: Learner, Instructor,• Social sciences: Society,• Anthropology: Human.

Page 34: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

34 of 46

Case Study

• Why we need to remove domain specific stopwords?– Dimensionality reduction,– Accurate feature selection (drawbacks of

information gain in selecting noise as feature)

– Based on stopwords, we can find and separate phrases (based on our definition, a phrase is a set of words between two stopwords).

Page 35: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

35 of 46

0 0.01 0.02 0.03 0.04 0.05 0.060

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

abilablabnorm

absenc

absolut

abstractaccept accessaccumulaccur

accuraci

achiev

acmacquiracquisit across

action

activactivat

actualacycl

adaboost adapt

addaddit

adequadjustadopt advancadvantagadversariadvocag aggreg

ai

aimalalcnr

algorithm

alialiasalign

allow

alonalong

altern

amount

analog

analysianalyt analyzandrzej

animanneal

annotanomali

answ eranticipappar appear

appli

applicapprentic

approach

appropriapproxim

ararbitragarbitrari

architectur

area

argu

argumentarisenarticul

artif iciascensascent

aspect

assert

assess

asset

assign

assistassociassum

assumpt

attempt

attentattract

attributaugment

automat

autonom

availaverag

backendbackground

backtrack

backupbag

balancballardband

base

basibasicbasisfunctbatteri

bayesian

bayesiannetw orkbeck

becom

begun

behavior

behaviourbehindbelief

bellmanbenchmark

bestbetter

beyondbibliographibinari

biologbipartitbivaribless blockbodi

boost

bootstrapbottleneckboundboundaribrain

branchbreiman

brief

brieflibringbroad

brookbrow sbuffalo buildbuiltbypass

cachcalculi

calculucalendar

call

cameracapaccapturcarin

carri

case

castcategoricenter

centr

central

certaincft

chainchair challengchang

character

characteristchargercheapchemicchenchildrenchosenchricl

class

classic

classifclassif i

clausclear

climbclosecloser

cnco code

coevolv

cognit collectcollis

combin

combinatori

come

commit common

commonlicommonsens

commun

compact

companicompar

comparisoncompetcompetit compilcomplet

complex

componcomposcompresscompromis

comput

computationconcav

concentr

concept

conceptstoconcern

conclusconcret

condit

configurconflictconfus

conjunctconnect

connectionconnelconsensu

consid

consider

consistconstrain constraintconstruct

consum

contain

contentcontext continu

contourcontradict

contrast

contribut

control

conveiconveniconventconvergconvert

convexcooccurr

coopercoordincore

correct

correspond

costcostlicounterpart creatcriteria criticcrossov

crucial

current

curscurvcut

cvdcvdss

daidarpa

data

databasdayandbn

dedeadlock dealdecaddeceptdecid

decis

declar

decomposit

default

defin

definitdegredelet

demonstr

densiti

depend

depth

deriv

describ

descript

design

desir

detaildetectdetermindeterminist

develop

diagnosidiagnostdietterich

differ

diff icult

dimensdimension

diminishdirect

directlidisciplindiscovdiscoveridiscrepdiscret

discrimin

discussdisjunctdistinctdistinguish

distribut

divers

divid

divisdnf document

domain

donalddonohodow nw ard

dpdraft

dramat

drawdraw ndrp

duedure

dynamearliereas

easieasili

edg

editedu

effect

eff icieffort

eightelectrelimin

em

emerg

empir

emploi

enablencodendenergi enginenhancensembl

entail

enterprisentirentitientropi environepipolar

equalequaterror

especiessentiestablish

estim

et evalu

evid

evolut

evolutionari

exactexactli

examin

exampl

execut

exemptexhibit

exist

existentiexpectexpens experi

experimentexpert

explicitexplicitli

exploit

explor

expressexpressionless extendextens

extern

extract

extractorextremfabface

facilitfactfactorfailfaithfamili

familiarfarfast

faster

favorfeasibl

featur

februarifeedforw ardfew

field

fifteenfigur

filinskif illf iller f ilter

f inal

f ind

finerfinin

first

f irstlif ischerfisher

fit

f ivefixflavorfle

flexiblf luent focu

focusfocussfollowforc

formformalformer

formulformula

forw ard

found

foundatfour

fradet framew ork

freefrequencfreundfritzsonfrontalfruitfulli

full

function

fundamentfurther

furthermor

fuse futur

ga

gaingamegap

gather

gaussian

gelfond

gener

genesereth

genet

genotypgentnergeographgeometr

geometri

gio

give

given

global

goal

goodgracefulli

gradient

graduatgraingram

graph

graphicgraphplan

greatgreedi

group

guaranteguess guid

hand

handl

hard

hardesthcheadheavili

heterogen

heurist

hidden

hierarch

hierarchi high

highlihillhillclimbhinder

hinton

historhistorihitachi

hmmholdhomographi

hope

horn

htnhuman

hybridhypothesiid

idea

idealidentif i

ignorii illustrilp imagimagebasimit

immedi implement

import

importantliimposimposs

impract

improv

inadequincept

includinconsistincorpor increas

increasingliindepend

indexindirectindistinguish

inducinduct

industriinertia

infer

inflex

inform

inhibit initi

input

inqueriinsightinstabl

instancinstantiinsteadinsuffici

integr

intelligintendew d interact

interest

interfac

intergrintern internetinteroper

interpretintersect

intract

introduc

introductintrusinvariinvest

investig

involvion

irrelev

isiisol

issuitem

iterjacobjaijamejanuarijob jointjordanjulijumpkaelblkalmankautzkbann keep

keikernelkeyw ordkind

kit

kl

know ledg

know ledgebas

know n

kongkqml

kushilevitz

label

lacklamarckianlambda

languag

languageforlaplacian

larg

largerlaterlatterlatticlaw lead

learn

learnabl

learner

ledleedlemmaiz

length levellifolifschitz

lightlikelikelihoodlimit

line

linear

linguistlink

lisp

list

literatur

littl

lm locallocatlockhe

logic

longlookaheadlookup losslowlyma

machin

mademae

magnitud

mainmainstreammaintain

majormake

malici managmanifest

manipulmanner

mansourmapmargin

markov

marylandmassivmatchmathematmaximmaximummaxqmazemcguirmckai meanmeasur

mechanmedic

membership

memorimerit messagmet meta

metalearn

metaphor

method

methodolog

methodw hichmichael

min

mine

minimminimumminormirror

miss mixtur mobil

model

modelbasmodermodifimodulmodular

monitormonoton motionmotivmotorola multi

multiagmultidisciplinari

multigroupmultilay

multiplmultitudmultivalumuseum

mutat

mutualmysteri

naiv namenarrow

natur

nearnearlinecess

necessari neednegnegat

neither

netw orkneural

neurosci

new

nodenoisnondeterminist

nonlinear

nonmonotonnonparametrnonpropositnontrivinormalnotat

notic

notion

notorinovel

now lannpnumer objectobservobtainoccuroccurr offer

offset oftenon

ontologoper

opportun

optim

order

ordinari orient

origin

otherw isoutcomoutlin

outperform

overaloverheadoverlapoverli overviewpacpackag pagepairpaper

paradigmparallel

paramax parametparameter part

partipartial

partialord

particular

partit

pascal

passpassiv pastpathpathw idthpatient patternpatternmatchpazzani

pca

pdppedestrianpelavin

perceiv

perceptperceptronperceptuperfect

perform

permisspermitperson

perspect

phase

phenomenonphotogrammetrphotogrammetricmodelphotographphrase physicpiecew is

plan

plane

planner

plantplateplural

point

policipolyhedrpolynomi

pomdppool

poorlipopl

popul

popularpopularliposeposit possibl

posteriorpostprocesspotenti

pow er

practic

precispredic

predictpredictor

prefer

presenc

present

previouprevious

price primariprimarili primitprincipprincipl

priorpriorit

probabilist

probabl

problem

problematproce

procedur

process

processor

prodigi produc

product

program

progress projectprologprominpromispromotproof

properti propos

proposi

proposit

provabl

prove

provenprover

provid

pseudopsychologpublishpure purpospursuitpython

qualit

quantitiqueriquestionquickli

quitradial

radicramifrandomrandomli

rangrapier

rate

ration

readerreadili

real

reason

receiv

recent

recogn recognitrecommendrecordrecoverirecurrrecursreducredund

refer

refinrefineri

regardless

region

registregressregular

reinforc

rel relatrelationship

relev

reli

remainrenderrepair

replic report

represrepresent

reproducrequest

requir

research

resembl

residuresizresolut

resourcrespectrespons

rest

restrict

result

retrievreturnreusablrevenu

review

rew eightrichrichard

richerrigid

riskroadmap

robocup

robotroboticist

robustrolerotatroughlirubric

rule

run

salesman

sampl

samuel

satisfact satisfi

saundersbbhscale

scene

schapir

schedul

schemaschemata

schemesciencscientifseamlessli

search

secondsecondlisectionseek

seemingli

segment

select

semantsemistructursend

sendmailsens separ

sequencsequentiseriserious

server servic

set

sever

shallowshapeshapiro sharesharpshop

shortshortest

show

show n

sigactsigmoid

signatursignif ic

signif icantlisigplansilhouett

similar

simpl

simplersimplicsimplif i simulsinc

singersinghsingl

situat

sizeslot smallsmesmoothli

snow birdsoccer softw arsolut

solv

solvablsometimsomew hatsophistsound sourc

spacespars

sparsitispatialspeakspecial

specifspecif ispeechspeed

spitesrvstage

standardstanford

start

state

statementstatic

statist

statu

step

stochast

store

strategi

streamstrengthstress

strikestrip

strong

structur

stuart

studi

stylesubsubfieldsubroutinsubsequ

subset

substitut

subsum

subsumpt

subtresuccesssuccessfullisuff ic suff icisuggestsuit

sumsummar

sunisupervissupport

surfacsurprisinglisurvei

svmsymbol

syntactsynthessystemat

tabl

tableaux

tactactic taketakentannertantal

target

task

tautologtaxonomi

tcpdumpteacherteam

techniqu

technolog

templat

tempor

termterminterminologtesttexttexturtheorem

theoret theori

theoryandtherebithesi

thoughttim

timetool

top topictorontototaltour tow ardtracktractabltradittradition

train

trainabl

trajectori

transformtransittranslattraveltreat

tree

treew idthtrialtrilinear

truetruthtutoritw ofoldtypetypic

ucpop

unbounduncertaintiunderli

understand

understood

undertakundirectunfortununif i

uniformuniformliuniqu

unit universunknow n

unlabel

unobtainunordunrestrict

unsatisfi

unstablusag

userusual

utah

util

valiant

valid

valu

valuablvapnik

vari

variabl

variantvarieti

variou

vastvc

vector

version

verticvia

view

violatvisionvisit visualvisualis

vocabularivol

vote w aiw alkw areh

w eak

w eakliw ealth w ebw eber

w eight

w henevw hereaw herebiw hiteheadw holew ide

w iederholdw innoww ord

w ork

w orkspacw orld

w orldw idw ors

w orst

w rapperw rittenxy year

yield

zero might

only

its

both

othereithertherefore

become three

though

although how ever

w ay

w ell

under

aroundothersz

non

togetherratherw ant stillthusamongmustalready

AI

within-collection frequency

with

in-c

lass

fre

quen

cy

Page 36: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

36 of 46

0 0.01 0.02 0.03 0.04 0.05 0.060

0.1

0.2

0.3

0.4

0.5

0.6

0.7

abbreviabduct ablabov

absent abstractac acceptaccess

accommodaccordaccordingliaccountaccuracedb achievacm acrossaction

activactualad adaptaddressadequadjoinadmit adopt advancadvantagadventaffect

aggregaimalgebra algorithm

allow

alphaaltavista alternaltoamalgam amountanalys analysianalyzanil annotanomaliansw er

appear appli

applicapproach

appropriaraneuarbitrari architecturareaarguarguablargumentarisarithmet artif iciascensasnaspect

assign associassumassumpt

asynchronatomattattach attackattemptattentattribut

augmentautoepistemautom automat autonomauxiliari availaviv avoid

base

basibasicbecombeganbegin behaviorbehindbelievbelongbenchmarkbenefit

besid bestbetterbettinibeyondbinaribindbiologist

bitblastbodibolboolean

bottom

boundboundaribranchbriefbroadbrodskibrow n brow sbrybtbucket buffer buildbuiltbulkbusibytecacachcad

calculucalendarcallcam

capablcaptur

carecarefullicartesian casecategoricertainchalleng

chang

charactercharacterischaracteristcheckchenchoicci circuitcircumscriptclaim

clarif i

class

classic classifclausalcleanclear

client

clockclose

closur codecohercollaborcollectcombincomecommercicommit

commoncommonsens communcommutcompact compar compil

complet complex

componcomposcompositcomprehenscompris computconcentr conceptconceptu

concernconcisconclusconcret concurrconditconformconfus

conjunctconnect

consequconsid

considerconsist

constrain

constraint

constructconstructionofconstructorconsumcontain

contentcontinucontrast

contribut controlconveni

convent

conversconvert coopercopi correctcorrespondcostcounterpartcouplcoveragcrash creatcreditcredul criticcs currentcurvcyc cycldan

data

databas

dataguiddatalog

datashipdbmdcgdddbdeadlockdeal

decdecid decis

declar

declustdedic

deduct

deeplidefaultdefici

defin

definitdegerstedtdegre delaideletdelta demonstrdependdeptdepth derivdescend

describ

descriptdesign

desirdespitdetectdetermin

developdevisdictat

differdiff icultdiff icultidimensiondirectdirectlidisassocidiscoursdiscovdiscoveridiscret discuss

disjunctdiskdistancdistinctdistinguish distributdiversdlrdoc documentdomaindomindrivenduedung duredx dynameaseasi

easilieconomedbedieditoredu effect

eff ici

effortegenhofelaborelectr electron

elementarieliminelus

emerg

emploi enablencapsulencodencountencourag enginenhancensur

entirentitienviron

equalequatequipequivalespeciessentiestablishetcevaluevent

evolutevolutionariexactli examin examplexchang executexemplif iexhibit existexpens experimentexpertexplic explicit exploitexplorexponentiexportexpressextend

extens

extentexternextra extractfacefacilfacilitfactfailfarfashion

fasta faultfax feasibl featurfeefetchffl f ilefinal

f inanci f indfinefinit

f irst

f ivefixflexibl

fm focufocusfocuss followforcforest form formalformatformulformula

foundfoundatfragment framew orkfranzosafreefrequentfresh fullfulli

functionfundament

furtherfurthermorfuturgarlic

gener

genomgeometrigeorggi give givenglb globalglue goalgoodgraingrammargranulargraphgraphicgreatgreatestgreatliground groupgrowguarantegulog hand

handlhappen hardw arhelpherbrandhereaft heterogenhierarch

hierarchihighhigher highlihilberthiloghistorhistoriholdhomhornhospithothp humanhyperhypothetidb ideaidealident identif iiglu iiiiiil illustr imag

immedi impactimperimplement

impliimplicit

import

impos improvinappropriinc

includ

incompletincreasincreasingliincrementindefinit independindexindic individuinductineff ici infer

infinitinform

inginherinherit

inputinsertinspirinstablinstal instancinsteadinteg

integr

intelligintendintensinter interactinterchang interest interfacinterferintermediintern internetinteroper interpretintersectintervintroduc

introduct

intuitinvestig

invoc involvirregular issu

jackjajodiajejeroslowjoinjudgjulijustif ikanchansutkaufmannkeep keikindkl

know ledg

know nlablacklag

languag

langug larglatelatexlattic lead learnlegitim levellifelifespanlight limitlinelinearlisbon listliter literaturlittllivelo locallocklog

logic

logspac longlongerlooklooploos

lorellotlow er

luimachiavelli machinmademagic mainmainli

maintainmainten

majormake

manag

mandatorimanipul

mannermanufacturmanyoftheirmapmarketmarketplacmason matchmateri

math

maxim

mccarthimdbm meanmeaning mechanmediatmeetmemo memorimergermetametaphor

method

methodologmiddlew armilomin mineminimminimumminkermipmismatchmissmitmixmodal

model

modifimodularmonitormonizmonoidmonotonmoormorganmostrec motivmovemultidatabas

multidimensionmultimedia multiplmultisetnail namenaonatur

navignecessarinecessarili needneg

negat

neighborneithernerodnest netw orknevertheless

new

nextnoisinondisjunctnonminimnonmonotonnonquerinonuniformnormalnotablnotif i notionnovelnoveloptimnumberobb

object

objectori

observobtainoccuroccurr

odmg

offeroff ic oftenolderonoodbopal open

oper

opportunopposoptim

optimistoqlord

order

orgorgan

orient

originorthogonotherw isoutputoveralovercomoverrid

ow npapackag

page

paidpairpaper paradigmparadis parallelparametparametrparsparser partpartialparticular

particularlipartitpartli passpastpathpatient patternpearpeerpereiraperfect performperformancedatabasesystemperiodpermisspermit

persist

phasephenomenaphenomenonphrase physicplaceplai planplanarplung pointpointbaspointerpolicipolynomipopulpopularposit

positivist possiblpostposteriori potenti pow erpracticpre precis

predic

predominpreferpreferentipresburgpresenc

present

preservpressprevalprevent previouprimariprimarili primitprinciplpriorproblemprocedur process

processorproducproduct

program

programmprojectproliferprolog proof properti

propos

protectprototypprotoyp proveprovenprovid

proximprzymusinskipublishpuhrpure purpospushdow nqlqualif iqualit qualitiquantif iquantit

queri

queryevaluquestionquickliquitradicraisramasw amiramif randomrangrankrapidreadili realrealiti

reason

receivrecent

recognrecordrecoverirecursredistribut reducreductredund

reexamin

refer

regardlessregionreiter rel

relat

relationshiprelevreliremaindrenderreorganrepeatrepetitreplacreplic

repositorirepres

represent

republish requestrequir

researchresidresolutresolv resourcrespect respons

restrict

restructurresult

retrievreviewricherrigorriscrobinrock rolerollrossround

rule

runsairamsamo

samplsatisfiscalablschedul

schema

schemata schemesciencscientifscopeseamless search

second

secondari securseem selectself

semant

semi

semistructur

sensseparseq sequencsequoiaserializseriouserv

server

servic

set

seversgml

sharesheafshiftshipshoreshortcom

show

show how show nshrinksidesignal signif icsimilar

simplsimplisimplic simulsinc

singlsit sitesix sizeskepticslsldnf smallsmallersocial softw arsolutsometimsophistsound sourc

spacesparsspate spatialspatio special

specif

specif ispectrumspentspherespringer

sql

srsridharsrlssststablstale standardstart statestaticstatiststepstorag

store

straightforw ardstrategistratif i streamstreetstrongstrongli

structur

studistyle

subgoalsublanguagsubquerisubramaniansubsequsubsetsubtypsuciu suff icisuitablsummari

support

surveisymmetr synchronsyntactsyntaxsystemattabltableauxtabultalk targettautechniqu

technologteltemplog

tempor

tendterm

terminterritoriththankthemselvtheoret theorithereofthinkthird

thoroughtightli timetmtodai tooltop

topdow n topictopologtova tow ardtractabltradetradeoff

tradittransact

transfertransform

translattranspartravers

treattreatment treetrivialtroll trytsimmitune

tupl

turntw odimensiontype

typicunacceptundecidunderestimundergo underliunderstandunderstoodunifiuniformuniformliunionuniquunitunivers

unknow nunlikunpredictunstructuruod updatusag uservalidvalu

vari variablvariantvarieti

variouversionvia

view

violat virtualvisualvocabularivolatil w aiw angw arrenw eakw eakestw ealth w ebw herea w idew idespreadw isconsin w ordw ork

w orkfloww orkshop w orkstat w orldw rapperw ritexmlxsbye yearyieldyorkzero

onlyits

both

othereitherthereforebecome threethoughalthough

how everw ayw ell

underaroundothers

nontogetherratherstillthus amongmustbecomesw anting

DB

within-collection frequency

with

in-c

lass

fre

quen

cy

Page 37: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

37 of 46

0 0.01 0.02 0.03 0.04 0.05 0.060

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

aapc

abl

abort

absenc

abstract

accept

access

accommodaccompaniaccordaccountaccumulaccur

achiev

acmacquir

across

actaction

activactivexactor

actualad

adapt

addit

address

adjustadminist

admiss

admitadmitt

adopt

advanc

advantag

affect

againstagencaggressaiaid aim

aircraftairlinalalert

alew if

alexandalgebraalgoalgorithm

allevi

alloc

allow

alonaltern

ami

amountanaloganalys analysianalyt

analyz

andersonannot

anonymansw er

apollo

appear

appletappli

applic

applicationspecif

approach

appropri

approximarbitrariarchitect

architectur

areaarguarisarithmet

arrai

arrivartarticul artif iciaspectassembl associ

assumassumpt

asynchron

athenaatlantaatomatomicact

attackattemptattentattract attribut

audio

audiofil

authentauthorautomatautomata autonom

avail

averagavionavoid

axiombackgroundbacktrackbalanc

bandw idth

bank

base

basibasic

batteri becombeginbehav behaviorbeliev

benchmark

benefit

berkelei

bestbetter

beyondbillion

bitblock

bodibooleanbottleneck

boundbranchbreak

brew er

bridgbriefbrigg

bringbroadcast

brokenbrokerbrow sbrvbsp buffer build

builtbulkbusibutton

cach

cad calculucalifornia

call

candidcapabl

capaccardcarri

case

cash categoricauscd

cell centralcertainchair challengchang

channelcharactercharacteristchargercharlott

check

checkerchines

chip

choic

chooschose

circuit

ckp

classclassif

classif iclear

client

cliffclockcloseclosur

cluster

cmococoars

code

coeffici

coher

coherentsharedmemoricollect

colliscomacombin

commentcommon

commonli

commun

compar

comparison

compil

complement

complet

complex

componcomposcompositcompress

comput

computationcomputationsthat concept

concernconclus

concurrcondit

conduct

configur

conflictconfuscongest

connect

conserv consid

consider

consist

constantconstitutconstrain

constraintconstru construct

consumcontaincontentcontinucontradictcontrast

control

controversiconveni

conventconvergconvertconvexcopecore cost

count

counterfeitcouplcpucr

crai

creatcredit

critic

cryptographcryptographicryptolog

cryptosystemcsmaculler

currenc

current

custom

cvm

cycl

cyclic

data

databas

dataflowdataobjectdatapathdatatypdatedatumdaviddeadlockdeadlockrecoveri

dealdebit decad

decisdecision declar

decompos

decompositdefici

defin

definitdegre

delaidelivdeliveri

demand

demonstrdenhand

densdepend

deployderiv

describ

descriptdescriptor

design

desirdespit

detail

determin

develop

devic

differ

diff idiff icultdiff iculti digit

dimensdimensiondirect

directori

disadvantagdiscovdiscret discuss

disk

dispatchdisplai

dissemindistancdistinct

distinguish

distribut

diurnaldivergdividdivis domain

donhaindraftdrastic

draw ndrivendrop

dsmduboi

duedungeonduplex duredvsm

dynam

eagerearliearliereaseasi

easiliec econom

effect

eff ici

effort

elect electron

elementelev

elgotelus emerg

emphasemphasi empir

emploi

emul enablencapsul

encodencount end

energi enginenoughensur

entir

entiti

entri

environ

equat

eric

erronerror

especi

essenti

establishestimet

ethernet

evalu

eventevid

evidenc

evolutevolutionarievolv

exactexamin exampl

exce

exclus

execut

exercisexhaustexhibit

exist

exitexokernel

expectexpens

experi

experientiexperiment

explainexplicit

exploit

explorexplosexportexpos

express

expressli extend

extens

externextraextrem

face

facilit

factfactorfailfailur

fair

fairlifalconfall

falsfamili

familiar

fanci

farfashion

fastfault

fbuffeatur

feedfeedback

fewfiat

f ieldfigur

file

fileservfilesystemfilter

f inanci

f indfine

fingerfingertip

finitf initest

f irstf it

f ive

fixflexibl

f lowfluidflukeflyfmfocufocus

folklorfollow

forcfork

form

formalformatformulforthcomforw ardfountainfourfraction

frame

framew ork

fraserfreefrequencfrequent

frostfrustum

full

fulli function

fundamentfurtherfurthermorfuturgain

gallei

gamegapgarbag

gate

gather

gener

geometrgigabit

give givenglobal goal

goodgovern

graingranular

graphgraphicgrasp

greatgreatergreatli

gribblgrid

groupgrowgrow thgsgso

guarante

guidhalfhall

hand

handlhank

hard

hardw ar

hardw areand

harmonhd

headerheapheavilihellmanhelp

herlihi

heterogen

heurist

hide

hierarchhierarchi

high

higher

highli

historihithold

homehomogenhonest host

human

hundr

hybridhyperhypertexthypothesi

ibm ideaidealident

identif i

ietfiiikpillus illustr

impact

implement

implicitimplicitli

import

imposs

improv

inappropri

includ

inconsistincorpor

increas

increasingli

inde

independindic

individu

inductindustri

inequinevit

inexpens

inferinfinitariinflex

infomastinforminfrastructurinherinherit

initiinnov inputinsidinsightinstablinstanc

instanti

instead

instruct

instrument

insuffici integrintelintellig

intendintensintent interact

interconnectinterest

interfac

interferintern

internet

internodinterposinterpret

interprocessinterruptintervent

intric introduc

introduct

invalidinvaluinvestig

invisinvok

involv

ionip

irregularisa

issu

iteriumjava

javascriptjdkjdszjersei

joinjudgkatz

keepkei

kenkerbero

kernel

keyw ordkit know n

ksrkuck

laboratorilan languag

laplac

larg

larger

largest

latenc

latter

layerlayout

lazi

leakleakag

learnleavled

lee

level

levi

librari

lif tlightw eightlike

limit

limitless

linelinear

linklinker listliteraturlive

load

loader

local

locat

locklockrecord

log

loggp

logic

logp

longlongestlookuplooploos

looselycoupl

lossi

low

low erlow estlow levellp

lrc

lurelutzly

machin

made

mainmainstream

maintain

maintenmajor

makemalici

manag

manipulmanner

manualmapmarshalmaskmaster matchmathemat

matrixmatter maximmbac

mbitmean

meaning

measur

mechan

meet

meiko

membermembership

memori

merchantmerg

mesh

messag

metacomputmetasystem methodmethodologmicrocodmicrokernel

microprocessor

middlew ar

migratmigratorimillionmimic

minim

minimamissmistakmitml mobilmode

model

modernmodifi

modulmodulamodularmolecularmomentmonitormontz moreovmoss motivmotorola

move

movement

mpmpimppmud

multi

multicast

multilay

multimedia multipl

multiprocess

multiprocessor

multirmultithread

municipmuninmurphimwmyriadnaglnakamura name

nation naturnavig

ncube

necessarinecessarilinectar

need

needlessli

negneithernestnetcash

netw ork

neural

new

new linext

nicenightnii nodenomin

nonblocknonprehensilnonuniformnotablnote notion

novel

noveltinsnumer

obj

object

observobtainobviouoccur

offer

oftenoldolder

olymp

onoodb

oper

oppos

optic

optim

orca

order

organorganis orientoriginorthogon

os

outperform

overalovercomoverhead

overlapow n packetpagepalmpam

parallel

parametparctabpariti

partpartialparticip

particular

partit

pass

passiv

pastpathpathw ai

pattern

paymentpcpencilperceivperceptperfect

perform

periodpermitperpag personpgaphilosophi

physic

pin placeplacementplai planplatformplayer point

pointerpolicipolit

polygon

poorli

popularportabl

portion

positpossess

possiblpost

potenti

pow erpractic

pre precispredict

preemptibl

preferpreferenti

prefetchprefix

preliminariprepar

present

prevent

previou

primariprimarili

primit

princip

principl

privatekeiprobabilist

probe

problemproceprocedur process

processor

produc

productprofessorprofil

profilem

program

programm

progress projectproliferpromisproperti

propos

protect

protocol

prove

provid

providingaprovidingmodularprovis

proxi

pseudopublic

pure

purpospvm

qo qualitiquantif iquantit queri

questionqueue

quickquicklirabinraid

ramifrandi random

rang

rapidrapidlirare

rasp

rate

ratiorawrc rereachabl

reactiv

read

reader

real

realist

realitirealtim reason recent

receptrecord

recordlist

recoveri

recoveryrecord

recurrrecursredistribut

reduc

reductrefer

refinreflectreformulregardregim

regionregistregularrel relatrelax

releas

relireliablreliancremainremaind

remedi

remotreplic

reportrepres

represent

requestrequir

requisitrerout researchresili

resourc

respectrespons

restrict

result

retailretessel

retriev

returnreusabl

reviewrigidrigorriserithmrobinson

robotrobust

rolerout

routerrsa rule

run

runtim

safesameh

samplsandboxsatisfi

save

scalabl

scale

scanscarcscatterscenario

sceneschedul schemescheurichscienc

scientif

scientist

scout

searchsec secondsecrecisecretsection

secur

seek semantsemisendsepar

sequenc

sequentiseri

serial

serv

serverservic

session

setsever

sgishamir

share

shepherdshiftshort show

show nsidesignsignatur

signif ic

signif icantlisimilarsimilarli simplsimpler

simpli

simplicsimplif i simulsimultan

sinc

singl

site

situat

sizeslidesliderslight

slow

small

smoothsnoopiso

social

softw ar

solut

solv

sometimsomew hat

sonhetimsoon

sophist sourcspacespars specialspecialpurpos

specif

specif ispectralspectrum

speed

spispinsplicespuriousquid

stabilstall standardstanford

start statestatechartstatemstaticstatist

steerstem step

stevenstm

storestrategi

stream

strengthstress

strictstrike

stripe

strongstrongli

structurstudi

stylesubjectsubmitsubsequsubsumpt

subsystemsuccesssuccessfullisuffer suff ici

suit

summer

sun

suno

supercomput

suppli

support

supportingonlisurfacsurvei

svcsvrsw

sw itch

symbol

symmetrsympo

synchronsynergi

syntax

synthesisynthetsystemattabl

tacctackltailor take

tallitampertarget task

tcatcp

techniqu

technologtelecommun

temportendenctensor term

terminolog

test

textualtexturthank

themselvtheorem

theoret

theori

thesi

thingthirteenth

thoroughthoughtthousand

thread

threatthroughout

throughput

ti

tiger

time

tmtmctodaitoken

toler

tooltopologtow ard

tow n

trace

trade

tradeoff

tradit

tradition traff ic

transacttransendtransfer

transformtransitransistor translattransmisstranspartransport

treetritriangltriangultrytune

turn

type

typechecktypic

ubiquitouscomputucuhligunabl

underli

understand

unexpectunfortununif i

uniform

uniformli

uninterpretuniqu

universunixunknow nunnecessariunobtrus

unoccludunpredictunregistunresolvunsecur

untrustuntypupdatupperusag

userusual

utilvalid valuvaluablvari variablvarieti

variouvectorvendor

verif

verif iversionversuvi via videoviewview erview pointvigor

virtual

virtualiz

visibl

visionvisit visualvlsivote

voter

w aiw aitw arehousw astw aveletw eak

w ebw fw gw henevw hite

w idew idespreadw indoww ire

w iredvil

w ord w orkw orkload

w orkstat

w orldw ormholw orst

w rapw rite

w riterw ritervers

w ritten

w row t

w w w

year

zebra

zero might

only

its

both

other

either

thereforebecome threethough

although

how everw ay w ellunderaround

othersznontogether

ratherstillthus among

must

becomesletssay

onceitself

HW

within-collection frequency

with

in-c

lass

fre

quen

cy

Page 38: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

38 of 46

0 0.01 0.02 0.03 0.04 0.05 0.060

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

abilabreastabsencabsolut abstract accessaccumulaccur

accuraciachiev

acquiracquisit

acrossactaction activactualadadaboost adaptadd

additaddressadequadmit advancadvantagadversariadvicag aggreg

agnostagraw ai

algorithm

alialiasalign allocallow

alonalongalternamount

analog

analysi

analyst analyzanimannealansw erappar appearapplet

appli

applicapportapprentic

approach

appropri

approxim

aqarbitrarchaeolog architecturareaarguargumentarisartarticl artif iciascensascent aspect

assignassist

associ

assumassumptasymptotattemptattent

attribut

attributeoriaugmentauthorautom automatavail

averagax backgroundbag

balancball

base

basibasicbasisfunctbasket

batteribay

bayesian

beforehandbegunbehav

behavior

beliefbellmanbenchmark

bestbetter

beyondbiasbinbinari

biologbivariblackblessboltzmann

boost

bootstrapbottleneck

bought

bound

boxbranchbreimanbrieflibroadbroader brow sbuffer buildbuiltbytecodcalcul

calendar

call

candidcanon capablcaptur

cart

case

castcategor

categoricauscenter centralcertainchallengchangcharactercheck

chervonenkichildrenchoicchosen

class

classicclassif

classif i

clausclearclevercliqu closeclosest cluster

cn

codecognitcohn

collectcolumn combincombinatori

comecommit commoncommonlicommuncompact

companicompar

comparison

competit

compilcomplement completcomplex

componcomprehenscompress

computcomputationconcentr

concept

conceptsto concernconcludconclusconcret conditconductconfidconfigur

conjecturconnectconnectionistconserv

consid

consider

consist

constraintconstraintbas constructconsum

contain

contentcontinucontradictcontrastcontribut control

convei conventconvergconvex coopercopecorecorporcorpucorrectcorrelcorrespondcorrupt costcourscpucreat

criticcrossovcs

currentcurriculacurs

curvcustomdai

data

databasdataset

datedbmdbminerdealdecaddecid

decis

deduct

deependefindefinitdegre demonstrdens densiti

dependderiv

describ

descriptdesigndesirdetaildetectdetermindeterminist

develop

devicdiagnosidietterich

differ

diff icultdimensdimension

directdirectlidisciplin

discovdiscoveri

discrepdiscret

discrimindiscuss

diskdispardistancdistancebas

distinctdistinguish distributdiversdividdmti document

domain

dp

dramat

draw ndrivendt

dure dynamearliereaseasieasiliedgeffect

eff ici

efforteherelabor electron

eliminelsew herem embedemerg

empir

emploiempti enablencompass endenergi enginenhancensemblentail entirentitientropienviron

envisepisod

equalequival

error

especiessentiestablish estimetceuclideanevalu

evidevolutionariexact

examin

exampl

excelexcess executexhibit

exist

expect

expens experiexperimentexpert

explanexplicitexplicitli exploit

explor

exploratoriexponenti

expressextendextensextractextremfab face

facilit

factfactorfaithfalsfamiliarfar fastfasterfeasibl

featur

fedorovfeedbackfeedforw ardfewfew er

field

fillf iller f ilterf inal

f ind

finefirst

f irstli f itf ivefix flexiblf light focufocusfocuss followforc

form formalformulformulafoundfoundat

framew orkfreefrequencfrequentfreundfriendlifrontalfulli

function

fundament

furthermor

futurga gaingame gaussian

gener

genet

gentnergeometrgeometrigive

given

global

goal

goodgradientgraduat graphgraphicgreatgreatligreedi groupgrow th guaranteguess

guidhandhandlhardest helpheurist

hidden

hierarchhierarchi highhigher highlihillclimbhintonhistorhistori hmmholland

hopehornhphuge humanhx hyperlinkhypotheshypothesi

ididea identif iignor

iiiii illustrilp

imagimielinskiimpact implement

importimportantliimpossimpract

improv

inadequ

includ

incomincompletincorporincorrectli increasincreasingli

incrementindepend

indexindic individu

induc

induct

industriinfeas inferinflex

inform

initiinputinsensitinsidinsightinstablinstancinstantaninstead integrintelligintend interact

interest

intern internetinterpret

intersectintract

introduc

introductinvalidinvestig

involvirep

irrelev issu

item

itemsetiter

itijacobjam javajordankalmankdd keep keikernelkeyw ordkitknn

know ledg

know ledgebas

know n

kong

label

lacklamarckian languaglaplacian

larg

lawlayerlead

learn

learnabl

learner

ledledalemlengthlevel

li librarilifelikelikelihood limitline

linear

linklistliteraturlittl

llsflm locallocatlog logiclongerlookaheadlookup losslost lowlow erma

machin

mackaimademagnitudmail mainmaintain

mainten majormakemalici managmanipulmannermapmarginmarketmarkov

massiv matchmatrixmaximmaximum measur

mechanmedicmemori

memorybasmeta

method

methodolog

metric

mine

minimminimumminkow skimirrormissionmix mixtur mobil

model

modifimodulmonotonmosaicmotiv

move multimultiagmulticlassmultilay

multipl

multiplelevelmuseummutualmysterinaiv natur

nearest

neednegnegoti

neighbor netw orkneural

new

new elnextnine

nois

noisinonlinearnontrivi

notion

novelnow lannpnumbernumeroobjectobservobtainoccasionoccup occur

occurr offer

often

on open operopponopportun

optim

option

order

organ orientotherw isoutcom

outperform

outputoveralovercomoverlapoverviewpac

pagepairpanacea paradigmparallelparametpartparti

particularparticularli

partit

passpassiv pastpathpatient patternpatternmatchpazzanipbilpcapenalpeoplperceptron

perform

peripher personperspectphasephenomenon

physicpiecew is placeplanplanner

plural pointpolicipolyhedrpoolpoorpoorli

populpopularportabl

positpossiblposteriorpostprocess

pow erpracticpractitionprecipit

precispredecessor

predictpredictorpreferpreliminari

presenc

present

previouprevious

price primariprimariliprincip principlpriorprobabilist

probabl

problem

problematproblemsolvprocedur process

processorprodigiproducproduct programprogressprohibit projectproliferpromot properti

proposproposit protocolprovabl prove

providprune

pseudopsychologpublish

purchaspurpos

qmr

qualitiquantitiquantiz queriquestionquickliquitradial

raisramifrandomrang

rapidlirapier ratereactivreadablreadili

real

realtim reasonreceiv

recent

recognrecommendrecordredefin reducredundreflectregardregionregistregressregular

reinforc

rel relatrelationship

relev

reliremain

reorderreorganreplic reportrepresrepresent

requirresearch

resemblresiduresolut resourcrespectrest restrictrestructur

result

retailretinretriev

reuterrevenu

reviewrew eightribl

richrigidrigorringw orldripperkrisk robocup robotrobustrobustlirocchio

rolerow

rtdprubric

rule

run

sale sampl

samuelsat satisfisbc scalablscale

schapirschedul schemesciencscientif

searchsearchabl secondsecondliseekseemingli

segment

select

sensorseparsequencsequentiseriseriou servic

set

sever

shallowshape shareshootshortshortcom

show

show nsigmoid signif icsignif icantli

similar

simpl

simplersimplif i simulsincsinger singlsitesituat sizeskillslot

small

smesmoothsmoothlisoccersocial softw arsolut solvsophist

sought sourc

space

sparsiti special specifspecif ispectrum speedspinsquarsrv

standardstandpointstart

state

static

statist

step

storagstorestraightforw ardstrategistrengthstrike strongstrongli

structur

studi

subsubclasssubfield

subset

subspacsubstantisubsumsubsumptsuccesssuffer suff icisuff ixsuggest

suitsumsummarsuperior

supervissupportsurfac

surprisingli

surveisvmsw amisymbolsyntactsynthetsystemattabltactictail taketaken

target

task

teamteamw ork

techniqu

technologtemplat temportend termterminolog

test

theoret

theoritheoryandtherebithoughtthresholdtighter time

todaitool

top topictour tow ardtracetractabltrade tradittradition

train

trainabltrajectoritransact

transittranslattransmisstreattreatment

tree

trendtrialtrialandtrilineartrivialtruetuneturn

tutori

tw entitw ofold typetypicultimunbounduncertainti understandunderstoodundertakunif iuniformuniformliunionuniquunitunknow nunlabelunordunsatisfactoriunstablunsuccessunsupervisupdat

upperuprightusenet

userusual

utilvaliant valuvaluablvapnikvari variabl

variantvariat

varietivariouvast

vc vectorverif iversionvertic via viewviolat virtualvisionvisit visualvisualis

votevqw aiw arehw eak

w ealth w ebw ebw atch

w eight

w hole

w ide

w indoww innow w ord

w orkw orld

w orsw orstw rapperw rittenxy yearyieldmight

only its

tow ard

bothothereitherthereforebecome

three

although how everw ay

w ell

underothersz

non

togetherw ant thus amongmustperhaps

MI

within-collection frequency

with

in-c

lass

fre

quen

cy

0 0.01 0.02 0.03 0.04 0.05 0.060

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

abil

ablabreast abstractac

access

accompaniaccomplishaccountaccuraciacha

achievacm acoustacquiracquisit

across

actaction activ

ad

adaboostadaptaddit

addressadjust advancadvantagadventadvic againstaggregaggress aiaidal algorithmallowalongaltavistaalteramen

amount

analog analysianalyzannot

annualansw eranticipappear appli

applic

appreci

approach

appropri

approximaraneuarbitrariarbitrariliarchiarchitecturarchiv areaarguarisartarticlaskasn aspectassessassign assist

associ

assumassumptasymmetratm attackattemptattentattributaudioaugmentaugustauthor

automat

autonomavail

avoidaw arbackbonbackgroundbalanc bandw idthbank

base

basibayesianbecombeginbegunbehalfbehav behaviorbehindbelievbelongbelow benefitbenni

best

beyondbiologist bitblackblastblobblobw orld

boardbooleanboost boundboundariboxbroadcastbrokerbrought

brow s

brow ser buildbuiltburdenburgeonbuttonbypacketcabl cachcalcul

call

capaccapturcarefulli

carvcase

categorcategoricautioucbir centercertainchafe challengchangchannel

chaptercharacter

characteristchoicchoos

chorcircuit

citi classclassif

classif iclient

clockclosecluster

codecohen

coher

cole

collect

color combincommenccommercicommod

common

commonlicommun

comparcomparison compilcomplement complet

complexcomponcomposit

compresscompriscompromis

computcomputation

concentrconcept

conceptu concernconcurrcondit

conformconfus

connect

consensuconsequ

considconsistconstant

constrain constraintconstruct

consumcontact

containcontent

contextcontextdriven continucontrast controlconventconversation

convert coopercoordincopecopicorecornelcorpora

corpucorrectcorrel

correspond

costcourscovercraw ler

creatcrosscrucial cryptograph

curiou

current

dai

datadatabas

datedeadlin

dealdecid decisdecompositdecompressdeductdeeplidefin

definitdegre delaideliv demanddemonstr

dependdeploidepth deriv

describ

descriptdescriptor

designdesirdespit detail

determindevelop

dftdialogdictionari

differ

diff icultdiff iculti

digit

dimension

directlidiscarddisconnectdiscours

discovdiscoveridiscret discussdiskdispatchdissemindistancdistinctdistinguish

distribut

divergdivers

dividdoc

document

domain

dow nloaddozendramatdrive duedure dynamearliearlier

easiliebfedit

editor

effect

eff icieffortelectronelementeliminembed

emergemphas

emphasiempiremploi enabl

encod

encouragencrypt

end enginenhancenormensur

entir

entiti

environ

errorespeciessenti

establish estimeteuclidean

eulerevalu

event

evid

evolutevolv examin

exampl

exchangexclusexecutexistexpand

expans

expensexperiexperimentexpertexplanexplicitexplicitli exploit

explorexponenti expressextendextensextent

extracteyfab facefacilfacilitfactorfailurfairfall familifamiliar

faqfarfashion fastfastafavourfc feasibl

featurfeedback

fewfffffg

field

filefilter

f ind

finderfine

first

f ix f lexiblf lipper

flowfluctuat

focu

focusfocussfoil followforc

foreignform

formalformatformulfound

fourfourier framew orkfreefrequenc

frequentfull

fullifulllengthfunctionfundamentfurthermorfuse futur

gaingammagapgather

ge genergeneralisgenomgenr

geographgglossgilboa give

given

globalglossglossarignat goalgoodgopher

gpgpgr graingrammar graphicgreatgreatli groupgrowgrow th guarantehandlharderhead

helpheretofor

heterogen

heuristhide hierarchhierarchi

high

higherhighli

hmmhocholdhomehomepag hosthttphum humanhunt hybridhyperbfhyperlinkhypermedia

hypertexthyponymiibm ideaident identif iidfigignorii illustrilp

imag

impactimpercept implementimplementorimplicit

import

imposimprecis

improv

inadequinadequaciinappropri

includ

incomincompletincorpor

increasincreasingli

inde independ

index

indicindispensindisput individuindustriinferinfominfon

inform

infrastructurinitiinnov inputinputoutput

inqueri

instancinstead

integ

integr

intelligintensintent interactinterconnect

interest

interfacintermedi

internet

internetw orkinteroperintract

introducintroduct

invert

investiginvolvirirrelevissu

itemiteritiner javajointjudgement

keepkei

kept keyw ordkindknow

know ledgknow ledgebas

know nkrss label

languaglarg

largerlargest latenc

latent

latter lead learnled

lengthlevel

lexic

lexiconlexicosyntactlikelikelihood limitlinguist linklistlittl

locallocat

log longlongerlookloos lowlsi machin

mademailmainmaintainmajormakemalici manag

manipulmanualmapmassachusettmassiv

match

materimathematmatrix

maximmaximummeanmeasur mechanmediamedicmelodmelvyn

memorimergmerit messagmet meta

method

metricmigrat

mildlimineminimminimummirror

mixedin mobil

model

modernmodestmodifimodulmonitor

monolingumosaic motiv

muc multimultidimensionmultimedia

multiplmultivarimusicmyopic

name

natur

navignearlinecessarili

need

nestnetfind netw ork

new

new ernew sw irnivnlirnonhomogennormalnotori

novelnovembnumbernumernyu object

observobtain

occurodmgoe offeroff ici

often

okapionontolog operopportunopposopposit

optimoql

order

ordinari organorientoriginoutcomoutperformoutputoveralovercomoverlapoverloadoverviewoverw eight

ow now nerpackag packetpagepairpaper paradigm

paragraph

parallelparametparsev part

partialparticip

particularpartit

passpassagpassion past

pathpatient patternpeoplperceptpereira

perform

persist

personphiphihaphotographphrasalphrase

physicpiecpilotpir

piratpitch place planplantplatformpo pointpolylogarithm

poorli popularportablportionpose possiblpotenti

pow erpp practicpre

precipitprecis

predefinpredicpredictprefer

present

preservpreventpreviouprimari

prime principlprivaci

privat

probabilistprobablist

problem

procedur

process

processorproducproficiprofil programprogress

project

proliferpromispromotproper propertipropos

propositprotect protocolprove

provid

proxi publicpubliclipublishpure purposqbicquadratquantitquantiti

queri

questionqueu

queuequit

rais

randomrangrankrapid

rapidlirare rateratiorawrd

re readreadi realrealitirealiz reasonrecal receiv

recent

recogn recognitrecognizrecommendrecordrecoverredefinredesign reducrefer

refinreflectreformulregardlessregionregular

rel relat

relev

reliremainremotremovrenewrepetit

replacreplic

reportrepositori repres representrequest requir

research

researchindexresid

resourcresponsrestructur

result

resum

retriev

returnreusablreveal

reviewrew ardrew rittenrichrichardrightrigor

ripper

riskroam robustrobustlirocchioroleronroughli

routroutin rulerunruntim samplsatellitsatisfact satisfisavvysearchscalabl

scalescanscenario schedulschema

scheme

scientifscriptseachseamlessli

search

searchablsecond

secret secur

seekseem

segment select

self semantsemistructursend

senssensemakseparsequencseri

server

servic

set

sever

sfqsgmlshape shareshort

showshow nsicsigir

signsignalsignatursignif icsignif icantli

similar

similarli simplsimulsimultansinc

singl

singular sitesituatsliderslow

smallsmaller

smart

softw arsolut solvsometimsongsophist

sound

sourc

spacespaceandspanish

spatialspaw n specialspecifspecif ispectral speechspeedspin

spokenspread standardstanfordstart statestationari statiststepstern store

strategistrengthstrikestringstrong

structur

strzalkow ski studisubmisssubmitsubstanti

subtop

succe successsuccessfullisuggestsuitsummarsunsetsuperhighw aisuperimpos

superiorsupervis

support

surveisuspendsuspenssvd

svmsw itchtag

tail target

task

tcpteamtechtechnic

techniqu

technologten

term

terminologtest

text

textbastexttiltextualtextur

tf

ththeme

theoremtheoret

theorithereofthink throughputtitilebar

timetipstertitl

todaitooltop topictrack

tradittradition traff ic train

trainabl

transacttransfer transformtranslattransmiss

transporttreat

trec

tree

tremendtritriggertruetrulitsimmitune typetypicubiquit

umassuncapacituncertainuncertainti

unclearundergo underli

understand

uniformuniformliuniquunit universunknow nunlikunorganunrestrictunstructurunsuccessunsuitunsupervisuntrustupperurlusagusenet

user

util valu

vari

variablvariantvariat

varietivariouvast vectorverif

versionvi viavictor videoviewview pointvirginia

virtualvisitvisual

vocabularivolumw ai

w ardw arehousw atermarkw aynw eak

w eb

w ebmatw ebw atchw ellknow nw herebiw hole

w ide

w irelessw ish

w ordw ordnet

w orkw orld

w orldw idw orsw orstw rapperw ritew riter w rittenw w w

year

yieldzdonzue might

onlyits

tow ard bothothereithertherefore

becomethree

althoughhow everw ay

w ellunder

aroundothers nontogetherratherw antstill

thus amongmust

itselfw ants

IR

within-collection frequency

with

in-c

lass

fre

quen

cy

Page 39: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

39 of 46

0 0.01 0.02 0.03 0.04 0.05 0.060

0.1

0.2

0.3

0.4

0.5

0.6

0.7

abilablabsent abstractacadem

accept

access

accordaccountaccuraci achievacmacquaintacquir acrossactiv

ad adaptadditaddressadequadjustadladmit adopt advantagaffectag aggregagraw aiaid aimal

alexandria algorithm

allocallowalonalong

altaalternamountanalys

analysianalyt analyzanarchanchorandtheircontain

annot

anonymansw er

appear appli applicapproach

appropriapproximarc architecturarchiv area

arguarisart aspectassign assistassociatom attentattributauctionaugmentauthorauthorit

autom automat

autonom

avail

aw arbalanc

base

basibasicbayesian

becom

begin behaviorbelievbenefitbestbetterbeyondbibliographibidbinbiographbiologibipartitbookbookmarkbottombranchbriefbringbroadbroker

brow s

brow ser buildbuildinginformationmedibuiltburdenburibusibuyer cachcalcul callcamp

campsearchcandid capablcapturcarefulli

carnivorcasualcatalogcatalogucatapultcategor

categoricbir center centralcentroid certaincertif ichachain

challeng

changcharactercharacteristcheap checkchooscircumstcitatciteclaim classclassifclassif icleanclearclickclient

close

cluster

cme cocoauthorshipcognitcollabor

collect

colorcomallow combincommentcommerci commoncommun

comparcompat compilcomplet

complexcomplic componcomposcomprehens compresscompris computconceptconcerncongestconjunct connectconnectedconsequ considconsist

constraintconstruct

consum

containcontemporaricontend

content

contextcontrolconventconvert coopercopi

corpucorrectcorrespondcostcover

craw lcraw lercreatcreation

credenticreditcriteriacrucialcurrenc

current

customcustomizcutdaili

data

databasdatasetdate dealdecentr decisdeclardecompositdeemdefici defindelgadodeliveridemonstr

deploydeprecidepth deriv

describ

descript

design

desirdetermindevelop

devicdevisdictat

differdiff icult

diff icultidiff lcultdigest digitdimensdimensiondirectdisadvantagdiscovdiscoveri

discrimindiscuss

disjoint diskdispatchdispersdisplaidistancdistil

distinct distributdivers

document

domaindraftdrivendueduredutta dynamearlieasieasilieconomedueduc

effect

eff icacieff ici

effortelectronelementelsew heremail emergemphas empiremploi enablencodencount endenforc

engin

englishenhancenormenoughenterentertainentitientri environenvironmentequalerdo error

especiessenc essentiestablishestatetetc

evalu

eventeverydaieveryw herevolv

examin

exampl

exceedingliexchangexemplariexhaustexhibitexist

expandexpansexpensexperiexperimentexpertis

explainexplicit exploitexplorexplosexportexpos expressextendextensextentexternextra

extract

extrem facefacilitfactfailfall familifar fastfeast

featur

featurevectorfeed feedbackfew fieldfigur filterf inal

f ind

firstf ivefix flexiblf lowfocu

focus

followfontfoodforeground form formalformatformulforth foundfoundatframe framew orkfrenchfreshfriendlifullfulli function

fundamentfurtherfurthermorfuturga gaingapgathergear genergeogeographget give givenglobal goalgoodgoogl

graphgrazegreatli groupgrouplen

grow

grow thguid

handlharharvestheavi helpherbivor heterogenheurist hiddenhierarchhierarchi

high

higher highlihill historiholdhomehope hosthoushttp humanhundrhunt

hyperlink

hypermediahypertext

hypothesi ideaidentidentif iidfie ignor illustr

imagimagerov implement

import

imposimposs improvincent includinclusincomparincompat

incorpor increasincreasingli

inde independ

index

indic individuinducinductinfer

inform

informationsourcinfoseek initiinnov

inputinsightinspirinsteadintact integrintelligintend interactinterest interfacintern

internet

intersectintranet introducintroductinvent investiginvok involvionisol

issuitemiterjinjokautz keep

keikeyw ordkindknow know ledgknow nkrlaborlack

languag

larg

largest layerlayout lead learnlength levelli librarilikelikelihood

limitline

link

listliteraturlittlload local

locat

log logiclonglookloos lowlow levellyco machinmademagnitudmahadevanmail main

maintain

mainten

majormake

managmanifold manipulmankindmannermanualmapmarketmarkup matchmaterimathematmatrixmaturmaximummaze meanmeaning measur mechanmedia

mediatmedicinmediummeetmembermembershipmercurimerg

metametabrokmetadatametaindexmetasearch methodmethodologmetricmicropaymillion

mineminim

minimummirrormiscellanmodalmodemodel

modul motivmove multimultiagmultidimension

multiplmultiprocessmuseummutualnaiv

namenation

natur

navig

nearnearestnecessari

necessit needneighbornetbil netw orknewnew sgroupnlp nodenoisinorth notionnoun

novel

novic objectobservobtainoccuroctob offeroftenoilolderomputon

onlinonsiderontobrok

ontolog

open operoptim orderorg

organ

organiz originort outputoutsourcovercomoverloadoverlookoverviewow npage

paipairpaper paradigmparadigmat parallelpars partparthaparticipparticularparticularli

partitpartnerpassw ordpastpatent path pattern

peopl performperiodpermitpersist

personphenomenaphoakphrasepinpoint

plaiplain pointpolicipopul

popularportalpose

positpossibl

potenti pow erpracticpre

predicpredictpreferpreliminari

present

pressur previouprice primariprimariliprimeprivaciprivacyenhancprobabilistprobabl

problemprocessprocessorproduc

productproficiprofil programprogrammprogress projectprominpropag propertiproport

propos

prosper protect protocolprototypprove

provid

prunepublicpublishpujolpure purpospursupush

qualiti

queri

questionquitrajagopalanrandomrang

rankrapid

rapidliraterdf rereadabl realrealm reason

receiv

recent

recognitrecommendrecord reduc

referrreferralw ebrefinrefreshregularregurgit reinforcrel

relatrelationshiprelevrememb

remotreplacreplic reportrepositori represrepresent

request requirresearchresemblresidresnikresolutresortresourc

responsressourc restrict

resultretriev

return

revel reviewrevisrevolutionarirew ardrich

ripe robot

role

roughli rout rulesamplsanguesasarathi satisfi

savvysearchsbc scalabl

scalescatterscenario schemescreen

search

searchabl secondsecret secursecurityandseekseemsegmentselect

selfsellerselmansemant

semisemistructursenssensitsentsentenc separsequentiserious server servicsession

set

severshah shape sharesheershoe shortshot showshow nsignif icantli

similar simplsimpli

simplif isimulsimultansinc singlsink

site

situat

size

skeptic smallsocial softw arsolut solvsom

sourc

spacesparssparsitispatialspatiospatiotemporspecialspecifspecif ispecimenspectralsphericspiderstagger standardstanfordstart statestatiststemstock storag

storestrand strategistrengthstrongstruct

structur

studisubjectsubsetsubsumpt

succesuccesssuff icisuggestsuitsuitablsummari

summer supportsurprisinglisurveisw ell synchronsynopt

syntaxsynthetsystemattagtailor taketarget

task

tassiertechnictechniqutechnologtedioutelecommuntemplat temporten termterveen testtestabltestbtestgroundtexttextualtexturtf theoretthesithousandthreatthroughoutthroughput timetimeconsum

todai

tool

topic

topologtourtow ardtrack tradit

traditiontraintransacttransittranslattranspar treetremendtriggertropotruth type

typic

ucsbuncorrel underli

understandundertak

unfortun

uniformuniformliuniquunit universunlikunpreced

unpromisunrelunsatisfactoriunstructurupdatusablusagusenet

user

usernamusual

util valuvari variablvarietivariou

vastve vectorvendor verif iviavidal videoviewvir

virtualvision

vistavisual

vocabulariw aiw alk

w eb

w ebbookw ebcraw lw ebmatw ebsomw ebvis w eight

w ide

w it w ordw ordnet w ork

w orld

w orldw idw rapperw rite

w w w

yahoo

yearyieldyourselfyu might only

itsbothothereitherthereforebecome threealthough

how everw ay

w ell

underaroundothers

nontogetherratherstillthusamong

alreadybecomesonceitselfsays

www

within-collection frequency

with

in-c

lass

fre

quen

cy

Page 40: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

40 of 46

0 0.01 0.02 0.03 0.04 0.05 0.060

0.1

0.2

0.3

0.4

0.5

0.6

0.7

abilablabstractacacademacceler

access

accommodaccomplishaccordaccuracha

achievacknow ledgacquir

across

act activactualad adaptadaptor additaddress

administradmissadmit adopt

advancadvantagadvertisadvoc againstagencaggregaggress aiaimalalgebraalgo

algorithmallevi

alloc allowalphaalter altern analysi

analyt analyzanycastaodvapparappearapplet appli

applic

approachapproxim

architectur

archiv

areaargu

arisarparqarrivasic aspectassemblassess assistassociassumassumptasymmetriasynchron

atm

attach attackattain attentattitud audioauthent automat autonomavail

averagavoidbalanc

bandw idth

basebasibasicbearbeck becombehav behaviorbeliefbelong benefitbesid best

betterbibliographi

bit

biterror blockbooleanborrowbottleneck

bound

boundaribridgbrieflibring

broadcastbuffalobuffer buildbulk

bursti

bypacketcabl cachcalculi calculucalisti

callcampu capablcapaccare

carricascad casecategori

causcbr

cellcellular

center centralcertainchchair challengcham changchannel

charactercharacteristchaw lacheap checkchemicchemistrichinoichipchooschricitatciteclarif i classclassic classif iclearli

clientclockcloseclustercm co codecodercoexistcollaps collectcombin

commercicommoncommonli

commun

comparcompetcompetit compilcomplet complexcompon compresscompressor

comput

computeraidconcentr conceptconcernconcurrconditconductconfigurconflictconform

congest

conjunct

connect

consequ

consid

consider consistconstantconstitutconstrain constraintconstructcontaincontemporaricontend contentcontextcontextu continucontraricontrastcontribut

control

conventconvergconvincingli coopercopyrightcorecorrectcorrel costcpu creatcredenticriteriacritic

currentcurvcyberguidcycldarpa

data

databasdatagramdatalinkdatededebatdec decisdecoddecomposdecoupldecstatdeduc deductdeerdefensdefin

definit

degrad

degre

delai

delegdelivdeliveri

demand

demonstrdemultiplexdenial dependdeploideploy deriv

describ

descriptdescriptor

design

desirdesktop

despitdetectdetermindeterminist developdevic differ

diff icultdiff iculti digitdirectdirectlidisciplindisclosdiscrep

discussdisk

dispatchdissemindistancdistinct

distribut

diversdivid

djdocumentdomaindomindonalddow nlinkdraftdramdramatdrivendropdsm

due

dumb duredynamearliearliereaseasieasilieconomeconomi

edgedueffect

eff ici

effortelaborelementemergempiremploi enablencodencount

end

endtoendusenergiengagengend enginenhancenoughensurenterprisentri

environ

equipequivalerron erroresaki especiessentiestablish estimetetcethernet evalueventeverydaievidevolutevolvexact examinexampl

exceexceptexchang executexhaustexhibit

existexpect experiexperienc experimentexplain

explanexplicit exploitexplorexponentiexpos expressextendextensexternextrem facefacilit

factfactorfail

fair

fairlifall familifarfashion fastfasterfaultfeasiblfebruarifec feedbackfew

fieldfifteenfile

fileservfilesystem findfinegranularfininfinishfinit f irstf itf ivefix flexiblf loodflow

flyfm focusfoil form formalformerformulforthforumforw ardfoundfountainfraction frame framew orkfreefrequencfritzsonftp

ftpdata fullfulli functionfundament

furtherfurthermor

futur

gaingap

gatew ai

gather genergeneserethgenetgeographgiantgigabitgigabytgio giveglobal goalgoodgpgrgradualgranulargraphgreatgreatergreatestgreatligroup

grow

guarante

guidelinhalf handlhandlerhandoff

hardhardw arheader

heavili helphenc heterogenhierarchhierarchi

high

higher highlihighspehinthithocholdhomehop

host

hour humanhypothes ideaidealidentidentif i

idlietfii illustrim imagimmens impact implementimportimposimpossimprovinaccurinadvertinappropriincipi

includinclusincom incorpor

increasincur independindividuinflex inform

infrastructur

inherinheritiniti inputinstantiinstead instructinstrumentinteg

integr

intendintendew dintensinter interactinterarrivinterconnectinterfac

intern

internetinternetw ork

interopinteroperinterplai interpretinterventintra introducintroductinvestiginvok

ip

ipvisiispn issujacobijaijame javajitterkatsub keep keikind know ledgknow nkorkmaz kqmlkrunzla

lanlane languaglaptop

larg

latenclaudabl

layer

lead learnlength

level

librarilightlightw eightlike

limit

linear

link

linkshar listliteraturlittl

loadloadabl local

locat

lockhe logiclong

longestlooklookuplooploos

losslossi

lossless lowlow estlrd machinmade

magnitudmainli majormake

manag

manipulmapmarchmarkmarshalmarylandmaster matchmaterimathemat maximmaximummbpmcguirmckaimeaning

measurmechan

mediameetmembermembership memorimerit

messag

metacomputmetadata methodmethodologmetricmichaelmigrat

million

minimminimummipmismatchmitig

mixmm

mobil

modemodelmodemmodernmodifimodulmodulamodularmonofractmoreovmorpheumosaicmostli motion

motivmountmovemovementmpimpoampp multi

multicast

multiclassmultifractmultimedia multipl

multipletimmultiplex multiprocessormultiprogrammultiservicnagami namenarrownash naturncsanearnecessarili

need

negnetnetsolv

netw ork

neurolognevertheless

new

new linew toniannextnitionnninntp nodenonethelessnoticnotif i notionnovelnoveltinow numeroamp objectobservobtainobviousococcasionoccuroccurr offer oftenondemandopen

oper

opportunoptic

optimoption orderorgan orientoriginososirioutlin outputoveraloverhead

overview

packet

packetsw itchpact

paradigm

parallelparamax parametparameterizparametrpariti partpartialparticip particular

particularlipathpatholog patternpeerpelavin peoplperfectli

perform

performaperiodpermitpersonperspectperuspfq physicpiecpioneerpipelinpizzaplacementplai planplaneplanetplausiblplenti pointpoispoissonpolici

polymorphpoolpoorlipopulpopularpopularli portablpose posit possiblpotenti pow erpre preferprefetchprefix

present

preservpresetpressurprevalprevent previouprice primariprimariliprincip principlprioritiproactiv probabl

problemproceprocess

processorproductprofil programprogrammprogress projectproliferpromispromot properti

propos

protocol

prototypprovabl

provid

proxi purpos

qo

quali qualitiquantif iquantit queriqueuqueue

quickradioraisramifrandom

rang

rapid

rate

rereachreactiv

real

realisrealistrealizrealmrealtim reasonreceiv

recent

recipirecovrecoverired

reduc

referrefinreflectregardregardlessregionregularrel relatreli

reliabl

relianc remotrepeatreplacreplic reportrepres

represent

reprogramrequest

requir

rerout researchresolv

resourc

respond responsrestrict resultretainretransmiss retrievreusreusablrevers reviewrevolutionrichrichardriscriserithm robustrolerough

rout

router

routinrstrsvprt rulerun

safesafetisatellit satisfisave

scalabl

scale

scenario schedul schemeschubasciencscientifseamlesslisecond securseldom selectself

semant

send

sender

sensit

sensorsent separsequentiseriserv

server

serverless

servic

session

setsevershapeshapiro

share

sharp shortshot showshow nsign

signal

signif ic

signif icantlisimilar

similarli simplsimplic

simplif isimulsimultan

sinc singlsitesituat sizesmallsmoothsmoothlismtp softw ar

solutsophistsort sourcspacespaffordsparsspatialspaw n specialspecialisspecialpurpos specifspecif i

speed

speedupspinspoofspreadsrmsscstabilstack standardstanfordstart state

staticstation statiststatustem storagstorestrategistreamstrictstripe strongstrongli structurstuartstub studi

suachsubsequ

subsetsubstantisubsumptsubsystemsuffer suff icisuggest

suitsuitablsummarsummarisunisunosupercomputsuperset

support

suppos survei

sw itch

sw itchletsymmetrisyn synchronsyntaxsynthessynthesisystemattabltailor taketarget

task

tcp

tcplib techniqutechnologtelnet

temportemporaritenet termterminterminologtertiaritestbtextbastherebithesithoroughthoughtthresholdthrottlthroughout

throughputtigertilebartim

time

timelititltoend

tolertooltopologtoronto

tracetracerouttradeofftradit

tradition

traff ic

transacttranscodtransfertransformtransittranslat

transmiss

transmittranspartransport

treadmarktrendtryfonatuneturntw ice typetypic

ubiquitultrixunabluncertaintiunderliunderneathundertakunfair unfortununhintuniuniformliuniquunit universunixunsynchronupdatuplinkupperurlusag

user

util valuvari

variabl

variatvarieti

variouvbr

vectorvega versionvictim videoviewview erviolat

virtualvisualvoltagvsw anw b w ebw eber w eightw eiserw hitew hiteboard

w idew idespreadw iederholdw ire

w ireless

w irelinw orkw orkflow

w orkstat

w orldw rapperw rittenw w wyearzdonzhang might only itstow ardbothother

eitherbecome threethoughalthough how everw ay w ell

underaroundothers nontogetherratherstillthus amongmustonce

itselfw ants

net

within-collection frequency

with

in-c

lass

fre

quen

cy

Page 41: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

41 of 46

Case Study

• Dimensionality reduction process~400,000

15,971 stemming 12,044

Multi-partition document Vector space model

5,605Fuzzy-based term

clustering

507 stopwords4,872 keywords

226features

Using metadata

Page 42: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

42 of 46

Outlines

• Metadata Mining• Metadata Representation Model• Class-Term Matrix• Case Study• Conclusion Remarks

Page 43: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

43 of 46

Conclusion Remarks

• Most statistic-based data mining methods do not use domain knowledge

• Metadata (semi-structured data) mining uses domain knowledge embedded in tags and partitions.

• We introduced multi-partition document vector space model.

• We mine class-term matrix in addition to document-term matrix.

Page 44: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

44 of 46

Conclusion Remarks

• Based on the visualization model (class-collection map) and a fuzzy inference, we can cluster vocabulary for each class and extract three essential categories;– Features: to classify unknown documents, – Keywords: for indexing and access to

specific document in IR applications,– Stopwords: for dimensionality reduction

and noise removal.

Page 45: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

45 of 46

Conclusion Remarks

• Based on class-term matrix, we defined– Terminologies as fuzzy sets of all terms

in the vocabulary– Definitions as fuzzy sets of all concepts

Page 46: Learning Object Metadata Mining

Mak

rehc

hi &

Kam

el

46 of 46

Conclusion Remarks

• Future Works– Collecting LO metadata and constructing

a LO metadata repository,– A keyword recall method to test and

validate extracted keywords,– Implementing an average classifier (KNN

or Fuzzy classifier) to test and validate selected features,

– Applying multi-classifier architecture on metadata mining problem.