How well does your Instance Matching system perform? Experimental evaluation with LANCE

HowwelldoesyourInstanceMatchingsystemperform?ExperimentalevaluationwithLANCE

TzaninaSaveta,EvangeliaDaskalaki,GiorgosFlouris,

IriniFundulakiInstituteofComputerScience–FORTH,Greece

Axel-CyrilleNgongaNgomoIFI/AKSW,UniversityofLeipzig,Germany

10/31/16 ISWC2016:HowwelldoesyourInstanceMatchingsystemperform?ExperimentalevaluationwithLANCE 1

WhyInstanceMatching?

ISWC2016:HowwelldoesyourInstanceMatchingsystemperform?ExperimentalevaluationwithLANCE 2*AdaptedfromSuchanek&Weikumtutorial@SIGMOD2013

Differentsourcescontaindifferentdescriptionsofthesamerealworld

entity

InstanceMatchingforLinkedData


SetofRDFtriplesconstituteanRDF

graph

SparseData

Richsemanticsexpressedinterms

ofontologies

LargenumberofsourcestointegrateValue,Structure

andSemanticsHeterogeneities

*AdaptedfromSuchanek&Weikumtutorial@SIGMOD2013

Benchmarking


Instancematchinghasledtothedevelopmentofanumberofmatchingtechniquesandtools

•  Howtocomparethose?•  Howtoassesstheirperformance(efficiencyand

effectiveness)?•  Howto“push”systemsintobecomingbetter?

•  Benchmarkyoursystems!

InstanceMatchingBenchmarkComponents

•  Datasets–  Sourceandthetargetdatasetsthatwillbematchedtogethertofindtheentitiesthatrefertothesamerealworldobject

•  Groundtruth/Goldstandard/Referencealignment–  The“correctanswersheet”usedtojudgethecompletenessandsoundnessoftheresultsproducedbytheSUT

•  Organizedintotestcaseseachaddressingdifferentkindofinstancematchingrequirements

•  Metrics–  Theperformancemetric(s)thatdeterminethesystems’efficiencyandeffectiveness


LANCE

•  Anovelinstancematchingbenchmarkgenerator

•  Domain-independent

•  Highlyconfigurableandscalable•  Standardvalue-basedandstructure-basedtestcases•  Advancedsemantics-awaretestcasesconsideringOWL2

expressiveconstructs

•  Richweightedgoldstandard

•  Additionalmetrics:similarityscoremetric



LANCEArchitecture

Source Data

Target Data

Weighted Gold Standard

Resource Transformation

Module

RESCAL [NT12]

MATCHER SAMPLER

Weight Computation Module

Test Case Generation Parameters RDF

Repository Dat

a

Inge

stio

n M

odul

e

Initialization Module

Resource Generator

Test Case Generator SP

ARQ

L Q

uerie

s (S

chem

a St

ats)

SPAR

QL

Que

ries

(IR)

Matched Instances

Source Data

TestCases

Testcasesarebuiltusingavarietyoftransformations

•  Value-basedtestcases–  Transformationsofvaluesofdatatypeproperties

•  Structure-basedtestcases–  Transformationsofstructureofobjectanddatatypeproperties

•  Semantics-awaretestcases–  Transformationsattheinstancelevelconsideringtheschema

•  SimpleandComplexcombinationofthethreefirstcategories


LANCEPerformanceMetrics•  Averagesimilarityscore:averagedifficultyofthematchedinstances

–  Benchmarkwithhighaveragesimilarityscore:matchedinstancesareeasiertofind

•  Standarddeviation:spreadofsimilarityscoresforthematchedinstances–  Benchmarkwithhighstandarddeviation:

•  scoresarespreadoutfromtheaverage•  moreheterogeneityofmatchedinstances

10/31/16 HOBBITPlenary2

Obtainamorefine-grainedunderstandingoftheIMsystem’sperformancebycomparingtheaveragestandarddeviationand

similarityscoreofthesystemandbenchmark

Experiments•  EfficiencyandeffectivenessofIMsystemsusingLANCEbenchmarks–  Systems:•  LogMapVersion2.4[JG11](MoReReasoner[RG13])•  OtO[DP12]•  LIMES(EAGLEIMalgorithm[NL12])

–  Datasets•  LDBC’sSPIMBENCHGenerator(SemanticPublishingBenchmark)

•  UOBM– MatchingTask•  All5categoriesintroducedpreviously•  Allinstancesweretransformed

10

SPIMBENCH:StandardMetrics

11

•  LogMap–  Respondwellinthevalue-basedtestcases–  Reducedperformancewhenalsosemantics-awaretestcaseswereapplied

SPIMBENCH:StandardMetrics

12

•  OtOandEAGLE–  Givegoodresultsregardingthevalue-basedtransformations

–  Reducedperformanceintheremainingcategories•  EAGLEisnon-deterministicandusesunsupervisedlearning

UOBM:StandardMetrics

•  LogMap1.Doesnotperformwelltoanyofthecategories2.Performancenotaffectedbythedatasetsize•  OtO1.Performsbetter2.Reducedperformancewhenincreasingdatasetsize

13

SPIMBENCH:AdditionalMetrics

DistributionofsimilarityscoresforLANCEandTruePositivematchesfromIMsystemsforsemantics-awaretestcasesinthecaseofthe10Ktriplesdataset.•  LogMapcanaddressdifficulttestcases•  EAGLE&OtOcanaddressmostlyvalue-basedtestcases

1

10

100

0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1

log(#ofm

appings)

SimilarityScores

OtO EAGLE LogMap LANCE

14

StandardDevia8on

UOBM:AdditionalMetrics

DistributionofsimilarityscoresforLANCEandTruePositivematchesfromIMsystemsforstructure-basedtestcasesinthecaseofthe10Ktriplesdataset.•  LogMapcannotaddresswellthechangeofURIsintheInstances

ISWC2016:HowwelldoesyourInstanceMatchingsystemperform?ExperimentalevaluationwithLANCE 15

1

10

100

0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9

log(#ofm

appings)

SimilarityOtO LogMap LANCE

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

OtO LogMap LANCE

LessonsLearned•  DifferenttypeoftransformationsaffectIMsystem’s

performance•  Thecharacteristicsofsourcedatasetsaffectthebehaviorof

IMsystems


Questions?


AcknowledgmentsThisprojecthasreceivedfundingfromtheEuropeanUnion’sHorizon2020researchandinnovationprogrammeundergrantagreementNo688227.


References[JG11]E.Jimenez-RuizandB.C.Grau.Logmap:Logic-basedandscalableontologymatching.InISWC,2011.[RG13]A.A.Romero,B.C.Grau,etal.MORe:aModularOWLReasonerforOntologyClassification.InORE,pages61-67,2013.[DP12]E.DaskalakiandD.Plexousakis.OtOMatchingSystem:AMulti-strategyApproachtoInstanceMatching.InCAiSE,2012.[NL12]A.-C.NgongaNgomoandK.Lyko.EAGLE:EfficientActiveLearningofLinkSpecificationsusingGeneticProgramming.InESWC,2012.

19