GraphAnalysisofCandidateGQLFeaturesGraphQueryLanguageProject
ExistingLanguagesWorkingGroupThomasFrisendal
[email protected],@VizDataModeler2019-02-26
The”ExistingLanguagesWorkingGroup”• Inpreparationtothecommencement ofplanningforGQL, interestedparties-- drawnfromindustry(Neo4j,Oracle,RedisLabsand
TigerGraph),thecommunity(anoteddatamodellingexpertandpublishedtechnicalauthor),andacademia(theUniversityofTalcainChile)-- formedaninformalworkinggroupcalledthe“ExistingLanguagesWorkingGroup”.
• Wehaveworkedinanincrementalfashiononsystematically identifying,surveying,analysingandcomparinggraphquerylanguagefeatures,drawnfromthefollowingexistingquerylanguages:• Cypher• PGQL• GSQL• SQLPGQ[Framework:2020,Foundation:2020,SQL/PGQIWD,ERF-035• G-CORE.
• Wehopetocompriseacatalogueof:• thegroupsoffeatures• towhichextent(ifatall)these aresupportedineachlanguage• exemplarsyntax• supplementaryartifactstoaidintheunderstandingoftheunderlyingsemantics• grammarconstructs• andanyadditionaldetailsofinterest.
• TheideaistohavethislandscapeofexistingquerylanguagesinordertoinformthedesignanddevelopmentofGQLbyvirtue of awell-informedworkplanandhelpingtoleadtoamorerobustoutcome;i.e.thiswouldhelpustohaveclearandmeaningfuldiscussionsonscopeandpriorities,andwillfacilitateclearandunambiguousdesignchoices.Moreover,thiswillhelpusto identifyareasofconsolidation,innovationandopportunitiesforlanguageinteroperationinGQL(forexample,withSPARQL).
CombattingComplexity:TheELWGGraphDatabase• Establishingananalyticalgraphdatabaseforall5languagesacrossall212features
• Downtothekeywordlevelforeachfeatureofeachlanguageacross5descriptive(text/syntax)dimensions• Nowinits3rdedition• Methodology:
• Consolidateallsheetsintoone• GenerateMERGEcommandsforthefeaturestreeandthe5languages(bywayofExcelformulas)• Somemanualintervention(removeCR’sandchange;’sto§’s)• LoadintoNeo4j• Connectallcomponents• BuildtagsforDescriptors,GrammarTagsandSyntaxTags• BuildaKeywordtagtreebasedonallofthe3above• Dosomereporting(thispptandsomeexcelsheets)
• Willbemadeavailabetophase2andintheGQLdesignwork(foranalysis)• Ambition:Pragmatic,analyticalsupporttool,notanormativesource• Errarehumanumest– reporterrorsandomissions,please(afewknownissuesalready)
Curren
tMetaMod
el
StatisticsNodetypes Count Minrels MaxrelsFeature 212 6 14FeatureArea 6 1 17FeatureGroup 30 2 27InclDoc 5 80 549InclLang 1306 4 4Language 5 208 311GCOREFeature 212 2 18GSQLFeature 212 2 30OpenCypherFeature 212 2 29PGQLFeature 212 1 25SQLFeature 212 2 29DescriptorTag 401 1 22GrammarTag 299 1 424KeywordTag 659 1 247SyntaxTag 214 1 247
TheFeaturesTree
Comparison
ofPlann
edor
Implem
entedFeatures
GCORE GSQL OpenCypher
PGQL SQL
Implem
entatio
nStatus
(Not=’X’)
GCORE:72,GSQL:152,Cypher:168,PGQL:113,SQL:140
Implem
entatio
nStatus
NotSup
ported
(’X’)
GCORE:118,GSQL:54,Cypher:43,PGQL:99,SQL:71
TheDe
scrip
torTags
TheGrammarTags
FunctionInvocation(Cypher)
NotDefined(SQL)
TheSyntaxGraph
Partofthe
SyntaxG
raph
Zoom
inginona”W
ord”
inth
eSyntaxGraph
Even
MoreTagsinth
eKeyw
ordGraph
Essentially theSyntaxTagsenhanced withkeywordsextractedfromtheDescriptorandGrammar Tags
Collected
Keywordsper
FeatureandLanguage
UsingaGraphAlgorithmtoMeasureSimilarityofExpression(Jaccard)
FeatureName AvgSimAnd 1,00Comparingvalues(equality) 1,00Equality 1,00Greaterthan 1,00Greaterthanorequalto 1,00Inequality 1,00Lessthan 1,00Lessthanorequalto 1,00Negation 1,00Or 1,00Typecoercions(i,e,implicittypeconversions) 1,00approximate32-bitbinarydecimalnumber 1,00approximate64-bitbinarydecimalnumber 1,00Edgedirections:l-to-r 0,87Specifyingaconditionalvalue 0,87date 0,83localtime 0,83Checkifapropertyexistsonanodeoranedge 0,80Edgedirections:r-to-l 0,79Edgepatternwithdisjunctionoflabels 0,79
MATCHwithmorethanonenode/edge/pathpattern(i,e,allowingfor'star'-shapedpatternsetc),Essentiallythiscanalsobeusedtoobtainacrossproduct 0,75Edgepatternwithdirection 0,75Subtraction 0,74Edgedirections:anydirection 0,73
FeatureName AvgSimDynamicpropertyaccess(accessingapropertyofanodeoredgebyusingadynamically-computedstringvalueasthekey§ e,g,allowingforthekeytobepassedinasaparameter) -Escapingcharacters -Flatteningalist(transformalistintoaseriesofrows§transpose) -Get alltheelementsofalist/collection/arrayexcludingthefirstelement -Get allthelabelsforanode -Get theidentifierofanodeoredge -Nodepatternwithlabelnegation -interval -multidimensionalarray -Obtainthecurrentdate/time 0,06Get allthenodesinapath 0,07List/collection/arrayconcatenation 0,07Get alltheedgesinapath 0,08Determinewhether ornotavalueisamemberofamultiset 0,08Inputgraphspecification 0,08Listequality 0,08Create anedge 0,09Get theedgelabelasastring 0,09
Subtractionoperatorfortemporaltypesanddurations 0,11Create anode 0,11
Get thefirstelementinalist/collection/array 0,11Replace 0,11Checkingifapatternexists 0,12Amalgamatemultiplevaluesintoasinglelist 0,13
-
0,20
0,40
0,60
0,80
1,00
1,20
And
Lessth
anapproximate64-bitbinary…
Edgedire
ctions:r-to
-lEdgepatternwithlabel
Compute'e'raisedtoagiven…
Sortingreturnedro
wsEdgepropertypredica
tes
timewith
timezone
Updateallpropertie
sona
n…basiclist/array
Projectin
grows
Standardaggregatin
goperatio
nsDe
leteanedge
Elem
ente
xistencechecking
Conversio
nPower
Additio
noperatorfortem
poral…
Readingfro
magraph
multiset
Createanedge
Geta
llthenodesinap
ath
Geta
lltheelem
entsofa…
AvgSim
10DataExtractsinExcel(ELWG_reports_20190228.zip)• CandidateFeatures_20190228• DescriptorTags_20190228• FeaturesNotSupported_20190228• FeatureSyntaxSimilarity_20190228• GrammarTags_20190228• KeywordTagsAcrossLanguages_20190228• KeyWordTagsCollections_20190228• SyntaxSummary_20190228• SyntaxTags_20190228• SyntaxXref_20190228
Contact information:
ThomasFrisendal(Copenhagen, Denmark)
[email protected]@VizDataModelerlinkedin.com/in/thomas-frisendal-19a56a