Graph4Scala-JSON-UserGuide.pdf

  • Upload
    jp149

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

  • 7/27/2019 Graph4Scala-JSON-UserGuide.pdf

    1/8

    scalax.collection.Graph JSON Import/Export

    Peter Empen

    Contents

    1 Introduction .................................................................................................................................. 2

    2 Exporting Graphs......................................................................................................................... 3

    3 Working with Descriptors............................................................................................................ 3

    4 Importing JSON texts................................................................................................................... 6

    5 Working with Custom Edge Types.............................................................................................. 6

    6 Note on Inversion......................................................................................................................... 7

    7 Grammar....................................................................................................................................... 7

    7.1 Notes on the Grammar Notation............................................................................................. 7 7.2 Notes on Specific Grammar Elements .................................................................................... 8

    7.3 Further Examples for Valid JSON Texts ................................................................................. 8

  • 7/27/2019 Graph4Scala-JSON-UserGuide.pdf

    2/8

    scalax.collection.Graph JSON Import/Export / 2 P. Empen

    1 IntroductionThis document inducts the user of Graph for Scala into how to export Graph instances to JSON -textand how to populate graphs from JSON text. Thus, it may be viewed as a supplement of the UserGuide.

    JSON texts may embed node/edge sections at any point. These sections must adhere to the Graph for Scala JSON Grammar to enable data retrieval. The Graph for Scala JSON Grammar , an extendedJSON grammar, has been planned to be flexible in the following ways:

    An arbitrary number of node/edge sections within the same JSON text will be processed tosupport different node end edge types within the same Graph instance.

    JSON texts to be imported may include any non-graph related data which will be discarded. All identifiers within the JSON text marking node/edge sections or node/edge types are

    configurable. The user has full control over JSON formats representing nodes/edges.

    The user has also fine-grained control over each phase of the import/export process.With the exception of serializers, Graph for Scala JSON import/export is transparently implemented ontop of Lift-Json .

    Graph for Scala JSON is supplied as an extra module (jar). graph-json_XXX.jar depends ongraph-core_XXX , lift-json_YYY and paranamer-ZZZ all of which must be available at run-time.For the latest release numbers see project/GraphBuild.scala .

    Most examples in the following chapters are based on a partial 1 academic library application backed bya graph. In this library graph , books and authors are represented by nodes, authorship by edges:

    // node types: Book, Authorsealed trait Librarycase class Book ( val title: String,

    val isbn: String) extends Librarycase class Author( val surName: String,

    val firstName: String) extends Library

    // node data: 2 books, 4 authorsval (programming, inDepth) = (

    Book( "Programming in Scala" , "978-0-9815316-2-5" ),Book( "Scala in Depth" , "978-1-9351827-0-2" )

    )val (martin, lex, bill, josh) = (

    Author( "Odersky" , "Martin" ),Author( "Spoon" , "Lex" ),Author( "Venners" , "Bill" ),Author( "Suereth" , "Joshua D." )

    )// graph with 2 authorship relations

    val library = Graph[Library,HyperEdge](programming ~> martin ~> lex ~> bill,inDepth ~> josh

    )

    The complete source is incorporated in the test TJsonDemo.scala .

    1 We could also represent a complete academic library application by a graph containing different edge types forauthorship, lectorship, etc.

  • 7/27/2019 Graph4Scala-JSON-UserGuide.pdf

    3/8

    scalax.collection.Graph JSON Import/Export / 3 P. Empen

    2 Exporting GraphsTo export a graph instance to JSON text you call toJson :

    import scalax.collection.io.json._ val exported = library.toJson(descriptor)

    Alternatively, you can control the export phases one by one:import scalax.collection.io.json.exp.Exportval export = new Export[N,E](library, descriptor)

    import export._ val (nodesToExport, edgesToExport) = (jsonASTNodes, jsonASTEdges)val astToExport = jsonAST(nodesToExport ++ edgesToExport)val exported = jsonText(astToExport)

    Clearly, exported of type String will contain the JSON text, but what about the descriptorargument?

    3 Working with DescriptorsFine-grained control over JSON import/export is achieved by means of Graph JSON descriptors, a kindof export/import configuration made up of

    node descriptors for each node type (see arguments defaultNodeDescriptor andnamedNodeDescriptors )

    edge descriptors for each edge type (see arguments defaultEdgeDescriptor andnamedEdgeDescriptors ) and

    node/edge section identifiers (see argument sectionIds )

    Prior to calling toJson , you need to make thoughts about what node/edge types your graph containsand how you want to serialize these in terms of Lift-Json serialization. In case of our academic library example you may start with

    val bookDescriptor = new NodeDescriptor[Book](typeId = "Books" ) {def id(node: Any) = node match {

    case Book(_, isbn) => isbn}

    }val authorDescriptor = new NodeDescriptor[Author](typeId = "Authors" ){

    def id(node: Any) = node match {case Author(surName, firstName) => "" + surName(0) + firstName(0) }

    }

    import scalax.collection.io.json.descriptor.predefined.DiHyperval quickJson = new Descriptor[Library](defaultNodeDescriptor = authorDescriptor,defaultEdgeDescriptor = DiHyper.descriptor[Library]()

    )

    Now, we have a node descriptor for both node types Book and Author respectively:

    The typeId argument will be used to denote the node type in the JSON node sections like{"nodes":{

    " Books ":[{"title":"Programming in Scala","isbn":"978-0-9815316-2-5"}, ]}}

    id , the only abstract method in NodeDescriptor , is required for referencing nodes in the

    JSON representation of edges like{"edges":{"DiEdge":[[" 978-1-9351827-0-2 "," SJ "], ]

    }}where " SJ " references the node josh . Without establishing such references, JSON edgeentries would contain all node data what would, proportionally to the complexity of nodes and

  • 7/27/2019 Graph4Scala-JSON-UserGuide.pdf

    4/8

    scalax.collection.Graph JSON Import/Export / 4 P. Empen

    the order of the graph, make JSON texts explode in length.Please exercise great care when designing the id method to return unique keys.

    Thereafter, we assembled a Descriptor with the type argument Library and the constructorarguments authorDescriptor along with the predefined edge descriptor DiHyper . Predefined edgedescriptors have a typeId equalling to their name and are type-safe with respect to the corresponding

    predefined edge types bearing the name of the edge descriptor suffixed with Edge , in our exampleDiHyperEdge . Predefined edge descriptors are merely short-cuts for individually configurableinstances of EdgeDescriptor which we do not cover in this introductory.

    At this point youd like to inspect the resulting JSON text but instead, you get a run-time exceptiontelling No 'NodeDescriptor' capable of processing type "demo.Book" found. You did havereason for wondering about the completeness of quickJson indeed, Graph JSON descriptors mustcover all node/edge types to be exported/imported. For a partial export you should simply filter yourgraph instance prior to exporting.

    Here is a complete descriptor sufficing our academic library graph (named arguments may be omitted we verbose them just for better readability):

    val descriptor = new Descriptor[Library](defaultNodeDescriptor = authorDescriptor,defaultEdgeDescriptor = DiHyper.descriptor[Library](),namedNodeDescriptors = Seq(bookDescriptor),namedEdgeDescriptors = Seq(Di.descriptor[Library]())

    )

    Passing the above descriptor to toJson finally yields the following afterwards prettified JSON text:

    { "nodes": {"Books":[{"title":"Programming in Scala","isbn":"978-0-9815316-2-5"},

    {"title":"Scala in Depth", "isbn":"978-1-9351827-0-2"}] },

    "nodes":[{"surName":"Odersky","firstName":"Martin"},{"surName":"Spoon","firstName":"Lex"},{"surName":"Suereth","firstName":"Joshua D."},{"surName":"Venners","firstName":"Bill"}

    ], "edges": {

    "DiEdge":[{"n1":"978-1-9351827-0-2","n2":"SJ"}] }, "edges":[{"nodeIds":["978-0-9815316-2-5","OM","SL","VB"]}]

    }

    Lets analyse this JSON text in more detail:

    You can easily identify the two node and two edge sections denoted by the field names "nodes" and"edges" respectively. These names are default names which may be altered by supplying a fifthargument to the constructor of Descriptor .

    It may be unclear why the typeId "Authors" is missing while "Books" is present. By design, thereare always one default and zero to any named descriptors for nodes/edges. The typeId is by designomitted in the node/edge sections corresponding to the default descriptors. These flat sections(node/edge lists without type indicators) allow for JSON texts to be compatible with JSON textsformatted by third-party libraries such as JSDot .

    Our above JSON text may raise criticism in that it is polluted with the repeated field names surName" ,"firstName etc. It may turn out that it is not acceptable to excuse this lengthy output by referring tothe Lift-Json default serialization which in fact includes parameter names. In such cases you may opt

    for what we call positional JSON meaning that JSON values will be matched to node/edge class fieldsby their position. To let the export generate positional JSON requires a little bit of programming, namelythe definition of appropriate Lift-Json custom serializers:

  • 7/27/2019 Graph4Scala-JSON-UserGuide.pdf

    5/8

    scalax.collection.Graph JSON Import/Export / 5 P. Empen

    object PositionedNodeDescriptor {import net.liftweb.json._ final class AuthorSerializer extends CustomSerializer[Author] ( fmts => (

    { case JArray(JString(surName) :: JString(firstName) :: Nil) =>Author(surName, firstName)

    },

    { case Author(surName, firstName) =>JArray(JString(surName) :: JString(firstName) :: Nil)

    }))val author = new NodeDescriptor[Author](

    typeId = "Authors" ,customSerializers = Seq( new AuthorSerializer)){

    def id(node: Any) = node match {case Author(surName, firstName) => "" + surName(0) + firstName(0) }

    }}

    For each node type we need to extend net.liftweb.json.Serializer what is really

    straightforward. Then we pass an instance of youre the custom serializer AuthorSerializer to thenode descriptor author . We have hidden implementation details by enveloping AuthorSerializerand the new NodeDescriptor author into the object PositionedNodeDescriptor which shouldalso contain a custom serializer for Book (here left out).

    Now we are able to assemble a descriptor utilizing positioned JSON texts. As the Graph for ScalaJSON package also contains predefined serializers for predefined edges we do not need to implementthem separately:

    import scalax.collection.io.json.serializer.{HyperEdgeSerializer, EdgeSerializer}

    val descriptor = new Descriptor[Library](

    defaultNodeDescriptor = PositionedNodeDescriptor.author,defaultEdgeDescriptor = DiHyper.descriptor[Library](

    Some( new HyperEdgeSerializer)),namedNodeDescriptors = Seq(PositionedNodeDescriptor.book),namedEdgeDescriptors = Seq(Di.descriptor[Library](

    Some( new EdgeSerializer))))

    Armed with the above descriptor we then call

    val exported = library.toJson(descriptor)

    and verify the resulting, condensed JSON text:{ "nodes":{"Books":[

    ["Programming in Scala","978-0-9815316-2-5"],["Scala in Depth","978-1-9351827-0-2"]

    ]},

    "nodes":[["Odersky","Martin"],["Spoon","Lex"],["Suereth","Joshua D."],["Venners","Bill"]

    ], "edges":{

    "DiEdge":[["978-1-9351827-0-2","SJ"]]

    }, "edges":[["978-0-9815316-2-5","OM","SL","VB"]]

    }

  • 7/27/2019 Graph4Scala-JSON-UserGuide.pdf

    6/8

    scalax.collection.Graph JSON Import/Export / 6 P. Empen

    4 Importing JSON textsBeing well versed in the design of Graph for Scala JSON descriptors, there is virtually nothing more leftto learn to be able to populate Graph instances from JSON texts. To process JSON texts you callfromJson :

    import scalax.collection.io.json._ val library = Graph.fromJson[Library,HyperEdge](jsonTextLibrary, descriptor)

    library of type Graph [Library,HyperEdge] will contain all nodes/edges derived from thenode/edge sections of the JSON text jsonTextLibrary . The descriptor argument will generallybe the same value as used for the export unless you intend to alter node/edge types what wouldcorrespond to map a graph to another graph.

    Note that the compiler can infer the type arguments but the result of this inference will be unsatisfactoryso you are strongly advised to explicitly state the correct type arguments.

    Alternatively, you can control import phases one by one:

    import scalax.collection.io.json.imp.Parser._

    val parsed = parse(jsonText, descriptor)val result = Graph.fromJson[](parsed)

    5 Working with Custom Edge TypesAs in the following example, custom edge types must mix in Attributes and their companion objectsmust extend CEdgeCompanion to adhere to JSON descriptor requirements. Lets examine the customedge type Transition that could serve as a transition between program states depending on keys.For the sake of simplicity we abstract away from the key modifiers Alt, Ctrl and Shift:

    class Transition[N](from: N, to: N, val key: Char)extends DiEdge [N](NodeProduct(from, to))

    with ExtendedKey[N]with EdgeCopy [Transition]with EdgeIn [N,Transition]with Attributes [N] {

    def keyAttributes = Seq(key)override protected def attributesToString = " (" + key + ")"

    type P = Transition.Poverride def attributes: P = new Tuple1(key)override def copy[NN](newNodes: Product): Transition[NN] =

    Transition.newEdge[NN](newNodes, attributes)}

    object Transition extends CEdgeCompanion[Transition] {/** nodes are of type String. */ def apply(from: String, to: String, key: Char) =

    new Transition[String](from, to, key)def unapply[N](e: Transition[String]): Option[(String,String,Char)] =

    if (e eq null ) Noneelse Some(e.from, e.to, e.key)

    type P = Tuple1[Char]override protected def newEdge[N](nodes: Product, attributes: P) =

    nodes match {

    case (from: N, to: N) =>new Transition[N](from, to, attributes._1)}

    }

  • 7/27/2019 Graph4Scala-JSON-UserGuide.pdf

    7/8

    scalax.collection.Graph JSON Import/Export / 7 P. Empen

    Most notably, attributes must be overridden by a Product containing all custom fields. Thecompanion object must extend CEdgeCompanion and define newEdge .

    Given the above definition of Transition we can instantiate a custom edge descriptor as follows:

    new CEdgeDescriptor[String, Transition, Transition. type , Transition.P](edgeCompanion = Transition,sampleAttributes = Tuple1( 'A' ))

    6 Note on Inversionval expLibrary = library.toJson(descriptor)Graph.fromJson[Library,HyperEdge](

    expLibrary, descriptor) should equal (library)

    Thinking of the JSON export as the inverse function of JSON import, the following rules apply: Import(Export(graph)) == graph

    as demonstrated above

    Export(Import(JSON-text)) JSON-text in most cases.

    This relation should be obvious because a (JSON-)text is an ordered collection of characters while agraphs contains unordered sets of nodes and edges.

    7 GrammarnodeSection 0..* ::= JsonField( nodeSectionId : nodeValues )

    nodeValues ::= nodeList| JsonObject( nodeTypeId : nodeList ) 0-1

    nodeList ::= JsonArray( JsonObject( nodeFieldId : nodeField ) 1..* ) 0-1 | JsonArray( JsonArray ( nodeField ) 1..* ) 0-1

    nodeField ::= JsonValue

    edgeSection 0..* ::= JsonField( edgeSectionId : edgeValues )

    edgeValues ::= edgeList| JsonObject( edgeTypeId : edgeList ) 0-1

    edgeList ::= JsonArray( JsonObject( edgeIdFields ) 2..* ) 0-1 | JsonArray( JsonArray ( edgeFields ) 2..* ) 0-1

    edgeIdFields ::= (edgeFieldId : edgeField) 1..*

    edgeFields ::= (edgeField) 1..*

    edgeField ::= JsonValue

    7.1 Notes on the Grammar Notation(1) Entries with the prefix Json refer to JSON values as defined in RFC 4627. The parenthesis

    following such a Json entry are not part of the syntax. For instance,JsonArray( JsonObject( edgeIdFields ))

    reads a JSON array containing JSON objects containing edgeIdFields .

    (2) If the multiplicity of a repetitive JSON element is restricted, the allowed multiplicity is given insuperscript notation. For instance,

    JsonObject( edgeTypeId : edgeList )0-1

    translates to

    { }| { edgeTypeId : edgeList }

  • 7/27/2019 Graph4Scala-JSON-UserGuide.pdf

    8/8

    scalax.collection.Graph JSON Import/Export / 8 P. Empen

    with zero or one field in the JSON object. Thus it reads a JSON object containing zero or onefield.

    7.2 Notes on Specific Grammar Elements

    (1) nodeSection / edgeSection JSON fields The JSON text passed to the Graph conversion method fromJson will be parsed for anarbitrary number of nodeSection s and edgeSection s both described in the above grammar.

    (2) *Id JSON strings

    Grammar elements suffixed with Id such as nodeSectionId, nodeTypeId,edgeSectionId or gedgeTypeId are always JSON strings. In general, they allow usingcustom JSON constants.For instance, JSON objects containing edges will be found in the JSON text based onedgeSectionId which defaults to edges but may be altered to any other name such as

    vertices . Then, the caller of a Graph conversion method passes the appropriate value foredgeSectionId in the jsonDescriptor argument.

    (3) nodeTypeId / edgeTypeId JSON Strings

    These Id s provide a means to select the appropriate nod/edge descriptor.

    (4) nodeList / edgeList JSON arrays

    Nodes/edges enlisted in nodeList / edgeList may be represented either by JSON objects(named fields) or by JSON arrays (positioned field values).

    7.3 Further Examples for Valid JSON Texts(1) Default node type with named node fields, default edge type with positioned edge fields

    {"nodes": [{"id":"n1", "name":"John"},{"id":"n2", "name":"Mary"},{"id":"n3", "name":"Alex"}

    ],"edges": [["n2", "n1", 1998],

    ["n2", "n3", 2003]]

    }

    (2) Default node type with positioned node fields, multiple named edge types with positioned edgefields

    {"nodes": [["n1", "John"],["n2", "Mary"],["n3", "Alex"]

    ],"edges": {"married": [["n2", "n1", 1998],

    ["n2", "n3", 2003]]},

    "edges": {"knows": [["n2", "n1", 1998, "school"],["n2", "n3", 2000, "work"],["n1", "n3", 2003, "party"]]

    }}