29
XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski Vivek Sarke Conference: The 14th International World Wide Web Conference (WWW2005), Chiba, Japan, May 10-14, 2005 Karawan Shahla Seminar Lecture 236803

XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

  • View
    226

  • Download
    6

Embed Size (px)

Citation preview

Page 1: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ: Facilitating XML Processing in Java

Written By :

Matthew Harren Mukund Raghavachari

Oded Shmueli Michael Burke

Rajesh Bordawekar Igor Pechtchanski

Vivek Sarke

Conference:  The 14th International World Wide Web Conference (WWW2005), Chiba, Japan, May 10-

14, 2005

Karawan Shahla

Seminar Lecture 236803

Page 2: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

Agenda

Some files.Main Idea.Introduction to XJ.XJ Type System.XJ Expressions .XJ Updates.XJ Problems.Conclusion

Page 3: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"><xs:element name="catalog"> <xs:complexType> <xs:sequence> <xs:element name="course" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="points" type="xs:int"/> <xs:element name="number" type="xs:int"/> <xs:element name="name" type="xs:string"/> <xs:element name="teacher" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element></xs:schema>

Schema file (file: technioncatalog.xsd)

Page 4: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XML document(file: short.xml)

<?xml version="1.0" encoding="UTF-8"?><catalog> <course> <points>3</points> <number>234319</number> <name>Programming Languages</name> <teacher>Ron Pinter</teacher> </course> <course> <points>3</points> <number>234141</number> <name>Combinatorics for CS</name> <teacher>Ran El-Yaniv</teacher> </course></catalog>

Page 5: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ Program file

import java.io.*; import technioncatalog.*; public class Demo1 { public static void main(String[] args) throws Throwable { catalog cat = new catalog(new(File("short.xml")); catalog.course c = cat [| /course[2] |]; printCourse(c); } private static void printCourse(catalog.course c) { String name = c [| /name |]; String teacher = c [| /teacher |]; int points = c [| /points |]; int id = c [| /number |]; System.out.println(name + "(" + id + ") by " + teacher + ", " + points); } }

“Combinatorics for CS (234141) by Ran El-Yaniv, 3 credit points”

Page 6: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

Main Idea

XML is getting increasingly popular.

High level languages should support manipulating XML sufficiently.

Let’s go through existing API’s

Page 7: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

public static void main(String[] args) throws Throwable { DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new java.io.File("short.xml")); XPath xp = XPathFactory.newInstance().newXPath(); DTMNodeList nodes = (DTMNodeList) xp.evaluate("//course", doc, XPathConstants.NODESET); printCourse(nodes.item(1)); } XPath is a plain string. It may

be:•Syntactically incorrect•Incompatible with the document

The types of the XML objects

(Node, Document) do not reflect the

schema

Traditional XML processing: (DOM, XPath apis)

Page 8: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

private static void printCourse(Node n) { NodeList nodes = n.getChildNodes(); System.out.println(nodes.item(5).getTextContent() + " (" + nodes.item(3).getTextContent() + ") by " + nodes.item(7).getTextContent() + ", " + nodes.item(1).getTextContent() + " credit points");} Assumption: Four

child nodes must exist

Assumption: 3rd child is the course number

• These assumptions will not hold if the schema is changed– => run-time errors– problems remain, even if we identify nodes by name

• Possible Schema changes:– Allowing a new optional <students> sub-element– Changing the order of the sub-elements

What about reading the numeric value of an

element?

Traditional XML processing(DOM apis)

Assumption: 2nd child has no child elements

Page 9: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

Shaping the future

• What XML-related facilities do we want?– Typed XML objects – Seamless translation of a Schema/DTD into a Java

type – Two composition techniques

• XML notation

• Java’s object creation syntax – Two decomposition techniques

• Typed XPath • Typed, named methods/fields

– XPath expressions as first-class-values

Page 10: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ: offered solution

Java XJ.

we will over view the constructs offered by XJ.

Available at: http://www.research.ibm.com/xj

Page 11: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ Type System

Page 12: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

Integration with Schema

• The rationale: 1. An OO program is a collection of class definitions2. A Schema file is a collection of type definitions

• => let’s integrate these definitions

• Any Schema is also an XJ types– The XJ compiler generates a “logical class” for

each such type– Schema file == package name– Using a schema == import schema_file_name;

Page 13: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

import technioncatalog.*;

public class Demo2 { public static void main(String[] args) throws Throwable { String x = "Algorithms 1"; int y = 234247; catalog cat = buildCatalog(new catalog.course( <course><points>3</points>

<number>{y}</number><name>{x}</name> <teacher>Shlomo Moran</teacher></course>)); }

private static catalog buildCatalog(catalog.course c) {

return new catalog(<catalog>{c}</catalog>); } }

XML literal in XJ code• Invalid XML content triggers a compile-time error• Resulting elements are typed!

Page 14: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

... course c = new course(<course> <teacher>Shlomo Moran</teacher></course>); buildCatalog(c);

XMLObject x = new course.teacher ( <teacher>Shlomo Moran</teacher>); buildCatalog(x);...

private static catalog buildCatalog(catalog.course c) { return new catalog(<catalog>{c}</catalog>); }

An ill-typed program

Wrong <course> element

An XMLObject cannot be passed as a course element

Page 15: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

Embedding XPath Queries in XJ

• Syntax: XmlExpr [| XPathQuery |]

Requires: a context-provider: – An XML element over which the XPath query

is invoked– (see the cat variable in the sample)

course doSomething(catalog cat, int courseNum) { return cat [| /course[./number = $courseNum] |];}

Page 16: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

• Problem: resulting type is sometimes not so clear• Two options

– Sequence<T>• If the compiler determines that all result

elements are of type T

– Sequence<XMLObject>• (Otherwise)

• Automatic conversion from a singleton sequence

• Static check of XPath queries– If result is always empty => compile-time error

XPath Semantics

Page 17: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ Updates (Introduction)

• XJ provide three kinds of updates: 1) Simple assignment. 2) Bulk assignment. 3) Structural updates.

• XJ updates are chosen to be consistent with Java’s reference semantics.

Page 18: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ Updates (syntax and semantics)

Simple Assignment

The XPath expression returns a reference to the existing element to be updated.

Bulk Assignment

The XPath expression denotes a sequence , bulk assignment allows multiple

assignments. Here double the credit points of each course.

public static void changePoint(catalog.course c, int p) {

cat [| //points |] *:= 2;}

public static void changePoint(catalog.course c, int p) {

c [| /points |] = p;}

Page 19: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ Updates (syntax and semantics) Structural updates

public static void addCourse(catalog cat) { course c = new course(<course><points>4</points> <number>234111</number><name>Intorduction to CS</name> <teacher>Roy Friedman</teacher></course>); cat.insertAsLast(c);}

Class XML Object also defines methods, such as:

insertAfter()insertBefore()insertAsFirst()insertAsLast()detach()

Page 20: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ Updates Problems : Cycles

Updates may cause cycles, e.g. a class that have more than one parent.

This arises a run time exception.

Ensuring that the root is never inserted into one of it’s descendants.

Why cycles are bad ?

Can you think of a solution ?

Page 21: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ Updates Problems : Type Consistency

• Definitions

1. An XML update operation, u, is a mapping over XML values• u: T1 -> T2

2. An update is consistent if T1 = T2

• Ideally, a compile-time error should be triggered for

each inconsistent update in the program

• Unfortunately, this cannot be promised

• The solution: Additional run-time check

Can you think of an example ?

Page 22: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XJ Updates Problems: Covariant subtyping (the problem)

• Covariance: change of type in signature is in the same direction as that of the inheritance

class X { }class A { public void m(X x) { } }

Class X1 extends X { }Class A1 extends A { public void m(X1 x) { } }...A a = new A1(); a.m(new X());

A1.m() is “spoiled”:

Requires only X1 objects

• Java favors type-safety: A method with covariant arguments is considered to be an overloading rather than overriding

– Same approach is taken by C++, C#

• But, covariance is allowed for arrays– Array assignments may fail at run-time

Which method should be invoked: A.m()

or A1.m() ?

Page 23: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

(Now let us get back to our technioncatalog schema…)

• A <course> value is also spoiled – It requires unique children: <points>, <name>, etc.

• But, it also has an unspoiled super-class: XMLObject

– All updates to XMLObject are legal at compile-time

• The following code compiles successfully:public static void trick(course c) { XMLObject x = c; points p = new points(<points>4</points>); x.appendAsLast(p); } Run-time error is

here !!

XJ Updates Problems: Covariant subtyping (example)

Page 24: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

• Language constructs seen so far

– Typed XML objects – Seamless translation of a Schema/DTD into a Java

type – Two composition techniques

• XML notation • Java’s object creation syntax

– Two decomposition techniques• Typed XPath • Typed, named methods/fields

– XPath expressions as first-class-values

Shaping the future (revisited)

Page 25: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XPath expression as first-class-values

• What is a first-class-value?– A value that can be used “naturally” in the program

• Passed as an argument• Stored in a variable/field• Returned from a method• Created

• In XJ, XPath expression do not met these conditions– The main obstacle: The XPath part of the expression

cannot be separated from its context provider

Page 26: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

XPath expression as first-class-values

• Operators on XPath values– Composition– Conjunction– Disjunction

• These operators will allow the developer to easily create a rich array of safe XPath values

• The compiler must keep track of the type of each such value

– Basically an XPath value is a function T -> R, where both T,R are subclasses of XMLObject

– When two XPath values are composed, the result type is deduced from the types of the operands

Page 27: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

Typed, named methods/fields

• Usually, values aggregated by a Java object are accessed by fields/methods– Can we access XML sub-elements this

way?– (Following code IS NOT a legal XJ

program)

import technioncatalog.*;void printTeachers(catalog cat) { for(int i = 0; i < cat.courses.length; ++i) { catalog.course c = cat.courses[i]; System.out.println(c.teacher); }}

Page 28: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

Typed, named methods/fields

• Some of the difficulties:– Sub-elements are not always named– Schema supports optional types: <xsd:choice>

• How can Java express an “optional” field?

• Observation: Java’s typing mechanisms cannot capture the wealth of Schema/DTD types

– Missing features: virtual fields, inheritance without polymorphism

– Other features can be found in Functional languages• E.g.: Variant types, immutability, structural

conformance• But, their popularity lags behind

Page 29: XJ: Facilitating XML Processing in Java Written By : Matthew Harren Mukund Raghavachari Oded Shmueli Michael Burke Rajesh Bordawekar Igor Pechtchanski

Conclusion• XJ is a Java extension that has built

in support for XML– Type safety: Many things are checked at

compile time– Ease of use

• OO languages are not powerful enough (in terms of typing)– Some type information is lost in the

transition Schema -> Java