The connection between Context-free grammars and XML
Guðmundur Freyr Jónasson
19.04.23 1Algorithms, Logic and Complexity
Context-Free Grammars
19.04.23 2Algorithms, Logic and Complexity
Variables: A, BTerminals: 0, 1, #Nr of rules: 3
Derivation of the string 000#111:
Document Type Definition (DTD)
• DTD defines the legal building blocks of an XML document with a list of legal elements and attributes.
• XML validity requires that a document follows the constraints expressed in its document type definition, which provides a rough equivalence to a context-free grammar for a document type.
• DTD is specified using “Extended Context-Free Grammar”.
19.04.23 Algorithms, Logic and Complexity 3
Extended Context-Free Grammar• Allows regular expressions over terminals and non-terminals on the
right hand side of productions.
• The regular expression operators are , * and concatenation.∪
• We could write grammar G as follows:
Or
Where S is the start symbol in both cases.
19.04.23 Algorithms, Logic and Complexity 4
DTD Syntax
• < !ELEMENT name (model) >– ELEMENT is a keyword– name is the the element name being declared– model is the elements content model
• The content model is specified using a regular expression over element names.
• We now have sequences of element names rather then sequences of symbols (strings) like in CFG.
19.04.23 Algorithms, Logic and Complexity 5
Regular Expressions vs. DTD Syntax
19.04.23 Algorithms, Logic and Complexity 6
CFG and DTD example
19.04.23 Algorithms, Logic and Complexity 7
Conclusion
• Both DTD for XML and CFG describe languages with certain rules and restrictions, and thereby declare what’s legal and what’s not in a given language.
• An XML document is considered valid if it’s well formed and has been validated against a DTD.
• A string is a valid string in a given Context-free language if the Context-free grammar for that language can generate it.
19.04.23 Algorithms, Logic and Complexity 8
References• Introduction to the Theory of Computation, 2nd edition, M. Sipser, Course
Technology, 2005.
• W3Schools – Learn DTD: http://w3schools.com/dtd/default.asp
• Extensible Markup Language (XML) Fourth Edition: http://www.w3.org/TR/REC-xml/
• Languages, Grammars and DTDs: http://www.dcs.bbk.ac.uk/~ptw/teaching/dtd-new/notes.html
19.04.23 Algorithms, Logic and Complexity 9