Upload
peter-okelly
View
1.318
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
XQuery’s Enigmatic Information Architecture Role
MarkLogic User Conference 2011Peter O’Kelly
2
Agenda
• Background• Why XQuery is awesome• The XQuery enigma: why it’s not yet
mainstream• Projections and recommendations• Q&A
4/28/2011 © 2011 O’Kelly Associates
3
Background
• Where I’m coming from– Industry analyst/consultant working in information
management and collaboration domains for ~30 years
– Background• Application developer and data architect• Product management and strategy roles at Lotus, IBM,
Groove Networks, Macromedia, and Microsoft• Industry analyst/consultant with the Patricia Seybold
Group and Burton Group
– pbokelly.blogspot.com4/28/2011 © 2011 O’Kelly Associates
4
Background
• My high-level XQuery perspective– XQuery truly is awesome…
• A very well-designed language and standards initiative, optimized for un- or semi-structured information
– But XQuery appears to be somewhat stalled, in terms of overall market momentum
– It’s important to understand and address the reasons for the stall• Because a vibrant XQuery standard, along with related
techniques and tools, are important for the evolution of information management
4/28/2011 © 2011 O’Kelly Associates
5
Agenda
• Background• Why XQuery is awesome• The XQuery enigma: why it’s not yet
mainstream• Projections and recommendations• Q&A
4/28/2011 © 2011 O’Kelly Associates
6
Why XQuery is Awesome
• A purpose-built XML content manipulation language– Gracefully applying the joy of sets to XML content– Offering a sustainably complementary fit with SQL
• Designed by experts including SQL co-author Don Chamberlin
– Evolving to go far beyond queries• With search, conditional expressions, function libraries, and more
– Can replace a kitchen sink of earlier technologies• Fewer moving parts means more simplicity and less maintenance
• A W3C Recommendation, building on XML Schema, XPath, and other standards
4/28/2011 © 2011 O’Kelly Associates
8
Agenda
• Background• Why XQuery is awesome• The XQuery enigma: why it’s not yet
mainstream• Projections and recommendations• Q&A
4/28/2011 © 2011 O’Kelly Associates
14
Other Evidence of Non-Mainstreaming
• Vendor uncertainty or other hesitation– E.g., at Gilbane Boston 2010, few of the exhibitors
I spoke with had even heard of XQuery• Not a statistically significant survey, but still surprising
– Few of the current market-leading collaboration/ content platforms are based on XQuery• Tangent: this suggests there is a compelling market
opportunity for new collaboration/content entrants that are XQuery-based
4/28/2011 © 2011 O’Kelly Associates
15
Exploring the XQuery Enigma
• Some issues that have probably limited XQuery market momentum– Lack of a big-picture framework– Installed base inertia– Standards and politics– The Internet ethos– Limited techniques and tools
• [Don’t panic! We’ll return to the future-optimistic themes in a few minutes...]
4/28/2011 © 2011 O’Kelly Associates
16
A Big-picture Framework
• A digital information item dichotomy – Resources
• Digital artifacts optimized for human comprehension– Organized in terms of narrative, hierarchy, and sequence
• Examples: books, magazines, documents (e.g., PDF, Word), Web pages, XBRL documents…
– Relations• Application-independent descriptions of real-world
things and relationships• Examples: business domain databases, e.g., customer,
sales, HR…
4/28/2011 © 2011 O’Kelly Associates
17
The Resource/Relation Continuum
4/28/2011 © 2011 O’Kelly Associates
Resource RelationW
ord
docs
DITA
doc
s
XBRL
doc
s
PDF d
ocs
Oper
ation
al d
b
Desk
top
db
Stre
amin
g db
18
A Big-picture Framework
• Complementary levels of modeling abstraction– Conceptual
• Technology-neutral• Used to establish contextual consensus • Also very useful, when done well, for creating logical models
– Logical• Captures conceptual models in a technology rendering• Examples: (beyond-the-basics) hypertext and relational• Information workers and app developers ideally work at this level of
abstraction
– Physical• Includes implementation-level details• Ideally, activity at this level is limited to system architects and administrators
4/28/2011 © 2011 O’Kelly Associates
19
Conceptual Model Examples
4/28/2011 © 2011 O’Kelly Associates
20
A Big-picture Framework
4/28/2011 © 2011 O’Kelly Associates
Resources Relations
Conceptual Resources and links Entities, attributes, relationships, and identifiers
Logical Model: hypertextLanguage: XQuery
Model: extended relationalLanguage: SQL
Physical Indexing (e.g., scalar data types, XML, full-text), locking and isolation levels, federation, replication, in-memory databases,
columnar storage, table spaces, caching, and much more
21
The Lack of a Big-picture Framework
• Without a framework, there’s likely to be– Uncertainty about what to use when– Conflict based more on miscommunication and/or
misunderstanding than real issues– Insufficient focus on • Application/data independence• Conceptual/logical/physical model independence
– Low probability of appreciation for the sustainable and complementary fit between XQuery and SQL
4/28/2011 © 2011 O’Kelly Associates
22
Installed Base Inertia
• Incumbent vendors– DBMS vendors– Application vendors
• Large organizations usually have distinct “content” and “data” management groups, often with little collaboration between them– Content-focused people are often more instance-
oriented and care a lot about schema flexibility– Database-focused people are generally more type-
oriented and care a lot about schema precision
4/28/2011 © 2011 O’Kelly Associates
23
A Content-centric View
4/28/2011 © 2011 O’Kelly Associates
Resource Relation
24
A Database-centric View
4/28/2011 © 2011 O’Kelly Associates
Resource Relation
25
Installed Base Inertia
• Programmer preferences can be pernicious– Object-oriented frameworks have a lot in common with
resources (e.g., hierarchy, sequence, and positional navigation)
– The object/relational “impedance mismatch” is still irksome, in some tools/frameworks
– But that does not mean it’s reasonable to default to resource-oriented approaches for all domains, even if the application is XML-centric, because not all XML content is resource-centric• Doing so can dumb-down DBMS usage patterns, with significant
consequences
4/28/2011 © 2011 O’Kelly Associates
26
Standards and Politics
• “The nice thing about standards is that you have so many to choose from” – Andrew Tanenbaum
• As in the development of SQL, there are complex challenges at the intersection of standards groups, vendor agendas, and academic priorities
• The Open XML/ODF debate is another recent, relevant, and revealing case study
4/28/2011 © 2011 O’Kelly Associates
27
Standards and Politics
• NoSQL– “A rhetorically clever and manipulative name …
Saying ‘NoSQL’ says what you’re against, not what you’re for” (Joe Maguire)
– As with the largely failed “object database” wave 20+ years ago, NoSQL extremists appear to underestimate the expressive power and utility of what they propose to displace
– While there is ample room for database-related innovation, polarizing the debate is unhelpful
4/28/2011 © 2011 O’Kelly Associates
28
The Internet Ethos
• Lots to like– Open, community-driven, vendor-independent…
• But also some risks; e.g., the Internet– Doesn’t complain if your system is inefficient or ineffective – Is culturally conducive to cyber-polarization
• E.g., there are probably still lively Internet forum debates about the relative merits of DTD, Schematron, RNG, and XML Schema – And xBASE versus SQL, and RPG versus COBOL…
• This creates a key challenge: it’s difficult to get vitality readings on standards and technology alternatives– Including major initiatives such as XQuery and XHTML 2.0
4/28/2011 © 2011 O’Kelly Associates
29
Limited Techniques and Tools
• Some SQL reality checks – Relatively few people work directly with SQL
• The vast majority of information workers and developers who benefit from using SQL do so indirectly, through tools ranging from IDEs to query/reporting applications
– The development of ODBC was pivotal for software vendors and application developers working with RDBMSs• Making it possible for them to use a single interface
model for multiple products
4/28/2011 © 2011 O’Kelly Associates
30
Limited Techniques and Tools
• XQuery's market uptake has been constrained by the small number of XQuery-based tools and applications – Which is in turn limited in part by the lack of a
successful ODBC equivalent for XQuery• Which, in turn, is partly a function of Microsoft’s apparent
XQuery ambivalence
– Many XML-focused developers believe they get most of what they need from XPath• Without tools to promote effective use of XQuery, it’s a
difficult value proposition to make
4/28/2011 © 2011 O’Kelly Associates
31
Limited Techniques and Tools
• Modeling techniques and tools are also pivotal– There are some good options today for physical
database modeling• But few choices for logical modeling
– And almost a complete lack of conceptual modeling tools
– For XML information modeling, there are even fewer modeling technique/tool options today
• It’s also a cultural and incentive system challenge– If developers are paid to primarily focus on physical
models, that’s what most of them will do
4/28/2011 © 2011 O’Kelly Associates
32
Limited Techniques and Tools
• Many XML-focused developers appear to believe they don’t need to invest time and attention in modeling– In part because XML-focused application development often starts
with existing XML schemas and/or documents rather than “green field” modeling
• But modeling is equally applicable to resource and relation domains, for– Establishing contextual consensus– Helping to promote
• Application/information independence• Conceptual/logical/physical model independence
– Fostering the effective application of set theory and maximizing the use of declarative expressions
4/28/2011 © 2011 O’Kelly Associates
33
Agenda
• Background• Why XQuery is awesome• The XQuery enigma: why it’s not yet
mainstream• Projections and recommendations• Q&A
4/28/2011 © 2011 O’Kelly Associates
34
Projections and Recommendations
• XQuery is going to be a mainstream success• RDBMSs aren’t going away anytime soon• The standards scene is evolving in subtly
significant ways• More and better modeling• MarkLogic is very well positioned
4/28/2011 © 2011 O’Kelly Associates
35
XQuery Will be a Mainstream Success
• And already is a success, for many progressive IT organizations and software vendors
• The next wave of XQuery momentum will likely come more from content management than traditional database management – Providing significant opportunities to have fewer
information architecture moving parts• E.g., to spend less on specialized enterprise content
management, records management, and Web content management servers and tools
4/28/2011 © 2011 O’Kelly Associates
36
XQuery Will be a Mainstream Success
• Recommendations– Learn and fully leverage XQuery• Go beyond the basics to master the full XQuery
language• “Querying XML,” by Jim Melton and Stephen Buxton, is
a useful resource in this context
– Seek to simplify and consolidate, e.g., • To do less scripting/programming and more declarative
development using XQuery• To migrate content and apps from legacy ECM systems
4/28/2011 © 2011 O’Kelly Associates
37
RDBMSs Aren’t Going Away
• Resources and relations are complementary– And XQuery and SQL offer very strong synergy– Systems such as Google’s Megastore are important
leading indicators, as hybrid models• “NoSQL” will rapidly evolve – Initially implied “Just say ‘no’ to SQL”– Later quietly redefined as “Not Only SQL”– What may be next: “New Opportunities for SQL”
• I.e., some developers may reconsider the value of SQL and RDBMSs, after hitting NoSQL limitations
4/28/2011 © 2011 O’Kelly Associates
38
RDBMSs Aren’t Going Away
• Recommendations– Develop expertise in both (beyond-the-basics)
hypertext and relational models• And explore the information flows between them
– Provide clear customer requirements and feedback to your RDBMS, application, and tool vendors• Encourage them to fully exploit resource/relation synergy
– Establish clear developer criteria on what to use when, e.g., for NoSQL alternatives• Consider applying the framework presented earlier
4/28/2011 © 2011 O’Kelly Associates
39
Subtly Significant Standards Evolution
• The industry is a very different place compared to when SQL was standardized in the mid-1980s– The Internet ethos is pervasive, and key vendors have
learned to productively play the standards game together• There have been some major standards changes
recently, e.g., the discontinuation of XHTML 2.0• But there is also clear market momentum
consolidation around standards including XML Schema, XPath 2.0, XSLT, and HTML5– And, although not always obviously, XQuery
4/28/2011 © 2011 O’Kelly Associates
40
Subtly Significant Standards Evolution
• Recommendations– Place well-informed standards bets, regularly
check assumptions, and be willing to make course corrections
– Get involved • Make your standards-related requirements clear to
your strategic vendors• Actively participate in standards activities
4/28/2011 © 2011 O’Kelly Associates
41
More and Better Modeling
• Conceptual, logical, and physical modeling are critical success factors for both resources and relations
• Organizations that under-invest in modeling are essentially reverting to the obsolete programs-have-files approach, limiting– Application/data (and content) independence– Conceptual/logical/physical model independence
4/28/2011 © 2011 O’Kelly Associates
42
More and Better Modeling
• Recommendations– Develop modeling expertise
• Explore resources such as “Mastering Data Modeling” (Carlis/Maguire)
– Apply the big-picture framework for consensus on (resources + relations) * (conceptual/logical/physical)
– Build and consistently use model repositories • Also ensure modeling and reuse are supported by developer
incentive systems
– Provide clear modeling-related requirements to your tool and server vendors
4/28/2011 © 2011 O’Kelly Associates
43
MarkLogic is Very Well Positioned
• MarkLogic – Placed an early bet on XQuery, and continued to
focus on XQuery while many other vendors balked – Has insights from XML information management
market leadership in key domains including media, government, and finance
– Is led by a deeply experienced and strong team• Recommendations– Share your experiences this week and consider
proposing a customer case study for MLUC 20124/28/2011 © 2011 O’Kelly Associates
44
Agenda
• Background• Why XQuery is awesome• The XQuery enigma: why it’s not yet
mainstream• Projections and recommendations• Q&A
4/28/2011 © 2011 O’Kelly Associates