Upload
brett-montgomery
View
232
Download
0
Tags:
Embed Size (px)
Citation preview
Libwww, the W3C protocol library 29.06.2004
libwww -The W3C Protocol Library
„Großes Schwerpunktseminar WI“University of Applied Sciences Gießen-Friedberg
Stefan Sabatzki
Libwww, the W3C protocol library 29.06.2004
Contents
1. Introduction
2. Structure libwww
3. Programming with libwww
4. Conclusion
Libwww, the W3C protocol library 29.06.2004
Contents
1. Introduction– What is libwww?– Why libwww?
2. Structure libwww
3. Programming with libwww
4. Conclusion
Libwww, the W3C protocol library 29.06.2004
What is libwww?
• Generic framework for building web applications• Written in C• Pluggable modularity• Means to provide most common Internet access methods• Transmit data in many different media formats• Dataflow to and from the server
Libwww, the W3C protocol library 29.06.2004
What is libwww? (2)
• First version implemented 1992 by Tim Berners-Lee• Development at CERN• 1994 libwww moved from CERN to W3C• 1998 released as opensource• As of September 2003 W3C stopped work on libwww• As of January 2004 libwww officially belongs to the „Open
Source Community“
Libwww, the W3C protocol library 29.06.2004
Why libwww?
• Experimenting and prototyping• Performance, modularity and extensibility• Free and open source code• Mailing lists and active community
Libwww, the W3C protocol library 29.06.2004
Contents
1. Introduction
2. Structure libwww– Design Model– Request/Response Paradigm– Data Flow– Threads, Eventloops and Filters– Modules as Statemachines
3. Programming with libwww
4. Conclusion
Libwww, the W3C protocol library 29.06.2004
Design Model
• Layering as design model
Libwww, the W3C protocol library 29.06.2004
Design Model (2)
• More demonstrative
Libwww, the W3C protocol library 29.06.2004
Request/Response Paradigm
• Application issues request • Libwww fulfills request• Presented to application on arrival• Simultaneous requests handled by Librarycore
Libwww, the W3C protocol library 29.06.2004
Data Flow
• Streams are used to transport data• Derived from generic stream
– Protocol streams– Converters– Presenters– I/O streams– Basic streams
Libwww, the W3C protocol library 29.06.2004
Data Flow (2)
• Structured streams– Derived from generic stream– Accepts structured document– Ordered tree-structured arrangement of data– Each instance is associated with SMGL parser– Each instance is associated with corresponding DTD
Libwww, the W3C protocol library 29.06.2004
Data Flow (3)
• Cascaded streams– Stream chains– Setup before data arrives
Libwww, the W3C protocol library 29.06.2004
Data Flow (4)
– Setup after data arrives
Libwww, the W3C protocol library 29.06.2004
Threads, Eventloops and Filters
• Not thread-save• Implements pseudo-thread model
– Uses non-blocking sockets– Based on callback functions
• Before/After-Filter– Global and local filters– Registered at runtime
Libwww, the W3C protocol library 29.06.2004
Threads, Eventloops and Filters (2)
Libwww, the W3C protocol library 29.06.2004
Modules as Statemachines
• Since libwww 3.0• Protocol modules implemented as statemachines• Part of thread-model• Keep track of current state in communication interface
Libwww, the W3C protocol library 29.06.2004
Modules as Statemachines (2)
Libwww, the W3C protocol library 29.06.2004
Contents
1. Introduction
2. Structure libwww
3. Programming with libwww– C++ Simulation– APIs and Library Interfaces– Simple Example– More Complex Example
4. Conclusion
Libwww, the W3C protocol library 29.06.2004
C++ Simulation
• Construction/destruction– *_new / *_delete (HTRequest_new / HTRequest_delete)
• Data hiding • Inheritance
– Explicit pointer casting
• PRIVATE, PUBLIC Makros
Libwww, the W3C protocol library 29.06.2004
APIs and Library Interfaces
• Set of APIs called packages• Win32: DLLs• Unix: separate static libraries • Package interface exported via single include file: WWW*.h • Some important packages
– Basic Utility Packages– Core Packages – Initialization Packages– Transport Packages– Protocol Packages– Parser Packages
Libwww, the W3C protocol library 29.06.2004
Simple Example
• Displays all links in document• Applicable to text, html/xml tags, etc.
// snippet...HText_registerLinkCallback(foundLink); .HTEventList_loop(request); ... foundLink (...) {
HTAnchor * dest = HTAnchor_followMainLink(...); char * address = HTAnchor_address(dest); HTPrint("Found link `%s\'\n", address); HT_FREE(address);
}
Libwww, the W3C protocol library 29.06.2004
More Complex Example
• Rudimentary commandline browser• See project www.dsw
Libwww, the W3C protocol library 29.06.2004
Contents
1. Introduction
2. Structure libwww
3. Programming with libwww
4. Conclusion– What‘s missing?– Facts about libwww– Personal Opinon
Libwww, the W3C protocol library 29.06.2004
What‘s missing?
• Not thread-safe• No cookie-jar, only parsing/generation• Consistent usage of RegEx• C++ representation
Libwww, the W3C protocol library 29.06.2004
Facts about libwww
• Who uses libwww? No one?• Sample applications on project homepage• No reviews, benchmarks, comparisons• Not ‚bug free‘• ‚Competitors‘ (mostly UNIX)
– WinInet– Libghttp– Libcurl– Libhttp – Neon
Libwww, the W3C protocol library 29.06.2004
Personal Opinion
• Typical opensource project• Tricky installation• ‚Feels‘ old < – > IS old• Desperate attempt to reach OOP• Non-trivial usage, but very flexible and potent
Libwww, the W3C protocol library 29.06.2004
Thank you for your attention
?