A Snapshot of public Web Services
Prof: Dr.Jainguo Lu03-60-569
Presenting Group:
Aktar-uz-zamanMohit Sud
Objective
Find out the number of public web service
Complexity Composability Meaningful documentation Future research trends
Introduction Conflicting the direction of research area based
on- Current Status of web service- Future Evaluation
In order to find relative relevance of the currentresearch, they did some snapshot of public webservice and describe the result of study anddiscus their implications.
For Example, most primarily application will be - public web - intra-corporate
How Describe how crawled web services
from large number of registries, removed duplicates and validated the services.
Describe variety of automated and manual analysis from resulting web services.
Describe the implications and lessons of these analysis for the research
Overview of Current Research Direction in Web ServicesWeb services are software services distributed on the
internet.
Standard to formalize web service in levels SOAP (Simple Object Access Protocol) for message
Communication WSDL (Web Service Definition Language) for description BPEL4WS (Business Process Execution Language for
Web Service) for composition OWL-S (Ontology Web Language for Services) for
describing web service in an unambiguous, computer-interpretable form.
UDDI (Universal Description, Discovery and Integration) for publishing and discovery the web services.
Discovery and CompositionTwo approaches1. Promoted the syntax of WSDL and use BEPL4WS for
compositionUnderline problem: Search is mostly keyword is English text descriptions
which is not machine interpretable.Research possibility: To extract higher level of languagefrom WSDL
2. Using language like OWL-S, more semantics in the web services. So that the meaning and functionality is unambiguous and machine-interpretable.
Relevant Approach
It depend what type of application willsupport in web service in near future??Two Ideas:1. Intra-corporate scenario: Annotated
by service provider using consistence ontology
2. Public web: Consistence ontology is a dream and less feasible.
Snapshot of Current Web Services
What public web services are available?? UDDI registries is not good
- large portion is “hello-world” style- do not have valid WSDL file URL
Therefore: They first crawled the registries Processed the data collected to remove
invalid entry and duplicate Analysis the text description according
to their properties and functionalities.
Crawling the Registry A crawler is a program that visits Web sites and
reads their pages and other information in order to create entries for a search engine index. which is also known as a "spider" or a "bot."
Crawlers are typically programmed to visit sites that have been submitted by their owners as new or updated. Entire sites or specific pages can be selectively visited and indexed. Crawlers apparently gained the name because they crawl through a site a page at a time, following the links to other pages on the site until all pages have been read.
The crawler for the AltaVista search engine and its Web site is called Scooter.
Crawling the Registry Cont
The following registries crawled for collecting the information's:
www.bindingpoint.comwww.salcentral.comwww.xmethod.comwww.webservicex.comwww.webservicelist.com
Crawling the Registry cont
Crawler found the registries: 2432 total registries
After filtering invalid they found 1544 registries
Crawling the Registry cont
The following information saved in local
Database: Service name Providers Text description Content of the WSDL file
Crawling the Registry cont
Invalid entries: WSDL is not well-formed or does not conform to WSDL
standard Duplicates registrationRemoving Invalid entries:1. Parsed every fetched WSDL file to see valid xml
document2. Simple check to the WSDL standards by checking
existence of several necessary tagsRemoving duplicates: Used combination of service name and provider name
as a keyNext: Automated clustered to classify of these collected
web service in terms of their functionalities.
Clustering the Services cont
Why clustering? Would help the retrieval of services Hypothesis was to automatically generated
cluster will be able to suggest similar services.How?Text based clustering techniques, from Three
parts of service description: Text description when they are registered The document field of services in their WSDL
files The documentation field of individual
operation of services in their WSDL files.
Automated Clustering cont
Two algorithm techniques used Hierarchical Agglomerative Cluster
(HAC) Jaccard Similarity as distance
measure
Noise in clustering
When a service does not enough information to differentiate itself
from other from other during the
clustering Many of them does not have any
documentation in DSDL files
Manual Analysis Type of Web Services Publicly
Complexity of the Web Service
How many individual operations
are involved in individual web service? 640
Complexity of the Web Service Manual Analysis
77% < 5 operation 36% only one operation Most of the operation have relation
each other No more then two operation is
compatible among the services
Result and Motivation
At the current stage there are no large number of public web services available which are both very complicated and have the potential to be composed with other services.
Research motivation of the composition of complicated web services from intra-corporate scenarios
Complicacy of Service Compositions
Quality of WSDL service description Are services ready to use or
compose? Whether the services provider are
seriously using the WSDL files as the way to convey the correct interpretation to developers who will use them?
Analysis on Length of Text Description on 640
Services >80%
has less then 50 words >52%
has <20 words
Analysis on Operation of Text Description
80% has <10 words
50% has zero documentation
Population, Distribution and Structure
67% of registered web services not valid, 6 months data collection from another survey
Population, Distribution and Structure
63% of WS hosted in US
SOAP Message Size SOAP Message Size=
HTTP header + essential tag + payload tag
SOAP message is larger then current web objects
92% of SOAP messages are < 2kb, only 45% of existing web objects are < 2kb
Analysis
Since WSDL and registration information are the only source for the user to understand the functionality of the service, it is questionable that currently available public web services are ready for composition???
TPYE: Most publicly web services are simply data sources that uses SOAP
Analysis Retrieval: For the quality and performance
of retrieval/discovery challenges, and evolving of web services it need advanced system of registries which will structure the entries and make retrievals and discovery easier.
Composition: Very few ways of composing web service because of the lack of services and relation. If proper XML description in WSDL file, composing is not a pressing problem.
Conclusion Hoping this analysis will provides useful information about future fruitful research direction of the web service technology including, Modeling, Specification, Discovery Composition and Verification There is more opportunity to research and
do similar study on intra-corporate web services. Hoping machine interpretable annotation may well be feasible for more complex composition and conversion frameworks.
Reference 1. Jianchun Fan & Subbarao Kambhampati
Department of Computer Science and EngineeringArizona state University
2. Su Myeon Kin KAIST.EECS Dept
KOREAAnd
Marcel-Catalin RosuIBM T.J Watson Research Center
USA