Download ppt - A Snapshot of public Web Services

A Snapshot of public Web Services

Prof: Dr.Jainguo Lu03-60-569

Presenting Group:

Aktar-uz-zamanMohit Sud

Objective

Find out the number of public web service

Complexity Composability Meaningful documentation Future research trends

Introduction Conflicting the direction of research area based

on- Current Status of web service- Future Evaluation

In order to find relative relevance of the currentresearch, they did some snapshot of public webservice and describe the result of study anddiscus their implications.

For Example, most primarily application will be - public web - intra-corporate

How Describe how crawled web services

from large number of registries, removed duplicates and validated the services.

Describe variety of automated and manual analysis from resulting web services.

Describe the implications and lessons of these analysis for the research

Overview of Current Research Direction in Web ServicesWeb services are software services distributed on the

internet.

Standard to formalize web service in levels SOAP (Simple Object Access Protocol) for message

Communication WSDL (Web Service Definition Language) for description BPEL4WS (Business Process Execution Language for

Web Service) for composition OWL-S (Ontology Web Language for Services) for

describing web service in an unambiguous, computer-interpretable form.

UDDI (Universal Description, Discovery and Integration) for publishing and discovery the web services.

Discovery and CompositionTwo approaches1. Promoted the syntax of WSDL and use BEPL4WS for

compositionUnderline problem: Search is mostly keyword is English text descriptions

which is not machine interpretable.Research possibility: To extract higher level of languagefrom WSDL

2. Using language like OWL-S, more semantics in the web services. So that the meaning and functionality is unambiguous and machine-interpretable.

Relevant Approach

It depend what type of application willsupport in web service in near future??Two Ideas:1. Intra-corporate scenario: Annotated

by service provider using consistence ontology

2. Public web: Consistence ontology is a dream and less feasible.

Snapshot of Current Web Services

What public web services are available?? UDDI registries is not good

- large portion is “hello-world” style- do not have valid WSDL file URL

Therefore: They first crawled the registries Processed the data collected to remove

invalid entry and duplicate Analysis the text description according

to their properties and functionalities.

Crawling the Registry A crawler is a program that visits Web sites and

reads their pages and other information in order to create entries for a search engine index. which is also known as a "spider" or a "bot."

Crawlers are typically programmed to visit sites that have been submitted by their owners as new or updated. Entire sites or specific pages can be selectively visited and indexed. Crawlers apparently gained the name because they crawl through a site a page at a time, following the links to other pages on the site until all pages have been read.

The crawler for the AltaVista search engine and its Web site is called Scooter.

Crawling the Registry Cont

The following registries crawled for collecting the information's:

www.bindingpoint.comwww.salcentral.comwww.xmethod.comwww.webservicex.comwww.webservicelist.com

Crawling the Registry cont

Crawler found the registries: 2432 total registries

After filtering invalid they found 1544 registries


The following information saved in local

Database: Service name Providers Text description Content of the WSDL file


Invalid entries: WSDL is not well-formed or does not conform to WSDL

standard Duplicates registrationRemoving Invalid entries:1. Parsed every fetched WSDL file to see valid xml

document2. Simple check to the WSDL standards by checking

existence of several necessary tagsRemoving duplicates: Used combination of service name and provider name

as a keyNext: Automated clustered to classify of these collected

web service in terms of their functionalities.

Clustering the Services cont

Why clustering? Would help the retrieval of services Hypothesis was to automatically generated

cluster will be able to suggest similar services.How?Text based clustering techniques, from Three

parts of service description: Text description when they are registered The document field of services in their WSDL

files The documentation field of individual

operation of services in their WSDL files.

Automated Clustering cont

Two algorithm techniques used Hierarchical Agglomerative Cluster

(HAC) Jaccard Similarity as distance

measure

Noise in clustering

When a service does not enough information to differentiate itself

from other from other during the

clustering Many of them does not have any

documentation in DSDL files

Manual Analysis Type of Web Services Publicly

Complexity of the Web Service

How many individual operations

are involved in individual web service? 640

Complexity of the Web Service Manual Analysis

77% < 5 operation 36% only one operation Most of the operation have relation

each other No more then two operation is

compatible among the services

Result and Motivation

At the current stage there are no large number of public web services available which are both very complicated and have the potential to be composed with other services.

Research motivation of the composition of complicated web services from intra-corporate scenarios

Complicacy of Service Compositions

Quality of WSDL service description Are services ready to use or

compose? Whether the services provider are

seriously using the WSDL files as the way to convey the correct interpretation to developers who will use them?

Analysis on Length of Text Description on 640

Services >80%

has less then 50 words >52%

has <20 words

Analysis on Operation of Text Description

80% has <10 words

50% has zero documentation

Population, Distribution and Structure

67% of registered web services not valid, 6 months data collection from another survey

Population, Distribution and Structure

63% of WS hosted in US

SOAP Message Size SOAP Message Size=

HTTP header + essential tag + payload tag

SOAP message is larger then current web objects

92% of SOAP messages are < 2kb, only 45% of existing web objects are < 2kb

Analysis

Since WSDL and registration information are the only source for the user to understand the functionality of the service, it is questionable that currently available public web services are ready for composition???

TPYE: Most publicly web services are simply data sources that uses SOAP

Analysis Retrieval: For the quality and performance

of retrieval/discovery challenges, and evolving of web services it need advanced system of registries which will structure the entries and make retrievals and discovery easier.

Composition: Very few ways of composing web service because of the lack of services and relation. If proper XML description in WSDL file, composing is not a pressing problem.

Conclusion Hoping this analysis will provides useful information about future fruitful research direction of the web service technology including, Modeling, Specification, Discovery Composition and Verification There is more opportunity to research and

do similar study on intra-corporate web services. Hoping machine interpretable annotation may well be feasible for more complex composition and conversion frameworks.

Reference 1. Jianchun Fan & Subbarao Kambhampati

Department of Computer Science and EngineeringArizona state University

2. Su Myeon Kin KAIST.EECS Dept

KOREAAnd

Marcel-Catalin RosuIBM T.J Watson Research Center

USA