Upload
porfirio-tramontana
View
2.656
Download
9
Tags:
Embed Size (px)
DESCRIPTION
The heterogeneous and dynamic nature of components making up a Web Application, the lack of effective programming mechanisms for implementing basic software engineering principles in it, and undisciplined development processes induced by the high pressure of a very short time-to-market, make Web Application maintenance a challenging problem. A relevant issue consists of reusing the methodological and technological experience in the sector of traditional software maintenance, and exploring the opportunity of using Reverse Engineering to support effective Web Application maintenance. The Ph.D. Thesis presents an approach for Reverse Engineering Web Applications. The approach include the definition of Reverse Engineering methods and supporting software tools, that help to understand existing undocumented Web Applications to be maintained or evolved, through the reconstruction of UML diagrams. Some validation experiments have been carried out and they showed the usefulness of the proposed approach and highlighted possible areas for improvement of its effectiveness.
Citation preview
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Ph.D. DissertationPh.D. Dissertation
Reverse Engineering Reverse Engineering
Web ApplicationsWeb Applications
Porfirio TramontanaPorfirio Tramontana
University of Naples “Federico II”University of Naples “Federico II”
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Web Applications: open problemsWeb Applications: open problems
In the past years, a great request for Web Applications takes place, due to the World Wide Web diffusion making available many services all over the world
Web Applications have been developed with immature design methodologies and technologies
Nowadays, there is a number of legacy Web Applications needing for maintenance and re-engineering
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Ph. D. Thesis Goals
• To propose models, methods and tools supporting Reverse Engineering and Comprehension of Web Applications
• Reverse Engineering and comprehension are fundamental tasks needed to efficiently support maintenance, testing and quality assessment of Web Applications
Doctoral Thesis Goals
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Peculiarities of script-based Web Applications
Page based Client-Server Architecture Interpreted languages Client pages may be generated “on the
fly” Client pages are executed in a browser
(and the designer doesn’t know what kind of browser will be used)
HTML interpreters are fault tolerant
... and so on ...
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
A process for the A process for the Reverse Engineering of Web ApplicationsReverse Engineering of Web Applications
Abstraction
Extraction
WASourceCode
StaticAnalysis
Dynamic Analysis
Business Level UML Diagram Abstractions
WA Execution
Identification of cloned components
Identification of Interaction Design
Patterns
Assignment of Concepts
Functional Clustering
Cloned components
Interaction Design Patterns
Concepts describing Reverse Engineering artifacts
Groups of pages realizing Web Application use cases
Structural and Business Level UML diagrams
Maintanability assessment
Abstraction
Extraction
WASourceCode
StaticAnalysis
Dynamic Analysis
Business Level UML Diagram Abstractions
WA Execution
Identification of cloned components
Identification of Interaction Design
Patterns
Assignment of Concepts
Functional Clustering
Cloned components
Interaction Design Patterns
Concepts describing Reverse Engineering artifacts
Groups of pages realizing Web Application use cases
Structural and Business Level UML diagrams
Maintanability assessment
G.A. Di Lucca, A.R. Fasolino, P. Tramontana, “Reverse Engineering Web Application: the WARE approach”, Journal of Software Maintenance and Evolution: Research and Practice, Volume 16, Issue 1-2, Date: January - April 2004, Pages: 71-101
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Analysis of Web ApplicationsAnalysis of Web Applications
1) Static analysis of the source codeA multi-language parser analysing the source code of Server pages, Client pages and Script modules has been realized.
During the analysis of server pages, facts related to the client pages that are built by server pages are also recorded.
Static analysis results are stored in a intermediate form and are used to fill a relational database
2) Dynamic AnalysisAnalysis of Built Client pages in order to add to the database some facts that have been observed by executing the application
The reference model adopted is an extension of the one proposed by Conallen for the forward engineering of Web Applications
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Model of Web ApplicationsModel of Web Applications
Static Page
DB Interface
Java Applet
TextareaSelect Button
Media Flash Object Mail Address
Mail Interface Server File Interface
Other Object
Generic File
Download
Parameter
Other Interface
Hyperlink
Frame
Web Object
Frameset
Anchor
Field
Server Function Server Class
Interface Object
Built Page
Form
Server Script
Session Variable
Server CookieServer Page
Submits
include
HTML Tag
Web Page
source
redirect
Client Page
Client Script
event
Modify Tag
redirect
Client Function
Client Module
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
WARE (Web Application Reverse WARE (Web Application Reverse Engineering) toolEngineering) tool
Extractor Abstractor
Interface layer
IRF
DBR
Diagrams
Repository
HTML
Parsers
Service Layer
WARE-Tool
WA Source Files
WARE GUI
Graphical Visualizer
Dotty
VCG RIGI
ASP
VBS
PHP
JS
….
IRF Translator
Query Executor
UML Diagrams Abstractor /areadocente.html
/check.asp
Redirect
/check.aspBuilds
/autenticazionedocente.html
Submit
/check.asp /check.asp/check.asp
Submit
/areadocente.html
/check.asp
Redirect
/check.aspBuilds
/autenticazionedocente.html
Submit
/check.asp /check.asp/check.asp
Submit
WARE Architecture
Detail Class Diagram abstracted by WAREG. A. Di Lucca, A.R. Fasolino, U. De Carlini, F. Pace, P. Tramontana, “WARE: a tool for the Reverse Engineering of web Applications”, Proc. of 6th IEEE European Conference on Software Maintenance and Reengineering, CSMR 2002, IEEE CS Press, Los Alamitos, CA, Pages:241 - 250
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Functional Clustering of Web Functional Clustering of Web PagesPages
• Goal:
To cluster together subsets of components realizing Web Application functionalities
• Proposed Technique:
Hierarchical clustering algorithm, grouping Web Application pages in subsets, maximizing the cohesion and minimizing the coupling between them
G. A. Di Lucca, A.R. Fasolino, U. De Carlini, F. Pace, P. Tramontana, “Comprehending Web Applications by a Clustering Based Approach”, Proc. of 10th IEEE Workshop on Program Comprehension, IWPC 2002, Pages:261 - 270
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Concept AssignmentConcept Assignment Goal:Goal:
To identify the more To identify the more relevant concepts in client relevant concepts in client pages with the purpose to pages with the purpose to suggest a semantic suggest a semantic description of client pages description of client pages and of functional clusters and of functional clusters of pagesof pages
Proposed Technique:Proposed Technique: Heuristic Algorithms based Heuristic Algorithms based
on Information Retrievalon Information Retrieval Candidate concepts are Candidate concepts are
searched in textual content searched in textual content of client pagesof client pages
Single common words and Single common words and short word sequences are short word sequences are candidated to be conceptscandidated to be concepts
Built Client Page
Server Page
0..*
1
0..*
1<<builds>>
Data Component
StopWord
Word
has synonym
has stem
Web Page
Static Client Page
AttributeName
TagNameWeight
nested in
0..*0..*
Control Component
0..*0..*
Client PageFile name
1111
TextWeight
0..*0..*
0..1
0..1
0..1
0..1
0..*0..1 0..*0..1
Concept1
1
1
1
1
1
1
1
G.A. Di Lucca, A.R.Fasolino, P.Tramontana, U.De Carlini, “Supporting Concept Assignment in the Comprehension of Web Applications”, Proceedings of the 28th IEEE Annual International Computer Software and Applications Conference, COMPSAC 2004
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Interaction Design PatternInteraction Design Patterns s IdentificationIdentification
Goal:Goal: To identify repetitive structures in Web To identify repetitive structures in Web
Client pagesClient pages These structures can be related to known These structures can be related to known
Programming PatternsProgramming Patterns Proposed Technique:Proposed Technique:
Statistical methodology based on Statistical methodology based on features extracted in the source code features extracted in the source code of client pages.of client pages.
Presence, quantity and dimension of forms, Presence, quantity and dimension of forms, tables, input fields, frames, common tables, input fields, frames, common keywords and so on. keywords and so on.
G.A. Di Lucca, A.R.Fasolino, P.Tramontana, “Recovering Interaction Design Patterns in Web Applications”, submitted to 9th IEEE European Conference on Software Maintenace and Reengineering, CSMR 2005
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Identification of cloned Identification of cloned componentscomponents
Goals:Goals: Re-Engineering of cloned components via code Re-Engineering of cloned components via code
transformationstransformations Classification of Built Client Pages Classification of Built Client Pages Identification of reusable Programming PatternsIdentification of reusable Programming Patterns
Proposed Techniques:Proposed Techniques: Extraction of features in the structure of Client Extraction of features in the structure of Client
pages and in the source code of server pagespages and in the source code of server pages Computation of distance measures between Computation of distance measures between
pages (Euclidean dstance, Levenshtein edit pages (Euclidean dstance, Levenshtein edit distance)distance)
G.A. Di Lucca, A.R. Fasolino, P. Tramontana, U. De Carlini, “Identifying Reusable Components in Web Applications”, IASTED International Conference on Software Engineering, SE 2004, pp.526-531
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Abstraction of Business Level Abstraction of Business Level ModelsModels
Goals:Goals: To abstract object oriented To abstract object oriented
business level models of business level models of Web Applications Web Applications
Proposed Techniques:Proposed Techniques: Classes and attributes are Classes and attributes are
identified by analysing the identified by analysing the data that are exchanged data that are exchanged between user, Web pages between user, Web pages and databases. and databases.
Class methods are Class methods are identified by analysing the identified by analysing the functions implemented by functions implemented by cluster of pages cluster of pages
Relationships between Relationships between classes are identified classes are identified analysing data structures analysing data structures and data flow among and data flow among pagespages
Tutoring requestDate
TeacherNameSurnameE-mailPhone numberPasswordCode
TutoringDateStart timeEnd time
NewsNumberDateText
StudentNameSurnameE-mailPasswordCodePhone number
ExamDateTimeClassroom
CourseAcademic yearCodeName
Exam ReservationDate
G.A. Di Lucca, A.R.Fasolino, U.De Carlini, P.Tramontana, “Recovering a Business Object Model from Web Applications”, Proceedings of the 27th IEEE Annual International Computer Software and Applications Conference, COMPSAC 2003, Pages: 348 - 353
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Maintainability ModelMaintainability Model
Goals:Goals: To propose models and methods for the To propose models and methods for the
assessment of the maintainability of Web assessment of the maintainability of Web ApplicationsApplications
Proposed Models and Techniques:Proposed Models and Techniques: Adapting to Web Applications the Oman model Adapting to Web Applications the Oman model
(thought for traditional applications)(thought for traditional applications) Selection of a set of product metrics and Selection of a set of product metrics and
proposal of a maintainability index that can be proposal of a maintainability index that can be calculated with negligible effort and timecalculated with negligible effort and time
G.A. Di Lucca, A.R.Fasolino, P.Tramontana, C.A.Visaggio, “Towards the definition of a maintainability model for web applications”, Proceedings of the Eighth IEEE European Conference on Software Maintenance and Reengineering, CSMR 2004, pages:279 - 287
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005
Current and future worksCurrent and future works
Techniques for the dynamic Techniques for the dynamic analysis of Web Applicationsanalysis of Web Applications
Accessibility assessment of Accessibility assessment of Client pagesClient pages
Migration from Web Migration from Web Applications to Web ServicesApplications to Web Services
Testing of Web ApplicationsTesting of Web Applications Mutation Testing techniquesMutation Testing techniques
Maintainability assessmentMaintainability assessment Definition of ageing measures for Definition of ageing measures for
Web ApplicationsWeb Applications
G.A. Di Lucca, M. Di Penta, A.R. Fasolino, P. Tramontana, “Supporting Web Application Evolution by Dynamic Analysis”, IWPSE 2005
G.A. Di Lucca, A.R. Fasolino, P. Tramontana, “Web Site Accessibility: Identifying and Fixing of Accessibility Problems in Client Page Code”, WSE 2005
Ph.D. Dissertation Forum – ICSM 2005Ph.D. Dissertation Forum – ICSM 2005