16

Click here to load reader

REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

Embed Size (px)

Citation preview

Page 1: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REVERERecovering Legacy Requirements

an EPSRC-SEBPC projectan EPSRC-SEBPC project

Page 2: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer2

Positioning

User/Customer Requirements Engineer

Software Architect

Environment

Needs Specification Design / Architecture

Page 3: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer3

Acronyms

REVERE: Reverse engineering of requirements to support REVERE: Reverse engineering of requirements to support business process changebusiness process change

SEBPC: Systems Engineering for Business Process SEBPC: Systems Engineering for Business Process ChangeChange

CSEG: Co-operative Systems Engineering GroupCSEG: Co-operative Systems Engineering Group UCREL: University Centre for Computer Corpus Research UCREL: University Centre for Computer Corpus Research

on Languageon Language

Page 4: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer4

Who?

Supervised by Roger Garside and Pete SawyerSupervised by Roger Garside and Pete Sawyer A joint CSEG & UCREL projectA joint CSEG & UCREL project Adelard consultancy providing technical advice, Adelard consultancy providing technical advice,

documentary data, evaluation of the integrated method and documentary data, evaluation of the integrated method and piloting of the toolset resulting from the project.piloting of the toolset resulting from the project.

Page 5: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer5

What?

Improve the requirements analysis for legacy system Improve the requirements analysis for legacy system evolution where underlying BP has already changed.evolution where underlying BP has already changed.

Pre-change organisation

Post-changeorganisation

Target operational software

Existing operational software

De-facto organisationchange

Requiredsoftware change

Motivatingrequirements

Newrequirements

Page 6: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer6

Proposal

Reverse engineering of requirements documents by the Reverse engineering of requirements documents by the novel integration of techniques for the textual analysis of novel integration of techniques for the textual analysis of documentation; modelling of business processes; and documentation; modelling of business processes; and modelling the organisational structures serving the modelling the organisational structures serving the business processes.business processes.

Project started in May 1998Project started in May 1998 Review other applications of NLPReview other applications of NLP Rule-based (Goldin and Berry, 1997: Abstfinder) or sub-Rule-based (Goldin and Berry, 1997: Abstfinder) or sub-

language examples (Cyre 1995)language examples (Cyre 1995)

Page 7: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer7

Why?

BP change means redesign support systems, operating procedures and documentation.BP change means redesign support systems, operating procedures and documentation. High cost of recovering the motivating requirements.High cost of recovering the motivating requirements. Key people who possess the knowledge may be unavailable.Key people who possess the knowledge may be unavailable. Information is often implicit in documents such as requirements specifications, operating Information is often implicit in documents such as requirements specifications, operating

manuals and data models. manuals and data models.

Businessprocessesmodels

Organisationalstructures

Supportsystems

RequirementsRoles

have

enact

supportDocuments

Described inHavestakes

in

impose

define

Page 8: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer8

What next?

UCREL tools provide robust analysis over unrestricted UCREL tools provide robust analysis over unrestricted domainsdomains

Mainly statistically based with template analysis Mainly statistically based with template analysis componentscomponents

Layered: POS, lemmatisation, anaphor resolution, Layered: POS, lemmatisation, anaphor resolution, semantic analysissemantic analysis

Corpus annotation is fast and accurate way of improving Corpus annotation is fast and accurate way of improving information extraction from textinformation extraction from text

Porting from UNIX to Linux & PCPorting from UNIX to Linux & PC Integrate with Adelard’s ClaviarIntegrate with Adelard’s Claviar

Page 9: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer9

CLAWS POS tagging

Grammatical tagging, is the commonest form of corpus Grammatical tagging, is the commonest form of corpus annotation, and was the first form of annotation to be annotation, and was the first form of annotation to be developed by UCREL at Lancaster. Our POS tagging developed by UCREL at Lancaster. Our POS tagging software for English text, CLAWS (the Constituent software for English text, CLAWS (the Constituent Likelihood Automatic Word-tagging System), has been Likelihood Automatic Word-tagging System), has been continuously developed since the early 1980s. The latest continuously developed since the early 1980s. The latest version of the tagger, CLAWS4, was used to POS tag version of the tagger, CLAWS4, was used to POS tag c.100 million words of the British National Corpus (BNC).c.100 million words of the British National Corpus (BNC).

Page 10: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer10

CLAWS POS tagging

Grammatical_JJ tagging_NN1@ ,_, is_VBZ the_AT commonest_JJT Grammatical_JJ tagging_NN1@ ,_, is_VBZ the_AT commonest_JJT form_NN1 of_IO corpus_NN1 annotation_NN1 ,_, and_CC form_NN1 of_IO corpus_NN1 annotation_NN1 ,_, and_CC was_VBDZ the_AT first_MD form_NN1 of_IO annotation_NN1 was_VBDZ the_AT first_MD form_NN1 of_IO annotation_NN1 to_TO be_VBI developed_VVN by_II UCREL_NP1 at_II to_TO be_VBI developed_VVN by_II UCREL_NP1 at_II Lancaster_NP1 ._. Our_APPGE POS_NN2 tagging_VVG Lancaster_NP1 ._. Our_APPGE POS_NN2 tagging_VVG software_NN1 for_IF English_JJ text_NN1 ,_, CLAWS_NN2 software_NN1 for_IF English_JJ text_NN1 ,_, CLAWS_NN2 (_( the_AT Constituent_NN1 Likelihood_NN1 Automatic_JJ Word-(_( the_AT Constituent_NN1 Likelihood_NN1 Automatic_JJ Word-tagging_JJ System_NN1 )_) ,_, has_VHZ been_VBN tagging_JJ System_NN1 )_) ,_, has_VHZ been_VBN continuously_RR developed_VVN since_CS the_AT early_JJ continuously_RR developed_VVN since_CS the_AT early_JJ 1980s_MC2 ._. The_AT latest_JJT version_NN1 of_IO the_AT 1980s_MC2 ._. The_AT latest_JJT version_NN1 of_IO the_AT tagger_NN1 ,_, CLAWS4_FO ,_, was_VBDZ used_JJ to_II tagger_NN1 ,_, CLAWS4_FO ,_, was_VBDZ used_JJ to_II POS_NN2 tag_VV0 c.100_FO million_NNO words_NN2 of_IO POS_NN2 tag_VV0 c.100_FO million_NNO words_NN2 of_IO the_AT British_JJ National_JJ Corpus_NN1 ._.the_AT British_JJ National_JJ Corpus_NN1 ._.

Page 11: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer11

Semantic tagging Grammatical_Q3 tagging_Z99 ,_PUNC is_A3+ the_Z5 Grammatical_Q3 tagging_Z99 ,_PUNC is_A3+ the_Z5

commonest_A6.2+++ form_A4.1 of_Z5 corpus_Q3 commonest_A6.2+++ form_A4.1 of_Z5 corpus_Q3 annotation_Q1.2 ,_PUNC and_Z5 was_A3+ the_Z5 first_P1c[i1.2.1 annotation_Q1.2 ,_PUNC and_Z5 was_A3+ the_Z5 first_P1c[i1.2.1 form_P1c[i1.2.2 of_Z5 annotation_Q1.2 to_Z5 be_Z5 developed_A2.1+ form_P1c[i1.2.2 of_Z5 annotation_Q1.2 to_Z5 be_Z5 developed_A2.1+ by_Z5 UCREL_Z99 at_Z5 Lancaster_Z2 ._PUNC Our_Z8 POS_I2.2 by_Z5 UCREL_Z99 at_Z5 Lancaster_Z2 ._PUNC Our_Z8 POS_I2.2 tagging_Q1.1 software_Y2 for_Z5 English_Z2 text_Q1.2 ,_PUNC tagging_Q1.1 software_Y2 for_Z5 English_Z2 text_Q1.2 ,_PUNC CLAWS_L2 (_PUNC the_Z5 Constituent_G1.2/S2mf Likelihood_A7 CLAWS_L2 (_PUNC the_Z5 Constituent_G1.2/S2mf Likelihood_A7 Automatic_A1.1.1 Word-tagging_Z99 System_X4.2 )_PUNC ,_PUNC Automatic_A1.1.1 Word-tagging_Z99 System_X4.2 )_PUNC ,_PUNC has_Z5 been_Z5 continuously_T2++ developed_A2.1+ since_Z5 the_Z5 has_Z5 been_Z5 continuously_T2++ developed_A2.1+ since_Z5 the_Z5 early_T1.3[i2.2.1 1980s_T1.3[i2.2.2 ._PUNC The_Z5 latest_T3--- early_T1.3[i2.2.1 1980s_T1.3[i2.2.2 ._PUNC The_Z5 latest_T3--- version_A4.1 of_Z5 the_Z5 tagger_Z99 ,_PUNC CLAWS4_Z99 ,_PUNC version_A4.1 of_Z5 the_Z5 tagger_Z99 ,_PUNC CLAWS4_Z99 ,_PUNC was_A3+ used_T1.1.1[i3.2.1 to_T1.1.1[i3.2.2 POS_I2.2 tag_Q1.1 was_A3+ used_T1.1.1[i3.2.1 to_T1.1.1[i3.2.2 POS_I2.2 tag_Q1.1 c.100_Z99 million_N1 words_Q3 of_Z5 the_Z5 British_Z2 National_Z3c c.100_Z99 million_N1 words_Q3 of_Z5 the_Z5 British_Z2 National_Z3c Corpus_Q3 ._PUNC Corpus_Q3 ._PUNC

Page 12: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer12

Statistical analysis

Build training & test corpus and separate normative corpus Build training & test corpus and separate normative corpus for vocabulary norms:for vocabulary norms: requirements documents, operating manualsrequirements documents, operating manuals IBM manuals corpus (800K)IBM manuals corpus (800K) Subcorpus of BNC (applied science 11 million words)Subcorpus of BNC (applied science 11 million words) CSEG technical reports?CSEG technical reports? Transcripts of ethnographic studies of technical workplacesTranscripts of ethnographic studies of technical workplaces Public domain IT standards documentsPublic domain IT standards documents

Retrain CLAWS probability matrix, vocabulary and idiom Retrain CLAWS probability matrix, vocabulary and idiom usage and investigate frequency distributions for text usage and investigate frequency distributions for text normsnorms

Page 13: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer13

Preliminary results

Semantic comparison of LIBSYS and BNCIT corpusSemantic comparison of LIBSYS and BNCIT corpus

Semantictag

Semantic category Example items LIBSYSrelativefrequency

Log-likelihood

BNC ITrelativefrequency

Q1.2 Paper documents andwriting

documents,records, prints

4.74 717.5 1.02

T1.1.3 Time future will, shall 3.70 483.8 0.91A1.5.1 Using user, end-user 2.64 260.1 0.81I2.1 Business agents,

commercial0.12 208.8 1.44

S7.1+ Power, organising administrator,management,order

2.31 159.5 0.89

Q4.1 The media author,catalogues,librarian

0.98 144.6 0.22

X9.1+ Ability, intelligence be-able-to 0.75 129.9 0.14X2.4 Investigate search 0.04 119.0 0.74

Page 14: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer14

Objects and operations

Page 15: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer15

Discussion for paper 1 (Carroll & Swatman)

Which quality features are addressed by the paper?Which quality features are addressed by the paper? Quality management of the early phases of RE process in order to target business Quality management of the early phases of RE process in order to target business

problems correctly.problems correctly. What is the main novelty/contribution of the paper?What is the main novelty/contribution of the paper?

““RE is opportunistic not deterministic”.RE is opportunistic not deterministic”. How will this novelty/contribution improve RE practice/research?How will this novelty/contribution improve RE practice/research?

Avoid focussing on one methodologyAvoid focussing on one methodology What are the main problems with the novelty/contribution and/or paper?What are the main problems with the novelty/contribution and/or paper?

Case study may be unrepresentative in terms of composition of the team.Case study may be unrepresentative in terms of composition of the team. Can the proposed approach be expected to scale to real-life problems?Can the proposed approach be expected to scale to real-life problems?

If a company has invested and trained in one methodology then they will probably use it, whether If a company has invested and trained in one methodology then they will probably use it, whether it fits the problem or not.it fits the problem or not.

Page 16: REVERE Recovering Legacy Requirements an EPSRC-SEBPC project

REFSQ’99Paul Rayson, Roger Garside,

Pete Sawyer16

Discussion for paper 3 (Claus et al)

Which quality features are addressed by the paper?Which quality features are addressed by the paper? Quality assurance: establishing organisational procedures and standardsQuality assurance: establishing organisational procedures and standards

What is the main novelty/contribution of the paper?What is the main novelty/contribution of the paper? Demonstrates practical ‘management’ problems of introducing requirements management.Demonstrates practical ‘management’ problems of introducing requirements management.

How will this novelty/contribution improve RE practice/research?How will this novelty/contribution improve RE practice/research? Emphasise involvement of stakeholders from an early stage.Emphasise involvement of stakeholders from an early stage.

What are the main problems with the novelty/contribution and/or paper?What are the main problems with the novelty/contribution and/or paper? Technical problems were trivial in this case study.Technical problems were trivial in this case study.

Can the proposed approach be expected to scale to real-life problems?Can the proposed approach be expected to scale to real-life problems? Don’t underestimate the difficulty of making the change happen.Don’t underestimate the difficulty of making the change happen.