Managing large amounts of natural language requirements ...fileadmin.cs.lth.se/serg/old-serg-dok/docs-master... · Managing large amounts of natural language requirements through

Managing large amounts of natural language requirements through natural language

processing and information retrieval support

Master’s ThesisTobias Karlsson

SupervisorsJohan Natt och Dag, LTH

Björn Regnell, LTH Micael Åkesson, Sony Ericsson

Department of Communication Systems

CODEN:LUTEDX(TETS-5506)/1-65/(2004) & local 5

Managing large amounts of natural language requirements through natural language processing and information retrieval support

1


2

Abstract

Software development engineering is a rather new subject and companies who develop software products often have some sort of problem with their software development process. It is a difficult task for the companies to define a well working software development process for their organization due the complexity of software development. If a company develops products for an open market with many stakeholders, problems will probably arise in the requirements engineering process due to the huge amount of requirements with which they contribute. Some consumer products, like mobile phones, have a short life cycle and the mobile phone producers therefore have to have a short development cycle.

The objective of this Master Thesis is to investigate if natural language processing and information retrieval techniques may be of assistance in the management of natural language requirements. The requirement sources treated, in this thesis, are so called requests for information (RFI), which are sent in by mobile phone operators.

A literature study was done in order to investigate various techniques within the fields of natural language processing and information retrieval in order to gain knowledge within these areas.

A program, mostly implemented by Johan Natt och Dag, was used to test the proposed methods on different requirement sets. Three natural language processing methods, lexical analysis, morphological analysis and the use of stoplists, were tested with two information retrieval similarity measure models, the Cosine model and the Normalized Matching model. Additional tests were conducted in order to find out which of the following weighting models gives the best result, binary weighting, term frequency weighting or logarithmic term frequency weighting. These tests were done with the similarity measure model that showed to be the best according to the previous tests, i.e. the Cosine model.

The tests with requirements originating from one operator gave very good results, i.e. recall ~100%. When testing was conducted with requirements from different operators the recall rate became approximately 45%. The conclusion is that the Cosine model with either binary or logarithmic term frequency weighting, in cooperation with the proposed natural language processing techniques gave the best results. The time and effort needed to complete a compliance process for an RFI can be reduced if this semi automatic method is used in cooperation with the manual; therefore the implemented models are considered an aid in Sony Ericsson’s process for managing the operators’ requests for information.


3


4

Sammanfattning

Programvaruutveckling är ett förhållandevis nytt ämne och företag som utvecklar mjukvaruprodukter har ofta problem med sin mjukvaruutvecklingsprocess. Det är en svår uppgift för företag att skapa en väl fungerande mjukvaruutvecklingsprocess eftersom mjukvaruutveckling är ett väldigt komplext område. När företag utvecklar produkter för en öppen marknad uppkommer ofta problem med kravhanteringen eftersom stora mängder krav inkommer från många olika källor. Vissa produkter, till exempel mobiltelefoner, har väldigt korta livscykler vilket pressar mobiltelefontillverkarna till att uppnå kortare och kortare utvecklingstider.

Syftet med detta examensarbete är undersöka om ”natural language processing” och ”information retreival” tekniker kan assistera i hanteringen av ”natural language” krav. Kraven, i detta examensarbete, hämtades från flera så kallade ”requests for information” som olika operatörer skickat in till Sony Ericsson.

En litteraturstudie genomfördes för att öka kunskapsnivån inom områdena ”natural language processing” och ”information retreival”.

För att testa de föreslagna metoderna användes ett program som, till största delen, implementerats av Johan Natt och Dag. Tre olika ”natural language processing” tekniker testades, de var lexikal analys, morfologisk analys samt borttagning av så kallade stoppord. I de första testerna användes två ”information retreival” tekniker, Cosinusmodellen samt en normaliserad matchningsmodell. Ytterligare testning utfördes med den modell som gav bäst resultat, Cosinusmodellen. Syftet med dessa tester var att se vilken viktningsmetod som var bäst. Binär, termfrekvens och logaritmisk termfrekvensviktning var de viktningsmodeller som testades.

Testerna med krav som kom från en enda operatör gav mycket bra resultat, ”recall” ~100%. När tester genomfördes med krav från olika operatörer blev ”recall rate” ungefär 45%. Slutsatsen är att Cosinusmodellen ihop med binär viktning eller logaritmisk termfrekvensviktning och de föreslagna ”natural language processing” metoderna gav bäst resultat. Den tid och de resurser som krävs för att genomföra en ”compliance process” för en RFI kan reduceras om den testade semiautomatiska metoden används tillsammans med den manuella metoden. Med hänsyn till dessa resultat anses det att de implementerade modellerna kan vara ett stöd i Sony Ericssons process för att hantera operatörernas ”Requests For Information”.


5


6

Preface

This Master Thesis, entitled “Managing large amounts of natural language requirements through natural language processing and information retrieval support”, was done as a mandatory part of the Master of Science education in Electrical Engineering at Lund Institute of Technology.

The work has been carried out at Sony Ericsson Mobile Communications AB in Lund between September 2003 and April 2004.

Several people have assisted me during my work and those I would especially like to thank are:

Johan Natt och Dag, my supervisor at Lund Institute of Technology, Lund University, who has contributed with the software, which were used during the testing, and his expertise in software engineering, natural language processing and information retrieval.

Micael Åkesson, my supervisor at Sony Ericsson Mobile Communications AB, who always had time to answer my questions and has given me lots of support during my thesis work.

Michael Kindlein, for all his help with the operator requirements and the process of handling these.

Niclas Forsberg, for the help with the verification of the results.

Fredrik Wendel and his staff at the division of Application Development, for contributing with data from the manual compliance process.

Staffan Björklund, for giving me a better comprehension of requirements engineering at Sony Ericsson.

The staff at the division of System Requirements & Roadmaps, for making my stay at Sony Ericsson a pleasant one.

Sara Ekholm, for making delicious lunch boxes and putting up with me.

Thank you!

Tobias Karlsson

Lund, May 2004


7


8

Table of Contents

1 Introduction .................................................................................................................10

1.1 Background ........................................................................................................10 1.2 Problem..............................................................................................................10 1.3 Objective ............................................................................................................11 1.4 Delimitations ......................................................................................................12

2 Theory.........................................................................................................................14 2.1 Software Engineering..........................................................................................14

2.1.1 Requirements Engineering..............................................................................21 2.2 Information Retrieval and Natural Language Processing .....................................27

2.2.1 Natural Language Processing Techniques .......................................................28 2.2.2 Information Retrieval Similarity Measure Models...........................................30 2.2.3 Evaluation Measures.......................................................................................34

3 Software Engineering at SEMC ...................................................................................36 4 Method ........................................................................................................................40

4.1 Retrieval of Data.................................................................................................40 4.2 Lexical Analysis .................................................................................................40 4.3 Morphological Analysis......................................................................................41 4.4 Stopword Removal .............................................................................................41 4.5 Similarity Measure Models .................................................................................42 4.6 Output and Manual Analysis ...............................................................................43 4.7 Evaluation Measure ............................................................................................45

5 Validation....................................................................................................................46 6 Material .......................................................................................................................48 7 Results.........................................................................................................................50

7.1 Test 1, Operator A revision 1 and 2.....................................................................51 7.1.1 Cosine Model .................................................................................................52 7.1.2 Normalized Matching Model ..........................................................................52 7.1.3 Manual Analysis versus Semi Automatic Analysis..........................................53

7.2 Test 2, SMS requirements from Operator A and other Operators .........................53 7.2.1 Cosine Model .................................................................................................53 7.2.2 Normalized Matching Model ..........................................................................54

7.3 Similarity Measure Model Summary...................................................................54 7.4 Test 3, Operator B revision 1 and 2 .....................................................................55 7.5 Test 4, Operator B revision 2 and Operator A revision 2......................................55 7.6 Weighting Algorithm Summary ..........................................................................56

8 Conclusions .................................................................................................................58 8.1 Limitations & Usability ......................................................................................58 8.2 Recommendations of Further Development.........................................................59

9 References ...................................................................................................................60


9


10

1 Introduction

The background of this thesis will be stated in this section. The original problem and objectives are also described, as are the delimitations of this thesis.

1.1 Background

Mobile phone operators submit so called “Requests For Information” (RFI) to SEMC (Sony Ericsson Mobile Communications AB) for a number of reasons. The RFI’s are meant to give operators more knowledge on the phones’ technical capabilities. There are two main types of RFIs that can be identified. The first requests comprehensive information on all the manufacturers’ products. It can also request information on specific telephones and their functionality. The other type of RFI requests stating the compliance on specific requirements; these are called “Statement of Compliance” (SoC) and are the most common ones. SoCs are replied with simple and standardized statements on if compliance exits or not. The first type of RFI demands a more comprehensive reply. “White Papers”1, product catalogues or even specific answers are used in these cases. [1]

RFIs have a large role in operators strategy planning. As it is the buyers’ market a good relationship with the operators is a necessity for SEMC. In order to maintain a good relationship with operators SEMC must show a great deal of professionalism and commitment when dealing with RFIs. The not so obvious part of the RFIs is that SEMC also gains a great deal of vital business intelligence information from them. Features and functionality that is prioritized by mobile phone operators may be used as a guideline when developing future phones. [1]

This thesis was carried out at Sony Ericsson Mobile Communications AB in Lund in co-operation with the Department of Communication Systems, the Software Engineering Research Group, at Lund Institute of Technology, Lund University.

1.2 Problem

When a market-driven company develops software the requirements are captured from many different sources. A market-driven company has to cope with issues like the need of short time-to-market and an ability to meet demands from the market. [15] If it does not succeed with these matters it will, most certainly, be outmaneuvered by companies who do. As a majority of the requirements need extensive analysis before they are inserted into a database, congestion is likely to appear in this part of the software development process. If the requirements are written in natural language, the task becomes even more challenging. When the requirements affects different parts of a product or different projects the analysis process becomes rather complex. [17]

1A White Paper is designed to be a product offer document, which is sent to the operators. It is also

supposed to give the reader a deeper technical understanding of how a product is designed, and of how it interacts with other media. This document will make it easier to integrate the product with the IT and communication solutions of a company or organization.


11

Requirements are often related to each other in some way and there is a high probability that different stakeholders2 contribute with similar and/or contradictory requirements. Finding duplicate, similar and contradictory requirements requires lots of time because of all the manual work that is needed. To be able to do this, one has to have extensive comprehension of the requirements. When the number of requirements exceeds one thousand this is almost for sure impossible. [17]

Sony Ericsson receives, from several different operators, documents in several different formats (DOC, XLS, PDF, PPT, etc.), in which the operators’ wishes and requirements are documented. These wishes and requirements are commonly written in natural language3. Currently, Sony Ericsson does have a process for handling the operators’ requirement documents, RFIs. This process does not work properly and it results in inefficient use of resources and time. At the moment an improved process is being defined. One of the goals is to establish a better structured method for managing operator requirements. This method will need a tool that is able to extract requirements from documents, which are retrieved from the operators, and insert them into a specified template. The tool must then compare these requirements with the older ones, which may be found in RFIs or in some kind of a database, in order to find similar and/or contradictory requirements.

1.3 Objective

The objective of this thesis is to make a study of how well natural language processing (NLP) and information retrieval (IR) techniques can be used to manage large amounts of natural language requirements. The requirements that are dealt with in this thesis are sent in by several operators. These requirements are written in natural language.

This study will be used by SEMC to decide whether they will continue using their manual method of handling operator requirements or start using a semi automatic method based on NLP and IR. The semi automatic method will be investigated to see if it can be used to compare two requirement sets, in order to find similar and/or contradictory requirements. If SEMC decide not to use the semi automatic method they will probably try to find another solution to the problem.

To be able to determine whether the proposed NLP and IR techniques can aid in the requirements engineering process or not, a program will be developed. If testing gives good results the program may be implemented in a future requirements management tool.

2 The term stakeholder is used to refer to anyone who has any knowledge about the system and

therefore should have some kind of influence on the system requirements. 3 “A language spoken or written by humans, as opposed to a language used to program or

communicate with computers. Natural language understanding is one of the hardest problems of artificial intelligence due to the complexity, irregularity and diversity of human language and the philosophical problems of meaning.” – http://dictionary.reference.com [18]


12

1.4 Delimitations

The program will only be able to extract information out of text files. Since SEMC does not have a well working database with unique operator requirements the testing process will use requirements achieved from one function group4 (FG), Messaging. Their local database contains requirements from several different operators. Requirements will also be elicited from RFIs received from two operators, which in this thesis are referred to as Operator A and Operator B.

4 A function group is a division within SEMC, which works with a specific area such as messaging,

acoustics or mechanics etc, with competence from different areas such as design, system engineering and test.


13


14

2 Theory

This section contains a brief overview of software engineering and also a more detailed one on requirements engineering, which is a subject within software engineering. Different natural language processing techniques, which are applicable in an information retrieval system, and information retrieval similarity measure models are described.

2.1 Software Engineering

Everyday we are affected by some kinds of software. Almost all electrical devices contain one or more software programs. The programs control how the equipment works and how it reacts to different internal or external signals. Software is not solely the programs it is also all the associated documentation. [4]

Note that the theories presented in chapters 2.1 and 2.1.1 often do not correspond to how work is done in companies. Since software engineering is a rather new and complex subject the companies, developing software, have not reached a high level of maturity in their software development processes.

The description of the way software is developed is called software process model. There are several different models or life cycles. The first one to be used was the waterfall model which is visualized in Figure 2.1.

Figure 2.1 – Waterfall model

The advantage with this model is that it is simple and straightforward. Its simplicity is also one of its major disadvantages since real software projects rarely follow the sequential flow. [11] It is often hard to state all the requirements at the beginning of a project. When problems are found in the last stages, generating new requirements, the development team has to go back to the requirements activity and work through all the activities again. The waterfall model should therefore only be used when the requirements are well understood. [4]


15

A model based on prototyping may be used when the requirements are not well stated. The customer can set general guidelines and a prototype can quickly be produced. After the customer has evaluated the prototype it can be refined into a new prototype. This iteration is done until the product meets the customer’s needs. An advantage with this model is that the purchaser quickly receives a prototype of the system. The main problem is that it is difficult to visualize the process and therefore management of the process is difficult. [4] [11]

The spiral model was developed to enclose the best parts of the waterfall and prototyping model. A new element, risk analysis, is also added. This model is shown in Figure 2.2.

Figure 2.2 – The Spiral model - [22]

As seen this model has four quadrants which represent the four major activities, setting of objectives, risk evaluation and reduction, development and testing and finally planning. The process starts in the middle and works its way out through the quadrants. [4] [10] This model uses prototyping to minimize the risks but it still uses the stepwise approach from the waterfall model. The major drawback with this method is that it needs experienced risk evaluation personnel and it has to rely on their knowledge. [11]


16

Other interesting models are the incremental development model and the iterative development model. These are examples of phased development models. In incremental development, the system is divided into subsystems. The first release of the software is a small but functional subsystem. As time goes on new functionality is added until the system has achieved full functionality. When using incremental development the parts should be quite small and contain some system functionality. Iterative development, on the other hand, is based on the development of a full software system. When the system has been delivered it is updated with new releases until it is has reached complete functionality. The advantages with phased development are that an early release can be delivered quickly and the risk of project failure is rather small. [4] [10]

Sommerville [4] states four basic activities that exist in all software processes and these are:

1. Software specification. Here the constraints and functionality of the software is handled.

2. Software development. This is production of the software program according to what is specified in the specification.

3. Software validation. After the software has been developed it must be validated through extensive testing. The testing procedure is used to ensure that the software does what is specified in the specification.

4. Software evolution. As the customer’s needs change, the software will need to evolve or else it will become obsolete.

This dividing is a bit narrow if compared to the ones described by Leach [9] and Pfleeger [10].

• “Requirements analysis and definition.

• System design.

• Program design.

• Writing the programs (program implementation).

• Unit testing.

• Integration testing.

• System testing.

• System delivery.

• Maintenance.”

- Shari Lawrence Pfleeger [10]

• “Analysis of the problem.

• Determination of the requirements.


17

• Design of the software.

• Coding of the software.

• Testing and integration of the code.

• Installation and delivery of the code.

• Documentation.

• Maintenance.

• Quality assurance.

• Training.

• Resource estimation.

• Project management.”

- Ronald J. Leach [9]

These three activity descriptions have certain similarities and differences, for example Pfleeger has three testing activities which both Sommerville and Leach have trimmed down to one. This seems logical since all testing stages do not have to be an activity of their own but can be sub-activities under a test/validation activity. Another example is that Pfleeger has two design stages that may be merged into one design activity like the one Leach has. One thing that they have in common is that they all have one activity that deals with requirement determination and analysis. Requirements are documented in a requirements specification, which describes the services and constraints.

Sommerville [4] has divided requirements into three categories.

The first category is user requirements, which describes, from a general point of view, what services the system is expected to provide. This is the least specified level.

The second category is system requirements; they state the services in detail.

The last and most detailed level is called software design specifications and they act as foundation for the detailed design and implementation.

These three levels may be fragmented further into functional, non-functional and domain requirements. Functional requirements describe services that the system must provide. They also state how the system should react to certain inputs and how the system should behave in certain situations. Sometimes they describe what the system must not do. Non-functional requirements are constraints on particular services that the system provides. These constraints can be related to the development process or standards that the system must comply with. Domain requirements however are requirements that come from an application domain. These requirements reflect the characteristics of that domain and can be functional or non-functional. [4] Non-functional domain requirements are sometimes called quality requirements since they are used to uphold the quality of the system. [13]


18

The non-functional requirements [4] are further derived into:

1. Product requirements. Specifies the behavior of the product like usability, reliability and portability.

2. Organizational requirements. Describes procedures and policies in the organization. These may be standards, implementation methods (e.g. programming language and design methods) and delivery concerns.

3. External requirements. States how external factors should be handled. These factors may concern interoperability, legislative rules and ethical issues.

Both Lauesen [13] and Robertson & Robertson [12] have different ways of defining requirements. Instead of splitting up requirements after their level of detail they start with dividing the requirements into functional and non-functional ones. Robertson & Robertson does not specify the functional requirements any further like Lauesen does. Instead of stating different requirement types he uses what he calls “styles”. A style is a way to state a requirement. In Software Requirements, Styles and Techniques [13] following functional requirement styles are stated:

1. Context diagram. In these kinds of requirements the product is represented by a black box. It is surrounded by user groups and other systems which it may communicate with. The data flow between the product and its surroundings is denoted by arrows.

2. Data model. This style uses an entity/relationship model in order to state the relations amongst the different sets of data which are stored in the system. Each entity is a box and the relationships are connectors.

3. Event List & Function List. This style approach uses events to describe function requests to a system. The events often contain some information which tells the system what to do. The different events may be stated in an event list and the system functions in a function list.

4. Dataflow Diagram, Domain-level. The dataflow diagram demonstrates activities. It also visualizes the data these create and consume. Each activity is represented by a bubble. Arrows symbolize the flow of data. Data storages appear as two parallel lines.

5. Domain Activities, Second Level. This is just a more specified description of the requirements in the Dataflow Diagram.

6. Dataflow Diagram, Physical Model. When you want to describe what the product should do you may use this style. Each activity from style number 5 is divided into two bubbles. The first represents the user and the second the product. Arrows between them represents the interaction.

7. Dataflow, Product Level. If the product bubbles are extracted from the previous model, a dataflow model on the product level is retrieved. This model presents the functions which the products render.


19

8. Use Cases, UML Notation. An activity carried out by a user in cooperation with the system is called use case. By using Universal Modeling Language, use cases can be defined on domain and product level. The business use cases (domain level) consist of diagrams that list the activities. These also contain an actor who initiates the activities. The product use cases list product events. They also have an actor who starts the events.

9. Use Cases, Task Notation. These describe domain activities and state also some background information about the activities. This kind of use case is text based.

10. Use Cases with Solutions. This kind of requirement description is used when the software already exists and is to be updated. The use case is built up by two columns. The left one states the sub-tasks and what problems the old software version has. The right column describes, in general terms, how the problem should be solved.

11. Virtual Windows. A virtual window is a screen picture. The picture does not contain any menus or functions. The purposes are to give the customer a possibility to review the data model, construct data screens that support use cases and to give the customer a possibility to review the design of the screen.

12. Feature Style. This involves writing product descriptions in plain text. Feature style is the most common way to create requirements.

13. Process Descriptions. This style uses either requirements based on product-level event or function lists or dataflow diagrams. The product-level ones should contain a process description for every event or function. The dataflow diagram should contain process descriptions for each low-level event or function.

14. State-transition diagrams. A state-transition diagram visualizes how entities change states due to some event happening.

15. Design Style. Certain requirements actually states how certain things should be designed. These requirements may sometimes be a bit too premature.

16. Standards Style. These kinds of requirements state that certain standards should be followed.

17. Process Style. Process requirements define the development process.

Robertson & Robertson [12] has following non-functional requirements:

1. Look & Feel requirements. These are high-level interface design desires.

2. Usability requirements. A usability requirement is used to highlight certain usability aspects.

3. Performance requirements. A requirement that defines how fast a task should be solved or the level of accuracy for an application is called performance requirement.


20

4. Operational requirements. These kinds of requirements may describe the environment in which the product is to be used.

5. Maintainability & Portability requirements. Maintainability and portability is about defining future maintenance factors and interoperability issues.

6. Security requirements. This involves stating confidentiality, availability and integrity appearances.

7. Cultural & Political requirements. These requirements should reflect things like human customs, preferences and prejudices according to which country the product is to be used in.

8. Legal requirements. The product must comply with laws that apply to the customer’s legal system

Lauesen [13] has a rather similar but slimmer definition of non-functional requirements, each category has its subcategories, and states following requirements:

1. Efficiency requirements. These are the most straight-forward ones and comparable to the performance requirements in Mastering the Requirements Process [12].

a. Timing requirements. The name says everything. These requirements deal with timing issues.

b. Capacity requirements. A capacity requirement may state how long a computer or database resource can be used.

c. Range & Accuracy requirements. These requirements deal with how large or small data values may be and how accurate they must be.

2. Usability requirements. For the description see Robertson & Robertson’s non-functional requirement number 2 above.

a. Performance style. This style is used to state how fast the users can learn different tasks and how fast they can carry out tasks after they have learnt how to do them.

b. Defect style. This style is used on requirements concerning failure ratio.

c. Process style. When describing the development process, process style requirements are used.

d. Subjective style. Subjective style requirements are based on the users’ opinions.

e. Design style. This style is used to describe the user interface (UI).

f. Guideline style. The general characteristics of the UI. This and the design style often seem to turn into functional requirements.


21

3. Maintainability requirements. These requirements are used to define maintenance factors.

a. Performance style. This style can be used when specifying factors concerning the customer’s essential needs.

b. Support process style. This style defines things that should happen during maintenance activities.

c. Development process style. This style is used to maintain a maintenance friendly development process.

d. Product structure style. Certain product structures may be enforced to obtain maintainability.

e. Product feature style. Special feature requirements may be created in order to supply methods for error analysis.

The dividing of requirements into different categories is good when everybody, developers and customers, have extensive knowledge about the categories. If they do not then there is a risk of confusion when stating requirements. Sony Ericsson has not made a partitioning of the requirements according to any of the suggestions above. Lauesen’s styles seem very interesting but since there are a lot of different ones, there is a risk of inconsistent use. If a software development company decides to use them, the staff who specifies the requirements and all other employees who have contact with requirements, will need thorough training.

Several different problems may occur when writing requirements in natural language. Requirements can be written in completely different ways but they may have the same purpose, this leads to problems concerning consistency in the requirements specification.

The process of eliciting, analyzing and documenting requirements is called requirements engineering. [4]

2.1.1 Requirements Engineering

In addition to process activities mentioned before, the requirements process also contain a system feasibility study, the specification of the requirements and a validation of the proposed requirements. [4]

Feasibility Study

The main objective with the feasibility study is to find out whether or not it is worth to carry on with the requirements engineering process and the software development process. The input to this study comes from an outline description of the system and a description on how the system is intended to be used within one or several organizations. [4]


22

Elicitation and Analysis

If the feasibility study shows that the development process should proceed, the next step is elicitation and analysis of requirements. During this stage staff with technical software development experience use different sources in order to find out what kind of services the system must provide. Other important aspects are required performance and hardware constraints. Due to the diffusion of people involved in this stage, a term stakeholder is used to refer to anyone who has any knowledge about the system and therefore should have some kind of influence on the system requirements. [4]

There are numerous problems with eliciting and analyzing requirements, for example stakeholders often use very general terms concerning the systems services and performance. This may lead to false interpretations by the engineers. Other problems are contradictions among the requirements as there can be many stakeholders who contribute with desires. The stakeholders may also make unrealistic demands since they may be unaware of how much it would cost to implement their requirements. [4] Very often, stakeholders have problems with stating what they really need. The stakeholders often come with solutions instead of requirements. Customers usually have an underlying resistance towards changes and therefore find it hard to adopt new ideas. Stakeholders tend to create an overwhelming amount of requirements; many of them are just things that would be nice to have. [13]

Sommerville states following activities for elicitation and analysis:

1. Domain understanding. The engineers must evolve awareness concerning the domain in which the application will be used.

2. Requirements caption. To elicit requirements engineers must interact with the stakeholders.

3. Classification. When requirements have been elicited they ought to be classified and organized into coherent clusters.

4. Conflict resolution. If there are multiple stakeholders involved it is likely that there will be a matter of conflict between the requirements and this must be resolved.

5. Prioritization. This activity contains interaction with the stakeholders in order to make up a priority list amongst the requirements.

6. Requirements checking. At last the requirements must be checked to see if they are consistent and not incomplete.

The elicitation and analysis of the requirements are an iterative process which means that, even if you start with domain understanding and end with requirements checking, you may go back and forth between the activities. [4]

Lauesen describes a couple of work products that has to be produced, iteratively, in an elicitation process:

1. Evolve a comprehension for the existing work in the domain.

2. Find out what the problems are with the domain.


23

3. Specify a list with important goals and issues.

4. Come up with a model for the system to be produced.

5. State the feasible possibilities.

6. Get the stakeholders involved.

7. Solve dissensions between the stakeholders.

8. Specify the requirements. Create a requirements specification.

9. Create a prioritization amongst the requirements.

10. Verify the requirements to see if they are complete.

These two process activity lists are rather similar with only minor differences. But Pfleeger, on the other hand, has narrowed it down to three points:

1. Problem analysis. Identify processes, people and resources that are involved and find out the relationship between them.

2. Problem description. Investigate if the right techniques and views are used.

3. Prototyping and testing. Find out if the requirements can be implemented (i.e. are they feasible).

There are different ways to go about when eliciting requirements. One way is the view-point oriented method. Even for a small system there can be several different viewpoints to consider. Separated view-points on a problem may see the problem in different ways. Though they may find different requirements many of the view-points tend to find similar ones. The view-point oriented approach is used to find different view-points and use them to organize and structure the elicitation process and the requirements. [4]

Another way of eliciting requirements is to use a scenario based method. Engineers set up scenarios made up of real-life examples and formulate requirements out of the information gained from the process when different stakeholders ‘walk’ through the scenarios and discuss these. A scenario contains one or a small amount of feasible interactions. In its simplest version a scenario may include following:

1. “a system state description at the beginning of the scenario;

2. a description of the normal flow of events in the scenario;

3. a description of what can go wrong and how this is handled;

4. information about other activities which might be going on at the same time;

5. a description of the state of the system after completion of the scenario.”

- Ian Sommerville [4]


24

An elicitation process based on scenarios may be carried out in a rather informal fashion, but there are more structured ways to go about, for example event scenarios or use-cases. When engineers use event scenarios they document every interaction with a separate event scenario. The event scenario contains a description of the data flows, the action of the system and also the exceptions which may occur. [4]

The use-case based method of scenarios is rather similar to event scenarios and is often used when the system is represented by an object-oriented model. The simplest form of use-case only identifies the participants and gives names to each of the different interaction kinds. To extend the amount of information in a use-case sequence diagrams can be added. Sequence diagrams visualize the participants, the objects which the system contains and the operations associated with the system’s objects. [4] [12]

Another scenario based method is ethnography5. As software systems often exist in a not so isolated environment it is not strange that they are affected by the social and organizational context of that environment. Requirements that originate from the surroundings are often critical for the success of the system. If these requirements are not met there is a high risk that the system never will be used. Engineering teams who use this method must integrate with the environment in which the system is intended to be used. When immersed, observations and notes of the day-to-day work are made. The information captured from this action is then used to formulate requirements. [4]

Interviewing is a good way to gain knowledge about the domain and through it ideas of the future system might emerge. There must be a good mixture of stakeholders in the interview group/s. If, for example, end-users are left out it will be hard to elicit any relevant needs. It is important to ask about risks and tasks that are critical. When the most important issues have been identified, questions about specific matters may be asked. [12] [13]

Interviewing users might not always be a good method since it is often hard for them to state what they do and how they do it. A solution for this problem is to observe them instead. An analyst can spend some time in the users daily work environment and study how they work. This method is rather equivalent to the ethnography method. [13]

A combination of the interviewing and observation methods is called task demonstration or apprenticing. As the user may find it hard to explain how they execute a certain task or how they conduct their day-to-day work, it might be easier for them to show it practically. The analyst can ask them to perform infrequent and critical tasks. The analyst can also ask questions to the user during the observation. [13]

Also brainstorming can be used as a method for generating ideas, which can be transformed into requirements. A group of people (e.g. end-users, etc.) is gathered and they are asked to come up with ideas. They must be informed that all ideas are accepted. An idea will for sure generate a new one. After the brainstorming session, the best ideas may be turned into requirements. [12] [13]

5“That branch of knowledge which has for its subject the characteristics of the human family,

developing the details with which ethnology as a comparative science deals; descriptive ethnology”. – http://dictionary.reference.com [18]


25

There several additional methods of eliciting requirements and these can be found in Mastering the Requirements Process [12] and Software Requirements – Styles and Techniques [13].

Validation

After requirements have been elicited, analyzed and documented in a specification, a validation process must take place. Validation is used to find out whether the requirements really meet the customers needs. It is much related to the requirements analysis but validation uses complete requirement’s specifications and analysis uses incomplete ones. If the validation process is skipped, critical faults will probably exist and this will lead to large costs as the design, implementation and testing processes must be done all over again due the necessary supplements and changes in substantial requirements. Sommerville states that the validation process should consist of the following activities:

1. Validity checks. Stakeholders may have an idea of how the system should work but further thought may bring up new features in addition to the ones originally stated by the users.

2. Consistency checks. As there often are a lot of stakeholders involved, it is likely that they will come up with contradictory requirements. This activity is used to find such contradictions.

3. Completeness checks. This involves controlling that the requirements cover all the functions and constraints regarding the system.

4. Realism checks. Ensure that the suggested requirements actually can be implemented regarding the existing technology and budget aspects.

5. Verifiability. To be able to verify the requirements, they must be testable.

Lauesen refers to IEEE Std. 830-1993 when he specifies the quality aspects, which a requirement specification should comply with, and these are:

1. Correctness. This means that every requirement should state a user need.

2. Completeness. See point 3 above.

3. Consistency. See point 2 above.

4. Modifiability. A requirement specification should be easy to modify without corrupting the consistency.

5. Verifiability. It should be possible to check if the system is complaint or not compliant to every requirement.

6. Prioritization and stability. Every requirement should have some sort of ranking based on its importance and its instability.

7. Traceability. The specification must state where the requirements originate from and where they are used in the design stage.

8. Unambiguity. Both the developers and the customer must have the same conception on each and every requirement.


26

There are numerous ways of validating requirements and one of them is the review. [4] [13] This process is rather manual as the involved personnel read the requirements specification and look for ambiguities and omissions. The informal review only concerns contractors and a maximum number of stakeholders. In the more formal variant the requirement engineers ‘walk’ through the specification with the client and explain the meaning of each and every requirement. The team of reviewers may use following checkpoints, stated by Sommerville:

• Consistency

• Completeness

• Verifiability

• Comprehensibility

• Traceability

• Adaptability

When anomalies, conflicts, faults and contradictions are found they must be recorded. After the review it is up to the contractor and client to negotiate a solution to the discovered problems. [4]

Other ways of validating the requirements are prototyping, test-case generation and automated consistency analysis. As the name prototyping reveals, it involves creating an executable prototype of the software and let the customers experiment with this and see if it meets their needs. [4] [13] Test-case generation is about stating if the requirements are realistically testable. If a test-case, for a specific requirement, is impossible to implement, it usually means that the requirement also will be difficult to implement. In the automated consistency analysis a CASE tool is used to inspect the consistency amongst the requirements. This method is only applicable when the requirements are stated as a system model in a structured or in a formal notation. [4]

Requirements Management

From time to time it seems like the requirement process has clear beginning and end but that is not the case. [12] Requirements are likely to appear and change during the software process, as the developer’s comprehension of the problem changes. These changes of understanding may engender new requirements or changes to already existing ones. Another reason for having a well working requirements management process is that software systems must evolve, due to changes in their environment, or they will become obsolete. [4] Sommerville states that new requirements may also emerge due to the following reasons:

1. Substantial software systems often have a great number of users. Diverse users have contributed with different requirements. As their requirements may have been contradictory a compromise has been made. Very often it is found that this compromise needs to change.

2. If the customer and end-user is not the same, it is likely that the customer comes up with requirements related to organization and budget. These requirements may interfere with requirements originated from end-users.


27

3. The future environment of the system may change during the software process. These changes can affect the requirements.

When problems have emerged they must be analyzed. If changes are proposed these must be analyzed also to see how they will affect different parts of the system. An analysis of the cost of the proposed changes must also be made. After these stages the changes are implemented in the software system. [4]

When the software needs to be updated, due to some reason, there must be a way to see which requirements will be affected. It must also be possible to trace the change, of the requirements, to other requirements that will be affected. In other words it is important to keep track of which requirements are used to implement different parts of the system. [12]

There exist several requirements management tools on the market. These may aid in the tasks of “tracing requirements, linking test results to requirements, change management, semantic analysis and so on”. – Suzanne Robertson & James Robertson [12] The tools are often useful but they must be treated as a tool and nothing else. They should be considered as an aid in the requirements management and nothing else. The tool vendors often try to exaggerate the tools abilities.

2.2 Information Retrieval and Natural Language Processing

As stated in chapter 1.3 the purpose with this thesis is to see if a semi automatic method may be of assistance in the process of handling operator requirements. This method is supposed to compare incoming requirements with “old” sets of requirements in order to find similar or contradictory requirements. Since the requirements are written in natural language, IR in combination with NLP seemed useful.

The subject information retrieval was coined in 1952. It achieved acknowledgement from 1961, mainly in the area of research. Its first purpose was to aid in the management of the large amount of scientific literature, which had been produced since the 1940’s. Very quickly libraries adopted the new technique. IR became a great aid for them as their area of subject had increased from a storage place for books into places where information was catalogued and indexed. Nowadays IR has become widely spread in areas such as business, law and medicine. IR has reached a wider audience with the birth of internet. Due to the massive amount of information on the World Wide Web, the need for searching algorithms grew and IR became handy. Research in IR has escalated because of this. [2] [3] [6]

As the name reveals, information retrieval deals with the retrieval of information. The information may be in any format such as text, image, video and speech. Information is often stored in a database. The input to the retrieval process consists of queries often formulated by users. [6]

There are three major subjects that make up an IR system: items of information, user’s queries and the matching of these two. An extensive analysis of the information source and of the user query is needed, before the matching process begins. The following vital functions are a must in an IR system:

1. Identify the information sources pertinent to user’s queries.

2. Analyze and represent the information sources in such a way that they will be suitable for matching.


28

3. Analyze and represent the user’s queries in such a way that they will be suitable for matching.

4. Match the user’s queries with the information sources.

5. Retrieve relevant information with assistance from the ranking, which were created during the matching process.

The goal with the analysis process is to identify keywords which will be used when matching. It is in this process that NLP techniques come in handy. [3] [6]

Making a product, which deals with natural language components, is called natural language engineering. There are many different techniques that a natural language engineer can use. The engineer must find out which of them that are best fit for solving the particular problem. [23]

2.2.1 Natural Language Processing Techniques

Different natural language processing techniques, which are applicable in an IR system, are described in this subchapter.

Lexical Analysis

The process of transforming an input stream of characters into a stream of words or tokens6 is one kind of lexical analysis, which is often called tokenization. The first stage in query processing and automatic indexing is lexical analysis. In this analysis process the first decision that has to be made is to decide what will count as a word or token. Terms that solely consist of letters are of course tokens but there are other terms which demand some consideration. Digits, for example, are often not well suited to be tokens. In databases, containing documents of technical type, it is common that terms contain digits. It is not a good idea to exclude the digits if the words, in which they are found, make up good index terms. [3]

Hyphens and other punctuation marks are other things that need to be considered. If hyphenated words like “voice-mail” are broken up into “voice mail” problems with inconsistent usage is solved but the hyphenated phrase loses its original meaning. Other punctuation marks like the dot and the slash may be used in words like “M-Send.req” and “text/x-iMelody”, which can be found in technical documents. The dot is also used to end a sentence. It is a difficult problem to decide when they should be kept or not. It is up to the person who is making the lexical analyzer to make up the lexical rules. The case of the letters is often not important and the letters are often converted to lower or upper case since the matching process is case sensitive. The queries and the information sources may use the same lexical analyzer since the rules have to be the same for both of them. [6] [7]

6 “Tokens are groups of characters with collective significance.” – Christopher Fox [3]


29

Removal of Stopwords

Words that occur frequently are bad as index terms. If an IR system uses words like “the”, “and”, “so” and “take” in a search, it will probably retrieve a large amount of documents that has none or little relevance. The ten most frequently occurring words make up 20 to 30 percent of the words in a document. If such words are removed the IR system will work a lot faster with almost no negative effects on the retrieval result. The list of words which are to be excluded is called a stoplist. A common list of 250 stopwords was created by C.J. van Rijsbergen in 1975. Another stoplist, containing 425 words, was created from the Brown corpus of 1,014,000 words, which came from a wide range of English literature. [3]

Frequency Lists

A frequency list may be produced to enhance the filtering of words that are of little or no use. The frequency list consists of all the words in the documents and their frequency of occurrence. After a frequency threshold has been selected all the words that have a higher frequency than the threshold are eliminated. [2]

Thesaurus

A thesaurus is a collection of words and their synonyms. It may be applied in several ways. One way is to limit the number of used terms in a document. For example may the words “accumulation”, “aggregating”, “anthology”, “assembling”, “assortment”, “collecting”, “collection”, “collocating”, “combining”, “compiling”, “consolidating”, “garner”, “garnering”, “gathering”, “incorporating”, “joining”, “treasury” and “unifying” be exchanged for the word “compilation”. The synonyms for this word were found with http://thesaurus.reference.com [19]. Another way is to use the thesaurus in the similarity measure calculation phase, which is described later. When two words are to be compared, the thesaurus may be used to elicit all the synonyms, of the first word. Then the synonyms are compared with the second word. Both these methods will lead to an increased recall ratio but the first method also affects the content in such a way that it will be corrupted. [2]

Morphological Analysis

Another way of increasing the performance of IR system is to integrate a morphogical analyzer into the system. Morphological analysis is used to find different morphological variants of search terms, commonly their ground form. For example if a user includes the term “stemming” in the query, he or she will also be interested in variants such as “stem” and “stemmed”. [2] One method is called stemming and it heuristically abbreviates word forms by removing affixes and suffixes. Another method, called lemmatization, also reduces the words to their ground form by applying certain lemmas but this method, apart from stemming, handles ambiguities. One such ambiguity is that the ground form of lying is lie. The latter method covers the inflectional7 morphology of English nouns and verbs. There are several different stemming algorithms (e.g. Porter’s and Lovin’s). [16] A third method uses a word list to find the ground form of a word. This method is the most accurate one but it would take a lot of time to implement such a method due the huge amount of words in a language.

7 “ Inflections are the systematic modifications of a root form by means of prefixes and suffixes to

indicate grammatical distinctions like singular and plural.” – Christopher D. Manning & Hinrich Schütze [7]


30

This analysis may be done manually or by automation. The automatic way is done by using a program. Besides the increase in performance morphological analysis also reduces the size of the information by up to 50 percent since the analyzed words contain fewer letters than those who are not analyzed. Two errors may occur while using a stemming algorithm, overstemming and understemming. Overstemming occurs when terms that are not morphologically related are conflated into the same form and understemming is when related words are not conflated into the same form. Morphological analysis can be performed at the stage of retrieving index terms but can also be done when calculating similarities. The major drawback of morphological analysis during index terms retrieval is that the full terms will be lost or has to be stored beside the analyzed terms thus demanding additional storage space. [2] [3] [6] Another problem is when the information, which the analyzer is applied to, is morphology-rich. [7] The outcome from the analyzer will then be rather uniform.

2.2.2 Information Retrieval Similarity Measure Models

When the query and the documents have been processed by the chosen NLP techniques, the query must be compared to the documents in order to find out which documents that correspond best to the query. In an IR system it is done with a similarity measure algorithm.

The classic IR system is an ad-hoc retrieval system. In ad-hoc retrieval the user enters a query and the system returns a number of documents, which the system states as relevant. [7]

Following characterization of an IR model were retrieved from “Modern Information Retrieval” [5]. [D, Q, F, R(qi,dj)], D represents the documents with a set of logical representations. Q is the same as D but for the user’s information demands. The symbolization of the document and query representations and their connections are represented by F. The R(qi,dj) function returns a similarity measure between the query ��

∈iq and the document ��∈jd . The ranking, which is based on the similarity

measures, assigns an arrangement amongst the documents according to the query.

If t is the number of index terms in the systems then K = {k1,….,kt} is the set of all index terms. Each term ki is united with a weight 0, >jiw . If an index term does not

appear in a document, its weight is 0. A document dj has an associated index term

vector which looks like ).,.....,,( ,,2,1 jtjjj wwwd =�

The function gi returns the weight

which is associated with the term ki, ( ) jiji wdg ,=�

.

There are a lot of different weighting algorithms. The simplest one is to assign all terms the value one and identical terms are just counted once so frequent occurrences do not affect the weight of a term, 0, =jiw or 1, =jiw . This weighting method is

called binary weighting. Another simple way of allotting weights to terms is to use term frequency (tfi,j). It means that every term is given a weight, which is equivalent to the number of occurrences in document dj or query qi.

Document frequency, dfi, is the number of documents in which a term ki appears.

N is the number of documents in the collection.


31

There are several additional weighting algorithms and some of them are listed here:

1. ( )jiji tfw ,2, log1+=

2. ( ) ��

⋅+=

ijiji df

Ntfw log)log(1 ,,

3. i

jiji df

tfw ,

, =

4. ��

�⋅=

ijiji df

Ntfw log,,

After the index terms have been assigned weights, it is time to compute similarity measures of the query and the documents. Similarity measures are computed with a similarity measure model and some will now be presented.

The Boolean Model

The Boolean similarity measure model for computing similarity measures uses binary weights (i.e. { }1,0, ∈jiw ). The similarity measure between a query q and a document

dj is stated as ),( jdqsim , which is equivalent to the R(qi,dj) function, and can be 1

or 0. 1),( =jdqsim if any of the conjunctive components )( ccq�

, of the disjunctive

normal form )( dnfq� 8

for the query q, exists and if )( ccq�

is a subset of )( dnfq�

and

( )ji dg � equals ( )cci qg � for all index terms, otherwise 0),( =jdqsim . This can also

be expressed as

( ) ( ) ( ) ( )( )�� =∀∧∈∃

=otherwise

qgdgqqqifdqsim ccijikdnfcccc

ji

0

,|1,

��. [5]

For example the disjunctive normal form of qp⊕ is ( ) ( )qpqp ∧¬∨¬∧ . [20]

An IR system using a Boolean model is called exact match system since it returns documents that exactly match the query. [7] Even though this model has been widely used in many IR systems it has its limitations. If the number of documents, with similarity measures equal to one, is many then there is no way to rank them. This similarity measure model may also miss many relevant documents if the search statement, in the query, is too narrow. [2] If the Boolean model is applied to a large system consisting of many heterogeneous documents the result is either empty or contain a huge amount of documents. [7]

8 “A disjunctive normal form is a disjunction of conjunctions where every variable or its

negation is represented once in each conjunction.” - Ribarsky, W [20]


32

An example will now be shown. If a query q, containing only the term “yellow”, is matched with document d1, “The cabs in New York are often yellow”, and document d2, “Strawberries are red”, the IR system will only return document d1 since it contains the term “yellow” and it will not return document d2 since it does not contain this term.

The Vector Space Model

Another widely spread model used in ad-hoc retrieval systems are the vector space model. In this model, the queries and the documents are represented by vectors. Each dimension in the vectors corresponds to a term weight, which can be computed with either of the weighting models described above. The query q has an index term vector

),.....,,( ,,2,1 qtqq wwwq =�

,

which the document dj also has,

).,.....,,( ,,2,1 jujjj wwwd =�

Similarity measures may be computed with one of the following algorithms:

Cosine coefficient

( )BA

C

ww

ww

dq

dqdqsim weightsBinary

u

iji

t

iqi

t

ijiqi

j

jj ⋅

→⋅

⋅=

×

•= ��

�

==

=

1

2,

1

2,

1,, )(

, ��

Dice coefficient

( )BA

C

ww

wwdqsim weightsBinary

u

iji

t

iqi

t

ijiqi

j +⋅ →

+

⋅⋅= ��

�

==

= 2)(2

,

1,

1,

1,,�

Jaccard coefficient

( )CBA

C

wwww

ww

dqsim weightsBinary

u

i

t

ijiqiji

t

iqi

t

ijiqi

j −+ →

⋅−+

⋅= � ��

�

= ==

=

1 1,,,

1,

1,,

)(

)(

,

A = # terms that exist in the query. B = # terms that exist in the document. C = # terms that exist in both the query and the document.

The Cosine coefficient conducts a measurement of the angle between the query vector, t-dimensional, and the document vector, u-dimensional, in a space of t dimensions. This description is visualized in two dimensions in Figure 2.3.


33

Figure 2.3 – Cosine coefficient visualization

The calculation can also be described as this; the numerator is a computation of how well the terms in the query correlates to the terms in the document, the denominator consist of the Euclidian lengths of the query vector and the document vector. [7]

The Dice coefficient relates the overlap of query vector q and document dj to their sizes.

The Jaccard coefficient expresses the degree of overlap between the query q and the

document dj as the proportion of the overlap from the whole:j

j

dq

dq

∪∩

With binary term weights, the Jaccard coefficient elements are visualized in Figure 2.4.

Figure 2.4 – Venn diagram visualizing the Jaccard c oefficient algorithm - [17]

The author of this thesis also wanted to test a model based on probability theory. An attempt to define a similarity measure model based on probability theory resulted in a model which was defined as follows:

( ) ! !!

= =

= "#$%&

'⋅

⋅=

N

d

t

idiqi

t

ijiqi

j

ww

wwdqsim

1 1,,

1,,

)(

)(, (


34

In time it became clear that this model did not have anything to do with probability theory, it was more like a version of the matching coefficient,

( ) )=

⋅=t

ijiqij wwdqsim

1,, )(, * ,

i.e. a vector space model. This model was then named “Normalized Matching model” since it is in fact a normalized version of the Matching coefficient. What it does is that it relates the overlap of query vector q and document dj to all the overlaps between the query and all the documents.

The main difference between this model and the Jaccard, the Dice and the Cosine models is that it does not include the length of the vectors in the similarity measure calculation and that it instead normalizes the nominator with the sum of the overlaps.

2.2.3 Evaluation Measures

There have mainly been two methods that have been used when evaluating the performance of an IR system and these are “precision” and “recall”. Recall relates the number of relevant documents retrieved to the total number of relevant documents. Precision relates the number of relevant documents retrieved to the total number of retrieved documents.

The recall and precision rate is defined by Chowdhury in “Introduction to modern information retrieval” [2]:

%100⋅+

=ca

arecall

%100⋅+

=ba

aprecision

a = number of relevant documents retrieved, b = number of irrelevant documents retrieved and c = number of relevant documents not retrieved. a + c = total number of relevant documents and a + b = total number of retrieved documents.

The recall model is used with either a similarity measure or a ranking list size threshold. When applying a similarity measure threshold, the number of retrieved relevant documents is the number of relevant documents that got a similarity measure above the threshold. The number of relevant documents that were not retrieved is the number of relevant documents that got a similarity measure below the threshold.

If instead a ranking list size threshold is used, the number of retrieved relevant documents is the number of relevant documents that did make the ranking list. The number of not retrieved relevant documents is the number of relevant documents that did not make the ranking list. A problem may occur when the total number of relevant documents exceeds the threshold. This leads to all documents, not being able to be retrieved by the system and that recall can not become 100%.

The precision model can be applied with a similarity measure threshold. When doing so the number of retrieved relevant documents is the number of relevant documents that got a similarity measure above the threshold. The number of retrieved irrelevant documents is the number of irrelevant documents that got a similarity value above the threshold.


35

The precision model should not be applied with a ranking list size threshold since the total number of retrieved documents will be controlled by the threshold and not by the performance of the system.

If no threshold is chosen the recall will become 100% and the precision will become the total number of relevant documents related to the total number of documents.


36

3 Software Engineering at SEMC

This chapter contains an outline of the software engineering process (SWEP) which is used when developing products at SEMC. The information in this chapter was gathered from an internal webpage within SEMC [21], an internal document within SEMC [1] and through discussions with three SEMC employees Micael Åkesson, Staffan Björklund and Michael Kindlein.

Figure 3.1 – Software Engineering Process - [21]

As seen in Figure 3.1 the first phase in the SWEP is the pre-study. The main objective of this phase is to analyze the requirements in the Product Requirements Specification (PRS), which is created by the Product & Application Planning (PAP) division. Other inputs, besides the PRS, are “lessons learned” and a concept study report. The conclusion of this phase results is a first version of the System Requirement Specification (SRS). A SRS contains both the hardware and the software requirements. The analysis is used to eliminate requirements that are thought to be too expensive to implement and to clarify those who are in need of clarification. Along with the requirements analysis comes a System Impact Analysis (SIA). The SIA is performed to see how the existing software modules are affected. The aspect of user friendliness, User Centric Design (UCD), is also a subject which is treated at this early stage. This phase ends with a milestone report.

A product’s SWEP has a number of milestones. The formal output of a milestone (MS) is often a report of some kind. Milestones are used to keep track of the process.

The second phase is the feasibility study, which main purpose is a study of the technical solutions and execution planning. The planning involves allocation of resources to the project. The work with the SRS and the UCD continues. System Software Design (SSD) documents are created and so is the PRS SoC.

The milestone, which ends this phase, is a project plan. If the project receives “a green light”, the process advances to the next stage.

The next four stages are accomplished with incremental development. Each project increment is divided into many “bubbles” and each bubble is a function which is developed and tested by a FG. The implementation and function test works iteratively within every bubble. The different increments are gradually integrated with each other. This workflow is used to minimize hazards and to keep track of dependencies when merging together many different increments.


37

A final software product verification must be done before the release of the product. Now the project is almost finished and conclusions are made in lessons learned documents and in the final report. The last phase is High Volume Management (HVM) were the product is released to the market.

Four main software engineering activities are described in chapter 2.1. The software specification activity includes the pre-study, the feasibility and a part of the design phase. Design, implementation and maybe function test are included in the software implementation activity. Software verification is, in SEMC’s SWEP, represented by function test, early and final software product verification. The conclusion and HVM phase are somewhat corresponding to the software evolution activity.

A complex company like Sony Ericsson has to deal with many stakeholders when developing a new product. As seen in Figure 3.2 requirements are captured from several different sources.

Figure 3.2 – Requirement sources and requirement do cuments

The PRS, which is managed by PAP, contains mostly high level requirements but also a great deal of detailed ones. The requirements in the PRS reflect the requirements achieved from the operators. It is the Global Product Management (GPM) that has the responsibility to conduct communication with the operators.

A Development Unit (DU) has the responsibility over the SRS. Technical requirements concerning the platform are specified in the Technical Requirement Specification (TRS), which is managed by the Chief Technology Officer (CTO). The CTO is responsible for the communication with the platform vendors.

Each project is developed at a DU and it is very much affected by the TRS since every phone development project is based on a platform.

Requirements also come from end-users, through consumer research, the market, through information gathered from sales & marketing and intelligence from competitive companies, concept studies and from application planning.


38

In the product’s SRS, the requirements are divided according to the function groups. There is no separation into functional and nonfunctional requirements. The requirements are mostly written in feature style (i.e. plain text) but use case based requirements are also used in some cases, especially for new functionality. The use case requirements are a combination of UML and task notation. To retrieve requirements SEMC sets up a vision. From that vision certain goals are set up. Their next step is to create several user scenarios from the goals. In the last step use cases are generated from the user scenarios. Requirements are then created from these scenarios.

A very important issue is the prioritization of the requirements. The priority of a requirement is used to decide which ones are to be left out if problems occur during the development process. The prioritization is set from a market point of view, but also from a risk perspective. The risk is divided into three levels. If a requirement, which has come late in a product’s life cycle, gets a high risk value it will not be implemented in this product. The requirement gets a high priority if it has a high value, see Figure 3.3, and can be implemented at a small cost, vice versa for requirements with low priority.

Figure 3.3 – Prioritization

As stated above the operators have a great deal of influence in the requirements for a product. The process of handling operator requirement documents or RFIs is described in Figure 3.4.

Figure 3.4 – Handling of RFI’s


39

Above it was stated that it is the GPM that communicates with the operators. There are several Key Account Managers (KAM) within the GPM. Each KAM has the responsibility for the contact with one major operator. The KAM receives RFIs from the operators and pass it on to the Bid Support Specialist. He reviews the documents from a market point of view and finds out which products are to be treated by the RFI in question. The Bid Support Specialist then passes the documents on to the coordinator. The coordinator analyzes the instructions, which follow with the documents, and the documents. The RFI is then distributed to the Areas of Expertise (AoE). An AoE consists of a FG and a TWG (Technical Work Group). The TWG works with roadmaps (i.e. future functions) and the FG works with implemenation and testing. When the AoEs have stated the compliance for every requirement, they send the RFI reply back to the coordinator. He reviews the answers and sends the replies on to the Bid Support Specialist, which also checks the answers. If the RFI originates from a major operator, a meeting is held with the GPM, the coordinator and experts from the AoE in order to discuss the answers which are to be submitted back to the operator. The RFI reply is then sent back to the operator by the KAM.

The operators send RFIs to SEMC a couple of times each year. RFIs from different operators contain many similar requirements and two different RFI releases from a single operator have even more identical requirements. The AoE has a lot of work with development and testing and they feel that they do not have the time to state the compliance of every RFI that arrives. Especially frustrating is that they have to state the compliance of similar or identical requirements several times. In order to decrease the workload on the AoE, an evaluation process is conducted to see if a semi automatic requirements management method may be of assistance. This is were this thesis comes in. Another suggestion has been to standardize a RFI template which can be used by the operators.


40

4 Method

The techniques that were chosen are described in this chapter. Also the order in which they are used is stated. The data flow and the parts of the program are shown in Figure 4.1. The material used for testing is described in the end of this chapter. The program parts are lexical analysis, morphological analysis, removal of stopwords and similarity measure calculation and ranking. All the program parts, except the Normalized Matching model in the similarity measure calculation stage, were implemented by Johan Natt och Dag, who also chose which techniques to implement. The Normalized Matching model was implemented by the author of this thesis. All the program parts were programmed in Perl since it is a good programming language to use when you and dealing large amounts text based data.

Figure 4.1 – Data flow

The broken lines symbolize paths that may be used in the future but were not used in this thesis.

4.1 Retrieval of Data

The first step in this process is to extract relevant data/information. RFIs may come in many different formats like PDF, Excel etc. After the RFI has been received, requirements and their identifiers are exported to a txt-file. Other information, like requirement names, may also be exported to the txt-file. Data from a database or an older RFI are also extracted and converted into applicable format (i.e. txt format).

4.2 Lexical Analysis

As shown in Figure 4.1, lexical analysis was chosen to be the first step in the natural language processing phase. It was natural to have lexical analysis as the first NLP step since methods like morphological analysis and removal of stopwords need “clean” words in order to perform well. The lexical analyzer is programmed in flex (fast lexical analyzer generator). This analyzer converts all letters to lower case, deletes stand alone digits and special characters as “ # % & / but keeps standards and chapters formulated as “9.1” and “IEEE.123”. References, e.g. “abc12-345”, are also kept. \t and \n, which represents “tab” and “new line”, are not removed since \t is used to separate the requirement identifier from the requirement description and \n is used to separate the requirements from each other. The flex rules file in located in Appendix A.


41

Flex is a tool for generating scanners. A scanner is a program, which can be used to recognize lexical patterns in a text. The scanner is created from a description file, which contains rules. The rules are in form of pairs of regular expressions and C code. The flex manual is described in “LEX – A lexical analyzer generator” [14].

Flex was chosen since it is easier to program a deterministic finite automat (DFA) with it instead of programming one by hand in a high level programming language as C. The lex structure is compatible with UNIX, which is the environment the programs are executed in. A lexer is also often faster than a hand-coded DFA. Another good reason is that it is free.

4.3 Morphological Analysis

Step number two in the NLP phase were chosen to be morphological analysis. This method is an important step in many IR systems. As described in chapter 2.2.1 it transforms words to their ground form. Morphological analysis is regarded to be indispensable method since it often enhances the ranking success ratio.

A small perl program, which addresses a program called “Morpha”, was used to conduct the morphological analysis of the input data. Instead of using a stemming algorithm (e.g. Porter), which heuristically abbreviates word forms, “Morpha” is based on lemmatization. “Morpha” is built up by approximately 1400 flex-rules and its documentation can be found in “Applied morphological processing of English” [16].

Morpha was chosen since it has been tested and got a high, 99.0%, type accuracy and it is fast, throughput of 240 000 words per second, in “Applied morphological processing of English” [16]. It is also free and compatible with UNIX.

The common place for morphological analysis is after the stopword removal but stopwords may be created by the morphological analysis. That is why it is placed before the stopword removal process. When the morphological analysis is placed after the stopword removal phase, the amount of data coming into the morphological analysis is significantly reduced leading to a shorter execution time, for that reason stopword removal is often placed before the morphological analysis.

Figure 4.2 shows some examples of how the algorithm works:

Figure 4.2 – Morpha examples

4.4 Stopword Removal

The last step in the natural language processing phase is the removal of stopwords. This technique uses a stoplist derived from the Brown corpus. The stoplist contains 425 frequently occurring words such as “not”, “the”, “of” and “such”. A complete list of the stopwords is found in Appendix B. This method was chosen since it significantly reduces the number of boosted similarity measures.


42

4.5 Similarity Measure Models

Two similarity measure algorithms were chosen amongst the ones described in chapter 2.2.2, the Cosine model and the Normalized Matching model. The Boolean model was considered to be too simple for these kinds of measurements, i.e. it can not be used to rank requirements since it only states match or not, and was excluded. The Jaccard model and the Dice model were left out since they produce similarity measurements comparable to the Cosine model. The Normalized Matching model was implemented and used in tests before it was identified as a normalized version of the Matching coefficient. When the true nature of this model was discovered it was decided that the results from the tests with this model should be included in this report since the ranking may be different from the one achieved when using the Cosine model.

The following example is used to show when the Cosine and the Normalized Matching model gives similarity measures that results in different ranking. The query

10=q and the documents 51 =d and 302 =d , the overlap between q and d1

equals 51 =∩ dq and the overlap between q and d2 equals 62 =∩ dq . The

Cosine model, using binary weighting, calculates following similarity measures

( ) 71,0510

5, 1 ≈

⋅=dqsim

and

( ) 35,03010

6, 2 ≈

⋅=dqsim

which would give that document d1 gets a higher ranking than document d2. The Normalized Matching model, using binary weighting, calculates following similarity measures:

( ) 45,011

5, 1 ≈=dqsim

and

( ) 55,011

6, 2 ≈=dqsim

which would give that document d1 gets a lower ranking than document d2.

The first tests were conducted with the Cosine model and the Normalized Matching model, with binary weighting. The last tests were done with the model that had the best results in the first tests. These tests were conducted with three different weighting models.

The similarity measure models were implemented with three programs. The first contained the Cosine model and the Normalized Matching model, both with binary term weighting (BTW). The second program used term frequency weighting (TFW) with the Cosine model and the last program calculated similarity measures with the Cosine model using logarithmic term frequency weighting (LTFW)

( )jiji tfw ,2, log1+= .

The Cosine models and the Normalized Matching model give the documents similarity measures between 0 and 1, where 1 is best and 0 worst.


43

Note that the query q and the document dj in the similarity measure algorithms are represented by requirements. The query is a requirement from an incoming RFI and the document is a requirement from an old RFI or from a database.

To visualize how the NLP techniques work and how a similarity measure calculation is done an example will now be shown. The Cosine model using binary weighting will be used.

The query is “Support for sending and receiving UTF-8 character set encoding” and the document is represented by “Support for sending and receiving GIF87a”. In Table 4.1 it can be seen how the query and the document transforms when they are processed by the NLP techniques. The grey marked words under stopword removal shows which words the query and the document have in common after the NLP stage.

Query

Original Tokenization Lemmatization Stopword removal Support for sending and receiving UTF-8 character set encoding

support for sending and receiving utf-8 character set encoding

support for send and receive utf-8 character set encode

send receive utf-8 character set encode

Document Original Tokenization Lemmatization Stopword removal Support for sending and receiving GIF87a

support for sending and receiving gif87a

support for send and receive gif87a

send receive gif87a

Table 4.1 – Example of how the NLP techniques work

The information under stopword removal are what enters the similarity measure process. The similarity measure is now computed with the Cosine model, using binary weighting:

( ) 47,036

2)(

,

1

2,

1

2,

1,,

≈⋅

=⋅

→⋅

⋅=

×

•= ++

+

==

=

BA

C

ww

ww

dq

dqdqsim weightsBinary

u

iji

t

iqi

t

ijiqi

j

jj ,,

,,,

4.6 Output and Manual Analysis

The ranking results are presented in a html file, as seen in Figure 4.3. Note that this figure only visualizes the result for one requirement and a result file contains the results for every incoming requirement.


44

Figure 4.3 – Output visualization

The first requirement comes from Operator A revision 2 (OA rev. 2) and the five following ones originates from Operator A revision 1 (OA rev. 1). The requirements from OA rev. 1 are arranged by order of similarity according to their similarity measures. The highest ranked requirement has the highest similarity measure. When the output in Figure 4.3 is analyzed it becomes clear that the requirement from OA rev. 2 and the top ranked requirement from OA rev. 1 are duplicates. They have the same identifier and the same content. The analysis also gives that the first runner up is rather similar to the requirement from OA rev. 2 but not a duplicate since it handles UFT-16 character set encoding. Note that it is possible to change the number of shown requirements in the html file. The number of shown ranked requirements was always set to five since it was thought that a wanted requirement should be amongst the five top ranked requirements or else the program did not find it.

This manual analysis has to be done for all the results. When dealing with RFIs from one operator the result of the analysis should be a list that states if the requirements from the latest release are new, changed, old or, hopefully not, contradicting. When the information come from one RFI and a requirement database the list should contain information that states whether the requirements from the RFI are new, similar, duplicate or contradicting.

The first similarity measure calculation program, which contained the binary weighting Cosine and Normalized Matching model, can also return an output file in txt format. This file contains two matrixes. The first one has the requirements from the recently received RFI in a column to the left and all requirements, with a similarity value above zero, to the right in rows. Requirements with equal similarity values are separated by a “,” and requirements with different values are separated with tabs. The second matrix also has the new requirements in a column to the left but the old requirements are now in a row in the top of the matrix. Between the requirements is a matrix which states the similarity measurements.

The second and third program can, as the first program, send back a similarity matrix but can also create top list files. If the top list option is used a top list file will be created for each and every requirement.


45

The html result file in Figure 4.3 was created with the Cosine model. The ranking visualization became slightly different when the Normalized Matching model was used as the visualization was optimized for the Cosine model. Hence, the dots marking the level of similarity, which are based on the similarity measures, can not be seen as a good indicator for the level of similarity. This did not affect the results that were retrieved with the Normalized Matching model.

4.7 Evaluation Measure

To be able to evaluate the different weighting algorithms and similarity measure models one or more evaluation measures were needed. For each test and similarity measure model or weighting algorithm, a recall rate was computed. Since the purpose of the tests was to see if the proposed methods can be used to retrieve relevant requirements regardless of what similarity measure values the requirements got, the ranking list size threshold was chosen. Five was chosen since it seemed to be a reasonable amount of requirements to examine for every incoming requirement. The recall was calculated with a ranking list size threshold of five, which in some cases could limit the number of retrieved requirements and hence also the recall. Because it was decided to use a ranking list size threshold it was considered that the precision model would be unsuitable since the number of retrieved requirements, hence also the precision, would be controlled by the threshold and not by the performance of the system.

When comparing two RFI releases from one operator a relevant requirement is an old requirement or an original version of a changed requirement. If the requirements come from different operator a relevant requirement is a requirement with somewhat equivalent purpose to the one it was matched against.


46

5 Validation

In order to confirm that the implemented similarity measure algorithms calculate correct similarity measures according to their definition a validation was conducted. The validation was conducted by comparing similarity measures computed manually and similarity measures computed by the program. The Cosine models and the Normalized Matching model were validated with a selection of requirements from Operator A.

These requirements were treated by the lexical analyzer, the morphological analyzer and the stopword removal process. These requirements were used to compute similarity measures with the Cosine models and the Normalized Matching model. The results from these calculations are found in Appendix C. Table A to Table D contain manually calculated similarity measures for the Cosine model with binary weighting, the Normalized Matching model with binary weighting, the Cosine model with term frequency weighting and finally the Cosine model with logarithmic term frequency weighting. Table E to Table H contain similarity measures from the same models but these were computed by the programs.

A comparison between Table A and Table E clearly shows that the implemented Cosine model computes correct similarity measures. An observation of Table B and Table F reveals that the implemented Normalized Matching model calculates accurate similarity measures. Table C and Table D compared with Table G and Table H reveals that the Cosine model using term frequency weighting and logarithmic term frequency weighting computes the correct results.


47


48

6 Material

Test 1

The first test was conducted with messaging requirements from Operator A revision 1 and 2. These requirements were chosen to be able to compare the semi automatic requirements management method with a manual method for handling operator requirements. The manual compliance process was made by the Messaging function group.

Test 2

Test number two was done with requirements extracted from Messaging’s requirement database. Requirements from Operator A were compared with requirements from all the other operators in order to find out if the algorithms are able to find duplicate, similar and contradicting requirements.

Test 3

Testing was also done with requirements extracted from two Operator B RFIs since their RFIs are not as well structured as Operator A’s and their requirements are not as well formulated.

Test 4

The fourth test contained requirements from Operator A revision 2 and Operator B revision 2. The RFI from Operator B was compared with the RFI from Operator A to see if they formulate their requirements differently and if the program could find similar or contradicting requirements.


49


50

7 Results

This chapter contains the results that were achieved during the testing processes. For each test a recall rate was computed. The recall is calculated with a ranking list size threshold of five. Five was chosen since it seems to be a reasonable amount of requirements to examine for every incoming requirement.

Table 7.1 – Requirement data

Table 7.1 contains information about the different requirement sets. For example OA rev. 2 had 434 messaging requirements and these were found in PDF and Excel format and it took 195 minutes to extract and convert them into applicable format, i.e. txt. For test 4 there is no time for extraction since the requirements had been used in previous tests.

The time it takes to extract and convert a requirement from Word or PDF format to txt is roughly 1 minute. For requirements in Excel format the time for extraction and conversion is not dependent on the amount of requirements and therefore it is not possible to calculate a mean extraction and conversion time for these requirements.

Total timeTotal time Total time to completefor manual for semi manualanalysis of automatic compliancethe result file (h) method (h) process (h) Binary TFW LTFW Binary TFW LTFW

Test 1 8 13,1 20 99,6 - - 99,6 - -Test 2 1,5 1,7 4 52,3 - - 45,5 - -Test 3 1 3,1 1 100 100 100 - - -Test 4 2 2 3 42,9 40 45,7 - - -

Recall rate (%)Cosine model Normalized Matching model

Table 7.2 – Test data

Table 7.2 holds data from the four tests. For test 1 the total time it took complete the semi automatic method, for each similarity measure model, was approximately 13,1 hours. The function group Messaging worked for 20 hours to complete the compliance process. The Cosine model got a recall rate equal to 99,6% and the Normalized Matching model also got a recall rate that was 99,6%. For this test the similarity measure models only used binary weighting. The Messaging function group only took part in test session 1. Equivalent data are also stated for the other tests, similarity measure models and weighting algorithms.

Total time for extraction and conversion into RFI # requirements RFI formats txt format (min) Test 1 OA rev. 2 434 pdf, excel 195 OA rev. 1 242 pdf, excel 110 Test 2 OA 36 excel 5 Others 192 excel 5 Test 3 OB rev. 2 63 word 65 OB rev. 1 58 word 60 Test 4 OB rev. 2 63 word - OA rev. 2 434 pdf, excel -


51

The manual compliance process, in test 2, was done by the author of this thesis and it was not equivalent to the one made by the Messaging FG. This process only involved stating the compliance for those requirements that had corresponding requirements in the set they were compared to.

For test 3 and 4 the manual compliance process was done by Niclas Forsberg. For test 3 only the old requirements got a statement of compliance, since he could not state compliance for new and changed requirements, and for test 4 the compliance process was conducted in the same way as for test 2.

It takes approximately 1 minute to manually analyze the result for one requirement when comparing RFIs from one operator. Since different operators formulate their requirements differently it takes more time, ~2,3 minutes, to do the same analysis when comparing requirements from two or more operators.

The time to complete the semi automatic method consists of the extraction and conversion time, execution time for the programs and the time it takes to manually analyze the output from the programs. The execution time is neglected since it is very small compared to the other times.

Note that all time data in Table 7.1 and Table 7.2 are approximate and will be used as indicators when deciding whether the semi automatic method is to be used or not.

7.1 Test 1, Operator A revision 1 and 2

In order to investigate whether the program was able to find changed and old requirements or not, 434 messaging requirements were extracted from Operator A revision 2 and 242 messaging requirements were extracted from Operator A revision 1. The time it took to extract these requirements and convert them into text format can be seen in Table 7.1. These sets were run through the NLP analysis and ranking process. The RFI’s document histories were to be used as a key when verifying the result. After reviewing the document history for OA rev. 2, looking for requirements that were new or changed, it was discovered that the document history contained a great deal of anomalies, i.e. the requirements often had false document history.

A manual examination of the requirements was done due to the lack of a reliable key. In this examination each requirement from OA rev. 2 was compared with the requirements from OA rev 1. If a requirement from OA rev. 2 had a duplicate amongst the requirements in OA rev. 1 then it was an old requirement. When the OA rev. 2 requirement was an updated version of an OA rev. 1 requirement then the requirement was listed as changed. A requirement, from OA rev. 2, received the status new if it did not have any equivalent requirement in the OA rev. 1 set. The manual examination gave that 48,2% (209) of the requirements, in OA rev. 2, were new, 11,5% (50) were changed and 40,3% (175) were old ones. The document history for OA rev. 2 stated that 50,5% (219) were new, 24,4% (106) were changed and 25,1% (109) were old requirements. After a comparison between the document history and the manual examination it became clear that 27,2% (118) of the requirements in OA rev. 2 had incorrect document history.

The manual examination also gave that 55 of the requirements in OA rev. 2 were old requirements but their identifier had changed. 11 requirements had changed but their identifiers were the same as in OA rev. 1.


52

7.1.1 Cosine Model

The manual analysis of the program’s results gave that OA rev. 2 contained 49 changed and 175 old requirements and it took 8 hours as stated in Table 7.2. The

recall for the Cosine model, using binary weighting, became 99,6% -./012++=

17550

17549,

see Table 7.2, since the program’s result and the manual examination only differ for one requirement. It does not exist any relevant requirements, in OA rev. 1, for the new requirements in OA rev. 2 and for that reason they are not used when calculating the recall.

The requirement OAD-MMS_CONT-012-2 from OA rev. 2 was a changed requirement but the original requirement, which had the same identifier, OAD-MMS_CONT-012-2 from OA rev. 1 was not found. When the input to the similarity measure algorithm was studied, it was found that only one word was identical in both requirements, which resulted in a similarity measure equal to 0,1768. The fifth ranked requirement had a similarity value that was 0,2357. It had two words that existed in the requirement from OA rev. 2 and this resulted in a higher similarity value than OAD-MMS_CONT-012-2 even though it contained more words.

Now OA rev. 1 was compared with OA rev. 2 in order to see if the program would get a match between the above mentioned requirements. Unfortunately the result was negative, i.e. no match.

When the results were studied further it could be seen that four of the wanted requirements did not end up in first place. There are three reasons why a wanted requirement does not end up in first place; first it may have to few words that are identical with the ones from the requirement it was compared to; second it may contain to many words resulting in high denominator value which gives a low similarity value; third a requirement may end up in a position below position number one even though it has the same similarity value as the requirement/s above because its place in the data file is below the top ranked requirement.

7.1.2 Normalized Matching Model

The Normalized Matching model also got a recall that was 99,6%, which was calculated in the same way as for the Cosine model. This result can be seen in Table 7.2. The requirement that was not found was the same requirement that troubled the Cosine model. When the result file was studied it was seen that all the top ranked requirements contain two words that are identical with the ones from OAD-MMS-CONT-012-2 rev. 2 and OAD-MMS-CONT-012-2 from OA rev. 1 only had one word that was identical.

In result file, which contained the results after a comparison between OA rev. 1 and OA rev. 2, it was seen that the Normalized Matching model could not find OAD-MMS-CONT-012-2.

A further analysis of the results showed that there were 10 requirements that did not make to the top of the list. The main reason, 8 out of 10, was that the requirements were placed after the highest ranked ones in the data file. The others had not enough words that were identical with the requirements from OA rev. 2.


53

7.1.3 Manual Analysis versus Semi Automatic Analysis

Alongside with the semi automatic analysis a manual analysis was conducted by the Messaging function group. It took them approximately 20 hours to complete the manual compliance process. They estimated that OA rev. 2 consisted of 65% identical requirements. Their “identical” requirements probably contained some requirements that had changed a little since OA rev. 1.

It took roughly 13,1 hours to do the semi automatic compliance process for one similarity measure model. Note that this process included a manual analysis of the results from one execution but did not include stating the compliance of new and changed requirements. It is difficult to estimate the time needed for stating the compliance of new and changed requirements since the complexity varies a lot. The time it can take to state the compliance of a single requirement varies between seconds up to hours since it depends on the complexity of the requirement and on how experienced the requirement engineer is.

7.2 Test 2, SMS requirements from Operator A and other Operators

SMS requirements were extracted from Messaging’s database and divided into requirements from Operator A and requirements from other operators. There were 36 requirements from Operator A and 192 requirements from other operators. In Table 7.1 it can be seen that the extraction and conversion time was very short, approximately 5 minutes for each set. The time it takes to extract requirements from Excel sheets do not depend on the amount of requirements since it is easy to extract requirements from these. The requirement sets were chosen to see if the program could get a good match between requirements that has the same meaning but are stated differently. The Cosine model and the Normalized Matching model, with binary weighting, were used.

7.2.1 Cosine Model

The key for this test can be found in Appendix D. The requirements to the left are OA requirements. To the right of them are the corresponding requirements, which are written by other operators. The requirements marked with grey are the ones that the program was not able to find. Those OA requirements that did not have any equivalent requirements, amongst the other requirements, were left out. If Appendix D is studied further it can be seen that there are six relevant requirements for requirement SMS160 resulting in a maximum recall rate value, for the first execution, equivalent to 97,7%.

The recall after the first execution, where OA requirements were compared with requirements from other operators, became 52,3% as stated in Table 7.2. The second execution, a comparison between requirements from other operators and OA requirements, were conducted in an attempt to increase the recall. This attempt was successful as 8 new requirements were found by the program. The two tests together managed to get a recall of 70,5%.

Low similarity measures, due to small amounts of identical words together with to many words, were the reason why certain requirements were not found by the similarity measure algorithm.


54

7.2.2 Normalized Matching Model

The first test compared OA requirements with requirements from other operators. The corresponding key for this test is placed in Appendix E. The OA requirements are placed to the left; requirements from other operators are placed to the right of the OA requirements. Requirements that were not found are colored red. OA requirements with no equivalent requirements are not stated. As for the Cosine model, the recall, for the first execution, has as maximum limit of 97,7%.

The first execution managed to get a recall of 45,5%, which is stated in Table 7.2. A second execution was performed to see if this ratio could be increased. Requirements written by other operators were set up against requirements from OA in this execution. 11 new requirements got a match resulting in a recall of 70,5%.

Those requirements that were not found by the program often had the same similarity measures as those who received a high ranking, but they were placed after them in the data file. Another reason was that some requirements contained a too small amount of words that were identical with the requirement which they were compared to.

7.3 Similarity Measure Model Summary

Table 7.2 contains the recall rates for the two similarity measure models in the two tests. The recall rates come from comparing OA rev. 2 and OA rev. 1 and from a comparison between SMS requirements from Operator A and SMS requirements from other operators. For test 1 the two models got the same recall rate but the Cosine model had more top ranked requirements amongst the retrieved ones. They probably got the same recall rate since the RFI’s had a large dimension of dependence between them. The Cosine model got a better result, than the Normalized Matching model, in test 2.

The results from test 2 are compared with each other in Appendix F. The requirements with thick box border come from Operator A and the ones listed beneath come from other operators. The grey marked requirements are the ones not found by the algorithm. The models often retrieved and missed the same requirements but the Cosine model got better result for SMS159, SMS164 and SMS187 and the Normalized Matching model got better result for SMS167. Since the Cosine model got the overall best results it was decided that this algorithm was to be used in the following tests. Operator requirements seemed to fit the Cosine model better. These tests were conducted to see if the Cosine model with other weighting algorithms could manage to find similar and/or contradictory requirements better.


55

7.4 Test 3, Operator B revision 1 and 2

Since Operator B states their requirements very differently compared to Operator A, a test was conducted with messaging requirements from two Operator B RFI releases. 63 requirements were extracted from Operator B revision 2 and 58 requirements were extracted from Operator B revision 1. Both requirement sets were treated by the NLP programs. Finally, RFI 2 was compared with RFI 1. The manual examination of the requirement sets and the manual analysis of the results were done by Niclas Forsberg. The time it took to complete these activities is stated in Table 7.2. As for the test with Operator A, the goal was to see if the Cosine model, using three different weighting models, was able to find old, changed and contradictory requirements. The requirement names were added to aid in the manual analysis but do not take part in the similarity measure calculations. It took approximately one hour to extract the requirements from the Operator B revision 2 RFI and one hour to extract the requirements from the Operator B revision 1 RFI. These data can be found Table 7.1.

The recall rates for the three different weighting algorithms are found in Table 7.2.

Binary Weighting

The Cosine model, using binary weighting, was able to retrieve all wanted requirements, giving a recall rate of 100% and all wanted requirements were top ranked.

Term Frequency Weighting

The recall for this test became 100% since all requirements were found by the algorithm. All requirements were also top ranked in the result file.

Logarithmic Term Frequency Weighting

The result for the Cosine model, using logarithmic term frequency weighting, was as good as the results from the previous tests with these requirements sets. In other words the recall became equal to 100% and all requirements received the highest ranking possible.

7.5 Test 4, Operator B revision 2 and Operator A revision 2

In these test rounds the requirement sets from Operator B revision 2 and Operator A revision 2 were used. The amount of requirements was 434 from Operator A and 62 from Operator B. These extracted requirements were messaging requirements. As can be seen in Table 7.1 it did not take any time to extract and convert these requirement sets since it had been done in previous tests.

Operator B requirements were compared with Operator A requirements. Also here, Niclas Forsberg stood for the manual examination of the requirement sets and the manual analysis of the results. The time it took to complete these activities are found in Table 7.2. The main purpose was to see if any of the three different weighting algorithms received a better recall rate than the others, since the test was conducted with the Cosine model and the three different types of weighting. The result files contained, besides the requirement description, requirement name, product name, project name, compliance and comment. The requirement name was stated before the description and the product name, project name, compliance and comment comes after the description. It was only the descriptions that were used when matching; the other information was there to aid in the manual analysis.


56

Binary Weighting

The key for this test is located in Appendix G and when it is studied it can be seen that there are 8 relevant requirements for requirement R5 resulting in that recall could not achieve a higher value than 91,4%. The requirements with a thick box border around them originate from the Operator B and those listed beneath are Operator A requirements. Requirements marked with grey are those who were not found by the program. Note that requirements, which did not have any similar or contradictory relatives, where left out. The recall for this test became 42,9% as stated in Table 7.2.

Term Frequency Weighting

The Cosine model, with term frequency weighting, only managed to get a recall rate of 40,0% (Table 7.2), which was a bit worse than for binary weighting. The recall was calculated from the results in the key file in Appendix H. The recall rate had the same limit as for binary weighting. The Operator B requirements have thick box borders round them and the requirements, which were not found by the program, are colored red.

Logarithmic Term Frequency Weighting

The recall for this test, found in Table 7.2, became 45,7%. This weighting algorithm produced the best result. The key file is located in Appendix I. As earlier stated the recall could not be higher than 91,4%. The requirements that were not found by the program are marked with grey and the Operator B requirements have a thick box border round them.

7.6 Weighting Algorithm Summary

Table 7.2 contains the recall rates for the three weighting algorithms in test 3 and test 4. The recall rates come from comparing Operator B revision 2 and Operator B revision 1 and from a comparison between Operator B revision 2 and Operator A revision 2. For test 3 the three algorithms got the same recall rate, 100%, which were good but it did not give any help on deciding the best weighting algorithm. The LFTW algorithm got the best result in test 4.

In Appendix J, the results from the three weighting algorithms, in test 4, are compared with each other. They have often retrieved the same requirements but the results vary for requirements R5, R8, R25, R53 and R62. The largest variation is for R5. R5 had eight similar requirements in the Operator A RFI. Despite this, the TFW model only managed to find one requirement and the LTFW model found three. When the deviant results were studied and compared with each other, it became clear that TFW model favored requirements with lots of occurrences of frequent Operator B terms. This lead to the fact that many “false” requirements got a high ranking. The BTW algorithm also got a better result for R62; the TFW and LTFW algorithms got better results for the other requirements.


57


58

8 Conclusions

The proposed methods worked well, recall rate ~100%, when comparing two RFI releases from one single operator. The results from tests with different operators, recall rate ~45%, were not as good as the results from testing with one operator. The high recall rates gained from the tests with RFIs from one operator were, more or less, expected since RFIs originating from one operator has a large dimension of dependence between them. The recall rates achieved from the tests with requirements from different operators were expected to be significantly lower since they formulate their requirements differently.

The outcome is that the Cosine model with logarithmic term frequency weighting gave the overall best results but the binary weighting Cosine model was a close runner up. Note that the results are not statistically secure and are to be considered as an indicator.

It is believed that time and effort can be reduced, after studying the time data in Table 7.2, if the semi automatic method is used instead of the strictly manual one when dealing with large amounts of natural language requirements. Exactly how much is impossible to say but with an experienced requirement engineer maybe up to 20-25%.

Sony Ericsson should only use the semi automatic method, in its current version, to compare requirements from one operator since the tests with requirements from one operator gave good results, both in terms of time and recall rate. At the moment it is not recommendable to use this method to compare requirement sets originating from two, or more, operators. The time it takes to conduct this method in combination with the recall rates from these tests makes it not recommendable to use the semi automatic method in this case, since the time and effort put in to the compliance process will probably not be reduced.

If SEMC decides to use the semi automatic method to compare new and old RFIs from one operator they will be able to sort out old requirements so that the AoEs do not have to state the compliance for them again. SEMC will also be able to match old requirements with their new version. This may help the AoEs to state the compliance for these requirements.

8.1 Limitations & Usability

One problem with this semi automatic method is that it takes quite some time to complete this method. When dealing with RFIs that are not in Excel format then it takes even more time since the extraction of the requirements becomes much more time consuming when dealing with RFIs in other formats like PDF or Word.

The lack of a well working operator requirements database reduces the usability significantly.

The major problem is that similar requirements can be written with different words. The program has problems matching these kinds of requirements. There is also a problem when incoming RFIs contain misspellings, because the program can not handle these.


59

Since the program has its technical limitations, it misses some requirements when comparing two requirement sets from two operators. To find the missed requirements, a complete manual examination of the sets must be made. If the manual analysis of the program’s results state that there are no equivalent requirements, the person performing the analysis can not be sure that this is the case and he/she has to examine the set which the incoming requirements were compared to.

The program may, in its current version, be an aid in the process of handling RFIs but can not be trusted completely due to the complex nature in which it is supposed to be used.

8.2 Recommendations of Further Development

Further testing is needed to statistically secure the results.

It would be interesting to examine how well other similarity measure models would succeed and if additional NLP methods (e.g. thesaurus and frequency lists) could help increase the recall rate.

The program parts should be merged together into one program, which should be more user-friendly.

In the future the program should be integrated with a requirement database or a requirement engineering tool.


60

9 References

Literature

[1] Kindlein, M (2003)

Handling of Requests for Information (RFI) – Experiences and Proposals, Internal document within Sony Ericsson Mobile Communications AB

[2] Chowdhury, G. G. (1999)

Introduction to modern information retrieval, London: Library Association Publishing

ISBN: 1-85604-318-5

[3] Frakes, W. B. & Baeza-Yates, R. (1992)

Information retrieval, data structures & algorithms, New Jersey: Prentice Hall

ISBN: 0-13-463837-9

[4] Sommerville, I. (2001)

Software engineering, 6th ed., Essex Pearson Education Limited

ISBN: 0-201-39815-X

[5] Baeza-Yates, R. & Ribeiro-Neto, B. (1999)

Modern information retrieval, Harlow: Addison Wesley

ISBN: 0-201-298829-X

[6] Mitkov, R. (2003)

The Oxford handbook of computational linguistics, Oxford: Oxford University Press

ISBN: 0-19-823882-7

[7] Manning, D. C. & Schütze (2001)

Foundations of statistical natural language processing, Cambridge: MIT Press

ISBN: 0-262-13360-1

[8] Karlsson, E.-A. (2004)

SWEP – SW engineering process basic course, Internal document within Sony Ericsson Mobile Communications AB


61

[9] Leach R. J. (2000)

Introduction to software engineering, Boca Raton: CRC Press LLC

ISBN: 0-8493-1445-3

[10] Pfleeger, S. L. (1998)

Software engineering, theory and practice, New Jersey: Prentice Hall

ISBN: 0-13-624842-X

[11] Pressman, R. S. (1992)

Software engineering, a practitioner’s approach, Singapore: McGraw-Hill

ISBN: 0-07-050-814-3

[12] Robertson, S. & Robertson, J. (1999)

Mastering the requirements process, Cloth: Addison-Wesley

ISBN: 0-201-36046-2

[13] Lauesen, S. (2000)

Software requirements, styles and techniques, Fredriksberg: Samfundslitteratur

ISBN: 87-593-0794-3

[14] Lesk, M. E. & Schmidt, S (1975)

LEX – A lexical analyzer generator, Computer Science Technical Report, volume 39, Murray Hill: AT&T Bell laboratories

[15] Natt och Dag, J., Regnell, B., Carlshamre P:, Andersson, M. & Karlsson, J. (2002)

A feasibility study of automated natural language requirements analysis in market-driven development, Requirements Engineering, volume 7, London: Springer-Verlag London Limited

[16] Minnen, G., Carrol, J. & Pearce D. (2001)

Applied morphological processing of English, Natural Language Engineering, volume 7(3), Brighton: Cambridge

Electronic Resources

[17] Natt och Dag, J. (2002)

Automated requirements analysis, http://serg.telecom.lth.se/personnel/johannod/demo_reqanalysis/nattochdag_LUCAS_demo_1_req_analysis.pdf


62

[18] http://dictionary.reference.com

[19] http://thesaurus.reference.com

[20] Ribarsky, W. (2003)

http://www.cc.gatech.edu/classes/AY2004/cs1050b_fall/3_DNF.ppt

[21] http://seldweb01.sonyericsson.net/business_viewer/sw/

Internal webpage within Sony Ericsson Mobile Communication s AB

[22] Boehm, B. & Hansen, W. J. (2001)

The spiral model as a tool for evolutionary acquisition, http://www.stsc.hill.af.mil/crosstalk/2001/05/boehm.html

[23] Garigliano, R. (2003)

JNLE Editorial, http://www.dur.ac.uk/~dcs0www3/lnle/editorial.htm


63


64

Glossary

CTO Chief Technology Officer. Handles the communication with platform developers.

DU Development Unit.

FG Function Group. A division within SEMC, which works with a specific area such as messaging, acoustics or mechanics etc.

GPM Global Product Management. Handles the communication with large operators.

IR Information Retrieval.

Lexicology The science of the derivation and signification of words.

Morphology The study of the structure and form of words in language or a language, including inflection, derivation, and the formation of compounds.

MS Milestone. A milestone is an end-point of a software process activity. The formal output should be a report of some kind. Milestones are used to keep track of the process.

NLP Natural Language Processing.

PAP Product & Application Planning. A division within SEMC that plans and manages product development.

PRS Product Requirement Specification. The PRS contain high level requirements and is produced by the PP division.

Query A formal statement of an information need.

SIA Software Impact Analysis. An investigation to see how the requirement modules are affected.

SMS Short Message Service. An application in mobile phones which lets the user send small text messages to mobile phone users.

SRS System Requirements Specification. Contains both software and hardware requirements. The requirements in this specification can be very detailed.

SSD Software System Design. A document which describes the design of the software system.

Stakeholder Refers to anyone who has any knowledge about a system and therefore should be have some kind of influence on the system requirements.

Token A character with collective significance.


65

TRS Technical Requirements Specification. The TRS contains platform specific requirements.

TWG Technical Work Group. Works with roadmaps (i.e. future functions).

UCD User Centric Design. User interface design with focus on user friendliness.


66

Appendix

Appendix A – Flex Rules File

DIGIT [0-9]

CHAR [a-zA-ZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ]

FLOAT {DIGIT}+"."{DIGIT}+

STANDARD {CHAR}+({CHAR}|{DIGIT})+"."({CHAR}|{DIGIT} )+("."({CHAR}|{DIGIT})+)*

REFERENCE (({CHAR}|{DIGIT})+"-"{DIGIT}+)+

TOKEN {CHAR}|{CHAR}+({CHAR}|{DIGIT})+|{FLOAT}|{STAN DARD}|{REFERENCE}

DELIMITER [^0-9a-zA-ZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûü”þÿ\n\t\"]

%x row

%x field

%x fieldst

%{ int i; int quote; int quotelast; %}

%%

[\"] ; /* Remove quotation marks */

[\n] ECHO; BEGIN(row); /* Ignore/output first row containing labels */

<row>[\"] BEGIN(fieldst);

<row>[\t] for(i=0; yytext[i]!=0; i++) { putchar(tolower(yytext[i])); }; BEGIN(fieldst); /* output identifier */

<fieldst>[\"] quote=1; quotelast = 1; BEGIN(field); /* Increase quotation counter and ignore quote */ <fieldst>{DELIMITER}+ ; /* remove initial delimiter characters */ <fieldst>{TOKEN} quotelast = 0; for(i=0; yytext[i]!=0; i++) {putchar(tolower(yytext[i])); }; BEGIN(field); <fieldst>[\n] printf("\n"); BEGIN(row); /* If last field is empty! */ <fieldst>[\t] ECHO; BEGIN(row); /* If field is empty! */

<field>[\"] if (quotelast) { quote++; quotelast = 0; } else { quote--; quotelast = 1; } <field>{DIGIT}+ ; /* Remove digit tokens */ <field>{TOKEN} quotelast = 0; for(i=0; yytext[i]!=0; i++) { putchar(tolower(yytext[i])); } <field>{DELIMITER}+[\n]{DELIMITER}+ quotelast = 0; if (quote>0) { printf(" "); } else {“printf("\n"); BEGIN(row); } <field>{DELIMITER}+[\t] quotelast = 0; if (quote>0) { printf(" "); } else {“printf("\t"); BEGIN(fieldst); } <field>{DELIMITER}+[\n] quotelast = 0; if (quote>0) { printf(" "); } else {“printf("\n"); BEGIN(row); } <field>{DELIMITER}+ quotelast = 0; printf(" "); <field>[\t] quotelast = 0; if (quote>0) { printf(" "); } else { ECHO; BEGIN(fieldst); } <field>[\n] quotelast = 0; if (quote>0) { printf(" "); } else { ECHO; BEGIN(row); }

%%


67

Appendix B – List of Stopwords

a behind everybody has lets numbers puts such uses

about being everyone have like o q support v

above beings everything having likely of quite sure w

across best everywhere he long off r t want

after better f her longer often rather take wanted

again between face here longest old really taken wanting

against big faces herself m older right than wants

all both fact high made oldest right that was

almost but facts higher make on room the way

alone by far highest making once rooms their ways

along c felt him man one s them we

already came few himself many only said then well

also can find his may open same there wells

although cannot finds how me opened saw therefore went

always case first however member opening say these were

among cases for i members opens says they very

an certain four if men or second thing what

and certainly from important might order seconds things when

another clear full in more ordered see think where

any clearly fully interest most ordering seem thinks whether

anybody come further interested mostly orders seemed this which

anyone could furthered interesting mr other seeming those while

anything d furthering interests mrs others seems though who

anywhere did furthers into much our sees thought whole

are differ g is must out several thoughts whose

area different gave it my over shall three why

areas differently general its myself p she through will

around do generally itself n part should thus with

as does get j necessary parted show to within

ask done gets just need parting showed today without

asked down give k needed parts showing together work

asking downed given keep needing per shows too worked

asks downing gives keeps needs perhaps side took working

at downs go kind new place sides toward works

away during going knew never places since turn would

b e good know newer point small turned x

back each goods known newest pointed smaller turning y

backed early got knows next pointing smallest turns year

backing either great l no points so two years

backs end greater large nobody possible some u yet

be ended greatest largely non present somebody under you

became ending group last none presented someone until young

because ends grouped later not presenting something up younger

become enough grouping latest nothing presents somewhere upon youngest

becomes even groups least now problem state us your

been evenly h less nowhere problems states use yours

before ever had let number put still used z

began every


68

Appendix C - Validation data

Cosine model, using binary weighting, similarity measures by hand O

AF

_M

MS

_3

-1

OA

F_

IT_2

-1

OA

D-M

MS

_C

ont

-01

3

OA

D-M

MS

_M

sgH

d-0

22

OA

D-M

MS

_Usa

-031

OA

D-M

MS

_T

ran

-01

6

OA

D-E

F-0

008

OA

F_

IM_

2-2

OAF-MMS_1-2 0,3780 0,1925 0 0,1179 0 0,1054 0 0,1260 OAM-SL_MS_4-1 0,2182 0,3333 0 0,2041 0 0,1826 0 0,2182

OAD-MMS_PLG-004-2 0 0,1021 0 0 0,1336 0 0 0 OAF-IM_7-1 0 0,1491 0 0 0,0976 0,0816 0 0

OAD-MMS_Conf-0012 0 0,0801 0 0 0 0 0 0 OAD-PCS_MF-003 0,1952 0,2981 0 0,0913 0 0,0816 0 0,0976

OAD-MMS_Tran-003 0 0,0913 0 0 0,2390 0 0 0

Table A – Cosine model, binary weighting, similarity measures computed manually

Normalized Matching model, using binary weighting, similarity measures by hand O

AF

_M

MS

_3

-1

OA

F_

IT_

2-1

OA

D-M

MS

_C

ont

-01

3

OA

D-M

MS

_M

sgH

d-0

22

OA

D-M

MS

_Usa

-031

OA

D-M

MS

_T

ran-

016

OA

D-E

F-0

00

8

OA

F_

IM_

2-2

OAF-MMS_1-2 0,375 0,25 0 0,125 0 0,125 0 0,125 OAM-SL_MS_4-1 0,1667 0,3333 0 0,1667 0 0,1667 0 0,1667

OAD-MMS_PLG-004-2 0 0,5 0 0 0,5 0 0 0 OAF-IM_7-1 0 0,5 0 0 0,25 0,25 0 0

OAD-MMS_Conf-0012 0 1 0 0 0 0 0 0 OAD-PCS_MF-003 0,2222 0,4444 0 0,1111 0 0,1111 0 0,1111

OAD-MMS_Tran-003 0 0,3333 0 0 0,6667 0 0 0

Table B – Normalized Matching model, binary weighting, similarity measures computed manually


69

Cosine model, using term frequency weighting, similarity measures by hand O

AF

_MM

S_3

-1

OA

F_I

T_

2-1

OA

D-M

MS

_Con

t-01

3

OA

D-M

MS

_Msg

Hd

-02

2

OA

D-M

MS

_Usa

-031

OA

D-M

MS

_Tra

n-0

16

OA

D-E

F-0

008

OA

F_I

M_

2-2

OAF-MMS_1-2 0,3904 0,2 0 0,1826 0 0,1633 0 0,0976 OAM-SL_MS_4-1 0,2182 0,2981 0 0,2041 0 0,1826 0 0,2182

OAD-MMS_PLG-004-2 0 0,1826 0 0 0,4472 0 0 0 OAF-IM_7-1 0 0,1217 0 0 0,1491 0,1491 0 0


OAD-MMS_Tran-003 0 0,1826 0 0 0,5217 0 0 0

Table C - Cosine model, term frequency weighting, similarity measures computed manually

Cosine model, using logarithmic term frequency weighting, similarity measures by hand O

AF

_M

MS

_3

-1

OA

F_

IT_2

-1

OA

D-M

MS

_C

ont

-01

3

OA

D-M

MS

_M

sgH

d-02

2

OA

D-M

MS

_Usa

-031

OA

D-M

MS

_T

ran

-01

6

OA

D-E

F-0

00

8

OA

F_

IM_

2-2

OAF-MMS_1-2 0,3904 0,2 0 0,1826 0 0,1633 0 0,0976 OAM-SL_MS_4-1 0,2182 0,2981 0 0,2041 0 0,1826 0 0,2182

OAD-MMS_PLG-004-2 0 0,1549 0 0 0,3795 0 0 0 OAF-IM_7-1 0 0,1217 0 0 0,1491 0,1491 0 0


OAD-MMS_Tran-003 0 0,1685 0 0 0,4927 0 0 0

Table D - Cosine model, logarithmic term frequency weighting, similarity measures computed manually


70

Cosine model, using binary weighting, similarity measures by program O

AF

_MM

S_3

-1

OA

F_I

T_

2-1

OA

D-M

MS

_Con

t-01

3

OA

D-M

MS

_Msg

Hd

-02

2

OA

D-M

MS

_Usa

-031

OA

D-M

MS

_Tra

n-0

16

OA

D-E

F-0

008

OA

F_I

M_

2-2

OAF-MMS_1-2 0,3780 0,1925 0 0,1179 0 0,1054 0 0,1260 OAM-SL_MS_4-1 0,2182 0,3333 0 0,2041 0 0,1826 0 0,2182

OAD-MMS_PLG-004-2 0 0,1021 0 0 0,1336 0 0 0 OAF-IM_7-1 0 0,1491 0 0 0,0976 0,0817 0 0


OAD-MMS_Tran-003 0 0,0913 0 0 0,2390 0 0 0

Table E - Cosine model, binary weighting, similarity measures computed by the program

Normalized Matching model, using binary weighting similarity measures by program O

AF

_M

MS

_3

-1

OA

F_

IT_2

-1

OA

D-M

MS

_C

ont

-01

3

OA

D-M

MS

_M

sgH

d-02

2

OA

D-M

MS

_Usa

-031

OA

D-M

MS

_T

ran

-01

6

OA

D-E

F-0

00

8

OA

F_

IM_

2-2

OAF-MMS_1-2 0,375 0,25 0 0,125 0 0,125 0 0,125 OAM-SL_MS_4-1 0,1667 0,3333 0 0,1667 0 0,1667 0 0,1667

OAD-MMS_PLG-004-2 0 0,5 0 0 0,5 0 0 0 OAF-IM_7-1 0 0,5 0 0 0,25 0,25 0 0

OAD-MMS_Conf-0012 0 1 0 0 0 0 0 0 OAD-PCS_MF-003 0,2222 0,4444 0 0,1111 0 0,1111 0 0,1111

OAD-MMS_Tran-003 0 0,3333 0 0 0,6667 0 0 0

Table F - Normalized Matching model, binary weighting, similarity measures computed by the program


71

Cosine model, using term frequency weighting, similarity measures by program O

AF

_MM

S_3

-1

OA

F_I

T_

2-1

OA

D-M

MS

_Con

t-01

3

OA

D-M

MS

_Msg

Hd

-02

2

OA

D-M

MS

_Usa

-031

OA

D-M

MS

_Tra

n-0

16

OA

D-E

F-0

008

OA

F_I

M_

2-2

OAF-MMS_1-2 0,3904 0,2 0 0,1826 0 0,1633 0 0,0976 OAM-SL_MS_4-1 0,2182 0,2981 0 0,2041 0 0,1826 0 0,2182

OAD-MMS_PLG-004-2 0 0,1826 0 0 0,4472 0 0 0 OAF-IM_7-1 0 0,1217 0 0 0,1491 0,1491 0 0


OAD-MMS_Tran-003 0 0,1826 0 0 0,5217 0 0 0

Table G - Cosine model, term frequency weighting, similarity measures computed by the program

Cosine model, using logarithmic term frequency weighting, similarity measures by program O

AF

_M

MS

_3

-1

OA

F_

IT_2

-1

OA

D-M

MS

_C

ont

-01

3

OA

D-M

MS

_M

sgH

d-02

2

OA

D-M

MS

_Usa

-031

OA

D-M

MS

_T

ran

-01

6

OA

D-E

F-0

00

8

OA

F_

IM_

2-2

OAF-MMS_1-2 0,3904 0,2 0 0,1826 0 0,1633 0 0,0976 OAM-SL_MS_4-1 0,2182 0,2981 0 0,2041 0 0,1826 0 0,2182

OAD-MMS_PLG-004-2 0 0,1549 0 0 0,3795 0 0 0 OAF-IM_7-1 0 0,1217 0 0 0,1491 0,1491 0 0


OAD-MMS_Tran-003 0 0,1685 0 0 0,4927 0 0 0

Table H - Cosine model, logarithmic term frequency weighting, similarity measures computed by the program


72

Appendix D – Operator A vs other Operators, Key for Cosine Model

Operator A Other operators SMS157 SMS045 SMS231 SMS158 SMS045 SMS159 SMS047 SMS105 SMS193 SMS049 SMS160 SMS211 SMS074 SMS 134 SMS046 SMS123 SMS004

SMS161 SMS001 SMS002 SMS164 SMS205 SMS165 SMS050 SMS166 SMS095 SMS167 SMS096 SMS097 SMS037 SMS168 SMS051 SMS169 SMS049 SMS105 SMS193 SMS171 SMS096 SMS206 SMS037 SMS097 SMS172 SMS082 SMS085 SMS018 SMS030 SMS062 SMS173 SMS193 SMS175 SMS089 SMS183X SMS197 SMS200 SMS183 SMS052 SMS054 SMS185 SMS096 SMS 146 SMS187 SMS142 SMS107


73

Appendix E – Operator A vs other Operators, Key for Normalized Matching Model

Operator A Other operators SMS157 SMS045 SMS231 SMS158 SMS045 SMS159 SMS047 SMS105 SMS193 SMS049 SMS160 SMS211 SMS074 SMS 134 SMS046 SMS123 SMS004

SMS161 SMS001 SMS002 SMS164 SMS205 SMS165 SMS050 SMS166 SMS095 SMS167 SMS096 SMS097 SMS037 SMS168 SMS051 SMS169 SMS049 SMS105 SMS193 SMS171 SMS096 SMS206 SMS037 SMS097 SMS172 SMS082 SMS085 SMS018 SMS030 SMS062 SMS173 SMS193 SMS175 SMS089 SMS183X SMS197 SMS200 SMS183 SMS052 SMS054 SMS185 SMS096 SMS 146 SMS187 SMS142 SMS107


74

Appendix F – Comparison of Similarity Measure Models

Cosine model Normalized Matching model SMS157 SMS157

SMS045 SMS045

SMS231 SMS231

SMS158 SMS158

SMS045 SMS045

SMS159 SMS159

SMS047 SMS047

SMS105 SMS105

SMS193 SMS193

SMS049 SMS049

SMS160 SMS160

SMS211 SMS211

SMS074 SMS074

SMS 134 SMS 134 SMS046 SMS046

SMS123 SMS123

SMS004 SMS004

SMS161 SMS161

SMS001 SMS001

SMS002 SMS002

SMS164 SMS164

SMS205 SMS205

SMS165 SMS165

SMS050 SMS050

SMS166 SMS166

SMS095 SMS095

SMS167 SMS167

SMS096 SMS096

SMS097 SMS097

SMS037 SMS037

SMS168 SMS168

SMS051 SMS051

SMS169 SMS169

SMS049 SMS049

SMS105 SMS105

SMS193 SMS193


75

SMS171 SMS171

SMS096 SMS096 SMS206 SMS206

SMS037 SMS037

SMS097 SMS097

SMS172 SMS172

SMS082 SMS082

SMS085 SMS085

SMS018 SMS018

SMS030 SMS030

SMS062 SMS062

SMS173 SMS173

SMS193 SMS193

SMS175 SMS175

SMS089 SMS089

SMS183X SMS183X

SMS197 SMS197

SMS200 SMS200

SMS183 SMS183

SMS052 SMS052

SMS054 SMS054

SMS185 SMS185

SMS096 SMS096 SMS 146 SMS 146

SMS187 SMS187

SMS 142 SMS 142

SMS107 SMS107


76

Appendix G – Operator B vs Operator A, Key for Cosine Model, Binary Weighting

R1 R25

OAD-MMS_Conf-0003-2 OAF-IT_2-1 OAD-EF-0009

R4 OAD-EF-0010

OAD-IM_PROT-005

R38

R5 OAF-UM_5-1

OAD-MMS_Tran-010-2

OAD-MMS_Tran-024-2 R39

OAD-MMS_Tran-021-2 OAF-UM_13-1 OAD-MMS_Tran-022-2

OAF-MMS_40-4 R46

OAF-MMS-88-1 OAF-UM_13-1 OAF-MMS-89-1

OAF-MMS-90-1 R49

OAD-IM_CONF-001

R6

OAD-MMS_Cont-027 R53

OAD-MMS_Cont-028 OAF-IM_5-2

R7 R55

OAD-MMS_Usa-043B-2 OAF-IM_4-2

R8 R57

OAD-MMS_PLG-020-2 OAD-IM_PROT-001 OAD-MMS_PLG-021-2

R59

R14 OAF-IM_10-2

OAF-MMS_19-3 OAD-IM_PROT-003

R15 R60


R16 R61

OAD-MMS_Tran-011 OAF-IM_14-3

R19 R62

OAD-MMS_Usa-031 OAD-IM_PROT-005 OAD-MMS_Usa-032-2


77

Appendix H – Operator B vs Operator A, Key for Cosine Model, Term Frequency Weighting

R1 R25


R4 OAD-EF-0010

OAD-IM_PROT-005

R38

R5 OAF-UM_5-1

OAD-MMS_Tran-010-2



OAF-MMS_40-4 R46


OAF-MMS-90-1 R49

OAD-IM_CONF-001

R6



R7 R55


R8 R57


R59

R14 OAF-IM_10-2


R15 R60


R16 R61


R19 R62



78

Appendix I – Operator B vs Operator A, Key for Cosine Model, Logarithmic Term Frequency Weighting

R1 R25


R4 OAD-EF-0010

OAD-IM_PROT-005

R38

R5 OAF-UM_5-1

OAD-MMS_Tran-010-2



OAF-MMS_40-4 R46


OAF-MMS-90-1 R49

OAD-IM_CONF-001

R6



R7 R55


R8 R57


R59

R14 OAF-IM_10-2


R15 R60


R16 R61


R19 R62



79

Appendix J – Comparison of Weighting Algorithms

Binary Term frequency Logarithmic term frequency

R1 R1 R1

OAD-MMS_Conf-0003-2 OAD-MMS_Conf-0003-2 OAD-MMS_Conf-0003-2

R4 R4 R4

OAD-IM_PROT-005 OAD-IM_PROT-005 OAD-IM_PROT-005

R5 R5 R5

OAD-MMS_Tran-010-2 OAD-MMS_Tran-010-2 OAD-MMS_Tran-010-2 OAD-MMS_Tran-024-2 OAD-MMS_Tran-024-2 OAD-MMS_Tran-024-2 OAD-MMS_Tran-021-2 OAD-MMS_Tran-021-2 OAD-MMS_Tran-021-2 OAD-MMS_Tran-022-2 OAD-MMS_Tran-022-2 OAD-MMS_Tran-022-2 OAF-MMS_40-4 OAF-MMS_40-4 OAF-MMS_40-4 OAF-MMS-88-1 OAF-MMS-88-1 OAF-MMS-88-1 OAF-MMS-89-1 OAF-MMS-89-1 OAF-MMS-89-1 OAF-MMS-90-1 OAF-MMS-90-1 OAF-MMS-90-1

R6 R6 R6

OAD-MMS_Cont-027 OAD-MMS_Cont-027 OAD-MMS_Cont-027 OAD-MMS_Cont-028 OAD-MMS_Cont-028 OAD-MMS_Cont-028

R7 R7 R7

OAD-MMS_Usa-043B-2 OAD-MMS_Usa-043B-2 OAD-MMS_Usa-043B-2

R8 R8 R8

OAD-MMS_PLG-020-2 OAD-MMS_PLG-020-2 OAD-MMS_PLG-020-2 OAD-MMS_PLG-021-2 OAD-MMS_PLG-021-2 OAD-MMS_PLG-021-2

R14 R14 R14

OAF-MMS_19-3 OAF-MMS_19-3 OAF-MMS_19-3

R15 R15 R15

OAF-MMS_19-3 OAF-MMS_19-3 OAF-MMS_19-3

R16 R16 R16

OAD-MMS_Tran-011 OAD-MMS_Tran-011 OAD-MMS_Tran-011

R19 R19 R19

OAD-MMS_Usa-031 OAD-MMS_Usa-031 OAD-MMS_Usa-031 OAD-MMS_Usa-032-2 OAD-MMS_Usa-032-2 OAD-MMS_Usa-032-2


80

R25 R25 R25

OAF-IT_2-1 OAF-IT_2-1 OAF-IT_2-1 OAD-EF-0009 OAD-EF-0009 OAD-EF-0009 OAD-EF-0010 OAD-EF-0010 OAD-EF-0010

R38 R38 R38

OAF-UM_5-1 OAF-UM_5-1 OAF-UM_5-1

R39 R39 R39


R46 R46 R46


R49 R49 R49

OAD-IM_CONF-001 OAD-IM_CONF-001 OAD-IM_CONF-001

R53 R53 R53

OAF-IM_5-2 OAF-IM_5-2 OAF-IM_5-2

R55 R55 R55


R57 R57 R57


R59 R59 R59

OAF-IM_10-2 OAF-IM_10-2 OAF-IM_10-2 OAD-IM_PROT-003 OAD-IM_PROT-003 OAD-IM_PROT-003

R60 R60 R60


R61 R61 R61


R62 R62 R62


Documents

Managing large amounts of natural language requirements ...fileadmin.cs.lth.se/serg/old-serg-dok/docs-master... · Managing large amounts of natural language requirements through