Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
A Recommendation System for Software Function Discovery
Naoki OhsugiSoftware Engineering Laboratory,
Graduate School of Information Science,Nara Institute of Science and Technology
Tuesday 16 December, 2003.International Workshop on Community-Driven Evolution of Knowledge Artifacts
2 of 14
Growth of Software Functions
� Application software is getting more complicated and providing more functions.
� Total number of menu items (Microsoft Office)� Word 2000: 660� Word 2002: 772� Excel 2000: 705� Excel 2002: 792� PowerPoint 2000: 565� PowerPoint 2002: 646
Screenshot of MS-Word 2002
Users can’t find useful functionsfrom too many functions.
3 of 14
Users Could Not Find Some Useful Functions!Subjects: 32 users in our lab.Period: 22 months
Total Number of Different Functions
Maximum Number of Functions Used
Minimum Number of Functions Used
Average Number of Functions Used
Excel2000 Excel2002 PPT2000 PPT2002 Word2000 Word2002
Num
ber o
f Functio
ns
705
792
565
646 660
772
75 83
189147 143
120
10 12 18 31 22 1138 26
80 67 6632
0
100
200
300
400
500
600
700
800
900
10.6% 10.5% 33.5% 22.8% 21.7% 15.5%
1.4% 1.5% 3.2% 4.8% 3.3% 1.4%
5.4% 3.3% 14.2% 10.4% 10.0% 4.1%
4 of 14
A Recommendation System forSoftware Function Discovery� The system recommends individual users a set of
candidate functions, which may be useful.� Our solution is a Collaborative Filtering approach.
Here’s my recommendation:
Tools � Word Count… 21 pts
Insert � Date Time… 20 ptsTools � Thesaurus… 18 ptsInsert � Footnote… 18 ptsTools � Spelling… 17 pts
5 of 14
What is Collaborative Filtering (CF)?�“Collaborative” means using some users’ knowledge for filtering.
�“Filtering” means selecting useful items from large amount of items.
Using some users’ knowledge
F
K
A B D EC
G I JH
L N OM
P Q S TR
Large amount of items
F is good! K is cool!? ?
Selecting useful items
FF
K
K
6 of 14
Voting-based Recommendation Systemswith CF�The systems collect explicit votes as users’ knowledge.
Amazon.com(Book recommendation system)
http://www.amazon.com
MovieLens(Movie recommendation system)
http://www.movielens.umn.edu
7 of 14
Logging Usage as Users’ Knowledge�The proposed system automatically collects the records of
executed functions (Usage logs) as users' knowledge.
�Usage logs are collected from some users via the Internet.
Server of the System
The InternetUser
Application Softwaree.g. MS-Word, Excel
Usage log as shown below:2002/02/03 18:50:41 Formatting->Font…2002/02/03 18:50:45 File->Save As…Log Collector VBA Plug-In
8 of 14
Function AFunction BFunction CFunction D
Function AFunction BFunction CFunction D
Dissimilar usersSimilar users
Step1: Computing Similarities� Computing similarities between the target user and the
other users
Function HFunction IFunction JFunction K
Function EFunction FFunction G
Target userUser 1 User 2 User 3 User 4
Function AFunction BFunction C
Function AFunction AFunction C
Function AFunction BFunction CFunction D
Function AFunction BFunction CFunction D
Function AFunction AFunction C
9 of 14Dissimilar usersSimilar users
Step 2: Delivering Knowledge� Delivering the useful functions candidate, which were
frequently used by the similar users'.
Function AFunction BFunction CFunction D
Function AFunction BFunction CFunction D
Function HFunction IFunction JFunction K
Function AFunction BFunction C
Function EFunction FFunction G
Target user
Function AFunction BFunction CFunction D
Function AFunction BFunction CFunction D
Function AFunction BFunction CFunction D
User 1 User 2 User 3 User 4
Function B
Function D
10 of 14
1234567
User 3
UndoSaveClearCutCopyPasteRedo
60%20%6%5%4%3%2%
1234567
User 2
SaveUndoRedoCopyPasteCutClear
55%25%10%4%3%2%1%
Correlation based similarity +0.41 +0.97
(Range of value [-1.00, +1.00])
� Calculating Similarities by Correlation Coefficient� The dominant frequencies (e.g., “Undo” or “Save”) over-affect similarity
computations.
Conventional Similarity Calculation
1234567
Target user
UndoSaveRedoCopyPasteCutClear
60%20%10%4%3%2%1%
11 of 14Rank correlation based similarity +0.90 +0.05
Better Similarity Calculation� Calculating Similarities by Rank Correlation
� The dominant frequencies ("Undo" & "Save") do not affect similarity computations.
Correlation based similarity +0.41 +0.97
1234567
User 3
UndoSaveClearCutCopyPasteRedo
60%20%6%5%4%3%2%
1234567
User 2
SaveUndoRedoCopyPasteCutClear
55%25%10%4%3%2%1%
1234567
Target user
UndoSaveRedoCopyPasteCutClear
60%20%10%4%3%2%1%
12 of 14
Ndpm[0.0, 1.0]
0.0 is the best1.0 is the worst
Comparison
Evaluating Accuracy of Recommendation
� Yao’s ndpm measure� * Y.Y. Yao, “Measuring Retrieval Effectiveness Based on User Preference of Documents”,
J. of American Society for Information Science, 46, 2, 1995, pp.133-145.
User’s Ideal Recommendation
Interview for user
1. Function A2. Function B3. Function C4. Function D
System
System’s Recommendation
1. Function A2. Function B3. Function C4. Function D
Usage logs6 users
22 months
13 of 14
Experimental Result
0.514
0.404 0.3960.383
0.355
0.2
0.3
0.4
0.5
0.6
Ndpm
Each user’s ndpmAverage of ndpm0.5 of ndpm
User CountRandom Base Case Correlation basedSimilarity
Rank Correlationbased Similarity
Algorithms
Collected usage logs of Ms-Word 2000Subjects: 6 users in our lab.Period: 22 months
14 of 14
Conclusion� I proposed a recommendation system to help users
discover useful functions.
� I evaluated the accuracy of recommendation.� The result suggested the proposed system has a potential
to provide useful recommendation for software function discovery.