Enriching Literature Reviews with Text Mining ToolsCase: Group Support Systems
Ph.D. Johanna Bragge / HSE / Business Technology / ISS
http://www.hse.fi/EN/HKI/B/Johanna_Bragge
The presentation is based on Bragge J., Relander S., Sunikka A. and Mannonen P. (2007) “Enriching Literature Reviews with Computer-Assisted Research Mining. Case: Profiling Group Support Systems Research”, Proceedings of 40th HICSS Conference, Hawaii, USA.
Structure of presentation
• Objectives of our HICSS´07 paper• What is ”research profiling”?• Profiling research on Group Support
Systems• Dimensions of research profiling
– From undergrad class papers and ”one-nighters” to dissertation or other major research projects
– Possibilities of ISI WoS Analysis tool• Conclusions
Objectives of our HICSS´07 paper
• Extend the notion of a traditional literature review
into the domain of research profiling• Present a practical case study
Group (Decision) Support Systems research
• Briefly overview the capabilities of VantagePoint
Text-mining software originally developed at the Georgia Institute of Technology, USA (currently SearchTech Inc.)
”Can we improve on the traditional ’literature review’?”
• Science and technology abstracts are literally at our fingertips in R&D databases
• Search engines enable rapid and effective collection of records relating to one’s research interests
• Analytical software helps elicit useful information from the searches (even if thousands of abstracts) to gain perspective on ones’ research context.
• ”This enhanced literature review – ’research profiling’– should become standard research practice.”Source: Porter, Kongthon and Lu (2002), ”From Traditional Literature Reviews to
’Research Profiling’”, Scientometrics, Vol. 53, No. 3, 351-370. AVAILABLE in SpringerLink database.
Augmenting, not replacing!
• Research profiling aims to augment, not replace, the traditional literature review
– helping to fulfill purposes of understanding the structure of the subject, important variables, pertinent methods, and key needs
• These aims can be better served by analyzing the whole, rather than just a few parts of the research milieu.
– students often limit their perspective to specialized slices of the literature
– research streams often lose connection to other research activities
Porter et al. (2002, p. 351).
Comparison of traditional literature reviews and research profiling
Old (Traditional Literature Review)
New (Research Profiling)
Micro focus (paper-by-paper)
Macro focus (patterns in the literature as a body)
Narrow range (~20 references)
Wide range (~20 – 20,000 references)
Tightly restricted to the topic
Encompassing the topic + related areas
Text discussion Text, numerical, and graphical depiction
Porter et al. (2002, p. 353).
The research profiling process based on Herbert Simon’s Decision Phases
Phase A: Intelligence
(1) Issue Identification(2) Selection of Information
Sources (3) Search Refinement
and Data Retrieval(4) Data Cleaning
Phase B: Analysis & Design
(5) Basic Analyses(6) Advanced
Analyses
Phase C: Choice
(7) Representation(8) Interpretation(9) Utilization
Phase A: Intelligence
(1) Issue Identification(2) Selection of Information
Sources (3) Search Refinement
and Data Retrieval(4) Data Cleaning
Phase B: Analysis & Design
(5) Basic Analyses(6) Advanced
Analyses
Phase C: Choice
(7) Representation(8) Interpretation(9) Utilization
Source: Adapted from Porter and Cunningham (2005).
Fundamental research
Commercial application
- Science Citation Index, ISI
- MEDLINE
- Chem abstracts
- INSPEC (by IEE)
- EI Compendex
- Derwent World Patent Index
- ABI Inform (ProQuest)
- Lexis Nexis
Source: Porter & Cunningham (2005, p. 83)
Selection of databases
Case: Profiling Group Support Systems Research
• Research question: Past, present and future of GSS
– Level of maturity / saturation?– Who? What? When?– What’s hot? Are there any emerging themes? – Etc.
• Search words used:– group support system(s), group decision support
system(s), electronic meeting system(s)• Database used: INSPEC by IEE
– Covers more than 3.850 journals and 2.200 conference proceedings
• Final sample: 2.000 publications from 1982-2005– The sample was collected in April 2006
G(D)SS Publications yearly in INSPEC
0
20
40
60
80
100
120
140
160
180
1982
1984
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
Top-12 Institutions of GSS research
Key affiliations # %
University of Arizona, AZ, USA 93 27 %
National Univ. of Singapore, Singapore 37 11 %
City University of Hong Kong, Hong Kong 32 9 %
New Jersey Inst. of Technology, NJ, USA 32 9 %
University of Mississippi, MS, USA 26 8 %
University of Georgia, GA, USA 25 7 %
Indiana University, IN, USA 19 6 %
University of Minnesota, MN, USA 18 5 %
University of Baltimore, MD, USA 17 5 %
Delft Univ. of Technology, Netherlands 16 5 %
University of Calgary, Canada 16 5 %
Naval Postgraduate School, CA, USA 16 5 %
Total 347 100%
Top-12 Outlets of GSS researchOutlet # % Proceedings of Hawaii International Conf. on System Sciences, IEEE (HICSS) 317 41.8
Decision Support Systems (DSS) 84 11.1 Journal of Management Information Systems (JMIS) 69 9.1
Information and Management (I&M) 57 7.5
European J. of Operational Res. (EJOR) 43 5.7 The International Conference on Systems, Man and Cybernetics, IEEE (SMC) 34 4.5 Proceedings of the Information Resources Management Association International Conference (IRMA) 31 4.1 Journal of Organizational Computing and Electronic Commerce (JOC&EC) 29 3.8
IFIP Working Groups Conference Proc’s 28 3.7
MIS Quarterly (MISQ) 25 3.3 Proceedings of Decision Sciences Institute Annual Meeting (DSI) 22 2.9 IFIP Transactions A: Computer Science and Technology 20 2.6
759 100%
Author [# of publ.] and Affiliation(s)
5 Main outlets 5 Main descriptors (other than
GDSS, DSS or groupware) Temporal publication
activity
Nunamaker, J.F., Jr [85] University of Arizona
Proc. of HICSS [36] JMIS [16] DSS [8]
I & M [3] MISQ [3]
Teleconferencing [15] Human factors [11] Systems analysis [9]
Military computing [6] Social aspects of automation [6]
Vogel, D. R. [51] University of Arizona
(current affiliation City University of Hong Kong)
Proc. of HICSS [18] JMIS [ 7] DSS [4]
MISQ [3] I & M [2]
Teleconferencing [12] Human factors [7]
Social aspects of automation [7] Computer-aided instruction [3]
DP-management [3]
Aiken, M. [42] University of Mississippi
I & M [10] DSS [4]
Proc. of Dec. Sci. Inst. [3] J. of Int. Inf. Mgmt [2]
SIGOIS Bulletin [2]
Human factors [11] Teleconferencing [7] User interfaces [7] Expert systems [6]
Language-translation [5]
Briggs, R. O. [36] University of Arizona
(current affiliation University of Alaska)
Proc. of HICSS [22] JMIS[8]
CACM [1] J. of Educ. Tech. Syst. [1] J. of End-User Comp. [1]
Human factors [9] Social aspects of automation [8]
Business data processing [5] Computer-aided instruction [4]
Military computing [4]
Dennis, A. R. [36] University of Georgia
(current affiliation Indiana University)
Proc. of HICSS [9] JMIS [7] MISQ [7] DSS [2] ISR [2]
Human factors [10] Teleconferencing [8] Systems analysis [5]
Social aspects of automation [4] Idea processors [3]
Wei, K. K. [29] National University of
Singapore (current affil. City U. of Hong Kong)
Proc. of HICSS [5] DSS [3]
EJOR [3] IEEE T. on SCM [3]
I & M[3]
Human factors [12] User interfaces [6]
Social aspects of automation [5] Task analysis [3]
Business data processing [2]
Vreede, G. J. [28] Delft University of
Technology (current affil. University of Omaha)
Proc. of HICSS [12] JMIS [5]
Proc. of ACM SIGCPR [2] Data Base for Adv. in IS [1]
IT for Development [1]
Police data processing [6] Business data processing [5]
Teleconferencing [4] Human factors [3]
Information Technology [3]
Hiltz, S. R. [26] New Jersey Institute of
Technology
Proc. of HICSS [15] JMIS [5] DSS [2]
J. of Org. Comp.& EC [2] MISQ [1]
Teleconferencing [7] Human factors [6]
Social aspects of automation [5] Office automation [2] Systems analysis [2]
Top-7 authors and selected details
Key figures of GSS research in 2000-2005 (from 688 out of 2000 publications)
2000-2005 Top-13 Authors
2000-2005 Top-13 Countries *)
2000-2005 Top-10 Outlets
2000-2005 Top-14 Descriptors
2000-2005 Top-10 Classifications
de Vreede, G. J. [15] (7.) Briggs, R. O. [14] (4.) Nunamaker, J.F. [11] (1.) Dennis, A. R. [8] (4.) Ma, J. [8] (18.) Vogel, D. R. [8] (2.) Huang, W. W. [7] (18.) Kwok, R. C. W. [7] (18.) Reinig, B. A. [7] (18.) Tuominen, M. [7] (27.) Aiken, M. [6] (3.) Hiltz, S. R. [6] (8.) Ito, T. [6] (29.)
(position in the whole sample presented in
parenthesis)
USA [234] China [102] UK [38] Japan [34] Australia [24] Taiwan [24] Netherlands[21] Canada [14] Finland [14] Brazil [12] France [12] Germany [11] Portugal [10] *) Country of publication is determined based on the first author
Proc. of HICSS [82] DSS [26] EJOR [22] Int. Conf. on CSCW in Design [18] JMIS [17] Proc. of IRMA [15] I & M [14] Int. Conf. on Systems, Man and Cybernetics [12] Journal of the OR Society [11] Proc. of World Multiconference on Systemics, Cybernetics and Informatics [10]
Group Decision Support Systems [544] Groupware [126] Decision making [92] Internet [79] Human factors [50] Business data processing [42] Fuzzy set theory [39] Negotiation Support Systems [38] Multi-agent syst.[35] Teleconferencing [33] Social aspects of automation [31] User-interfaces [31] Military comp. [26] Knowledge Management [25]
C7102-Decision-Support-systems [506] C6130G-Groupware [499] C6150N-Distributed-systems-software [100] C7210N-Information-networks [79] C7100-Business-and-administration [63] C6170-Expert-systems-and-other-AI-software-and-techniques [62] C6170K-Knowledge-engineering-techniques [60] C7810C-Computer-aided-instruction [42] C6180-User-interfaces [39] C0230-Economic,-social-and-political-aspects-of-computing [38]
Trends in authors’ keywords (non-technology)
1986-1990
1991-1995
1996-2000
2001-2005
distributed-* 15 65 62 63 group-decision-making 10 30 57 58 decision-making 9 28 35 38 face-to-face* 9 23 28 27 consensus-* 6 31 31 16 virtual-* 0 7 32 35 information-technology 5 25 18 15 idea-generation* 7 10 25 13 web-based-* 0 0 14 33 Internet 0 4 25 18 anonymity 5 10 21 8
1982-1985
1986-1990
1991-1995
1996-2000
2001-2005
Countries
Cited refer. avg.
Journal papers
Conf. papers
OutletsClassifications
PublicationsDescriptors
Authors
0
50
100
150
200
250
300
5-year trends of various terms
Dimensions of research profiling
Less >>>>>>>>>>>>>>>>>>>>>>>>> More
Data availability
Counts only
Restricted download
Single rich dataset
Multiple datasets
Time & resources
”One-nighter”
Limited Rich
Tool availability
Search engine
Text-mining software
Text-mining expertise
None Limited Extensive
Subject expertise
Novice Knowledgeable
Multiple experts
Source: Porter et al. (2002, p. 366)
Problems occur as the authors may be inserted differently in the database!
With VantagePoint these can be cleaned for the purposes of advanced analyses – with simple frequencies ISI works well.
More options can be found from the ”Analyze Results: Analyze” button (in the right-hand column of the main search results page). Also possibility to save records to file for graphs.
Conclusions
• Increasing availability and amount of information– Modern search engines developed
• Research profiling uses sophisticated text mining tools for structured science information resources
– i.e. abstracts from ISI WoS, Ebsco, ProQuest, INSPEC etc.
• Emphasis on content - uncovering research gaps and new scientific domains
– Emphasis not on co-citation analysis (SNA tools better for that)
• Does not replace traditional literature reviews!– Database limitations, e.g. publication delays and non-
standardized contents of different databases
• Questions or comments?
More references on the foundations and applications of research profiling
• Porter, A. L., Kongthon, A. and J.-C. Lu, (2002) "Research Profiling: Improving the Literature Review", Scientometrics, vol. 53, no. 3, pp. 351-370.
• Porter, A. L. and S.W. Cunningham (2005), Tech Mining. Exploiting New Technologies for Competitive Advantage, Wiley Series in System Engineering and Management, New Jersey: John Wiley & Sons, Inc.
• Bragge, J., and Storgårds, J, (2007) “Profiling Academic Research on Digital Games Using Text Mining Tools”, Proceedings of the Digital Games Research Association’s DIGRA Conference, Tokyo, Japan.
• Bragge, J., and Storgårds, J, (2007) “Utilizing Text-Mining Tools to Enrich Traditional Literature Reviews. Case: Digital Games”, Proceedings of the 30th Information Systems Research Seminar in Scandinavia IRIS, Tampere, Finland.
• Sunikka, A. and Bragge, J. (2008) “What, Who and Where: Insight into Personalization”, Forthcoming in the Proceedings of the HICSS´-41, Hawaii, USA, January, 2008