Upload
janis-payne
View
213
Download
0
Embed Size (px)
Citation preview
Enterprise & Intranet SearchEnterprise & Intranet Search
• How Enterprise is different from Web search• What to think about when evaluating
Enterprise Search• How Intranet use is different from Web use- And what that means for search
Intranet DifferencesIntranet Differences
• Intranet = content inside the organization• Learning from content, not for commerce• Smaller content collections
- Smaller content subjects- Smaller number of possible tasks or queries
• More document types than the Web (supports)• Filtering could be more applicable• Taxonomies may be present (& understandable)
- Work groups, locations, departments, projects
• Content managed in some way (culture or policy)• Is the goal to discover tacit knowledge?
Differences in Intranet useDifferences in Intranet use
• Bandwidth- Wireless too
• Security- Financial work, Access policies
• New technology- Mobile, high-resolution displays, …
• Legal- Regulation (Sarbanes Oxley)- Privacy
• Cultural- Adoption- Revolution
One week of corporate searchOne week of corporate search
• What are the patterns of search in a corporation?- Big company 70K- 740K documents
• 80% HTML, ~15% PDF- Ultraseek engine
• 11-15 minute search sessions• Small drop in Friday searching• 71% of the 5644 users only active on one work-week
day• User sessions
- 1.2 w 2.47 activities = infrequent- 3.03 w 9.7 activities/day = frequent
• What interfaces & tools could increase use?- Is increased searching a net good for knowledge workers?
Is Enterprise IR different?Is Enterprise IR different?• Application Design
- Webify – Front Ends- Web Services- Application Service Providing- (More) Database Integration
• (Even More) Integration Issues- Content (CMS and Politics)- Quality & Quantity
• Existing Design Guidelines• More Specific Users
- One corporation- The accounting department
• More Definable Goals- Dictated by management- Interaction with (all?) potential users
• Must Use- Use Data- Feedback for Verification
Enterprise SearchEnterprise Search
• Centralized & Measurable- More Return on Investment• Work tasks• Easier to develop than Web-wide search
• Clarified- Consistent- Accurate
• Simplified Technology Platform- More Open to Information Sharing- IA Structures Help Define Organization (Goals)
• Extendable IA System
Intranet Search & Info ExtractionIntranet Search & Info Extraction
• Building a system specifically for knowledge workers, vertical markets & types of users
• Do you think intranet search is different?• IT workers spend 15-35% of work time
searching for information• We need more than relevance as a measure of
specific tasks- Question Answering: specific answers, not
keyword matching- Categorizing user needs• By user? Department? Job? Task?
• Satisficing results vs. the right answers
Information DeskInformation Desk
• Tasks- Term definitions- Homepages for (internal) groups or topics- Experts- Employee contact (personal) info
• Categorization of need- Query text itself- Resulting documents- Selected documents
• Developed a hierarchy
Catergories of Search NeedsCatergories of Search Needs
Analysis of Search NeedsAnalysis of Search Needs
• Query logs- Information & navigational needs- Home pages & Relevance (content)
• Survey- How to’s & Downloads- Technologies, products, services, groups,
projects, people
• More in-depth analysis possible (logs, more questionnaire surveys)
• How different are these needs from Web search?
Challenges in Enterprise SearchChallenges in Enterprise Search
• Google (Web) is the worst enemy of Enterprise search
• Content complexity: dbms, non-linked docs, email, CMS content, access levels, servers/locations
• Ranking becomes more difficult with different document types, metadata, systems
• Do we need Enterprise Metasearch?- Enterprise, Federated, Web content ++ ?- Corporate Web site, intranet, email, company
directory, forms, templates, reports…
Key IR research for EnterprisesKey IR research for Enterprises
• Defining an appropriate enterprise search test collection
• Effective ranking over heterogeneous collections that a characteristic of enterprise environments
• Portals for knowledge workers (intranet & internet?)• Email search• PageRank, relevance measures for internal
documents• Understanding search context• Future considerations for linked, internal media• Multimedia*• Web 2.0 features & document types*• Crawling & updating strategies*
Solutions to Enterprise IRSolutions to Enterprise IR
• Designing linking mechanisms- Based on use or (user generated) metadata- Derive metadata & evaluate automation (e.g. email)
• Navigation in intranets (saves searching)• APIs & open access• Part of records management activities• Intense focus on user evaluation &
development cycles
Final Projects & PapersFinal Projects & Papers
• Use class readings for prove your points• Be daring with your ideas & state why you
think they’re right or interesting• Cite any non-obvious facts• Proof read you writing• Be conscious of writing style & grammar• Use APA or ACM style guidelines
• This should be a good contribution to your portfolio of graduate work.