Upload
isriya-paireepairit
View
3.579
Download
1
Embed Size (px)
DESCRIPTION
An Evaluation of Drupal.org Search System.
Citation preview
Enterprise Search Engine Survey
Isriya Paireepairit
Drupal.org Case
Drupal
• Software
• Content Management System
• Web-based
• PHP
Drupal.org
• Home of Drupal the CMS
• For Drupal users, downloaders, developers
• Definitely use Drupal as CMS
• As well as Drupal Search Function
Drupal.org Content Types
• Projects
• Modules
• Themes
• Translations
• Forums (Support, Discussion, Chit-chat)
• Documents (Manual, Howto)
• Issues (Bugs, Feature Requests)
• API Documents (for Developers)
• User page
• News/Announcement
• As mid April 2008
• Content: 250,000 nodes
• Registered User: 280,000 users
• Page Visits: ~1M/day (Compete.com)
Drupal.org Content Size
Drupal Search Function
• Indexing
• Minimum word length is configurable
• CJK Handling
Drupal Search Function
• Search result ranking
• Weightable
• 3 default factors
• Keyword relevance
• Recency
• Number of comments
Drupal.org Implementation
• Keyword relevance: 10
• Recency: 5
• Number of comments: 1
Source: http://www.civicactions.com/blog/search/part_1
Good
• Simplicity
• Advanced Search
• (Some) Specific content type search
• Detailed result
Simplicity
Advanced Search
(Some) Specific Content Types
Detailed Result
Some Problems
1
2
3
Improvement Ideas
• Add more priority to some content types
• Projects > Documents > Forums
• Add sorting option
• By type
• Also by date, number of comments
More Ideas
• Weight by
• Number of incoming links (like PageRank)
• Tag/Category/Taxonomy
• Misspelling Handler
• Synonym Handler
• e.g. “Category” = “Taxonomy”
More Experimental IdeasFaceted Search
Further Issues
• Overall site performance
• Indexing and Searching is resource-consuming
• Solution
• “Outsource” search function to dedicated search software?
• Google Box
• Apache Solr