EthicShare.org (Mostly Solr)

  • View
    2.452

  • Download
    0

Embed Size (px)

Text of EthicShare.org (Mostly Solr)

  • 1.Twin Cities Drupal Users Group - October 22, 2008 EthicShare: Solr + Drupal Under the Hood Tour

2. EthicShare?

  • Who:University of Minnesota's Center for Bioethics, the University of Minnesota Libraries, and the University of Minnesota Department of Computer Science and Engineering
    • EthicShares pilot implementation builds on a recent planning phase that was a collaboration with the University of Virginia, Georgetown University, Indiana University-Bloomington, and Indiana University-Purdue University, Indianapolis.
  • What:A sustainable aggregation of bioethics research and a platform for scholarship
  • When:Pilot Phase runs from January 2008 - June 2009
  • How:Funded by the Andrew W. Mellon Foundation

3. The Platform

  • Drupal
    • Community Development Framework
  • Solr
    • Faceted Search Appliance

4. The Process 5. 6.

  • Origin:Created by CNET and released January 2006
    • Became an Apache Software Foundation project shortly thereafter
  • Builds on theLucene Search Engine Library
    • Comes with Lucenes search syntax and features
  • Providessimple HTTP/XML API
  • Strongly typed field definitions
  • Noteworthy Implementations Netflix, CNET Reviews, GameSpot, Digg
      • More:http: //wiki .apache. org/solr/PublicServers

7. Behind the Scenes - Indexing

  • HTTP/XML API
  • http://localhost:8983/solr/update
  • http://localhost:8983/solr/select
  • Indexing = POSTing XML Records to /update
  • Commands:
    • 101
    • 2
    • Solr Search is Simply Great
    • Solr and Drupal are like PB And J
    • 1224707462
    • 4
    • libsys
    • 10297

8. Behind the Scenes - Searching

  • Get Contents of /select URL:cURL, file_get_contents($url)
  • ApacheSolr makes use of a Solr PHP Client Abstraction Layer
    • http: //wiki .apache.org/solr/SolPHP

9. Setup - Solr Directory Layout

    • Tomcat Files:
    • /tomcat/webapps/solr_ethicshare.war (cp solr.war from example dir)
    • /tomcat/conf/Catalina/localhost/ solr_ethicshare.xml

solr_ethicshare.xml - Tell Tomcat About Solr 10. Solr Schema - Fields and Types

  • Starter schema:
    • ../drupaldir/sites/all/modules/apachesolr/schema.xml
  • ex:
    • string=solr.StrField
    • boolean=solr.BoolField

11. Solr Schema - Analyzers

  • Tokenize on whitespace, then remove any common words (StopFilterFactory)
  • Remove any duplicates (RemoveDuplicatesTokenFilterFactory)

12. Solr Schema - Dynamic Fields

13. Solr Schema - Some Example Options

  • uniqueKey
      • nid
  • defaultSearchField
      • text
  • solrQueryParser