24
Apache Solr: Beyond the Search Page. Rupert Jabelman drupal.org/IRC: rupertj Twitter: @rupertjabelman

Apache Solr: beyond the search page, from Drupalcon London 2014

Embed Size (px)

DESCRIPTION

A talk on how to modify a default install of Drupal's Apache Solr integration module to provide a customised search experience. http://2014.drupalcamplondon.co.uk/drupalcamp-london-2014/session/apache-solr-beyond-search-page

Citation preview

Page 1: Apache Solr: beyond the search page, from Drupalcon London 2014

Apache Solr: Beyond the Search Page.Rupert Jabelman drupal.org/IRC: rupertj Twitter: @rupertjabelman

Page 2: Apache Solr: beyond the search page, from Drupalcon London 2014

Who are you again?

I work for Aroq - an online publisher.

We provide daily information in 4 industries: Auto, Food, Drink, Clothing.

Auto is where I spend most of my time, working on QUBE: a site for research and intelligence, rather than news. QUBE uses Solr extensively, with ~ 65000 documents in its index.

Page 3: Apache Solr: beyond the search page, from Drupalcon London 2014

What’s this about then?

How to use Solr for things that aren’t just the bog standard search page:

1. Content admin page

2. Data visualisation

You can use these techniques to enhance normal search pages too.

Page 4: Apache Solr: beyond the search page, from Drupalcon London 2014

Text

A (mostly) bog standard search page.

Page 5: Apache Solr: beyond the search page, from Drupalcon London 2014

Text

An admin-focused search

Page 6: Apache Solr: beyond the search page, from Drupalcon London 2014

Text

Data visualisation

Page 7: Apache Solr: beyond the search page, from Drupalcon London 2014

The Lie.

Page 8: Apache Solr: beyond the search page, from Drupalcon London 2014

Differentiating your search pages.

Page 9: Apache Solr: beyond the search page, from Drupalcon London 2014

Text

Specialised image search

Page 10: Apache Solr: beyond the search page, from Drupalcon London 2014

Text

So what is Solr anyway?

Page 11: Apache Solr: beyond the search page, from Drupalcon London 2014

Solr is a web serviceAnd a request looks like this: start=0&rows=20&&spellcheck=true&q=&fl=id%2Centity_id%2Centity_type%2Cbundle%2Cbundle_name%2Clabel%2Css_language%2Cis_comment_count%2Cds_created%2Cds_changed%2Cscore%2Cpath%2Curl%2Cis_uid%2Ctos_name%2Cteaser%2Czm_parent_entity%2Css_filemime%2Css_file_entity_title%2Css_file_entity_url%2Csm_vid_Segment%2Cim_segment%2Cds_production_dates%2Cds_production_dates_end%2Css_platform_reference%2Css_platform_name%2Css_group_reference%2Css_group_name%2Css_make_reference%2Css_make_name%2Csm_model_name_reference%2Csm_model_name_name%2Csm_parent_model_reference%2Csm_parent_model_name%2Css_codename%2Csm_plant_reference%2Csm_plant_name%2Csm_vid_Production_status%2Cds_next_product_action_date%2Cim_next_product_action%2Csm_vid_Product_action&mm=1&mm=100%25&pf=content%5E2.0&ps=15&hl=true&hl.fl=content&hl.snippets=3&hl.mergeContigious=true&f.content.hl.alternateField=teaser&f.content.hl.maxAlternateFieldLength=256&qf=content%5E40&qf=label%5E5.0&qf=tags_h2_h3%5E3.0&qf=tags_h4_h5_h6%5E2.0&qf=tags_inline%5E1.0&qf=taxonomy_names%5E2.0&qf=tos_name%5E3.0&qf=ts_comments%5E20&facet=true&facet.sort=count&facet.mincount=1&facet.field=im_segment&facet.field=im_country&facet.field=im_production_status&facet.field=sm_plant_reference&facet.field=sm_group_reference&facet.field=%7B%21ex%3Dsm_make_reference%7Dsm_make_reference&facet.field=sm_model_name_reference&facet.field=ss_platform_reference&facet.field=is_eop_year&facet.field=is_sop_year&facet.field=sm_revision_name_formatted&facet.field=is_owner&facet.field=im_field_priority&facet.field=im_field_company_reference&facet.field=im_field_article_type&facet.field=is_uid&facet.field=im_field_intelligence_sector&facet.field=im_field_theme&facet.field=bundle&f.im_segment.facet.limit=-1&f.im_segment.facet.mincount=1&f.im_country.facet.limit=-1&f.im_country.facet.mincount=1&f.im_production_status.facet.limit=50&f.im_production_status.facet.mincount=1&f.sm_plant_reference.facet.limit=100&f.sm_plant_reference.facet.mincount=1&f.sm_group_reference.facet.limit=50&f.sm_group_reference.facet.mincount=1&f.sm_make_reference.facet.limit=50&f.sm_make_reference.facet.mincount=1&f.sm_model_name_reference.facet.limit=50&f.sm_model_name_reference.facet.mincount=1&f.ss_platform_reference.facet.limit=100&f.ss_platform_reference.facet.mincount=1&f.is_eop_year.facet.limit=50&f.is_eop_year.facet.mincount=1&f.is_sop_year.facet.limit=50&f.is_sop_year.facet.mincount=1&f.sm_revision_name_formatted.facet.limit=50&f.sm_revision_name_formatted.facet.mincount=1&f.is_owner.facet.limit=50&f.is_owner.facet.mincount=1&f.im_field_priority.facet.limit=50&f.im_field_priority.facet.mincount=1&f.im_field_company_reference.facet.limit=50&f.im_field_company_reference.facet.mincount=1&f.im_field_article_type.facet.limit=50&f.im_field_article_type.facet.mincount=1&facet.date=ds_created&facet.date=ds_changed&f.ds_created.facet.date.start=1997-01-01T00%3A00%3A00Z%2FYEAR&f.ds_created.facet.date.end=2014-01-01T00%3A00%3A00Z%2B1YEAR%2FYEAR&f.ds_created.facet.date.gap=%2B1YEAR&f.ds_created.facet.limit=50&f.is_uid.facet.limit=50&f.is_uid.facet.mincount=1&f.ds_changed.facet.date.start=2009-01-01T00%3A00%3A00Z%2FYEAR&f.ds_changed.facet.date.end=2014-01-01T00%3A00%3A00Z%2B1YEAR%2FYEAR&f.ds_changed.facet.date.gap=%2B1YEAR&f.ds_changed.facet.limit=50&f.im_field_intelligence_sector.facet.limit=50&f.im_field_intelligence_sector.facet.mincount=1&f.im_field_theme.facet.limit=50&f.im_field_theme.facet.mincount=1&f.bundle.facet.limit=50&f.bundle.facet.mincount=1&sort=ds_changed%20desc&q.alt=%28entity_type%3Aqube_entity%29%20%28bundle%3Aproduction_run%29%20%28im_production_status%3A1167%29&wt=json&json.nl=map!

Page 12: Apache Solr: beyond the search page, from Drupalcon London 2014

Solr is a web serviceAnd a (helpfully formatted) request looks like this: q = Toyota!fq = entity_type:qube_entity!fq = bundle:production_run!fq = ss_platform_reference:”qube_entity:3505"!fl = id, entity_id, entity_type, bundle, bundle_name, label, ss_language, is_comment_count, ds_created, ds_changed, score, path, url, is_uid, tos_name, teaser, zm_parent_entity, ss_filemime, ss_file_entity_title, ss_file_entity_url, sm_vid_Segment, im_segment, ds_production_dates, ds_production_dates_end, ss_platform_reference, ss_platform_name, ss_group_reference, ss_group_name, ss_make_reference, ss_make_name, sm_model_name_reference, sm_model_name_name, sm_parent_model_reference, sm_parent_model_name, ss_codename, sm_plant_reference, sm_plant_name, sm_vid_Production_status, ds_next_product_action_date, im_next_product_action, sm_vid_Product_action!mm = 100%!start = 0!rows = 20!spellcheck = true!pf = content^2.0!ps = 15!hl = true!hl.fl = content!hl.snippets = 3!hl.mergeContigious = true!qf = content^40!qf = label^5.0!qf = tags_h2_h3^3.0!qf = tags_h4_h5_h6^2.0!qf = tags_inline^1.0!qf = taxonomy_names^2.0!qf = tos_name^3.0!qf = ts_comments^20!sort = ds_changed desc!wt = json!json.nl = map!

Page 13: Apache Solr: beyond the search page, from Drupalcon London 2014

Dynamic FieldsSo any field you’ve added through Field API can be shown in solr without changing the schema, dynamic fields are used. These have their properties defined by their prefix.

EG:

• im_foo => integer, multi-valued. EG Taxonomy term IDs.

• sm_bar => string, multi-valued. EG Taxonomy term names.

• bs_grill => boolean, single. EG Checkbox fields.

Most fields are added for you, but you can always add your own. In fact, you’ll probably have to.

Page 14: Apache Solr: beyond the search page, from Drupalcon London 2014

You’ll need to feed Solr more data

Page 15: Apache Solr: beyond the search page, from Drupalcon London 2014

Adding extra data.

hook_apachesolr_index_documents_alter(array &$documents, $entity, $entity_type, $env_id) {

$document = reset($documents);

$document->setField('ss_foo', $entity->field_foo…);

}

Most field data is included for you already (with a few exceptions). Data you add here can be anything you like though.

Page 16: Apache Solr: beyond the search page, from Drupalcon London 2014

Checking your work: Solr admin.

Page 17: Apache Solr: beyond the search page, from Drupalcon London 2014

Modifying queries.Add extra fields to the results:

hook_apachesolr_query_alter($query) {

$query->addParam('fl', ‘sm_foo’);

}

Changing the query fields:

hook_apachesolr_query_alter($query) {

$query->replaceParam('qf', array(

‘label^2.0',

'content^1.0',

));

}

Page 18: Apache Solr: beyond the search page, from Drupalcon London 2014

Modifying queries.Changing the number of results to return:

hook_apachesolr_query_alter($query) {

// Solr doesn't have a value for “unlimited"

// so we'll pass in a Very Large Number.

$query->addParam('rows', 999999);

}

Changing the sort order (setting a default in this case):

hook_apachesolr_query_alter($query) {

if (!isset($_GET['solrsort']) && ($query->getParam('q') == '')) {

$query->setSolrSort('ds_changed', 'desc');

}

}

Page 19: Apache Solr: beyond the search page, from Drupalcon London 2014

Adding sortsOverride SolrBaseQuery. apachesolr_query_class variable controls which is instantiated.

class MySolrBaseQuery extends SolrBaseQuery {

protected function defaultSorts() {

$sorts = parent::defaultSorts();

// Add in core changed property. Missing by default.

$sorts['ds_changed'] =

array('title' => t('Updated Date'), 'default' => 'desc');

}

}

apachesolr_sort module can be used to manage sorts:

• Choose a default sort & order.

• Enable / Disable the available sorts.

Page 20: Apache Solr: beyond the search page, from Drupalcon London 2014

Bringing it all together.Create a search page.

Apply the limits you need. Eg entity type, bundles, etc.

Add the data you need to solr documents.

Alter the query to add any additional fields you want to display.

Place any facet blocks you want.

Theme it.

Page 21: Apache Solr: beyond the search page, from Drupalcon London 2014

Text

An admin-focused search

Page 22: Apache Solr: beyond the search page, from Drupalcon London 2014

Text

Data visualisation

Page 23: Apache Solr: beyond the search page, from Drupalcon London 2014

Swopping out theme functions.function hook_apachesolr_search_page_alter(&$build, $search_page) {

if ($search_page['page_id'] == 'pldb') {

// If a timeseries view was requested, switch the theme function to one we define that draws timeseries.

if (qube_pldb_search_results_view() == 'timeseries') {

$env_id = 'solr';

if (apachesolr_has_searched($env_id)) {

$query = apachesolr_current_query($env_id);

if (qube_pldb_search_results_view_timeseries_allowed($query)) {

$build['search_results']['#theme'] = 'search_results_timeseries';

}

….

Page 24: Apache Solr: beyond the search page, from Drupalcon London 2014

Fin.

Any questions?