Upload
c-daniel-chase
View
20
Download
0
Embed Size (px)
Citation preview
DON'T LIKE YOUR GOOGLE SEARCHINTERFACE? MAKE YOUR OWN!
C. Daniel Chase — @cdchase
The University of Tennessee at Chattanooga
#aim7 #heweb14
PUZZLE PARTSComparisonsSearch FormSearch Request ProcessingSearch APIResult ProcessingCustomizing OutputIntegrating into WebsitePage Not Found (404) Handling
GOOGLE CUSTOM SEARCH ENGINEFreeCannot Customize ResultsAds on results pages (can be disabled for non-profit)Google Branded
GOOGLE SITE SEARCHFormerly Google Custom Search Business EditionNOT Free!Licensed by Number of SearchesIndexes 13 file formatsOur Search count 934,000+/year = Over $2,000 for licenseLarger license is for off-line engineXML Results Query Reference
GOOGLE SEARCH APPLIANCE (GSA)HardwareLicensed by Document CountIndexes over 220 file formatsCan index sites requiring authentication
MAKING A SEARCH QUERY
if(isset($_POST['q']) && $_POST['q'] != '') {
$url = "http://google.tennessee.edu/search?"
. "client=utk_frontend&"
. "output=xml_no_dtd&"
. "sort=date:D:L:d1&"
. "entqr=3&"
. "ie=UTF-8&"
. "ud=1&"
. "site=Chattanooga&"
. "start=0&"
. "q=" . urlencode(stripslashes($_POST['q']));
$q = html_entity_decode(strip_tags($_POST['q']));
}
http://google.tennessee.edu/search?client=utk_frontend&output=xml_no_dtd&
sort=date:D:L:d1&entqr=3&ie=UTF-8&ud=1&site=Chattanooga&
start=0&q=university%20web%20services
SEARCH QUERY - RESPONSE
<gsp ver="3.2">
<tm>0.256643</tm>
<q>university web services</q>
<param name="client" value="utk_frontend" original_value="utk_frontend">
<param name="output" value="xml_no_dtd" original_value="xml_no_dtd">
<param name="sort" value="date:D:L:d1" original_value="date:D:L:d1">
<param name="entqr" value="3" original_value="3">
<param name="ie" value="UTF-8" original_value="UTF-8">
<param name="ud" value="1" original_value="1">
<param name="site" value="Chattanooga" original_value="Chattanooga">
<param name="start" value="0" original_value="0">
<param name="q" value="university web services" original_value="university+web+services"
<param name="ulang" value="en" original_value="en">
<param name="ip" value="150.182.252.13" original_value="150.182.252.13">
<param name="access" value="p" original_value="p">
<param name="entqrm" value="0" original_value="0">
<param name="entsp" value="a__urlpattern_policy" original_value="a__urlpattern_policy"
<param name="wc" value="200" original_value="200">
<param name="wc_mc" value="1" original_value="1">
<res sn="1" en="10">
<m>1940</m>
<fi>
<wxt>
<nb>
<nu>/search?q=university+web+services&site=Chattanooga&lr=&ie=UTF-8&output=xml_no_dtd&client=utk_frontend&access=p&sort=date:D:L:d1&start=10&sa=N
</nb>
<r n="1">
<u>http://www.utc.edu/university-web-services/</u>
SEARCH APIBookmark the Reference documentation!
https://support.google.com/gsa/answer/3890846?hl=en&ref_topic=2709671
More specifically, the Search Protocol Reference:http://www.google.com/support/enterprise/static/gsa/docs/admin/72/gsa_doc_set/
xml_reference/
REQUIRED SEARCH PARAMETERSsite
Limits search results to the contents of the specified collection.
client
A string that indicates a valid front end and the policies definedfor it, including KeyMatches, related queries, filters, remove
URLs, and OneBox Modules.
output
Selects the format of the search results.
q
Search query as entered by the user.
site=Chattanooga
client=utk_frontend
output=xml_no_dtd
q=university%20web%20services
PREFERENCE SEARCH PARAMETERSsort
Results can be sorted by relevance, date or metadata.
entqr
This parameter sets the query expansion policy. 3 is Full: Usesboth standard and local synonym files.
ie
Sets the character encoding that is used to interpret the query.
ud
Specifies whether results include ud tags. A ud tag containsinternationalized domain name (IDN) encoding for a result URL.
sort=date:D:L:d1
entqr=3
ie=UTF-8
ud=1
MORE SEARCH PARAMETERSstart
Specifies the index number of the first entry in the result set thatis to be returned. (Use with num.)
start=0
RESULT PROCESSINGWe base our search result handling on the same template
provided with GSA — Customized. But, you can build your own.
Remove SERP <head> content to wrap with your template.Replace references to search in links and form action to pointat your new page.Review settings in top of GSA default XSL for configurableoptions.Remove or fine-tune page top & bottom content.Remove conflicting CSS.
CUSTOMIZING OUTPUTStart with built-in options
Replaced the Google logoAdded the header used on the organization's web site.Changed search button textChanged the advanced search anchor text...
Review output for other changesDon't be afraid of (do not customize)
INTEGRATING INTO WEBSITEEvery page should have search form!Customize page content to improve search (SEO)Add standard description & keyword meta tagsAdd custom meta tags
PAGE NOT FOUND (404) HANDLINGDon't redirect directly to search page!Must send 404 Error to search engines crawlersBe nice to people — Do a search for them!Historic page redirectsParse requested URL and use it to search!The Trick: Plain HTML 404 page with JavaScript redirect
QUESTIONS?C. Daniel Chase — @cdchase
The University of Tennessee at Chattanooga
#aim7 #heweb14