Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Thanks
• CGA
• Ben Lewis
• Dave Strohschein
Layer Level Search Of Spatial Resources
A Spatial Resource<kml xmlns=“http://www.opengis.net/kml/2.2"> ! <Document> <Placemark> ! <name>Harvard</name> <description>You Are Here</description> <Point> <coordinates>-71.1169,42.3774,0</coordinates> </Point> ! </Placemark> </Document> !</kml>
Not A Spatial Resource
Web Services
• Individual Layer Level Search
• OGC - Get Capabilities, WMS
• ESRI Rest
Anchor Link Signatures
• Anchor Links To Spatial Resources
• ?request=GetCapabilities
• /ArcGIS/rest/service
• *.kml and *.kmz
• */shape/*.zip
Not JavaScript Code<script>
L.esri.tiledMapLayer(
"http://basemap.nationalmap.gov/ArcGIS/rest/services/USGSTopo/MapServer",
{opacity: 0.50, zIndex:2}).addTo(map);
</script>
Not HTML Tags<body>
Please use my base layer:
<blink>
http://basemap.nationalmap.gov/ArcGIS/rest/services/USGSTopo/MapServer
</blink>!
</body>
Google Advanced Search
• What Will A Crawl Discover?
• allinanchor:, allinurl:, filetype:
• Follow Terms Of Service
Limited Crawl
• Crawl A Couple Sites
• JCrawler: Provide Two Functions
• Follow This Link?
• Process Page
• Run On Localhost, Obey robots.txt
Find ALL Spatial Resources
• Not With A Cluster Running Nutch
• Too Hard!
CommonCrawl.org
• Monthly Crawl, 2-3 Billion Web Pages
• 55,000 WARC Files On Amazon East
• Hadoop Sample Code
• Add jsoup And Several Hundred Lines Of Code
CommonCrawl Blog
Easier Hadoop
One Complete “Crawl”
• 25 Slaves, 3 Full Days
• $1400
• R3.XLarge - Lots Of Memory
Sample Crawl Outputhttp://www.ga.gov.au/gis/services/earth_science/GA_Surface_Geology_of_Australia/MapServer/WMSServer?request=GetCapabilities&service=WMS 306 !http://maps.ngdc.noaa.gov/soap/web_mercator/dem_extents/MapServer/WMSServer?request=GetCapabilities%26service=WMS 169 !http://www.ga.gov.au/gis/services/earth_science/Geoscience_Australia_Seismic_Surveys/MapServer/WMSServer?request=GetCapabilities&service=WMS 144 !http://gis.ngdc.noaa.gov/arcgis/services/dem_hillshades/ImageServer/WMSServer?request=GetCapabilities%26service=WMS 132 !http://www.ga.gov.au/gis/services/topography/Australian_Topography/MapServer/WMSServer?request=GetCapabilities&service=WMS 108 !http://www.ga.gov.au/gis/services/earth_science/Crustal_Elements_of_Australia/MapServer/WMSServer?request=GetCapabilities&service=WMS 108
Sample Crawl Outputhttp://www.ga.gov.au/data-pubs/web-services/replacement-services-for-the-national-geoscience-datasets-wms|||http://www.ga.gov.au/gis/services/earth_science/GA_Surface_Geology_of_Australia/MapServer/WMSServer?request=GetCapabilities&service=WMS -306 !http://www.ga.gov.au/data-pubs/web-services/replacement-services-for-the-national-geoscience-datasets-wms|||http://www.ga.gov.au/gis/services/earth_science/Geoscience_Australia_Seismic_Surveys/MapServer/WMSServer?request=GetCapabilities&service=WMS -144 !http://www.ga.gov.au/data-pubs/web-services/replacement-services-for-the-national-geoscience-datasets-wms|||http://www.ga.gov.au/gis/services/earth_science/Crustal_Elements_of_Australia/MapServer/WMSServer?request=GetCapabilities&service=WMS -108 !http://www.ga.gov.au/data-pubs/web-services/replacement-services-for-the-national-geoscience-datasets-wms|||http://www.ga.gov.au/gis/services/topography/Australian_Topography/MapServer/WMSServer?request=GetCapabilities&service=WMS -108 !http://www.ga.gov.au/data-pubs/web-services/replacement-services-for-the-national-geoscience-datasets-wms|||http://www.ga.gov.au/gis/services/earth_science/Geoscience_Australia_Airborne_Geophysics/MapServer/WMSServer?request=GetCapabilities&service=WMS -90
Key / Value Pairs• String / Integer Pairs
• Value > 0
• URL To Resource / Frequency Count
• Value < 0
• URL To Resource + “|||” + Page Found On
• Use Unix Commands To Split, Sort File
Harvester
• Input: List Of Spatial Resources
• Processing:
• Obtain Metadata On Each Layer
• Periodically Re-visit
• Output: Solr Records, Report
Layer Level Search Of Spatial Resources
Search
• People Expect Good Results
• Always Too Many Results For Human Review
• Ranking / Scoring Results Is Key
Some Layers Not Relevant
Layer Within Map
Similar Center
Similar Area
Spatial Solr
• Old Style: Floats
• New Style: Rectangle, Polygons
Solr Schema
• Define Fields To Support Search
• Pre-compute Intermediate Result
• Data Type = Search Options
• Or Schema-less
Solr Schema
MinX, MaxX, CenterX
MinY, MaxY, CenterY
HalfWidth
HalfHeight
Area
tdouble Field Types
Old Style Solr Queryhttp://geodata.tufts.edu/solr/select?q=_val_:%22product(10.0,map(sum(map(MinX,-71.143160023987,-71.096038976013,1,0),map(MaxX,-71.143160023987,-71.096038976013,1,0),map(MinY,42.385170824958,42.428266055761,1,0),map(MaxY,42.385170824958,42.428266055761,1,0)),4,4,1,0)))%22_val_:%22product(15.0,recip(sum(abs(sub(Area,0.002030692438118123)),.01),1,1000,1000))%22_val_:%22product(3.0,recip(abs(sub(product(sum(MaxX,MinX),.5),-71.11959949999999)),1,1000,1000))%22_val_:%22product(3.0,recip(abs(sub(product(sum(MaxY,MinY),.5),42.4067184403595)),1,1000,1000))%22+AND+%28LayerDisplayName:water^3+OR+ThemeKeywords:water^2+OR+PlaceKeywords:water^2%29+AND+%28ThemeKeywords:geoscientificinformation^4%29&&fq={!frange+l%3D1+u%3D10}product(2.0,map(sum(map(sub(abs(sub(-71.11959949999999,CenterX)),sum(0.023560523986994042,HalfWidth)),0,400000,1,0),map(sub(abs(sub(42.4067184403595,CenterY)),sum(0.021547615401498632,HalfHeight)),0,400000,1,0)),0,0,1,0))&wt=json&fl=Name,CollectionId,Institution,Access,DataType,Availability,LayerDisplayName,Publisher,GeoReferenced,Originator,Location,MinX,MaxX,MinY,MaxY,ContentDate,LayerId,score,WorkspaceName,SrsProjectionCode&rows=27&start=0&sort=score+desc&fq=ContentDate:[1950-01-01T01:01:01Z+TO+2012-01-01T01:01:01Z]&fq=DataType%3APoint&fq=Institution%3ATufts+OR+Institution%3AHarvard&fq=Institution:Tufts+OR+Access:Public&json.wrf=jQuery16408675794449108286_1331937717696&_=1331941365233
New Style Spatial
• A Lat-Lon rectangle: minX minY maxX maxY
• <field name="geo">-74.093 41.042 -69.347 44.558</field>
• Units: Degrees
• Distance Calc: Haversine or Euclidean, etc.
Spatial Functions
• fq=geo:"Intersects(-74.093 41.042 -69.347 44.558)"
• fq=geo:”IsWithin(POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30))) distErrPct=0”
• HeatMaps: Coming Soon
Also Search By
Date
Keywords
DataType
Institution
Solr Filter Clause
Future Is Browser Centric
• Client-Side Rendering
• Canvas, GPU, Actual Data
• Client-Side Analysis
• GPU, BYOD
• Apps With Phone Gap
Web Mapping Terms
• Map Servers
• ESRI Rest, OGC / GetCapibilities
• Convert Spatial Data To Map Tiles
Separating Axis
Diff CenterXs > Sum Half Widths
Half Width
Center X