SharePoint 2013 Search Architecture with Russ Houberg
Preview:
DESCRIPTION
Citation preview
- 1. SharePoint 2010 SharePoint 2013Managed Property (Multiple)
Search SchemasBest Bets Promoted Results (Query Rule)Scope and
Federated Location Result SourceContent By Query Content By
SearchIncremental Crawl Continuous CrawlMCM MCSM
- 2. Continous Crawl Benefits Continus Crawl Facts No more
waiting for index Runs every 15 minutes by merge default Does not
wait for other Default interval can be crawls to complete changed
with PowerShell Can have multiple Should be used instead of
continuous crawls running incremental crawls for simultaneously
SharePoint content sources Continuous crawls ignores errors
- 3. HTTP Other File Share End User QueryUser Profile Or Content
Process Initiated SharePoint Sources Query Content Query Crawl
Index Processing Processing Component Component Component Component
Analytics Processing Link Index Crawl Partition(s) Component
Database Database(s) Event Store Analytics Database
- 4. What it Does Important Facts Crawls content sources to We
can have multiple crawl populate index components Delivers crawl
items (binary) and MS Recommends: 2 Crawl metadata to content
processor Components per Search Service Invokes connectors or
protocol Application handlers to interact with content MS
Recommends: 8(4vm) CPU / sources to retrieve data 8GB RAM per Crawl
Component Uses one or more crawl databases to store info about
crawl items and crawl history
- 5. What it Does Important Facts Processes crawl items and feeds
to index We must only have one (1) crawl component processing
component per server more Transforms crawl items into artifacts
that will hurt, not help crawl performance can be included in
search index Max of 2 per search service application (Performs
document parsing and Feeding Sessions are scaled based on property
mapping) CPU cores using a default coefficient of 3 Writes
information about links and urls 8 (cores) * 3 = 24 feeding
sessions in link database (which are analyzed by 4 (cores) * 3 = 12
feeding sessions analytics to calculate relevance and MS
Recommends: 8(4vm) CPU / 8GB currency - Results written back to
search RAM per Content Processing Component index by content
processing component Feeding sessions require RAM More Generates
phonetic name variations to RAM is necessary when more cores are
improve people search present monitoring required
- 6. What it Does Important Facts Runs analytics jobs that
analyze crawl items Maximum of 6 per search service and user
interaction with search results to application perform both search
analytics and usage Add more Analytics Processing Components
analytics to improve analytics performance Analyzes Link &
Anchor text analysis, Clear MS Recommends: 8(4vm) CPU / 8GB RAM /
distance, Search Clicks, Deep Links, Social 300GB disk space per
Analytics Processing Tags, Social Distance, Search Reports,
Component. Recommendations, Usage Counts, Activity Interacts with
Analytics Reporting to store Ranking statistical information
Improves search relevance and create Interacts with Link database
to store search results information about searches and crawled
Output included in search index by content documents processor
- 7. What it Does Important Facts Receives processed items from
content Maximum of 60 index partitions (20 processing component and
writes the index partitions X 3 index replicas) per items to the
index file search service application Receives queries from the
query Must provision one Index Component processing component and
returns for each index replica. result sets MS Recommends: 8(4vm)
CPU / 16GB Redistributes content among index RAM / 500GB disk space
per Index partitions when index architecture is Component. changed
by Search Administration Component
- 8. Index partition is logical portion of entire search index
(same as before) Index partition is served by one or more index
components Index components can be primary "replica" or secondary
Index "replica" Primary Replica is contacted by content processing
component to write new data in the indexArchitecture Secondary
Replica is read only copy that get updated with the data. Adding
replicas improves query performance under load Add partitions to
handle increased content corpus Cant remove partition after it has
been added.
- 9. What it Does Important Facts Analyzes and processes queries
and Maximum of 1 per server results MS Recommends: 8(4vm) CPU / 8GB
After receiving a query, it analyzes and RAM per Query Processing
processes the query to optimize Component. precision, recall and
relevance Submits processed queries to the index component
Processes the result set returned by the index component before
returning to the querying entity.
- 10. Get-SPEnterpriseSearchService
Get-SPEnterpriseSearchServiceApplicationGet-SPEnterpriseSearchStatus
Get-SPEnterpriseSearchQueryAndSiteSettingsService
Get-SPEnterpriseSearchLanguageResourcePhrase
Get-SPEnterpriseSearchServiceApplicationProxyNew-SPEnterpriseSearchAdminComponent
Get- Get-SPEnterpriseSearchSiteHitRule
Get-SPEnterpriseSearchServiceInstance
SPEnterpriseSearchQueryAndSiteSettingsServiceInstan
New-SPEnterpriseSearchLanguageResourcePhrase
New-SPEnterpriseSearchServiceApplication
ceGet-SPEnterpriseSearchCrawlContentSource
New-SPEnterpriseSearchSiteHitRule
New-SPEnterpriseSearchServiceApplicationProxy
Get-Get-SPEnterpriseSearchCrawlCustomConnector
Remove-SPEnterpriseSearchLanguageResourcePhrase
Remove-SPEnterpriseSearchServiceApplication
SPEnterpriseSearchQueryAndSiteSettingsServiceProxyGet-SPEnterpriseSearchCrawlDatabase
Remove-SPEnterpriseSearchSiteHitRule
Remove-SPEnterpriseSearchServiceApplicationProxy
Get-SPEnterpriseSearchQueryAuthorityGet-SPEnterpriseSearchCrawlExtension
Get-SPEnterpriseSearchVssDataPath
Restore-SPEnterpriseSearchServiceApplication
Get-SPEnterpriseSearchQueryDemotedGet-SPEnterpriseSearchCrawlMapping
Get- Resume-SPEnterpriseSearchServiceApplication
Get-SPEnterpriseSearchQueryKeyword
SPEnterpriseSearchContentEnrichmentConfigurationGet-SPEnterpriseSearchCrawlRule
Set-SPEnterpriseSearchService Get-SPEnterpriseSearchQueryScope
Set-SPEnterpriseSearchPrimaryHostControllerNew-SPEnterpriseSearchCrawlComponent
Set-SPEnterpriseSearchServiceApplication
Get-SPEnterpriseSearchQueryScopeRule
Set-SPEnterpriseSearchLinguisticComponentsStatusNew-SPEnterpriseSearchCrawlContentSource
Set-SPEnterpriseSearchServiceApplicationProxy
Get-SPEnterpriseSearchQuerySuggestionCandidates
Set-New-SPEnterpriseSearchCrawlCustomConnector
Start-SPEnterpriseSearchServiceInstance
Get-SPEnterpriseSearchRankingModel
SPEnterpriseSearchContentEnrichmentConfigurationNew-SPEnterpriseSearchCrawlDatabase
Stop-SPEnterpriseSearchServiceInstance
Get-SPEnterpriseSearchSecurityTrimmer
Remove-New-SPEnterpriseSearchCrawlExtension
Suspend-SPEnterpriseSearchServiceApplication
New-SPEnterpriseSearchQueryAuthority
SPEnterpriseSearchContentEnrichmentConfigurationNew-SPEnterpriseSearchCrawlMapping
Upgrade-SPEnterpriseSearchServiceApplication
New-SPEnterpriseSearchQueryDemoted
New-New-SPEnterpriseSearchCrawlRule
SPEnterpriseSearchContentEnrichmentConfiguration
Backup-SPEnterpriseSearchServiceApplicationIndex
New-SPEnterpriseSearchQueryKeywordRemove-SPEnterpriseSearchCrawlContentSource
Get-SPEnterpriseSearchLinguisticComponentsStatus Upgrade-
New-SPEnterpriseSearchQueryScopeRemove-
Get-SPEnterpriseSearchHostController
SPEnterpriseSearchServiceApplicationSiteSettings
New-SPEnterpriseSearchQueryScopeRuleSPEnterpriseSearchCrawlCustomConnector
Restore-SPEnterpriseSearchServiceApplicationIndex
New-SPEnterpriseSearchRankingModel
Set-SPEnterpriseSearchLinksDatabaseRemove-SPEnterpriseSearchCrawlDatabase
Remove- New-SPEnterpriseSearchSecurityTrimmer
Repartition-SPEnterpriseSearchLinksDatabasesRemove-SPEnterpriseSearchCrawlExtension
SPEnterpriseSearchServiceApplicationSiteSettings
Remove-SPEnterpriseSearchQueryAuthority
Move-SPEnterpriseSearchLinksDatabasesRemove-SPEnterpriseSearchCrawlMapping
Get-SPEnterpriseSearchOwner Remove-SPEnterpriseSearchQueryDemoted
Remove-SPEnterpriseSearchTenantSchemaRemove-SPEnterpriseSearchCrawlRule
Suspend-SPEnterpriseSearchServiceApplication
Remove-SPEnterpriseSearchQueryKeyword
Remove-SPEnterpriseSearchTenantConfigurationSet-SPEnterpriseSearchCrawlContentSource
Set-SPEnterpriseSearchServiceInstance
Remove-SPEnterpriseSearchQueryScope
Remove-SPEnterpriseSearchLinksDatabaseSet-SPEnterpriseSearchCrawlDatabase
Remove-SPEnterpriseSearchQueryScopeRule
Remove-SPEnterpriseSearchFileFormatSet-SPEnterpriseSearchCrawlRule
Get-SPEnterpriseSearchMetadataCategory
Remove-SPEnterpriseSearchRankingModel
New-SPEnterpriseSearchLinksDatabaseSet-SPEnterpriseSearchCrawlLogReadPermission
Get-SPEnterpriseSearchMetadataCrawledProperty
Remove-SPEnterpriseSearchSecurityTrimmer
New-SPEnterpriseSearchFileFormatRemove-
Get-SPEnterpriseSearchMetadataManagedProperty
Set-SPEnterpriseSearchQueryAuthority
New-SPEnterpriseSearchCrawlLogReadPermission
Get-SPEnterpriseSearchMetadataMapping
Set-SPEnterpriseSearchQueryKeyword
SPEnterpriseSearchAnalyticsProcessingComponentRemove-SPEnterpriseSearchCrawlLogReadPermission
New-SPEnterpriseSearchMetadataCategory
Set-SPEnterpriseSearchQueryScope
Import-SPEnterpriseSearchCustomExtractionDictionary
New-SPEnterpriseSearchMetadataCrawledProperty
Set-SPEnterpriseSearchQueryScopeRule
Get-SPEnterpriseSearchLinksDatabaseImport-SPEnterpriseSearchTopology
New-SPEnterpriseSearchMetadataManagedProperty
Set-SPEnterpriseSearchRankingModel
Get-SPEnterpriseSearchFileFormatExport-SPEnterpriseSearchTopology
New-SPEnterpriseSearchMetadataMapping Start-
Set-SPEnterpriseSearchFileFormatStateSet-SPEnterpriseSearchTopology
Remove-SPEnterpriseSearchMetadataCategory
SPEnterpriseSearchQueryAndSiteSettingsServiceInstan
Get-SPEnterpriseSearchComponentRemove-SPEnterpriseSearchTopology
Remove- ce Get- SPEnterpriseSearchMetadataManagedProperty Stop-
SPEnterpriseSearchServiceApplicationBackupStoreRemove-SPEnterpriseSearchComponent
Remove-SPEnterpriseSearchMetadataMapping
SPEnterpriseSearchQueryAndSiteSettingsServiceInstanNew-SPEnterpriseSearchTopology
ce Set-SPEnterpriseSearchMetadataCategoryNew-
Import-SPEnterpriseSearchPopularQueriesSPEnterpriseSearchQueryProcessingComponent
Set-SPEnterpriseSearchMetadataCrawledProperty
Set-SPEnterpriseSearchMetadataManagedProperty
Set-SPEnterpriseSearchResultItemTypeNew-SPEnterpriseSearchIndexComponent
Set-SPEnterpriseSearchMetadataMapping
Set-SPEnterpriseSearchQuerySpellingCorrection
- 11. Host 1 Host 2 Host 5 Host 6 Web server Web server Web
server Web server All SharePoint databases All SharePoint databases
Application Office Application Office Search admin db Link db
Server Web Apps Server Web Apps Server Server Crawl db Analytics db
Redundant copies of all databases using SQL clustering, mirroring,
or SQL Server SharePoint Config db 2012 AlwaysOn All other
SharePoint databasesHost 3 Host 4 Application Server Application
Server Query Processing Query Processing Replica Index part ition 0
Replica Application Server Application Server Crawl Crawl Admin
Admin Analytics Analytics Content processing Content
processing
- 12. Host A Host B Host E Host F Application Server Application
Server Query Processing Replica Index part ition 0 Replica
Application Server Application Server Analytics Analytics
Application Server Application Server Content processing Content
processing Application Server Application Server Replica Index part
ition 1 Replica Admin Admin Crawl Content processing Crawl Content
processingHost C Host D Host G Host H Application Server
Application Server Query Processing SharePoint databases SharePoint
databases Replica Index part ition 2 Replica Crawl db Search admin
db Crawl db Redundant copies of all databases using Application
Server Application Server Link db Analytics db SQL clustering,
mirroring, or SQL Server 2012 AlwaysOn Replica Index part ition 3
Replica
- 13. Host A Host B Host C Host D Host K Host L Host M Host N
Application Server Application Server Application Server
Application Server Query Processing Query Processing Replica Index
part ition 2 Replica Replica Index part ition 0 Replica Application
Server Application Server Application Server Application Server
Analytics Analytics Analytics Analytics Application Server
Application Server Application Server Application Server Content
processing Content processing Content processing Content processing
Application Server Application Server Application Server
Application Server Index part ition 1 Replica Index part ition 3
Replica Replica Replica Analytics Analytics Crawl Admin Crawl Admin
Content processing Content processingHost E Host F Host G Host H
Host O Host P Host Q Host R Application Server Application Server
Application Server Application Server SharePoint databases
SharePoint databases SharePoint databases SharePoint databases
Query Processing Query Processing Index part ition 4 Replica
Replica Index part ition 6 Replica Replica Search admin db Link db
Redundant copies of all databases using Crawl db Redundant copies
of all databases using Analytics db SQL clustering, mirroring, or
SQL Server Application Server Application Server Application Server
Application Server SQL clustering, mirroring, or SQL Server 2012
AlwaysOn Crawl db 2012 AlwaysOn Analytics db Crawl db Crawl db
Replica Index part ition 5 Replica Replica Index part ition 7
Replica Crawl dbHost I Host J Application Server Application Server
Replica Index part ition 8 Replica Application Server Application
Server Replica Index part ition 9 Replica
- 14. Schema can be managed by site admins, reducing the load on
search administrator Schema can be configured to allow more
granularity (query, retrieve, refine, sort, etc) - Affects content
index size Remote result sources can be crawled locally and then
queried by remote farms. Huge impact on geo-distributed search KL
may be able to help! Individual items can be re-crawled easily
Automatic URL balancing in crawl databases minimizes host name
restrictions for large archive repositoriesScalability limit
changes will have a big impact on farm design for large archive
content repositories inthe near future.