Upload
bryan-garrison
View
212
Download
0
Embed Size (px)
Citation preview
NLM Digital RepositoryServer Architecture
January 18, 2011
Design Considerations
Consistency with NLM architecture and processes
Remove single points of failure Data redundancy for preservation Availability Scalability Ingest ease, speed
2
3
Single Server Architecture
NWU BookViewer
Flash Video Player with Search
Muradora 1.4b
Fedora 3.2.1
Solr GSearch
OS: CentOS
HW: virtual server, 3 CPU, 24 GB RAM
Djatoka
MySQL5.0
Tomcat
FedoraManagedStorage
ExternalStorage
SolrIndex
ResourceIndex
Application Server Database ServerFile Server
Content and code
Fedora managed content Fedora database Fedora Resource Index Solr Index External content Application code Can and should these items be shared
across Fedora servers?
4
Data Center Environment
Two locations with two virtual servers each– Primary: NLM data center– Backup: Contingency operations data center– Active/Active – both locations always in use– Each virtual server has 3 CPU, 24 GB RAM
System tools– 3DNS – wide load-balancing– BIG-IP – local load balancing– Server monitoring, automatic failover– SnapMirror – NetApp filesystem replication
5
System ArchitecturePrimary Data Center Backup Data Center
BIG-IP
FedoraPrimary #1
Fedora DB
ExternalStorage
ManagedStorage
Solr IndexResource Index
FedoraPrimary #2
Fedora DB
ManagedStorage
Solr IndexResource Index
BIG-IP
FedoraBackup #1
Fedora DB
ExternalStorage
ManagedStorage
Solr IndexResource Index
FedoraBackup #2
Fedora DB
ManagedStorage
Solr IndexResource Index
Browser Browser
3DNS
Ingest considerations
Our Fedora system is read-only with controlled periodic batch content updates
System is available during updates – use one data center while updating the other
Code and content should be identical across servers
Reduce time to ingest to all servers in system. Approx. 10 hours for full re-ingest.
7
Content replication Content replication strategies
1. Fedora journaling (ingest to master, master-slave, messaging)2. Ingest to master, copy managed content to slave, rebuild
slave DB and resource index from managed content (rebuild is faster than full ingest)
3. Ingest to master, use system tools (NetApp SnapMirror) to copy all resources to slaves.
4. Ingest to each server independently
Our approach– Turn off primary data center, use backup data center to serve
public– Ingest to primary 1, copy managed content to primary 2,
rebuild primary 2 ...– Turn off backup data center, use primary data center to serve
public– Use SnapMirror to copy all resources from primary 1,2 to
backup 1,2– Turn on backup data center, both data centers available to
serve public
8
NLM Content ReplicationPrimary Data Center Backup Data Center
FedoraPrimary #1
Fedora DB
ExternalStorage
ManagedStorage
Solr IndexResource Index
FedoraPrimary #2
Fedora DB
ManagedStorage
Solr IndexResource Index
FedoraBackup #1
Fedora DB
ExternalStorage
ManagedStorage
Solr IndexResource Index
FedoraBackup #2
Fedora DB
ManagedStorage
Solr IndexResource Index
Ingest
Rebuild
SnapMirror