Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Clustering Liferay in Amazon Cloud
Alex KimBryan LittlefieldBryan Littlefield
Overview 2013 - Website of the Year
Relaunched with Liferay Enterprise Portal in Summer 2012
Currently supports:112 – Countries58,000 – trained teachers24,000 – schools1 5 million students1.5 million - students
Original Hardware EnvironmentOriginal Hardware EnvironmentHosting – The Globe Rack:Rack (42RU)
Dual Large APC PDU’s with A/B power• Dual Large APC PDU s with A/B power • Dell KVM over IP for remote management
Dell R710 Servers (3 for production, 2 for Staging)• Each w/Dual Quad Core CPUs & 96 GB RAM• Redundant Power, Disks, Network & Fibre
SAN Storage (8gb Fibre SAN)SAN Storage (8gb Fibre SAN)• NexSAN Storage array w/dual controllers• 20TB+ of Storage with SAS & SATA Disks.• Dual Qlogic SANBox Fibre Switches (Stackable)
Load Balancer• Dual F5 Big/IP 1600 Load Balancer• Dual F5 Big/IP 1600 Load Balancer
Network Switches• Dual Dell 48-port network switches (Stackable)
System Software• VMWare vSphere 5.1• VMware Virtual Center Windows• VMware Virtual Center - Windows• CentOS 5 For VM Operating Systems• Liferay, PostgreSQL, PostGIS, Oracle & SolR
Backup Server• Veeam Backup Software – Windows Server• Lowcost Server with 20TB local storageo cos Se e 0 oca s o age
Cloud vs New Hardware
Transition PlanTransition Plan
• Why move GLOBE?Why move GLOBE?• Disaster Recovery Environment
d C• Reduce Costs• ~120,000 page views per month• NASA requested a plan to transition data centers
Physical HardwarePhysical Hardware• Pros:
• Own the hardware• Can be hosted anywhereO ti fi d t• Over time fixed costs
• No change from current environment
•Cons:• Limitations based on hardware. Not as scalable• Hardware support and maintenance costs• Liable to data center for power and network issue
Cloud ComputingCloud Computing• Pros:
• Low initial costs. No long term commitments in cost.• No limitation to the number of servers• Flexibility to increase or decrease CPU’s, memory, and
i f t N t ti d t h i l li itsize of storage. Not tied to physical limits.• Reduced support costs for hardware and data center• Built-In redundancy across multiple locations
• Cons:• Limited to offering of the cloud provider• On going costs for usage• On going costs for usage• Extra charges for bandwidth, monitoring, and storage• Need to validate environment will work in the cloud
What does the cloud buyWhat does the cloud buy
• Physical Labor Costs with maintaining a dataPhysical Labor Costs with maintaining a data center or hosting in an off site data center
• Hardware maintenance differedHardware maintenance differed• Hardware replenishment costs differed• Environment control costs differed• Environment control costs differed• Ability to scale up or downM lti l it li ti• Multiple site replication
• More choices in products
Cloud ProvidersCloud ProvidersWindows Azure can be used to build a web application that runs and stores its data in Microsoft datacenters. It can connect on‐premises applications with each other or map between different sets of identity information.
‐ Unsure how Liferay will perform in platform
The Rackspace Cloud is a set of cloud computing products and services billed on a utility computing basis from the US‐based company Rackspace.
‐ Product fits requirements, infrastructure has limited offerings in comparison
Amazon Web Services (abbreviated AWS) is a collection of remote computing services (also called web services) that together make p g ( ) gup a cloud computing platform, offered over the Internet by Amazon.com.
‐ Has most to offer and considered industry leader in cloud technologytechnology.
Cloud Cost EstimatesCloud Cost EstimatesAmazon does provide a Simple Monthly Pricing Calculator that really helps break it down.
http://calculator.s3.amazonaws.com/calc5.html
CostCostYear 1 Year 2 Year 3 Total
Hardware $190,000 $5,000 $5,000 $200,000
Software $71,000 $49,000 $49,000 $169,000
Hosting $16 400 $9 600 $9 600 $35 600Hosting $16,400 $9,600 $9,600 $35,600
Physical Estimated Costs: $404,600
Year 1 Year 2 Year 3 Total
Hosting $40,000 $40,000 $40,000 $120,000
Software $43 000 $43 000 $43 000 $129 000Software $43,000 $43,000 $43,000 $129,000
Cloud Estimated Costs: $249,000Savings:
$*Take over 10 years to recover delta in cost for physical hardware $155,600Take over 10 years to recover delta in cost for physical hardware
Cost TodayCost Today
Amazon is continuing adjusting it’s pricing structure. We are g j g p gestimating the costs of operations is going to go down.
Year 1 Savings
Hosting ~$24,000 ‐$16,000
Software $43,000
Th P t tThe Prototype
Will it work in the cloud?
What does Amazon Provide?What does Amazon Provide?
AWS Services DeployedAWS Services DeployedEC2 (Elastic Cloud)
On Demand Instances onS3
Provides Web Service based On Demand Instances on Dedicated Reserved Instances
VPC (Virtual Private Cloud )E bl t l h A
storage
EBS (Elastic Block Store)Provides persistent block level
storage volumes for use withEnables you to launch Amazon Web Services (AWS) resources into a virtual network that you've defined
storage volumes for use with Amazon EC2 instances in the AWS Cloud.
AMI (Amazon Machine Images)ELB (Elastic Load Balancer)
Automatically distributes incoming application traffic across multiple Amazon EC2 instances
Snapshot templates defining a launchable EC2 server instance.
Cloud Watchinstances. Cloud WatchTools to monitor and check your
EC2 instances
Architecture – Amazon AWSElastic Load
Balancing
Security GroupWEB1 Instance
WebServer
WEB2 Instance
WebServer
Security GroupAPP1 Instance
LiferayServer
APP2 Instance
LiferayServer
Elastic Loadwww globe gov
ELB
Security Group
VIZ AppServer
Elastic Load Balancing
PG1 InstanceSecurity Group
Data Volume
www.globe.gov
ELBPG-T Instance
Data Volume
VIZ Instance
SOLR-M Instance
SOLRServer
SOLR-S1 Instance
SOLRServer
Security Group
R b R b
vis.globe.govSecurity Group
A il bilit Z #1
Security Group
Data VolumesFS Instance
FileServer
DE-1 Instance
RubyServer
DE-2 Instance
RubyServer Ruby
Server
data.globe.gov
ELB
DE-x Instance
Auto Scaling group
Availability Zone #1Security Group
FS Instance
Ama on
Security Groupg g
training.globe.govassets.globe.gov
19
AmazonS3 Bucket
AmazonEBS Snapshot
Multicast vs UnicastMulticast vs. UnicastGLOBE was using multicast.
Amazon blocks multicast.
Multicast is the delivery of a message or information to a group of destination computers simultaneously in a single transmission from the source
Unicast transmission is the sending of messages to a single network destination identified by a unique addresstransmission from the source
The ProblemThe Problem
Liferay
Ehcache
Liferay
Ehcache
Q tCh i M C h
App Server 1
Quartz
Web Sessions
App Server 2
Quartz
Web Sessions
Change is made
Message to replicate
Cache Updated
If the user is being bounced around f t th ill lfrom server to server, they will lose the work they are doing, because both machines are not in sync.
Working around no multicastWorking around no multicastLiferay clusterlink uses multicast by default so need to convert Liferay to use JgroupsJgroups.
With Jgroups, there are multiple ways to work around multicast• TCP_Ping_ g• JDBC_Ping• S3_Ping – Amazon ONLY• RACKSPACE_Ping – Rackspace ONLY
TCP_Ping• Simplicity, the easiest to setup.• S3_PING was not supported by current version of Jgroups, need
minimum version 2.12.2.• Down side is TCP_Ping does not autoscale.
The Solution
# Set this to true to use Cluster Link for cache propagation. By default,
Updates these properties in portal-ext.properties
# Ehcache uses RMI which does not scale for very large sites. This property# does not take effect unless the Ehcache Cluster web application is also# deployed.ehcache.cluster.link.replication.enabled=true
# Set this to true to enable the cluster link. This is required if you want# to cluster indexing and other features that depend the cluster link.cluster.link.enabled=true
# Set the JGroups properties for each channel, we support up to 10 transport# channels and 1 single required control channel. Use as few transport# channels as possible for best performance. By default, only one UDP# control channel and one UDP transport channel are enabled. Channels can be# configured by XML files that are located in the class path or by inline# properties. (fix LPS-20689)
/cluster.link.channel.properties.control=globe_ehcache/tcp.xmlcluster.link.channel.properties.transport.0=globe_ehcache/tcp.xml
Optional:Cluster.link.autodetect.address=DB_IP_ADDRESS:5432
Configure Jgroups for UnicastConfigure Jgroups for UnicastFOR TCP_PINGAdd peer IP address info to your JVM system variable, for example, you can add it to
t h d T t b l d h hyour setenv.sh under Tomcat below or under ehcache.
JAVA_OPTS="$JAVA_OPTS “JVM settings etc etc"JAVA_OPTS="$JAVA_OPTS -Djava.net.preferIPv4Stack=true"JAVA_OPTS="$JAVA_OPTS \
-Djgroups.tcpping.initial_hosts= 10.x.x.1[7800],10.x.x.2 [7800]”
The TCP.XML file is inside jgroups.jar's root directory(default setting), and as such the XML file does not need to be placed into the classes folder.
If you need to customize it - extract the file and then point portal-ext.properties to the configuration.
Make sure to make these changes in tcp.xml at minimum:
<TCP bind port="7800"TCP bind_port 7800loopback="true"singleton_name="LiferayClusterLink“
Optional:initial_hosts="${jgroups.tcpping.initial_hosts:10.x.x.1[7800],10.x.x.2[7800]}“num_initial_members="2"
EhcacheEhcacheEhcache is an open source, standards-based cache for boosting performance, offloading your database, and simplifying scalability.
fIt's the most widely-used Java-based cache because it's robust, proven, and full-featured. Ehcache scales from in-process, with one or more nodes, all the way to mixed in-process/out-of-process configurations with terabyte-sized caches.
Commercial option is BigMemory from Terracotta which offloads the memory acrossCommercial option is BigMemory from Terracotta which offloads the memory across multiple servers without performance hits from the Java garbage collector.
Deploy the ehcache-cluster-web.war*
# Set this to true to use Cluster Link for cache propagation. By default,# Ehcache uses RMI which does not scale for very large sites. This property# does not take effect unless the Ehcache Cluster web application is also# deployed.ehcache.cluster.link.replication.enabled=true
# Set this to true to enable the cluster link. This is required if you want# to cluster indexing and other features that depend the cluster link.cluster link enabled=truecluster.link.enabled=true
Configure Tomcat for UnicastgAdd peer IP address info to your Tomcat system settings in “server.xml”.
<!-- You should set jvmRoute to support load-balancing via AJP ie : --><Engine name="Catalina" default Host="localhost" jvmRoute="jvm1"> g j j
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"channelSendOptions=“6" channelStartOptions="3">
<Manager className="org.apache.catalina.ha.session.DeltaManager"expireSessionsOnShutdown="false“ notifyListenersOnReplication="true"/>
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"address="10.x.x.1“ port="5001“ autoBind="9“ selectorTimeout="5000" maxThreads="6"/>
<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter"><Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="30000“ keepAliveTime="10“ keepAliveCount="0"/></Sender>
<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpPingInterceptor"/><Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/><Interceptor className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/><Interceptor className="org.apache.catalina.tribes.group.interceptors.StaticMembershipInterceptor">
<Member className="org.apache.catalina.tribes.membership.StaticMember"host="10.x.x.2“ port="5001"domain="delta-static“ uniqueId="{3,4,5,6,7,8,9,10,11,12,13,14,15,0,1,2}"/>
</Interceptor></Channel><Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/><Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/><ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/><ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>
ValidationValidation
• Tested:Tested:‐ Adding a Portlet
i l‐ Moving a Portlet‐ Removing a Portlet‐ Tested redundancy and session replication
Liferay 6.2 EE – What’s ChangedLiferay 6.2 EE What s Changed
• Ability to use newer JGroup versions toAbility to use newer JGroup versions to improve clustering
• Ability to use newer versions of SOLR and• Ability to use newer versions of SOLR and other Marketplace pluginsA h T 7 ff ddi i l f• Apache Tomcat 7.x offers additional feature improvements over Tomcat 6.x
Future Amazon ServicesFuture Amazon Services
• S3 Ping can be used with Liferay 6 1 +S3_Ping can be used with Liferay 6.1 +• S3 for Liferay Data Folder
l h h• Elastic Cache – In‐Memory Cache• Elastic Beanstalk – AWS Application Container
Lessons Learned
Multicast vs UnicastMulticast vs. UnicastMulticast:u cas‐ Can easily add Liferay servers with minimal configuration
‐ Still not supported in AWS.
Unicast:‐ Physically need to configure servers with TCP PING.y y g _‐ Other Unicast options allow for autoscale but requires additional configurations.
What is GLOBE like Today?What is GLOBE like Today?
• NASA demands reduction in costsNASA demands reduction in costs• Cloud hosting continues to go down in pricingG O i l ki h i• GLOBE is looking to move the Disaster Recovery site to the cloud
Q&AAlex Kim Twitter: @alexykimEmail: [email protected]
Bryan LittlefieldTwitter: @bruinbryanEmail: [email protected]
Questions/AnswersQuestions/AnswersHere are the Quartz cluster settings needed in portal‐ext.properties:If Lifray ran already as non clustered you'll need to drop the quartz tables from the database and let it generate them againIf Lifray ran already as non‐clustered you ll need to drop the quartz tables from the database and let it generate them again.
######################################################################################### QUARTZ SCHEDULER#######################################################################################
### Not part of portal.properties. Used for Clustering#org.quartz.jobStore.isClustered=true
##Quartz#memory.cluster.scheduler.lock.cache.enabled=true