Upload
amin-abbaspour
View
6.018
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Presentation at 3rd Iranian JUG Meeting. Introduces the whys behind scalable architecture, space based architecture and case-study in Gigaspaces.
Citation preview
An Introduction toSpace Based Architecture
Amin Abbaspour
MAGFA IT Development Centertwitter.com/abbaspour
Agenda
Title
Scalability, why and hows (15)
Space Based Architecture (6)
Java Spaces (4)
GigaSpaces (5)
Migrating Spring Apps to GigaSpaces (6)
Case Study (2)
Conclusion (2)
Q/A
40 slides
Time
10
5
5
10
5
10
1
-
45 min
Innovation Comes From The Need
The applications workload is increasing each day. This is inevitable.
We expect fast and reliable softwares even with increasing workload.
Speed and reliability means the death or life of a business.
But why so much Workload?
Todays softwares are not limited to operators and limited society. They directly interact with millions of people and thousand of other softwares.
● Large scale community sites, like facebook, hi5, twitter.● Prepaid Telecoms● Banking/XTP● Online Gaming● Online Fraud/Risk Management
Need For Speed
A brokerage can lose up to $4 million per millisecond of latency.
- The Tabb Group
An additional 500 ms latency resulted in -20% traffic.
An additional 100 ms in latency resulted in -1% sales.
- Amazon
Cost of Downtime
According to a 2004 Forrester survey of 235 companies the hourly cost of downtime was:
Percent of Companies Hourly Cost
33% $10K-100K
25% $100K-500K
13% $500K- 1M
4% >$1M
25% Didn't Know
Unpredictable work load
● How do you design and build applications that cost-effectively scale in such conditions?
● Without compromising reliability, performance and time-to-market?
Scalability
The solutions is to have scalable softwares. With scalability we create speed and reliability.
– Vertical scalability; More powerful machines leads to faster software.
– Horizontal scalability; More boxes leads to faster and more reliable software.
– Linear scalability; The overall throughput = (number of processing units) * (throughput per unit).
– Dynamic scalability; Scale on demand (usually using some sort of provisioning and monitoring capabilities)
We usually refer to horizontal scalability, since its more applicable and cost effective. Budget is a great excuse.
Amdal's Law
if, for example, your program has only 10% of a given function synchronized, then:
if the throughput of that function at a single CPU is 100 messages per second,
to increase performance by a factor of 10 (to 1,000 msg/sec)
we will need to increase our CPU resources by a factor of 100
This is 10 times more then what would have been required if the application wouldn't have any synchronization blocks in its code
Scalability Wall
Non-Scalable applications are expensive and risky.
At some point the application will hit a wall:● Application crashes● Re-architecting the application every few months/years
Server cost 20,000
Server Throughput:
1,000 tx/sec
Contention: 15%
Amdal's Law Consequences
To have scalable softwares, we should eliminate synchronized blocks. This means eliminating the bottlenecks and contentions.
Do We build Scalable Software?
Order Management Example
Need Availability? Things Get Worse
Tier Based Car-wash
● Total CPH is the minimal CPH.
● Failure in each warehouse makes the whole business fail.
● To increase performance need to budget all three warehouses.
● Personnel with specialized capabilities.
All in One Car-wash
● To increase CPH, simple add new warehouses.
● Better resources utilization.
● Each warehouse is independent.
● Less steps
Scaling Made Simple – Process Unit Design
Space-Based Architecture
● Based on Object Space Computational Model.● Processing Unit
– Self sufficient unit of scale
– Combination of Data, Processing and Messaging
● Principles of Partitioning● Content Based Routing● Interaction Model Abstractions
Inside A Processing Unit
Closer Look at PU
Parallel Pus – Bring Linear Scalability
The Ideal Scenario - “Write Once Scale Anywhere”
● Scale-out to get more processing power when volume increases.
● Through caching● Parallelizing of TX● Low commodity
resources ● Better Utilization
Space Based Architecture – Theory Basics
● Object Spaces is a paradigm for development of distributed computing applications.
● Spaces can be used to achieve scalability through parallel processing.
● Objects, when deposited in an Object Space are passive, i.e., their methods cannot be invoked while the objects are in the Object Space.
● This paradigm inherently provides mutual exclusion.● Linda coordination language was developed at Yale.● Object Spaces is usually called Tuple Spaces since it
contains of tuples unrelated to each others.
SBA Paradigm in Java; Sun Didn't (Re)invent The Wheel
● Linda a language and platform on tuple-spaces.● Space model was recommending a plug-n-play
infrastructure.– Jini was there
● So JavaSpaces was invented, based on TupleSpaces paradigm and on top of Jini platform.
● By the way Java is not the only language to take the concept. Tuple-spaces are ported to many other languages such as Python, Ruby, Scala, C, .NET, ... .
Tuple/Java Spaces Basics Operations
// An Entry classpublic class SpaceEntrySpaceEntry implements EntryEntry { public Integer count = 0; public String toString() { return "Count: " + count; }}
public class Server Server { public static void main(String[] args) throws Exception { SpaceEntry entry = new SpaceEntry(); JavaSpace space = (JavaSpace) space();
// Register and write the Entry into the Space space.write(entry, null, Lease.FOREVER);
// retrieve the Entry and check its state. SpaceEntry e = space.read(new SpaceEntry(), null, Long.MAX_VALUE);}
JavaSpaces Standard API
Java Spaces Implementations
● Sun RI (now River Project)● Orbitz (running orbitz.com)● Blitz (open source)● Openwings (?)● Semispaces● TSpaces (IBM's implementation)● GigaSpaces
About GigaSpaces Technologies
● Provides Application Platform product (XAP) for applications characterized by:
– High volume transaction processing and
– Very Low latency requirements
– Large Data Volumes
● Scaled-Out Application Server – GigaSpaces XAP– In-Memory Data Grid
– Service Grid
– Java, .NET and C++
● Customer Base – Financial Services, Retail, Banking, Gaming
XAP – eXtreme Application Platform
XAP – pronounced zap - a new class of application server focusing cloud computing and scaling out architectures.
Used for two main domain:● Data intensive/EDG (write-behind cache)● Compute intensive
GigaSpaces Architecture – Sub Systems
GigaSpaces Architecture - Runtime
GigaSpaces Architecture - SLA
How Can GigaSpaces Help Me● To much data and DB is slow. My application has too many
interactions with database.● Application does not scale well. We have (strong) hardware but
throughput does not increase anymore (symptom of tier based architecture)
● We develop XTP platform. e.g. billing, banking, finance.● Not pleased with my HTTP session clustering solution.● Want an scalable SOA/ESB platform.● Need in memory indexing, searching.● Want to deploy my application in cloud (pay-as-you-go)● Want CBR over my MQ.● We don't use Java. Want to stay in C++ or .NET.● Want SLA in my application/data-partition.
Expectations From Application Server
● Data access● Messaging / Event Processing ● Remoting● TX management ● Web
Migration to GigaSpaces● Messaging / Event processing
– Replace MDBs with GigaSpaces event listeners
● Remoting– Replace SLSBs with GigaSpaces SVF
(Remoting/Executors)
● Data access– Use GigaSpaces 2nd cache for Hibernate
– Convert your DAOs to use GigaSpaces, use mirror to persist
● TX management – Use Spring…
● Web– Use GigaSpaces web processing unit
– Use GS HTTP session replication
Migration in Practice
Converted Layer
Code change Config change
Effort (3 is
biggest)
Messaging Minor to none Yes 1
Remoting No Yes 1
Data Access
ORM 2nd level cache: No
DAO: Yes
Yes 2-3
Http Session
No No 1
XAP and Integration to Other (JEE) Platforms
● Spring● ORM● Lucene/Compass● Mule ESB● JGroovy, JRuby, and hopefully Scala ● C++● .NET
XAP Alternatives
● Data Grid– GemFire, Coherence
● Shared Memory– Terracotta/NAM, Memcached, Tokyo Cabinet, Infinispin
● Computation Grids– IBM Extreme Scale Platform
● Cloud/Grid– Google AppEngine, GridGain
● Map Reduce Engines– Hadoop, Disco, Skynet
Case Study – MAGFA SMPP Gateway
Case Study – Let's See it in Action
Other Useful Results
● Use everything in place. Memcached helped us a lot to have a fast, simple and centralized Key/Value store.
● BASE-like transactions in favor of full XA. Memcached as a transaction-memory.
● Tried on both Linux and Solaris. No tangible difference.● Don't design for an specific platform. Use them as tools.
– Easily switched to ActiveMQ, RabbitMQ, Coherence
● Spring greatly helps to apply above rule.● Love immutable. It prevents bugs before they happen.● Reduces contention as much as possible.
References
● Migrating JEE Apps to GigaSpaces, Uri Cohen● Scaling Out Tier Based Applications, Nati Shalom● Cloud Computing; Designing Applications for Efficiency,
Geva Perry● Characteristics of The Next Generation Application
Servers, Guy Nirpaz● GigaSpaces Wiki● Wikipedia
Thanks for Your Attention
Q/A