Upload
kurtvm
View
1.772
Download
4
Tags:
Embed Size (px)
DESCRIPTION
While many industries move towards centralized IT solutions, most retailers hold on to their complex distributed architecture with their typical store databases and store front POS. The presentation discusses a centralized retailing platform based on Oracle's Advanced Queuing technology using one clustered database and many Oracle XE's as a POS. The case study will demonstrate its effectiveness with 400 stores, 1500 POS, 1200 handscanners and 1500 back-office users. Attendees will learn about how to setup a centralized, scalable and HA architecture with topics as database and application server clustering, virtualisation and advanced queuing replication with 1500 Oracle XE databases as POS which is unique in any industry
Citation preview
Adventures inBuilding a Centralized and Scalable Retailing Platform Retailing Platform using Advanced QueueingKurt Van Meerbeeck
24/7 ABCSRPUAQ - AGENDA
� Agenda
� The Ego part
- All about AXI & me
� The Business part
- Retailing and IT- Retailing and IT
� The Infrastructure part
- Diving into the architecture
� The Geeky part
- The adventures
� Questions
The Ego PartAll about AXI & me
24/7 INTRODUCTION – ALL ABOUT ME
� Kurt Van Meerbeeck
� Oracle DBA - AXI NV/BV
- Backup & recovery internals (jDUL/DUDE)
- Oracle IAS architectures
� Working with
- Oracle related products since ’97- Oracle related products since ’97
- Java since ‘96 (jdk 1.0.1)
24/7 INTRODUCTION – ALL ABOUT AXI
� AXI NV founded in 1970 – AXI BV in 1989
� Long term Oracle partner (20+ years)
- Partner of the Year 2008 (The Netherlands)
� Hitting all cilinders of the IT technology stack
TRADE, PUBLICHEALTHRETAIL
TECHNOLOGY PARTNERS
CUSTOMERS
TRADE,
SERVICE
& INDUSTRY
PUBLICHEALTHRETAIL
Discovery Suite for financial and administrative management
24/7 Integrated Technology Services
ICT Systems - infrastructureTECHNOLOGY PARTNERS
SHAREHOLDERS
PERSONNEL
Sector software and software projects
The business partRetailing & IT
24/7 RETAILING & IT –RETAIL BUSINESS IS A SCALABLE BUSINESS
� Simple local store
� Front-office – POS
� Backoffice
24/7
� Expanding local store – scale up
� Front-office – POS
RETAILING & IT –RETAIL BUSINESS IS A SCALABLE BUSINESS
� Backoffice
24/7
� Expanding local store
� Front-office – POS
RETAILING & IT –RETAIL BUSINESS IS A SCALABLE SCALABLE
� Backoffice
24/7 RETAILING & IT –RETAIL BUSINESS IS A SCALABLE BUSINESS
� Typical POS solution
� Decentralised
� Scalable
� However ...
� Hard to manage� Hard to manage� Backup/recovery
� Failures
� Software updates
� Business Reporting� KPI
� Replication
� High TCO
24/7 RETAILING & IT –RETAIL BUSINESS IS A SCALABLE BUSINESS
� Imagine hundreds of stores
� Using their own data silo’s
� ... Yet it is still the most common store architecture
24/7 RETAILING & IT –CHANGES IN THE IT LANDSCAPE
� Trends in the IT landscape
- Consolidation & virtualisation
- Decentral to central computing to cloud computing
- Service Oriented Infrastructure
- Character-based to C/S to 3tier to grid
- Affordable communication lines
� Trends in retailing
- Big players competing each other (Netherlands)
- Profit margins under pressure
- (near) real-time information needs
24/7 RETAILING & IT –CHANGES IN THE IT LANDSCAPE
� The retailing industry is catching-up !
� And is moving towards centralised and integrated store solutions
� New challenges
24/7 RETAILING & IT –RETAIL BUSINESS IS A SCALABLE BUSINESS
� The obvious counterpart
� Centralised datastore
� POS
� Backoffice
� Solution needs to be� Solution needs to be
� Highly scalable
� Highly available
� i.e. flexible
24/7 RETAILING & IT – CUSTOMER CASE - INTERGAMMA
� Case study – Intergamma
� GAMMA & KARWEI stores
� DIY market leader in the Benelux
� Number of stores : 350 � Number of stores : 350
� Number of POS : 1500
� Number of backoffice users : 600
� Number of portable scan devices : 1200
The Infrastructure partDiving into the architecture
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� RS solution relies heavely on message-oriented middleware (MOM)
� Allows applications to connect by distributing messages
� Typically built around a queueing infrastructure
- IBM MQSeries, MSMQ, Tibco, Oracle AQ
� Decoupling in time
- Sender (producer) and receiver (consumer) do not need to interact with the queue at the same time
� Receive/store/send and keep track of messages
- guaranteed delivery
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Oracle Advanced Queueing (AQ)
� Oracle’s implementation of message-oriented middleware
- But within a database
� Persistent storage – IOT
� Aynchronous communication
IOT
Q’s
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Oracle Advanced Queueing (AQ)
- Point-to-point
- Publish/subscribe – broadcast - multicast
enqueueenqueuedequeue
dequeue
- Publish/subscribe – broadcast - multicast
publish
publishsubscribe
publish
subscribe
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Oracle Advanced Queueing (AQ)
- Message Propagation
Fan-out
Funnel-in
Sqlnet (dblinks)
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Oracle features using Advanced Queueing (AQ)
- Oracle Streams
- CDC (change data capture)
enqueue dequeue enqueuedequeue
archives
parse
dequeue
Redo generation
transactionstransactions
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Replication using Advanced Queueing (AQ)
- No log mining needed
- Optimize payload for network
enqueue dequeue enqueuedequeue
Optimise for bandwidth
appdequeue
Redo generation
transactionstransactions
app
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
Private
neworkOracle IAS - INF
Oracle IAS - MID
Oracle IAS - MID
Oracle IAS - MID
Management Node
Ora
cle
Gri
d C
on
tro
l
F5
Big
IP
LB
HttpBrowser
Web-based
Windows EmbeddedOracle XE – AQ – WS - .Net
sqlnet
Oracle 10/11g
RDBMS EE
Ora
cle
Gri
d C
on
tro
l
Symbol HHTerminalPocketBrowser - APEX
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
Store <n>
HQ
DR
The obvious question is
What if the central site is unavailable ?
(network failure)
DR
HQ
DR
Oracle 10g/11g EE RDBMS Oracle 10g XE RDBMS
AQ
AQ
sqlnet
� Frontoffice must be able to run stand-alone !
� Selling of items must not stop !
Store <n>
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
Store <n>
HQ
DR
The obvious question is
What if the central site is unavailable?
(network failure)
• Backoffice offline (http)
• HHT offline (http)
• POS Web Service offline (http)
• POS available
• Oracle XE – stores all items and prices DR
• Oracle XE – stores all items and prices
• AQ stores messages until network is available
HQ
DR
Oracle 10g/11g EE RDBMS Oracle 10g XE RDBMS
AQ
AQ Store <n>
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Architectural choices - Scale-out vs scale-up
Oracle RDBMS
Oracle IAS - INF
Oracle RDBMS
Oracle IAS - INF� Scale-up
� Capacity-on-demand- Add cpu’s
- Add memory
Oracle RDBMS Oracle RDBMS Oracle RDBMS
Scale-out : add nodes
- Add memory
� Oracle EM 10gR3� Provisioning pack
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Architectural choices
� Scale-up - example
� IBM pSeries
- Hardware based virtualisation
- (Dynamic) LPAR
8cpu/32Gb (on demand)
LPAR 0.5cpu/2Gb LPAR
LPAR 0.5cpu/2Gb LPAR
- (Dynamic) LPAR
- Capacity-On-demand
- Oracle licenses (!)
- initial cost might be high
LPAR 8cpu/16Gb LPAR
LPAR 2cpu/2Gb LPAR
LPAR 1cpu/2Gb LPAR
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Architectural choices
� Enterprise hardware vs commodity hardware
- Commodity hardware does not equal cheap hardware
- Healthy mix
� Oracle VM – adds a new dimention to scalable architectures� Oracle VM – adds a new dimention to scalable architectures
- scale up and out on commodity hardware almost transparently
- Allows hard partitioning
CPU04cores
CPU14cores
RAC0
RAC1
AS0 AS1
RAC0
RAC1
AS0 AS1
RAC2
AS2
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Testing AXI*RS on (ML 464754.1)
- Commodity hardware
- Oracle RDBMS 10g Real Application Cluster
- Oracle Enterprise Linux
- Oracle VM – Xen based virtualisation
- Oracle VM – supports RAC
- Linux based LB (VIPS+ldirector)
- Lower initial costs - Lower initial costs
� Oracle Unbreakable Linux support program
- Enterprise-class support for the whole stack
Oracle RDBMS Oracle RDBMS Oracle RDBMS
Scale-out : add nodes
Scale-up : Oracle VM
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� Application will scale near-linear because of Advanced Queueing
- Bind queues to specific RAC nodes
Oracle RDBMS Oracle RDBMS Oracle RDBMS
24/7 A DIVE INTO THE ARCHITECTURE – AXI RETAIL SOLUTIONS
� The challenge
- Know when to scale – measure – capacity planning
- Be prepared - provisioning
- Make your solution scale with the hardware - tuning
� Good tools
- Grid Control – Automatic Workload Repository – Active Session History- Grid Control – Automatic Workload Repository – Active Session History
- Management – tuning/troubleshooting
- Hobbit (Big Brother)
- Alerting
- Easy customization
- Capacity Planner
Service Desk Automatisation
24/7 SCALING THE SYSTEM – KNOWING WHEN TO SCALE - ADVANCED QUEUEING
0
5
10
15
20
25
30
35
OS Queues : 01/11/2006 - 01/09/2008
Runqueue
Waitqueue
Paging : 01/11/2006 - 01/09/2008
� Measure, so you can manage
0
50
100
150
200
250
300
350
400
450
PI
PO
0
2
4
6
8
10
12
# C
PU
s
Physical CPU's used by LPAR : 01/11/2006 - 01/09/2008
PHYS CPU
24/7 SCALING THE SYSTEM – KNOWING WHEN TO SCALE - ADVANCED QUEUEING
0
1000
2000
3000
4000
5000
6000
7000
Waits ms/s : IGDBP 01/11/2006 00:00 - 01/09/2008 00:00
Waits ms/s
0
1000
2000
3000
4000
5000
6000
7000
Sessions : IGDBP 01/11/2006 00:00 - 01/09/2008 00:00
Tot. Sessions
Act. Sessions
0
5000000
10000000
15000000
20000000
25000000
30000000
logical IO/min : IGDBP 01/11/2006 00:00 - 01/09/2008 00:00
Logical IO/min
0
1000
2000
3000
4000
5000
6000
# Users: IGDBP 01/11/2006 00:00 - 01/09/2008 00:00
WAP
IRSII_WAP_FO
HTMLDB_PUBLIC_USER
The Geeky part(the adventures)
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Store to central site
- R = W + S
- Response time = Wait time + Service time
POS App
Oracle J0x
process
Oracle
dedicated
process
Oracle
dedicated
process
network10g EE RDBMS 10g XE RDBMS
Response time
S Wdq Wq Weq Wnet Wdq Wq Weq
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� R < 1min
600
800
1000
1200
enqueue/min
0
200
400
0:0
0
0:5
6
1:5
2
2:4
8
3:4
4
4:4
0
5:3
6
6:3
2
7:2
8
8:2
4
9:2
0
10
:16
11
:12
12
:08
13
:04
14
:00
14
:56
15
:52
16
:48
17
:44
18
:40
19
:36
20
:32
21
:28
22
:24
23
:20
processed/min
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Wdq / Weq – time spend enqueueing/dequeueing
- Contention on queue table
- Waits on ITL slots (TX enq)
- Hot spots/blocks
� Spread load over multiple queues
- Max 255 POS/queue
- Lowers arrival rate
- Lowers Wq
� Increase ITL slots
- Initrans/maxtrans
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Wdq / Weq – time spend enqueueing/dequeueing
- More dequeueing/processing procs -> high OS runqueue
- Bind queue on RAC instance
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Wdq / Weq – time spend enqueueing/dequeueing
- Contention on AQ metadata tables
- TX enqueue locks on AQ$_PROPAGATION_STATUS
- Update AQ$_PROPAGATION_STATUS same record over and over
- Record identified by data objectid of remote queue (XE)
- Make sure objectid of all remote queues are unique (drop/recreate)
- RAC : alter table rebuild minimize records_per_block
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Wdq / Weq – time spend enqueueing/dequeueing
- QMON space management on ASSM tablespaces
- Issue propagating from central database to POS
- Manual coalesce/shrink IOT
- Serious impact on queue operations (LIO->CPU)
24/7 SCALING THE SYSTEM – KNOWING WHEN TO SCALE - ADVANCED QUEUEING
� Weq/Wdq/S - Logfile sync waits
- LGWR can lose CPU before it has exhausted its fair time slice
- Bind to CPU – or – renice – or – make non-preemptive
- Use RAC
Redo
Redo copy/allocation latch
commit
commitRedo
Log
buffer
Redo logfile
lgwr
commit
commit
LFS wait
SemaphoreQueue
P/W
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Tune service time - Partitioning – divide and conquer
- For managability – ILM
- Range partition / List subpartition
- Problem – timestamp not part of PK (global index from hell)
- 50 tables x 7y x 4Q x 350 stores = 490000 (sub)partitions
- SQL plan – partition iterator – impact LIO
- For performance – partition pruning – contention elimination (RAC)- For performance – partition pruning – contention elimination (RAC)
- List partition on store number – part of PK – local partitioned indexes
- 50 tables x 350 stores = 17500 partitions – partition manager
- ILM
- SQL plan – partition pruning
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Oracle dedicated and shared server architecture
- Healthy mix shared and dedicated server
- 11g Database Resident Connection Pool (however with 10g XE?)
DispatcherShared server
SERVER=DEDICATED
SERVER=SHARED
User process
Dedicated server
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Tuning service time - SQL Plan stability
- ‘we’ve changed nothing – the system is slow now’
� Optimizer trends
- Oracle 7-8 – CBO gaining grounds – plan stability
- Oracle 8i
- Optimizer_index_cost_adj + optimizer_index_caching
- Default parallel query
- Oracle 9i
- CPU costing (dbms_stats.gather_system_stats)
- Oracle 10g
- automatic statistics (stale/ for all columns auto/for all columns repeat)
- bind peeking/histograms
� Oracle 11g – SQL Plan Management
� 11g Intelligent Cursor Sharing
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Tuning service time - SQL Plan stability
- ‘we’ve changed nothing – the system is slow now’
� Oracle 11g – SQL Plan Management (SPM)
GB
Compile Execute
Plan Acceptable
Compile
GB
SQL log
Plan history
HJ
GB
Plan baseline
NL
GB
NL
Plan AcceptableHJ
GB
NL
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Tuning service time - SQL Plan stability
- ‘we’ve changed nothing – the system is slow now’
� Bind variables and partitioned tables
Table (store)
p2 p4
Partitions = Natural histogram
p1
p2 p3
p4
Partitions = Natural histogram
� Bind variable is peeked on hard parse
� Plan for p4 might not be ideal for p1
� 10g : disable bind peeking
� 11g : Adaptive Intelligent Cursor Sharing
24/7 THE ADVENTURES – TUNING RESPONSE TIME
� Tuning service time – make backup really fast
- RMAN incremental updated backups
- Apply incremental backupset to backup copy
- 10g Block Change Tracking (BCT)
CTWR
backupset
0
0,2
0,4
0,6
0,8
1
1,2
1,4
1,6
01
-11
-20
06
06
26
-11
-20
06
18
16
-12
-20
06
12
06
-01
-20
07
00
25
-01
-20
07
18
14
-02
-20
07
12
06
-03
-20
07
06
26
-03
-20
07
00
14
-04
-20
07
18
04
-05
-20
07
12
24
-05
-20
07
06
13
-06
-20
07
00
02
-07
-20
07
18
22
-07
-20
07
12
11
-08
-20
07
06
31
-08
-20
07
00
20
-09
-20
07
00
09
-10
-20
07
18
29
-10
-20
07
12
18
-11
-20
07
06
08
-12
-20
07
00
27
-12
-20
07
18
16
-01
-20
08
12
05
-02
-20
08
06
25
-02
-20
08
00
15
-03
-20
08
18
04
-04
-20
08
12
24
-04
-20
08
06
14
-05
-20
08
00
02
-06
-20
08
18
22
-06
-20
08
12
12
-07
-20
08
06
01
-08
-20
08
00
20
-08
-20
08
18
Database growth
Tot.Size [Tb]
Bctr.dbf RMAN backupset RMAN
24/7 CUSTOMER RESPONSE
� In the end ...
� Simon Vreeke, CTO, IG
[ AXI*RS offers our stores a 100% available solution – able to process all our transactions and more. For our customers, the process is quick, correct and secure...]
� PlusRetail
Questions
www.axi.bewww.axi.nl