Upload
kuijven
View
223
Download
0
Embed Size (px)
Citation preview
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 1/46
Practical Distributed
ProcessingUsing MySQL Built-In Functionality
Bob Burgess – radian6 Technologies
MySQL User Conference 2010
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 2/46
Who Am I
Database Analyst, radian6
Several Years Experience:
◦ Sybase, Oracle, MySQL
◦ AIX, Linux
◦ programming
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 3/46
Scope
MySQL concepts used in this system
How I solved our particular problem
Practical set-up of our distributed
system
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 4/46
Scope – NOT
Complete course on distributedprocessing
Complete coverage of related MySQL
functionality
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 5/46
The Problem
“Influencer” calculation: complex andgetting more complex
More and more data
More and more customers and profiles
Not scalable to run the stored
procedure on just the main database
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 6/46
Design Goals
Move calculation off of main database
Keep existing stored procedure
No single controller
Horizontally scalable!
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 7/46
“Influencer” calculation
Who’s most influential? (For this topic.)
subset ofdatabase
sphinx index
do stuffinfluencer data
(two MyISAM tables)
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 8/46
MySQL Functionality Used
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 9/46
MySQL Functionality Used
Replication
Blackhole Storage Engine
Federated Storage Engine
Public-Key Authentication
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 10/46
Replication
MasterDatabase
ReplicaDatabase
binlogs
INSERT...UPDATE...INSERT...DELETE...
relaylogs
INSERT...UPDATE...INSERT...
DELETE...
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 11/46
Replication
MasterDatabase
binlogs
INSERT...UPDATE...INSERT...DELETE...
server_id = 1log_bin = /store/log/mysql-bin
grant replication slave on *.*
to 'repl'@'%' identified by 'secret';
/store/log/mysql-bin.000001
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 12/46
Replication
ReplicaDatabase
relaylogs
INSERT...UPDATE...INSERT...
DELETE...
server_id = 2relaylog = /store/relaylog/relaylog
change master tomaster_host="masterserver",master_port=3306,master_user="repl",master_password="secret",master_log_file="mysql-bin.000001",master_log_pos=0;
start slave; /store/relaylog/relaylog.000001
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 13/46
Black Hole Tables
> create table t (a int) engine=blackhole;Query OK, 0 rows affected (0.02 sec)
> insert into t values (1);Query OK, 1 row affected (0.00 sec)
> select * from t;Empty set (0.00 sec)
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 14/46
real table
Black Hole Tables
Before-Insert triggercreate trigger t_bibefore insert on tfor each rowinsert into real_tablevalues (new.a);
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 15/46
server2
Federated Tables
Points to a table in another database
federatedtable `tm`
server1
table `t`
create table tm (a int) engine=federatedconnection="mysql://user:pass@server1/schemaname/t"
create table t (a int)
Not enabled by default in 5.1!
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 16/46
server2
Federated Tables
Points to a table in another database
federatedtable `tm`
server1
table `t`
create table tm (a int) engine=federatedconnection="mysql://user:pass@server1/schemaname/t"
create table t (a int)
server3
federatedtable `tm`
server4
federatedtable `tm`
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 17/46
Public-Key Authentication
Secure shell (ssh) or secure copy(scp) without a password
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 18/46
worker
Public-Key Authentication
main db
# scp file mysql@main:/...
# cd /root/.ssh
# ssh-keygen# cat id_rsa.pub
# vi /home/mysql/.ssh/authorized_keys
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 19/46
Distributed System –
Why It Works For Us
All involved database data fits on the
worker database
Calculation produces a self-contained
data set
MyISAM can be used for the resultingdata
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 20/46
worker 1
Distributed System –
General Architecture
main db server
worker db
main db
MYD & MYI
MYD & MYI
r e
pl i c a t i on
f e d er a t e d
s c p f i l e c o p y
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 21/46
Distributed System –
General Architecture
main db server
worker 1
worker 2
worker 3
main db
worker db
worker db
worker db MYD & MYI
MYD & MYI
MYD & MYI
MYD & MYI MYD & MYIMYD & MYI
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 22/46
The Non-Queue... Queue
--- Topics ---No. Name
101 – Apples102 – Oranges103 – Grapes104 – Bananas
105–Pears
--- To-do List ---101104105
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 23/46
The Non-Queue... Queue
--- Topics ---101 – Apples102 – Oranges103 – Grapes104 – Bananas105 – Pears
--- Topics ---Recalc Recalc
No. Name -Reqd -Date Priority Running
101 – Apples Full 10:00am high 101102 – Oranges No 8:00am 0
103–Grapes No 7:30am 0
104 – Bananas Partial 10:10am high 302105 – Pears Full 10:05am low 0
• not running• highest priority• earliest recalc date
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 24/46
Processing Cycle
Get nextTopic to do
is thereany?
sleepno
yes
Get work...
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 25/46
Processing Cycle
replicationstopped?
sleep 10 min.yes
replicationbehind?
yessleep 1 min.
Is data current? Let replication catch up!
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 26/46
Processing Cycle
waited forreplication?
yes get a newTopic to do
Replace stale Topic Profile – someone else likely has it...
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 27/46
Processing Cycle
markTopic with
my ID
sleep shortrandom time
Topic stillhas my
ID?
nosleep 1 min.
begin again
Playing Nice with Others
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 28/46
Processing Cycle
run the stored proc
success?
nolog the error
sleep 1 min.
begin again
yes
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 29/46
Processing Cycle
flush the data table forthis Profile (locally)
copy to main dbserver
flush the data table for
this Profile (on main db)
update Profile tableon main: not running
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 30/46
Processing Cycle
back to the top
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 31/46
Code – main loop
trap 'STOP=1' TERMSTOP=0
while [ $STOP -eq 0 ] ; do
done
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 32/46
Code – who am I?
NODEID_QUERY="select @@server_id-1000;"INSTANCE=$2
NODEID=`$MYSQL -e"$NODEID_QUERY"`UNIQUE_INSTANCEID=$((NODEID*100+INSTANCE))
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 33/46
Code – who’s next?
NEEDY=`$MYSQL_MASTER -e"$NEEDY_TP_QUERY" \2>/tmp/mysql.err`
set -- $NEEDY
TPID=$1; RECALC_SCOPE=$2; RECALC_DATE=$3
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 34/46
Code – who’s next, the SQL
NEEDY_TP_QUERY="select topicFilterId,influencerRecalcReqd,unix_timestamp(influencerRecalcDate) RecalcDatefrom TopicFilterwhere influencerRecalcReqd>0
and influencerRunning=0and influencerRecalcDate<=now()and active in (1,2)and influencerPriority>=$MINPRIORITY
order by influencerPriority desc,influencerRecalcDate asc,topicFilterId
limit 1;"
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 35/46
Code – inner loop
while [ ! -z "${TPID}" -a $STOP -eq 0 ]; do
done
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 36/46
Code – how far behind?
REPLICATION_DELAY=`$MYSQL -e'show slave status\G'| grep "Seconds_Behind_Master"
| cut -f2 -d':'`
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 37/46
Code – call the proc
$MYSQL -e"call proc(${TPID},${RECALC_SCOPE})">${INFLUENCER_LOG}${TPID}.${INSTANCE}.log2>&1
RUN_SUCCESS=`tail -1${INFLUENCER_LOG}${TPID}.${INSTANCE}.log| grep -c SUMMARY`
if [ $RUN_SUCCESS -eq 1 ] ; then copy tables
else report error
fi
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 38/46
Code – send the result back
$MYSQL -e"use influencer;flush tables Dynamics_${TPID};"
if scp -p -B ${TABLE_SOURCE}/Dynamics_${TPID}.*${TABLE_MASTER_PUSH_LOC} 2>&1 && \
scp -p -B ${TABLE_SOURCE}/Post_${TPID}.*${TABLE_MASTER_PUSH_LOC} 2>&1then
copy files to backup
remove tables from worker
else report error
fi
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 39/46
Code – was I killed?
if [ $STOP -eq 1 ]; thenecho "---- Script stopped by Kill
(TERM signal) ----"fi
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 40/46
A Blackhole Table Trigger
create table BlogPost (...)engine=blackhole;
create table PostAuthor (blogPostId bigint not null primary key,author varchar(189))
default character set=utf8;
create trigger BlogPost_birbefore insert on BlogPost for each rowbegin
if new.mediaProviderId>1 theninsert ignore into PostAuthor (blogPostId,author)values (new.blogPostId, new.author);
end if;end;
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 41/46
Launch Script
Shell script to launch workers On
◦ Starts several workers
Off◦ sends a signal to each worker
◦ stops after this stored proc call
Kill◦ abort and clean up database
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 42/46
Monitor Scripts
Replication Disk space
Errors from the worker script
◦ Procedure fails
◦ scp fails
◦ gzip fails
◦ replication is behind
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 43/46
Design Decisions
MyISAM vs. InnoDB◦ Size
◦ Tables easily copied
◦ Table locking / Replication: SmallerQueries Required
SSD Storage
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 44/46
Development Directions
Migrate stored proc to Java More workers
Problems in Code – not atomic:
◦ flush table / copy table
◦ copy result tables to main db
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 45/46
Take Away...
Replication and Federated tablesaren't hard.
Thousands of MyISAM tables isn't
crazy.Distributed processing
isn't rocket surgery.
8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation
http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 46/46
Thank you!
Bob Burgess
Slides on the conference site.
Email me for sample scripts.