19
CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/ Multithreading in CASTOR How to use pthreads without seeing them (almost…) Giuseppe Lo Presti DM technical meeting – July 1 st , 2008

Multithreading in CASTOR

Embed Size (px)

DESCRIPTION

Multithreading in CASTOR. How to use pthreads without seeing them (almost…) Giuseppe Lo Presti DM technical meeting – July 1 st , 2008. Outline. Overview Architecture and requirements A C++ framework for multithreading Design and implementation Some user code samples The internals - PowerPoint PPT Presentation

Citation preview

Page 1: Multithreading in CASTOR

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

Multithreading in CASTOR

How to use pthreads without seeing them (almost…)

Giuseppe Lo Presti

DM technical meeting – July 1st, 2008

Page 2: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 2

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Outline

• Overview– Architecture and requirements

• A C++ framework for multithreading– Design and implementation– Some user code samples– The internals

• The framework in action• Conclusions

Page 3: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 3

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Castor Architecture Overview

• Database centric– Stateless redundant software components– State stored in a central database for scalability and fault

resiliency purposes

• Technology choices– A number of multithreaded daemons perform all needed

tasks to serve user requests– Each operation is reflected in the database => tasks are

inherently I/O bound or better “latency bound”• Dominated by db/network latency

– Concurrency issues resolved in the databaseby using locks

Page 4: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 4

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

High Level Requirements

• Multithreading to achieve better overall throughput in terms of #requests/sec– System inherently superlinear because of I/O bound tasks

• Need for supporting thread pools– Each one dedicated to a different task

• Lightweight multithreading infrastructure– Limit memory footprint of the daemons

• Seamless integration with C++• Very limited issues with synchronization and data

sharing across different threads– Context data is always in the db– Each thread deals with a different request:

typical case of embarassing parallelism

Page 5: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 5

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

A Framework for Multithreading

• Choices– Usage of Linux POSIX threads– C++ package to hide pthreads complexity and

provide a Java-like interface• IThread abstract class (cf. Java Runnable interface)• Specialized thread pools to implement different

functionalities (e.g. requests handling)• Very high reusability across all software components

– Ability to have thread-safe and thread-shared variables

– Daemon mode with embedded signal handling• Support for graceful stop and restart of daemons

Page 6: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 6

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Framework Implementation

• Usage of an existing OS abstraction layer:the Cthread API– Replicates all pthread API, and additionally provides

thread-safe global variables– One of the most mature (read old…) parts in the Castor

codebase, shared by different projects in IT

• C++ code– Clean interface for the user: generic methods to compose

daemons out of user classes– Cthread / pthread / system calls are kept hidden from user

code, but still accessible for special cases• E.g. mutexes

• Typical use cases– Listening to a port and accepting requests– Polling the database for next operation to perform

Page 7: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 7

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Simplified Class Diagram

Programmer’s interface

Page 8: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 8

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Main Classes

• Thread pools– ListenerThreadPool: generic socket connection

dispatcher à-la Apache• Specialized classes for TCP, UDP, … sockets

– SignalThreadPool: pool manager for backend activities that need to run periodically or upon external signalling

• The signalling mechanism is based on condition variables– ForkedProcessPool: pool manager based on fork(),

not on pthreads, supporting pipes for IPC

• Classes for servers– BaseServer: basic generic server providing daemon

mode (detach from shell) and logging initialization– BaseDaemon: more sophisticated base class for daemons,

supporting system signal handling and any combinations of the implemented thread pools

Page 9: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 9

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Code Samples

• Excerpt from the Monitoring daemon’s main()– Different thread pools are mixed together– The start() method from BaseDaemon spawns all the

requested threads

RmMasterDaemon daemon;...// db threadpooldaemon.addThreadPool(new castor::server::SignalThreadPool( "DatabaseActuator”, new DbActuatorThread( daemon.clusterStatus()), updateInterval));daemon.getThreadPool('D')->setNbThreads(1);// update threadpooldaemon.addThreadPool(new castor::server::UDPListenerThreadPool( "Update", new UpdateThread( daemon.clusterStatus()), listenPort));...// Start daemondaemon.parseCommandLine(argc, argv);daemon.start();

Page 10: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 10

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Code Samples

• User threads– As easy as inheriting from IThread:

• Typical pitfall: code is shared among all threads in each given pool– Mutex sections and synchronization to be explicitly

implemented – no synchronized methods like in Java

• Consequence: class variables are thread-shared, only local variables are thread-safe– But you may need thread-safe singletons…

• Our solution (provided by Cthreads): for each thread-safe global variable, keep an hash map indexed by TID

class UpdateThread : public castor::server::IThread { public: virtual void run(void *param) throw(); virtual void stop();}

Page 11: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 11

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

The Internals

• …So, where are the (p)threads?• BaseThreadPool serves as basic infrastructure

– A friend function _thread_run() is the thread entrypoint, which runs the user code

– All specialized thread pools use this function when spawning threads

void* castor::server::_thread_run(void* param){ struct threadArgs *args = (struct threadArgs*)param; castor::server::BaseThreadPool* pool = dynamic_cast<castor::server::BaseThreadPool*>(args->handler); // Executes the thread try { pool->m_thread->run(args->param); } catch(castor::exception::Exception any) { // error handling }}

Page 12: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 12

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

The Internals

• SignalThreadPool encapsulates pthread_create() calls and condition variables

• Threads wait until a condition variable gets notified, or after a timeout has passed– pthread_cond_wait() and pthread_cond_signal()– One (or more) thread in the pool is waken up and executes

the user code– Pool keeps track of current # of busy threads

void castor::server::SignalThreadPool::run() throw (...) { ... // create pool of detached threads for (int i = 0; i < m_nbThreads; i++) { if (Cthread_create_detached( castor::server::_thread_run, args) >= 0) { ++n; // for later error handling } } ...}

Page 13: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 13

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

The Internals

• ForkedProcessPool encapsulates fork() calls, with children dispatch handled via select()

void castor::server::ForkedProcessPool::init() throw (...) { // create pool of forked processes // we do it here so it is done before daemonization m_childPid = new int[m_nbThreads]; for (int i = 0; i < m_nbThreads; i++) { ... castor::io::PipeSocket* ps = new castor::io::PipeSocket(); m_childPid[i] = 0; int pid = fork(); if(pid < 0) { ... // error } else if(pid == 0) { // child ... childRun(ps); // this is a blocking call to the user code exit(EXIT_SUCCESS); } else { // parent: save pipe and pid m_childPid[i] = pid; m_childPipe.push_back(ps); ps->closeRead(); int fd = ps->getFdWrite(); FD_SET(fd, &m_writePipes); // prepare mask for select() } }}

Page 14: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 14

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

The Internals

• BaseDaemon manages all threads and encapsulates the system signal handling– To avoid unpredictable behaviours, all threads need to

be protected from signals via:pthread_sigmask(SIG_BLOCK, &signalSet, NULL)where signalSet includes all usual system signals

– Yet another pthread performs a customized system signal handling by looping on sigwait()

– After spawning all user threads, the main thread waits for a notification from the dedicated signal handling thread, and broadcasts an appropriate message to all running threads

• E.g. on SIGTERM, all user threads’ stop() methods are called; after # of busy threads goes to 0, exit() is called.

• Forked children are told to stop via SIGTERM too

Page 15: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 15

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

The Internals

• Additional facilities in the framework– BaseDbThread implements the IThread

interface and provides a graceful termination of a thread-specific database connection upon stop()

– Mutex wraps common pthread functions to handle mutexes on integer variables• wait() and signal() methods provided• Generic mutexes on variables of any type left to the

user code

– PipeSocket wraps a Unix pipe and allows object streaming between different processes (e.g. parent and children)

Page 16: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 16

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

The Internals: full Class Diagram

Page 17: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 17

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

The Framework in Action

• Class Diagram from Castor doxygen documentation– Most Castor daemons

inherit from BaseDaemon

– They all support graceful stop, e.g.:

DATE=20080522175726.156834 HOST=lxb1952.cern.ch LVL=System FACILITY=Stager PID=11439 […] MESSAGE="GRACEFUL STOP [SIGTERM] - Shutting down the service"

DATE=20080522175728.857292 HOST=lxb1952.cern.ch LVL=System FACILITY=Stager PID=11439 […] MESSAGE="GRACEFUL STOP [SIGTERM] - Shut down successfully completed"

Page 18: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 18

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

The Framework in Action

• Typical load on a node– 8 cores run a total of ~90 threads, each owning a db

connection, with a fraction of the total available CPU and memory resources even during high load peaks

• The stager daemon alone runs 53 threads

– This is the current deployment of a productionCastor instance

top - 16:17:53 up 115 days, 11:00, 4 users, load average: 1.06, 0.78, 0.59Tasks: 173 total, 2 running, 171 sleeping, 0 stopped, 0 zombieCpu(s): 6.5% us, 1.9% sy, 0.0% ni, 91.3% id, 0.0% wa, 0.0% hi, 0.3% siMem: 16414780k total, 7548712k used, 8866068k free, 634696k buffersSwap: 4192880k total, 220k used, 4192660k free, 5285996k cached

PID USER PR NI %CPU TIME+ %MEM VIRT RES SHR S COMMAND31110 stage 16 0 20 3:07.56 0.2 183m 32m 11m S migrator 3107 root 16 0 4 4:48.46 0.5 237m 76m 5972 S dlfserver31107 stage 15 0 3 0:38.23 0.2 183m 32m 11m S migrator 3309 root 15 0 2 22:11.80 0.7 741m 109m 9.8m S stagerDaemon 3315 root 16 0 2 21:28.97 0.7 741m 109m 9.8m S stagerDaemon 3314 root 16 0 2 21:58.28 0.7 741m 109m 9.8m S stagerDaemon 3238 root 16 0 1 40:37.70 0.2 372m 29m 8380 S rhserver...

Page 19: Multithreading in CASTOR

Giuseppe Lo Presti, Multithreading in CASTOR - 19

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Conclusions

• We have shown how the pthread API can be powerful enough to support many high level multithreaded tasks– But don’t forget that we started with an

embarassing parallelism scenario…

• The CASTOR service moved from 6 dual CPU nodes to one 8-cores node– No way out of multithreading

• I know, that’s become pretty obvious by now…

• Comments, questions? www.cern.ch/castor