MeDiCI - How to Withstand a Research Data Tsunami

UQ’s MeDiCIData in the right place, at the right time.

Jake Carroll, Senior ICT Manager, ResearchThe Queensland Brain Institute, The University of Queensland, Australia

[email protected]

mailto:[email protected]

This is a story of data locality, performance, namespace and financial

complexity.

QBI

CAI

IMB

AIBN

100’s of TB’s per day of data generated - eclectic mixture of Life

Sciences data, engineering, physics, nanotech

Every man, woman and child seems to build a (little) supercomputer, to deal with their problems…

Compute + Storage are tightly connected, in each building.

Instrument outputs + scientific endeavors grow - budgets for storage

and compute do not.

To add another complexity…

The MeDiCI Journey

Thus, the problem (or question) definition:

“How do we provide parallel access to scientific data, through a multitudeof protocols and give the illusion that the data is ‘next to’ the applications, on a budget, keeping the rightdata near the right type of computational infrastructure, noting our budgetary constraints?

SpectrumScale AFM (cache)

{Parallel IO via NSD protocol}

SpectrumScale AFM (home)

Back at UQ

uqjcarr1

Scale cluster “A”using UQ creds

Scale cluster “B”using other creds

Out at Polaris

someOtherName

mmname2uuidmmuuid2Name

Turns out, all that code was missing from SpectrumScale.

Network stumbles…

• We had, at best, 10GbE between our buildings and around the campus.

• Not made for the parallel IO aggression of spectrumScale AFM over the NSD protocol.

• Needed to spawn an entire mini-project to upgrade campus networks for big storage IO to 40/100G around the “ring” of nodes.

Recovery storms - AFM is a work in progress

• When you’re trying to recover 10’s of millions of files, AFM doesn’t always keep up.

• IBM working on it, for us (and others, globally).

• Scaling to 100’s of millions of files in a single (or multiple) file-sets, if not billions of files in sync/push/recovery is required.

Things we assumed users would doas per our mental model.

User puts data in cache frominstruments to send to a

supercomputer, at remote site

User processes data out atremote site on said supercomputer

Things people actually did, breaking our mental model.

User puts data in cache frominstruments. They start processing

on a supercomputer locally.

Simultaneously, they start using the storage fabric to process other “bits”of the outputs of the run on the other supercomputer for an additive workflow.[culminating in the fabric becoming a means for both supercomputers to work on the same tasks at the same time]

Same data namespaceended up everywhere.

That much, was intentional.

As a result, user could leverage*every bit of the compute* everywhere

simultaneously, if their workflowis smart enough…

IMB QBI

RCC

Turns out, we’re onto something

Thank you.

• UQ RCC, David Abramson for mentorship and a true sense of adventure.

• The Queensland Cyber Infrastructure Foundation (QCIF)

• My colleagues at UQ QBI, IMB, CAI, AIBN, ITS

• AIIA, ACS

• Justin Glen @ DDN

Technology

MeDiCI - How to Withstand a Research Data Tsunami