34
Using Cassandra in Cloudian, an S3 Cloud Storage System August 8, 2012 Gary Ogasawara Cloudian, Inc. Cassandra Summit 2012 (#cassandra12) Page 1 Copyright © 2012 Cloudian Inc. & KK All Rights Reserved.

UsingCassandrain Cloudian, - DataStax · UsingCassandrain Cloudian, anS3CloudStorageSystem ... region&has&own&domain.& • ... 20 tps,&10&threads,&2MB&data& HyperStore: Less Compaction

  • Upload
    tranque

  • View
    226

  • Download
    0

Embed Size (px)

Citation preview

Using  Cassandra  in  Cloudian,    

an  S3  Cloud  Storage  System

August  8,  2012  

Gary  Ogasawara  

Cloudian,  Inc.  

Cassandra  Summit  2012  (#cassandra12)  

Page  1 Copyright  ©  2012  Cloudian  Inc.  &  KK      All  Rights  Reserved.

#cassandra12

What  is  Cloudian?  

Cloudian  =    

     S3  Cloud  Storage    

     as  Packaged  SoBware  

(c)  Copyright  ,  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved. 2  

#cassandra12

Cloudian  Features  1.   Full  Amazon  S3  API  CompaJbility,  including  error  codes    

2.   MulJ-­‐datacenter,  peer-­‐to-­‐peer  architecture.    No  single  point  of  failure.  

3.   MulJ-­‐tenant:  QoS  controls,  billing,  reporJng  by  each  User  and  each  Group  

4.   Public  and  Private  Clouds.    

5.   ElasJc  Capacity:  small  start  and  scale-­‐out  as  needed  

6.    System,  Group,  and  User  management  by  Management  Console  or  REST  API  

7.   Easy  to  Use  Packaged  SoBware,  backed  by  24x7  carrier  grade  support.  

 

3 (c)  Copyright,  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cloudian  ObjecJves  

1.   Fully  packaged  soBware  •  Hide  NoSQL  complexity  

•  Easy  install/upgrade    

•  HyperStore:  Best  fit  store  

•  Easy  to  deploy  on  exisSng  hardware/network.  

•  Flexible  for  different  customer  types.  

•  Scalable.    Start  small  and  grow.  

1.   S3  API  full  compaJbility  

•  Use  S3  ecosystem  applicaSons  “as  is”.  

•  API  already  designed.  

1.   Complete  service  pla^orm  

•  User/Group  Provisioning  •  Cluster  Management  

•  ReporSng  •  Billing  

•  Turnkey  system.      

•  Can  choose  integraSon  points  with  exisSng  systems.  

(c)  Copyright,  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved. 4

#cassandra12

Object  vs.  File  vs.  Block  Storage  

AbstracJon    Level  

OBJECTS  

FILES  

BLOCKS  

HTTP  

ApplicaJon  Level  

OS  User  Level  

OS  Kernel  Level  

NAS  (NFS,  CIFS)  

SAN  (iSCSI)  

Page  5 (c)  Copyright  ,  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Libraries,  applicaSons,  gateways,  etc.  using  Amazon  S3  can  be  simply  re-­‐pointed  to  Cloudian.  

Public  

Private  

Hybrid  

S3  Ecosystem

Page  6 Copyright  ©  2012  Cloudian  Inc.  &  KK      All  Rights  Reserved.

#cassandra12

S3  FuncJons  

•  HTTP  REST  API.    PUT,  POST,  GET,  DELETE,  HEAD.  •  Objects  organized  into  buckets.  •  Security.    Requests  authenScated  using  keyed  HMAC  with  symmetric  keys.    Also,  HTTPS  opSon,  client-­‐side  encrypSon,  server-­‐side  encrypSon.  

•  Access  control  lists  (ACLs)  define  access  rights  to  bucket  and  object.  •  Accoun9ng  of  bytes  inbound,  outbound,  stored  and  HTTP  request  counts.    Billing  by  Sered  raSng  plans  per  accounSng  type,  per-­‐region.  

•  Mul9-­‐part  uploads.    Allows  uploading  large  objects  in  mulSple  parts.  

•  Versioning.    MulSple  versions  of  same  object.  

•  Loca9on  constraint.    Buckets  can  be  assigned  to  a  specific  region.    Each  region  has  own  domain.  

•  …  Page  7 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Works  with  leading  Cloud  Compute  Pla^orms�

© 2010-2012 Gemini Mobile Technologies Inc. & KK

Page 8

Cloudian-Citrix CloudStack �(May 9, 2012) �

Cloudian-OpenStack �(October 21, 2011)�

#cassandra12

Cloudian  Customers

Private Hybrid

Public

Channel  Partners:  

Page  9 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Why  Cassandra?  l Scalable  • Add  capacity  by  adding  nodes  to  running  system.  • Distributed  (P2P  architecture),  no  single  point  of  failure  

l Reliable      •  Resilient  to  network  or  hardware  failures.  • MulS-­‐datacenter  replicaSon  •  Tuneable  data  consistency  level.  

l Features  •  TTL,  secondary  indexes,  counters,  compression,  encrypSon,  …  

l Fast  • Write  path  especially  fast.  

Why  Cassandra?  

10 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cassandra  in  Cloudian  

• v1.0.7  in  use  (started  at  0.7.x)  • Forked  to  add  customizaSons  

• Hector  client  • Data  stored  includes:  • Object  metadata  

• Reports/logs  • Counters  for  rate  control  • …  

Page 11 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cloudian:  Logical  Architecture  

12

Admin  Server  

S3  Server  

CredenJals  DB    

AccountInfo  &  QoS  DB  

(Cassandra)  

UserData  DB  (Cassandra)  

Reports  DB  (Cassandra)  

Servlets  Servlets  

Login  

Account  profile  /  Security  keys  

Reports  

Data  Explorer  

HTTPS  

HTTPS  

HTTP  

HTTP  

WEB  UI  

ApplicaJons  

HTTP  

HTTP  or  HTTPS    (S3)  

Management  Console  

Data  Servers  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Minimum  Redundant  ConfiguraJon  

13  

LB  

Browser    requests  for  UI  

ApplicaJon    requests  for  S3  

HTTP/HTTPS  

HTTPS   SJcky    sessions   HTTP/S  

Server   Cassandra  

Servlets  

HTTP/S  Server   Cassandra  

Servlets  

CredenJals  DB  

CredenJals  DB  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

MulJ-­‐Datacenter  Example  

l 2  datacenters  /  4  nodes  per  datacenter  

14

l Storage  objects,  reports,  profiles  replicated  across  DCs  by  Cassandra.  

l CredenSals  DB  (Redis)  has  local  DC  slave  and  single  global  master.  

S3/Admin  /HyperStore  

CMC  

S3/Admin  /HyperStore  

CMC  

S3/Admin  /HyperStore  

Redis  (M)  

S3/Admin  /HyperStore  

DC1  

Redis  (S)  

Cassandra  Cassandra  

Cassandra   Cassandra  

CMC   CMC  

Redis  (S)   Redis  (S)  

S3/Admin  /HyperStore  

CMC  

S3/Admin  /HyperStore  

CMC  

S3/Admin  /HyperStore  

Redis  (S)  

S3/Admin  /HyperStore  

DC2  

Redis  (S)  

Cassandra   Cassandra  

Cassandra  Cassandra  

CMC   CMC  

Redis  (S)   Redis  (S)  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Region  3  DC  3-­‐1  

DC  3-­‐2  

Network  Scaling  Example  

15

Region  2  

Region  1  DC  1-­‐1  

DC  1-­‐2  

DC  2-­‐1  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cassandra  for  Object  Store    

l Dynamically  decide  how  to  store  each  object  (Cassandra  or  file  system).  l Cassandra  beier  for  small  objects.  

l Large  objects  split  into  mulSple  parts  and  chunks.  l Row  key:  Object  name  +  version  +  part  info  +  Smestamp  l Column  name:  Unused  

   

16

Row  key  

Column  Name  

Value  

Column  Family  

Random  ParSSoner  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cassandra  for  Object  Metadata  

l Size,  Etag,  MD5,  Smestamp,  ACL,  part  info,  version,  etc.  

l Old  versions  of  metadata  format  supported.  

l Row  key:  Group  +  user  +  bucket  

l Column  names:  Object  name  +  version  +  part  info  +  Smestamp  

l Wide  rows.    Column  sorSng  used  for  bucket  lisSng.    

17

Column  Name  

Value  Row  Key  

Column  Name  

Value  

Column  Name  

Value  

Sorted  by  Column  Name  Column  Family  

…  Random  ParSSoner  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cassandra  for  Account  Info    DATA  MODEL  l User  

-­‐  ID,  name,  contact  info,    etc.    l Group  

-­‐  ID,  name,  contact  info,  etc.  l RaSng  Plan  l Security  CredenSals  l QoS  Counters    NOTES  l  “StaSc”  data.    Fixed  number  of  columns.  l  Secondary  index  in  User  CF  on  groupID.    Allows  query  to  get  all  userIDs  for  a  specified  groupID.  

l  Could  be  put  in  a  RelaSonal  DB  like  MySQL,  but  no  need  to  add  another  component.  

18 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Quality  of  Service  /  SLA  Management  •  Configurable  maximum  limits  per-­‐region  at  per-­‐user,  per-­‐group,  system  level.  •  Requests/minute  •  Storage  bytes  •  Storage  objects  •  Data  Bytes  Inbound  •  Data  Bytes  Outbound  

•  While  limit  is  reached,  requests  are  rejected.  

Page  19 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cassandra  for  Reports  

DATA  MODEL  l “Raw”  column  family  

-­‐  User,  Group,  System  -­‐  TransacSon  type  (HTTP  GET,  PUT,  DELETE)  -­‐  Object  path  -­‐  Size  -­‐  …  

l “Rollup”  column  families.  -­‐  RollupHour.    Summarizes  data  for  each  hour  using  Raw  data.  -­‐  RollupDay.    Summarizes  data  for  each  day  using  RollupHour  data.  -­‐  RollupMonth.    Summarizes  data  for  each  month  using  RollupDay  data.  

 NOTES  l High  write  rate.    Low  read  rate.  l Rollup  tables  used  for  direct  queries.  l AutomaSc  deleSon  using  Cassandra  TTL  (Sme-­‐to-­‐live).  

20

…  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cassandra:  Wish  List  

21

1.  Repair  •  Slow,  impact  on  performance,  difficult  to  monitor  progress,  manual  

operator  acSon  required.  

2.  CompacSon  •  Heavy  performance  impact.    Hard  to  tune.      Capacity  planning  difficult.  

3.  Schema  changes    •  Fixed  in  1.1.  

4.  Large  column  slices.        

5.  Caches  (row  and  key)  not  useful.    Slower  performance,  large  memory  use.  

6.  JMX  too  slow.    Need  to  directly  use  and  expose  Java  interfaces.  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

HyperStore™  HyperStore:    Management  policies  tailored  for  different  object  types.    l  Object  metadata  is  sSll  stored  in  Cassandra  

l  Use  Cassandra’s  distributed  systems  methods  for  data  parSSoning,  replicaSon,  node  health  detecSon.  

l  Fork  Cassandra  source  for  customizaSons.    Benefits:  l  Beier  performance    l  More  capacity  per  node  l  Higher  disk  uSlizaSon  l  Storage  layer  flexibility  

22

AccounJng  (Cassandra)  

ReporJng  (Cassandra)  

Admin  

CredenJals  

Data  Store  (Cassandra)  

Data  Store  (File  System)  

HyperStore  Manager  

S3  REST  API  

NFS  

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

Cloudian  S3  Storage  Server  

#cassandra12

HyperStore:  Hybrid  Storage  Example  

l  OpSmal  soluSon  is  to  choose  the  storage  method  that  minimizes  latency.  l  Generally,  you  want  to  maximize/minimize  U,  a  performance  metric,  based  on  random  variables  X  using  a  mixture  of  N  storage  layers.  

l  In  a  simple  case,    l  U  :  average  latency  l  X  =  {object  size}  l  N  =  {cassandra,  ext4  fs}.    

23

U

X

Storage 1

optimal Storage 2

(c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

HyperStore: Faster Read & Writes

(c)  Copyright  and  ConfidenSal,  Gemini  Mobile  Technologies,  Inc.  &  KK,    2011,  All  rights  reserved.

0  

10  

20  

30  

40  

50  

0.5   5   50   500   KB  

PUT-­‐Cass  

PUT-­‐HS  

ms

0  10  20  30  40  50  60  

0.5   5   50   500  

GET-­‐Cass  

GET-­‐HS  

ms

>30%  faster  

>400%  faster  

KB  

#cassandra12 25

Strictly  ConfidenSal

8/9/12

PUT GET LIST DELETE

Operations 50478� 1679� 3642� 422

Latency (msec) 149.78� 314.80� 41.60� 34.50

PUT GET LIST DELETE

Operations 50559� 9195� 3575� 2224�

Latency (msec) 96.64� 35.63� 28.14� 23.93�

No  HyperStore   With  HyperStore  

iostat  %  uJlizaJon   iostat  %  uJlizaJon  

io  read/write  (MB)   io  read/write  (MB)  

20  tps,  10  threads,  2MB  data  

HyperStore: Less Compaction

#cassandra12

Finally

Cassandra  and  other  enabling  technologies  has  allowed  “leveling  the  playing  field”  for  cloud  storage  providers.  

 

Info:  www.cloudian.com  

l  Download  trial  version.  

l  Coming  soon:                                                                                        

l                                   #1  best  seller  in  “Database”  category  on  amazon.co.jp.  

 

Page  26 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

BACKUP

Page  27 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cloudian  S3  API  Compliance

Amazon  S3  

Cloudian  S3  Mul9-­‐part  Upload  

Tiered    Storage  

Loca9on  constraint  

Client  Library    &  Error  Code    Compa9ble   Mul9-­‐Tenant  Service  

(Dashboard,  QoS,    Monitoring,  Admin,    Reports,  Billing)  

Packaged    &  Supported  

Basic  S3  CompaJbility  Put,  Get,  Head,  Delete,  etc.  

Basic  Object  Store    RESTful  API  Objects  in  Buckets  w/  Metadata  

Distributed  &  Replicated  

Mul9-­‐  Datacenter  Support  

Versioning  

Content  Sharing  

Integra9on  Ready  with  Turnkey  Installa9on  

Page  28 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Admin  &    User  Dashboard  

Billing  

ReporSng  

Quality  of  Service  &  SLA  Management  

Monitoring  

Complete  Service  Pla^orm  

Page  29 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Monitoring  

•  Open-­‐source  network  management  system  •  Used  for  node  and  applicaSon  monitoring  •  Gemini  provides  a  template  file  for  Cloudian-­‐specific  monitoring  •  Cloudian  monitoring  uses  JMX  staSsScs  that  are  output  by  Cassandra  and  Cloudian  servers  

Page  30 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

AccounJng,  Usage  Data  and  Billing  •  Per-­‐Group,  Per-­‐User  and  Global  

•  AccounSng  data  is  maintained  per-­‐group,  per-­‐group  user,  and  also  at  the  global  basis.  

•  Separate  for  each  region.  Admin  API  can  retrieve  data  for  1  region  or  all  regions.  •  Aiributes  

•  Storage.  Bytes  and  objects  stored.  The  amount  of  bytes  (in  GB-­‐months)  and  objects  stored.    

•  Data  Transfer.  Bytes  in  and  out.  For  both  inbound  (writes)  and  outbound  (reads)  data,  the  number  of  bytes  transmiied.  

•  Requests.  Number  of  requests.  The  total  number  of  requests,  HTTP  type,  URI.  

•  Billing  rules  •  Billing  rules  as  a  weighted  sum  of  the  accounSng  aiributes  can  be  configured  per-­‐group.  •  Example:  

•  Storage  bytes:  For  first  5  GB-­‐month  ($0.10/GB-­‐month)  ,    then  $0.08/GB-­‐month.  •  PUT  requests    $0.01  per  1,000  requests.  •  GET  requests  $0.001  per  1,000  requests.  •  Data  Transfer  IN:    $0.00  per  GB  •  Data  Transfer  OUT:  For  first  1  GB  ($0.10/GB),  for  next  9  GB  ($0.05/GB),  then  ($0.03/GB).  

Page  31 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cloudian  Management  Console  (CMC)�

Page  32 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

Cloudian  Management  Console  (CMC)�

Page  33 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.

#cassandra12

CloudStack  IntegraJon

HOST �

HOST �

Cluster �

POD �

ZONE�

Primary  Storage �

Secondary  Storage �

CloudStack  System  VM�

HOST �

HOST �

Cluster �

POD �

ZONE�

Primary  Storage �

Secondary  Storage �

CloudStack  System  VM�

CloudStack  Zone   � CloudStack  Zone   �

Snapshot,  Backup,  ISO

Cluster �

POD �

NFS�NFS�

CloudStack    Zone   �

Secondary  Storage �

Secondary  Storage �

Secondary  Storage �

Page  34 (c)  Copyright.  Cloudian  Inc.  &  KK,    2012,  All  rights  reserved.