Upload
restlet
View
1.886
Download
0
Embed Size (px)
Citation preview
1. Introduction2. Persistence needs of an API PaaS3. Selecting DataStax Enterprise Search4. Main challenges and solutions5. Conclusion6. Q&A
Agenda
Introduction
● Jérôme Louvel○ founder & CTO of Restlet, Web API platform vendor○ created Restlet Framework, first REST framework in 2004○ contributor to “RESTful Web Services” (O’Reilly, 2007)○ member of the JAX-RS 1.0 expert group (2007 - 2009)○ co-author of “Restlet in Action” (Manning, 2012)○ InfoQ editor covering Web APIs since 2014
● Guillaume Blondeau○ DevOps engineer at Restlet○ working on APISpark cloud platform○ Cassandra Administrator certified by DataStax
About the Speakers
x
•
○
○
○
•○○○
● Key features○ visual creation & deployment of
data APIs○ operation of APIs &
their local data sources○ management of any API
● Benefits○ accessible via web browser,
no technical expertise required○ companies of any size can
become API providers○ get started for free, then pay
when the API generates traffic
About APISpark
Persistence Needs of an API PaaS
High Availability of APIs and their Data Stores
Low Latency for Users Across the Globe
Rugby World Cup Data
High Scalability & Elasticity
● For API traffic○ concurrent calls○ workload types○ peaks handling
● For data storage○ number of stores○ volume of data ...
...
...
...
● Filtering on properties
● Pagination
● Sorting
Rich Query Capabilities
High Multi-tenant Density
● Balance between○ data isolation○ low cost
● Many customers & projects○ sharing persistence
infrastructure○ isolated data stores
● Many users & groups○ personal data○ shared group data
Selecting DataStax Enterprise Search
Step 1: Prototyping with AWS NoSQL
● Started with SimpleDB○ zero ops, highly available & low latency○ mono-region & limited query capabilities
● Upgraded to DynamoDB○ better scalability & predictability○ not really for multi-tenant use cases (soft limits)○ not very elastic (provisioned throughput)
● Other limitations○ unable to develop and test locally (MySQL mode)○ strong AWS lock-in
Step 2: Moving to Apache Cassandra
● For APISpark beta version○ increasing multi-tenancy needs○ increasing cost concerns
● Benefits○ fully open source & free (vendor support)○ on-premise deployments possible○ proven scalability on AWS (Netflix)○ richer query capabilities○ natively multi-region
Step 3: Upgrading to DataStax Enterprise
● For APISpark GA○ DataStax certified stack○ production support
● Improved capabilities○ much richer query capabilities with Solr integration○ administration console○ command line tooling○ comprehensive documentation
● Still open source foundation○ limited vendor lock-in○ mature open source components
Current Persistence DesignEntity Store
Entity
Property
Primary Key
7 Main Challenges & Solutions
DataStax Enterprise Search 4.6.7(Cassandra 2.0.14, Solr 4.6.0)
● Using Ec2MultiRegionSnitch
● 1 Entity Store = 1 Keyspace○ Each keyspace can set its own replication policy
I. Deploying Across Multiple Regions
● 1 Entity Store = 1 Keyspace○ Data isolated in File System and Memory
● Complementary benefit○ ACL per keyspace
II. Isolating Customer Data & Keeping Cost Low
Keyspace
Table
Composite property
List property
III. Supporting Complex Data Models
IV. Dealing with Dynamic Schema Changes (1/3)
ALTER TABLE DROP
ALTER TABLE ADD
IV. Dealing with Dynamic Schema Changes (2/3)
User Action on Entity Store Action performed in DB
Create Entity CQL: “CREATE TABLE <tableName>” + Solr Core creation
Delete Entity CQL: “DROP TABLE <tableName>”
Create Property CQL: “ALTER TABLE ADD <columnName> <type>” + Solr Core schema update
Delete Property CQL: “ALTER TABLE DROP <columnName>” + Solr Core schema update
Add Property in composite Java: Alter JSON for all rows
Delete Property in composite Java: Alter JSON for all rows
● Advantages○ flexibility compared to RDBMS
■ no lock○ available actions
■ add / drop / rename column■ change type of column
● Limitations○ schema deployment can take time○ in some edge cases can’t recreate columns
IV. Dealing with Dynamic Schema Changes (3/3)
V. High Multi-tenant Density (1/2)
Schema deployment time with growing # of tables
● Challenge○ large number of C* tables & Solr cores○ memory usage (ex: 1 C* table takes more than 1MB of heap)
● Solutions○ adjust JVM memory settings○ need to create additional clusters○ deprovision unused Entity Stores
V. High Multi-tenant Density (2/2)
VI. Query Capabilities 1/2
Search queries
Upsert / Delete / “Get by id” queries
● Filtering on a property
● Pagination
● Sorting
VI. Query Capabilities 2/2
Solr Queries
VII. Analytics (1/2)
Provide analytics about API calls
VII. Analytics (2/2)
used for latest API calls
issue with wide rows(heavily used APIs)
1 table per report
use of C* counters
Conclusion
● Special use case of DataStax Enterprise○ not a lot of shared knowledge about it○ great support from DataStax○ DSE is a good fit despite some challenges
● Looking forward to DSE 4.8 !○ User Defined Types with Solr indexing○ live indexing of C* data into Solr○ improved overall performance
Conclusion
Questions ?