Not only SQL - Database Choices

  • Published on
    26-Jan-2015

  • View
    104

  • Download
    2

DESCRIPTION

deck from talk at StartupCodeCamp at House of Devs in the OC in Jan 2014

Transcript

<ul><li> 1. Database Choices Lynn LangitJan 2014 Startup Code Camp in the OC</li></ul> <p> 2. Data Expertise / Lynn Langit Industry awards Microsoft MVP for SQL Server Google GDE for Cloud Platform 10Gen Master for MongoDB Practicing Architect Technical author / trainer Pluralsight Google Cloud Series DevelopMentor SQL Server 2012 Series 2 books on SQL Server BI Cloudera trainer (certified) Former MSFT FTE 4 years 3. Databases Now a Menu of Choices 4. Data PipelineProcess AllAcquire New Clean ExistingStore SomeQuery &amp; Mine 5. Is Big Data = NoSQL and just Hadoop? HUGE Hype factor since 2011Apache Hadoop a software framework that supports data-intensive distributed applications under a free license enables applications to work with thousands of nodes and petabytes of data was inspired by Google's MapReduce and Google File System (GFS) papers 6. Hadoop in the Enterprise 7. How you get Hadoop Open source roll your own Commercial distribution Cloudera MapR Hortonworks MoreRent it via the cloud AWS HDInsight 8. Demo AWS MapReduce 9. Working with Hadoop 10. About Hadoop MapReduceImage from - https://developers.google.com/appengine/docs/python/images/mapreduce_mapshuffle.png 11. The Hadoop on premises Market Leader Is Cloudera 12. Example Comparison: RDBMS vs. Hadoop Traditional RDBMSHadoop / MapReduceData SizeGigabytes (Terabytes)Petabytes and greaterAccessInteractive and BatchBatch NOT InteractiveUpdatesRead / Write many timesWrite once, Read many timesStructureStatic SchemaDynamic SchemaIntegrityHigh (ACID)LowScalingNonlinearLinearQuery Response TimeCan be near immediateHas latency (due to batch processing) 13. Small BigData vs. Big BigData On PremisesIn the CloudHadoopHadoopNoSQLNoSQLRDBMSRDBMS 14. But wait is there a relational database that scales that is cheap that runs in the cloud? 15. DEMO - AWS Redshift About $1k per Terabyte per year - relational 16. Cloud-hosted NoSQL up to 50x CHEAPER 17. So many NoSQL options More than just the Elephant in the room Over 150+ types of NoSQL databases 18. Flavors of NoSQL Key/Value VolatileKey/value PersistentWide-ColumnDocumentGraph 19. Key / Value Database Just keys and values No schema Persistent or Volatile Examples AWS Dynamo DB Riak 20. DEMO - AWS DynamoDB Key/Value store on the AWS cloud 21. File (BLOB) Storage Buckets in the Cloud Amazon S3 or Glacier Google Cloud Storage Microsoft Azure BLOBS 22. DEMO - Battle of the Buckets Google Cloud Storage VS. Windows Azure BLOBS VS. AWS S3 (Archiving) in to AWS Glacier 23. Column Database Wide, sparse column sets Schema-light Examples: HBase w/Hadoop Google Cloud Datastore SQL Server Columnstore Indexes or SSAS Tabular Models 24. Types of Column Databases Column-families Non-relational Sparse Examples: HBase Cassandra xVelocity (SQL 2012 Tabular) Column-stores Relational Dense Example: SQL Server 2012 Columnstore index 25. DEMO Google Cloud Datastore 26. DEMO SQL Server NoSQL SQL Server 2012 Columnstore Index SQL Server 2012 Tabular Model (SSAS) 27. Document Database (Mongo DB) document-oriented (collection of JSON documents) w/semi structured data Encodings include BSON, JSON, XML binary forms PDF, Microsoft Office documents -Word, Excel) Examples: MongoDB Couchbase 28. Demo - Mongo DB 29. Graph Databases a lot of many-to-many relationships recursive self-joins when your primary objective is quickly finding connections, patterns and relationships between the objects within lots of data Examples: Neo4J Google Freebase 30. DEMO Neo4J 31. Small BigData vs. Big BigData HadoopKey/Value or ColumnDocument or GraphRDBMSOn Premise or In the Cloud 32. Cloud-hosted RDBMS AWS RDS SQL Server, mySQL, Oracle Medium cost Solid feature set, i.e. backup, snapshot Use existing tooling Google mySQL Lowest cost Most limited RDBMS functionality Microsoft SQLAzure Highest cost 33. DEMO - AWS RDS SQL Server, MySQL or Oracle Essential to understand pricing models 34. Image - http://blog.outsourcing-partners.com/wp-content/uploads/2012/10/performance.png 35. Document MongoDBGraph Neo4jRDBMS SQL ServerLine-of-BusinessDynamoDBSocial aggregatorsKey/ValueSocial GamesHBaseProduct CatalogsColumnstoreLog FilesNoSQL Applied 36. Cloud Offerings RDBMS AND NoSQL AWSGoogleMicrosoftRDBMSRDS all majormySQLSQL AzureNoSQL bucketsS3 or GlacierCloud StorageAzure BlobsNoSQL Key-ValueDynamoDBCloud DatastoreAzure TablesStreaming ML or (Mahout)Custom EC2Prospective Search &amp; Prediction APIStreamInsightNoSQL Document or MongoDB on EC2 GraphFreebaseMongoDB on Windows AzureNoSQL Column Hadoop (HBase)Elastic MapReduce using S3 &amp; EC2noneHDInsightDremel/Warehousi ngRedShiftBigQuerynone 37. But wait how do I query NoSQL data? 38. Always MapReduce? 39. Can Excel help? Connector to HadoopData ExplorerData Quality ServicesMaster Data ServicesIntegration with Azure Data MarketVisualize with PowerViewData Mining w/Predixion 40. Demo - Hadoop Connector to Excel 41. Other types of cloud data services Hosting public datasets Pay to read Earn revenue by offering for readCleaning / matching (your) data ETL Microsoft Data Explorer, Google Refine Data Quality Windows Azure Data Market, InfoChimps, DataMarket.com 42. Collecting for BigData Sensors everywhere Structured, Semi-structured, Unstructured vs. Data Standards M2M Public Datasets Freebase Azure DataMarket Hillary Masons list42 43. NoSQL To-Do List Understand types of NoSQL databases Use NoSQL when business needs designate Use the right type of NoSQL for your business problemTry out NoSQL on the cloud Quick and cheap for behavioral data Mashup cloud datasets Good for specialized use cases, i.e. dev, test , training environmentsLearn NoSQL access technologies &amp; services New query languages, i.e. MapReduce, R, Infer.NET New query tools (vendor-specific) Google Refine, Amazon Karmasphere, Microsoft Excel connectors, etc Windows Azure Data Market, other public data markets 44. recipes)www.TeachingKidsProgramming.org Free Courseware (Java, Small Basic or C# [on Pluralsight]) Do a Recipe Teach a Kid (Ages 10 ++) 45. Keep Learning Twitter: @LynnLangit YouTube: http://www.youtube.com/user/SoCalDevGal Hire me To help build your BI/Big Data solution To teach your team next gen BI To learn more about using NoSQL solutions </p>

Recommended

View more >