

For Startup Saturday : Jan 2016

Big Data for Startups

– An introductory session about What, Why, When & When Not of Big Data for Startups

Yes, You may not like it; but its really my twitter id

Dhruv Gohil@yourFriendDhruv


l Why do you care to hear my opinion?

l What is this “big data”?

l Why “startup”s should care about it?

l When to “do big data”?

l When “NOT to do big data”?


Seems too serious?

Now, This is much better!

So, let's change the font!


OK... Why do you care to hear this from me?

Meet me after the session, to compare favorites


OK... So what questions I will try to answer?

Big is not only ‘big’. Why startup needs 'Big data'? What 'Big data' is NOT? fear of Big data? Kick it off! Big Data for “small startups”?


Let me tell you a story..


If you were thinking about RDBMS now...then

Everything you have been taught in academics about Database is ALL WRONG.


Big Data is...


Big Data is not only ‘big’

Volume, Velocity, VarietyGB/TB vs PB/EBCentralized vs DistributedStructured vs Semi-Structured/UnstructuredData Model vs SchemaKnown relationships vs Flexible associations


What 'Big data' is NOT?

Big data हैं इसलि�ए Hadoop हैँ , Hadoop हैँ इस�ए Big data नहिहं!


What 'Big data' is NOT?

Applying for a funding here?

Hadoop से कम तो गा�ी के बराबर हैं !


What 'Big data' is NOT?

Why always Hadoop/Technologies comes to mind with big data?What else we should know?Tools vs MethodologiesBeing too futuristic vs. being practical/economical


Big Data in your startup

Cost of tools/software decreases, but cost of knowledge increases

Being agile is the only way to deal competition Are you working with...

Social networking and media Mobile devices Internet transactions Networked devices and sensors


Big Data in your product/service Have to change thinking in perspective of access vs.

storage Design based on when/where data is used vs.

when/where data is produced. Use redundancy in contrast of storage cost Understand NoSQL = Not Only SQL

Streams In memory analytics Massively parallel processing (Data crunching)


Big Data in your startup

Random Research says.. 99% client of Big Data startups, ended up having

total paid customers less then your own fingers.

A Startup hits Business scalability much much earlier then technical scalability.


Big Data for your clients

Business first - technology second Current reality for client projects:

Use big data tools which works at small scale :-) Design with domain in mind not the database client

suggests. Always design for read optimization in mind (the

golden rule)


Big Data project for small data startups

If you can do it postgresql, then do it postgresql (the blue elephant rule)


For Tech centric startups - The CAP theorem

Read a lot about design of database before using any non traditional database. Or read good negative posts to know when NOT to use it.

e.g. :


And Now... Quick Tips

Why Big Data? Data == VALUE MONEY $$$! It's a buzzword, but ride on it like you mean it. Your competitors do it. claims to do it. Think of your growth exit stretegy, again!

Yes, I never owned/worked at startup, Still advising you!


And Now... Quick Tips

When to actually do Big Data? The purpose of Data your startup has, changes should change == To PIVOT

Do it for “Unfair advantage” not for UVP www://

See, I did it again.


And Now... Quick Tips

How to do Big Data? Big Data Storage

Use Big Data patterns, but don't use Big Data tools/technologies (yet)

Fact/Event based system design CQRS (command query responsibility seperation) Easy RDMS but with NON-Relational Design

Big Data Analytics Until you hit 1K customer use Analytics-as-

services IBM WATSON

Even more!, I am liking it, not sure about you although.


And Now... Quick Tips

How NOT to do Big Data? If you are not selling your startup in NEXT 6

months Don't start with Technology, start with business case on NON-BIGDATA-

TECHNOLOGY If you have not pivoted even once!

Even more!, I am liking it, not sure about you although.


Few references used AND this is not last slide Basic hadoop introductory material : Evaluate hadoop without installation : Postgresql good parts : Postgresql as NOSQL column store : Postgresql as Elastic search basic functionality : Good big data compatible OSS softwares : Practical Hbase usage : Why BigData technologies are on Linux : Using cassandra for write heavy applications : On-line analytics in STORM : E-commerce Domain specific use case : Good use case of selecting data store based on proper understanding of CAP theorem : Recommendation engine in Big Data scenarios : High volume log proessing: Open source alternatives :


Yes, It's unreadable and even un complete, And has irrelevant you tube video links!


Question & Answers Ask the Question now if your Question:

Is 1 liner Is not personal, from either side.

Ask it post session today If your context is specific If its personal and you don't wanna be humiliated in public If its technical, then attend next 2 sessions

Ask in any café @Prahladnagar We advice free to startups and individuals (Not joking this time)

Don't ask Melody itni chocolaty kyun hain?

“No question unanswered” is not copyrighted by me, yet.