Upload
hiren-tamhane
View
226
Download
0
Embed Size (px)
Citation preview
7/30/2019 Research Report Hiren Final Printout 3
1/19
1
In Memory Analyticsusing SAP HANA
7/30/2019 Research Report Hiren Final Printout 3
2/19
2
Index1. Introduction to In Memory Analytics 32. Need of In Memory Analytics 53. Advantages of SAP HANA for In-Memory Analytics 74. Types of Analytics with SAP HANA 95. Design of SAP HANA 106. How SAP HANA Works 137. Limitations of SAP HANA 158. Conclusion 159. References. 16
7/30/2019 Research Report Hiren Final Printout 3
3/19
3
1. Introduction to In-Memory Analytics
Analysis of data is a process of inspecting, cleaning, transforming, and modelingdata with the goal of highlighting useful information, suggesting conclusions, and
supporting decision making.
Its fine to analyze few thousand rows of data but when it exceed more and
becomes millions and billions of rows of data it takes huge amount of time(in hours and
days) to analyze it Along this there is another problem of disk read speed. With time
RAM size, capacity of memory and processing speed increased but there is very less
improvement in the speed with which we read the data from the spinning hard disk.
Thus due to these two problems it takes a huge amount of time in analysis of
large amount of data which is very common in huge companies in current times.
Time is something which the current organizations cant afford as this
organizations want instant access to information in the moment whether that is a
moment of risk or a moment of opportunity. If the moment has passed and business
has not taken the right action, it has failed.
To overcome this problem In-Memory Analytics was evolved. In In-Memory
Analytics we just pull all the data from the traditional RDBMS which lies on disks and
load it into the main memory. As we know that RAM (Main Memory) is extremely fast as
compared to hard disks this can boost the speed of analytics significantly and can make
analytics in real time.
And with presence of multi core processor we can have up to 2 TB (2 Tera Byte =
2048 GB) of RAM on servers to accomplish requirement of In-Memory Analytics.
http://en.wikipedia.org/wiki/Datahttp://en.wikipedia.org/wiki/Informationhttp://en.wikipedia.org/wiki/Informationhttp://en.wikipedia.org/wiki/Data7/30/2019 Research Report Hiren Final Printout 3
4/19
4
7/30/2019 Research Report Hiren Final Printout 3
5/19
5
2. Need of In Memory Analytics
Vishal Sikka (Head of SAP HANA Development Team) has once said that if you
ask a question today and get the answer after 3 days ,one will even forget what the
question was originally.
Today the Organization faces challenges of not only processing of huge amount
of data which changes at exponential rate and comes from different data source but
also to analyze them in different manner and that too in seconds.
When I say real Time Data for a retail shop perspective the POS (Point of sale)
data would be available for analytics even before the customer leaves the retail store.
Currently there are two big problems the analytical projects are facing, these are twobig Vs that comes in performance of and analytical project. These two Vs are:
1) Volume and 2) Velocity.
1) Volume:-Before many years when you had 1GB pen drive in your hand you were walking
like a King. But those golden days are gone now.
In a Company every year huge amounts of data is created and how fast your
business reacts to important information determines whether you wins or fail. This is a
big problem and its getting bigger.
2) Velocity (Speed):-
7/30/2019 Research Report Hiren Final Printout 3
6/19
6
To be successful in business, organization has to take decision in movements. It
could be movement of risk or movement of opportunity. If the movement is gone and
organization doesnt react on that, the organization fails. Thus today every organization
want result in fraction of seconds (in moments).
Today organizations want quick answers. They want them to be accurate to rely
on and they want them instantly without waiting for long time. And organizations also
want them to be anywhere and 24x7.
Day are gone when companies use to rely on quarterly review and annual
budgets for their decision making. Now they want instant responses. Companies today
want to know the current market conditions and trends, to take decision of how to
change their policies and supply chains to have competitive advantage over other rivals.
RDBMS are failed to achieve bothRDBMS were designed for transactional processing purpose (insert and
update).Its hard to find database that can do both transaction processing (insert and
update) at the same time good at aggregations, joins (typical in Analytical solutions).
Also the structured query language (SQL) is designed to efficiently fetch rows of data
while BI queries usually involve fetching of partial rows of data involving heavy
calculations.
We can write complex queries in SQL but, it has been observed that these queries
that very long time to complete and also these brings down the performance of
concurrent transactional processes. To get fast results often multidimensional
databases or cubes also called MOLAP .Which stands for multidimensional online
analytical processing were formed.
To design a cube design is very complex and elaborate process, the IT staff has
to give significantly huge amount of time to these cube designing. Changing the cubes
structure to adapt to dynamically changing business needs was cumbersome. These
cubes are then populated with pre calculated data, to answer so particular queries.
Though this decrease the queries time significantly but cant answer the ad hoc queries
efficiently.
http://en.wikipedia.org/wiki/SQLhttp://en.wikipedia.org/wiki/SQL7/30/2019 Research Report Hiren Final Printout 3
7/19
7
1. Advantages of SAP HANA for In-Memory Analytics
7/30/2019 Research Report Hiren Final Printout 3
8/19
8
SAP HANA the in-memory analytics technology from SAP AG which was launched
late last year, is winning rave reviews around the world, and promises to do for SAP
what R/3, its powerful ERP software, had done for it in the late 1990s.
Due to HANA, SAP AGs revenue was outpaced by that of competitor Oracle last
quarter for the first time in two and a half years
As define by SAPs official website SAP HANA is SAP HANA is an in-memory data
platform that is deployable as an on-premise appliance, or in the cloud. It is a
revolutionary platform thats best suited for performing real-time analytics, and
developing and deploying real-time applications. At the core of this real-time data
platform is the SAP HANA database which is fundamentally different than any other
database engine in the market today.
7/30/2019 Research Report Hiren Final Printout 3
9/19
9
Whenever organizations have to go deep within their databases to ask complex
and interactive queries, and have to go broad(which means working with enormous data
sets that are of different types from one another and from different data source) at the
same time, SAP HANA is well-suited. Increasingly there is a need for this data to be
recent and preferably in real-time. Add to that the need for high speed (very fastresponse time and true interactivity), and the need to do all this without any pre-
fabrication (no data preparation, no pre-aggregates, no-tuning) and you have a unique
combination of requirements that only SAP HANA can address effectively. When this set
of needs or any subset thereof have to be addressed (in any combination), SAP HANA is
in its elements.
7/30/2019 Research Report Hiren Final Printout 3
10/19
10
4. Types of Analytics with SAP HANAThere are 3 main types of In Memory Analytic that can be accomplishing by SAP Hana.
They are as follows:-
a. Operational Reportingb. Data Warehousingc. Predictive and Text analysis on Big Dataa. Operational Reporting:-
In this type of analytics companies can do the thing which they use to do
in day-today basis. That's what BASF for example is using it for. There are folks
who need to do profitability analysis quickly, to understand which products they
earn money on; these are complicated calculations, because the cost has to be
allocated to each product they make. For very large companies with complex
product structures, it's a report that could take up to three hours to generate.
With Hana, it takes a couple of seconds.
b. Data Warehousing:-With in-memory computing the need of complex data warehouse is been
minimize in the industries. Data warehouses aggregate data in various different
aggregates in order to have the answer ready when the question comes. With in-
memory computing, you don't need to do the aggregates.
You can just calculate on the fly. That means less cost for infrastructure to
run large-scale analytics systems. The third use case is, solve problems that could
not be solved before.
c. Predictive and Text analysis on Big Data:-Companies has to think beyond just delivering best and effective products
to people and uncover customer/employee /vendor/partner trends and insights,
anticipate behavior and take proactive action. SAP HANA provides the ability to
7/30/2019 Research Report Hiren Final Printout 3
11/19
11
perform predictive and text analysis on large volumes of data in real-time. It does
this through the power of its in-database predictive algorithms and its R
integration capability. With its text search/analysis capabilities SAP HANA also
provides a robust way to leverage unstructured data.
5. Design of SAP HANASOFTWARE DESIGN:
Instead of storing data row wise which majority of database do today,
HANA stories the in column manner for fast computing.
For example: If system wants to find aggregate of the second column
7/30/2019 Research Report Hiren Final Printout 3
12/19
12
i.e. 10+35+2+40+12.
In Row wise: The system has to jump memory locations to collect subsequent values for
aggregation. That is data records are available as complete tuples in one read which
makes accessing of few attributes expensive operation.
In Column wise: A single scan would fetch the results much faster.
Another important aspect of a column-based RDBMS is data compression. Since
all values in a column are stored together, there is the possibility of storing the value
only once, alongside the number of occurrences. So in the example table we've just
seen, the last column might be stored as follows:
EUR, USD, 2: EUR, USD
This might not seem important, but in a table that contains several million lines,
the space savings are potentially huge. SAP indicates that data can be compressed to
between 10 percent and 25 percent of its original size. Of course, this means less data to
scan through for the systemand since data is in memory, it means more data
between 4 and 10 timescan be kept in memory at once.
HARDWARE DESIGN:Large amount of data is divided into multiple sets which are then crunched
individually by the Blades as shown below.
7/30/2019 Research Report Hiren Final Printout 3
13/19
13
Data is divided into 4 blades with 2 standby blades
The Blades are composed of multiple CPUs per blade and each CPU has multiple
cores per CPU. This means that if you have for example 8 cores per CPU and 4 such
CPUs per blade. So just 4 blades will have 128 cores crunching data in parallel.
The SAP HANA box itself is a massively multi-core, multi-CPU server, with a great
deal of Memoryup to several terabytes. For example, on May 16, 2012, IBM
announced that in collaboration with SAP, they had built a machine with 100 TB of main
memory.
At the time, SAP indicated that this machine would be sufficient to run the eight
largest clients of SAP ERPall at the same time!
One of the main strong points of SAP HANA is its ability to process data in
parallel, cutting the initial (large) amount of data into small chunks, and then giving each
chunk to a separate CPU to work onhence the need for the large number of CPU
cores.
One other aspect of the system is that wherever possible, data is kept in memory,
in order to speed up access time. Where a traditional database system might set aside a
gigabyte or two of memory as a cache, SAP HANA takes this to the next level, using
nearly all the server's memory for the data, making access times nearly instantaneous.
7/30/2019 Research Report Hiren Final Printout 3
14/19
14
6. How SAP HANA Works
7/30/2019 Research Report Hiren Final Printout 3
15/19
15
For implementing SAP HANA, organization doesnt need to make drastic changes
to there IT investments (infrastructure). In the above diagram show that data in the
database can be replicated in real time into HANA and can be used for reporting with a
number of Business Intelligence tools directly sitting on the top of the HANA.
There are 3 major steps for using HANA:-
7/30/2019 Research Report Hiren Final Printout 3
16/19
16
1) Loading data into HANA from existing data source.2) Modelling data into HANA for facilitating data analysis using HANA Studio.3) Analysing the data in HANA using BI tools.
SAP HANA is designed to replicate and ingest structured data from SAP and non-
SAP relational databases, applications, and other systems quickly. One of three styles of
data replication trigger-based, ETL-based, or log-based - is used depending on the
source system and desired use-case. The replicated data is then stored in RAM rather
than loaded onto disk, the traditional form of application data storage. Because the data
is stored in-memory, it can be accessed in near real-time by analytic and transactional
applications that sit on top of HANA.
7. Limitations of SAP HANASAP HANA is not comfortable in analyzing very large amount of data (in Peta
Bytes) mostly known as big data. Therefore, HANA is not suited for social networking
and social media data analytics. For such uses cases, enterprises are better off looking to
open-source big-data approaches such as Apache Hadoop or LexisNexis HPC CSystems.
While SAP has announced a slew of new HANA-optimized applications, currently
only a few are on the market. It is incumbent upon SAP to follow through on its
commitment with practical applications that address real-world business problems.
7/30/2019 Research Report Hiren Final Printout 3
17/19
17
Also, SAP HANA is not made to support non-SAP applications, and to support
such application requires significant application re-engineering on the part of enterprise
IT groups.
You cannot expect great change by just replacing your current database
infrastructure by HANA. You also need to re design the application to some extent to
get best out of HANA.
8. ConclusionSAP HANA is a great in Memory analytical tool which can analyze huge amount
of data in real time which is great advantage to many huge companies around the
world. But its still new to adopt extensively by the enterprise BI community but its
usage is increasing day by day and is seen as a great revolutionary technology for
future.
9. References
Site:- http://sapignite.com/what-is-sap-hana/
7/30/2019 Research Report Hiren Final Printout 3
18/19
18
(Date: 15rd Sep 2012)
http://sapignite.com/why-sap-hana-database-applianc/(Date: 15rd Sep 2012)
http://en.wikipedia.org/wiki/In-Memory_Processing(Date: 15rd Sep 2012)
http://www.bluefinsolutions.com/insights/blog/the_sap_hana_faq_answering_key_sap_in_memory_questions/(Date: 20rd Sep 2012)
http://www.saphana.com/docs/DOC-2272(Date: 20rd Sep 2012)
http://articles.timesofindia.indiatimes.com/2011-09-21/strategy/30183979_1_hana-jim-hagemann-snabe-data(Date: 26rd Sep 2012)
http://archive.ciol.com/cgi-bin/printernew.asp?id=114860(Date: 14
rd
Oct 2012)
http://wikibon.org/wiki/v/Primer_on_SAP_HANA(Date: 6rd Nov 2012)
http://articles.businessinsider.com/2012-01-25/news/30662175_1_new-database-oracle-hasso-plattner(Date: 8rd Nov 2012)
http://www.cio.com.au/article/373945/in-memory_computing/(Date: 15rd Nov 2012)
http://en.wikipedia.org/wiki/In-memory_database
http://sapignite.com/why-sap-hana-database-applianc/http://sapignite.com/why-sap-hana-database-applianc/http://www.bluefinsolutions.com/insights/blog/the_sap_hana_faq_answering_key_sap_in_memory_questions/http://www.bluefinsolutions.com/insights/blog/the_sap_hana_faq_answering_key_sap_in_memory_questions/http://www.bluefinsolutions.com/insights/blog/the_sap_hana_faq_answering_key_sap_in_memory_questions/http://www.bluefinsolutions.com/insights/blog/the_sap_hana_faq_answering_key_sap_in_memory_questions/http://www.bluefinsolutions.com/insights/blog/the_sap_hana_faq_answering_key_sap_in_memory_questions/http://www.saphana.com/docs/DOC-2272http://www.saphana.com/docs/DOC-2272http://articles.timesofindia.indiatimes.com/2011-09-21/strategy/30183979_1_hana-jim-hagemann-snabe-datahttp://articles.timesofindia.indiatimes.com/2011-09-21/strategy/30183979_1_hana-jim-hagemann-snabe-datahttp://articles.timesofindia.indiatimes.com/2011-09-21/strategy/30183979_1_hana-jim-hagemann-snabe-datahttp://articles.timesofindia.indiatimes.com/2011-09-21/strategy/30183979_1_hana-jim-hagemann-snabe-datahttp://articles.timesofindia.indiatimes.com/2011-09-21/strategy/30183979_1_hana-jim-hagemann-snabe-datahttp://archive.ciol.com/cgi-bin/printernew.asp?id=114860http://archive.ciol.com/cgi-bin/printernew.asp?id=114860http://wikibon.org/wiki/v/Primer_on_SAP_HANAhttp://wikibon.org/wiki/v/Primer_on_SAP_HANAhttp://articles.businessinsider.com/2012-01-25/news/30662175_1_new-database-oracle-hasso-plattnerhttp://articles.businessinsider.com/2012-01-25/news/30662175_1_new-database-oracle-hasso-plattnerhttp://articles.businessinsider.com/2012-01-25/news/30662175_1_new-database-oracle-hasso-plattnerhttp://articles.businessinsider.com/2012-01-25/news/30662175_1_new-database-oracle-hasso-plattnerhttp://articles.businessinsider.com/2012-01-25/news/30662175_1_new-database-oracle-hasso-plattnerhttp://www.cio.com.au/article/373945/in-memory_computing/http://www.cio.com.au/article/373945/in-memory_computing/http://en.wikipedia.org/wiki/In-memory_databasehttp://en.wikipedia.org/wiki/In-memory_databasehttp://en.wikipedia.org/wiki/In-memory_databasehttp://www.cio.com.au/article/373945/in-memory_computing/http://articles.businessinsider.com/2012-01-25/news/30662175_1_new-database-oracle-hasso-plattnerhttp://articles.businessinsider.com/2012-01-25/news/30662175_1_new-database-oracle-hasso-plattnerhttp://wikibon.org/wiki/v/Primer_on_SAP_HANAhttp://archive.ciol.com/cgi-bin/printernew.asp?id=114860http://articles.timesofindia.indiatimes.com/2011-09-21/strategy/30183979_1_hana-jim-hagemann-snabe-datahttp://articles.timesofindia.indiatimes.com/2011-09-21/strategy/30183979_1_hana-jim-hagemann-snabe-datahttp://www.saphana.com/docs/DOC-2272http://www.bluefinsolutions.com/insights/blog/the_sap_hana_faq_answering_key_sap_in_memory_questions/http://www.bluefinsolutions.com/insights/blog/the_sap_hana_faq_answering_key_sap_in_memory_questions/http://sapignite.com/why-sap-hana-database-applianc/7/30/2019 Research Report Hiren Final Printout 3
19/19
19
(Date: 15rd Nov 2012)
http://en.wikipedia.org/wiki/SAP_HANA(Date: 15rd Nov 2012)
http://whatis.techtarget.com/definition/in-memory-database(Date: 16rd Nov 2012)
http://slashdot.org/topic/datacenter/the-rise-of-in-memory-databases/(Date: 17rd Nov 2012)
http://www.webopedia.com/TERM/I/in_memory_analytics.html(Date: 17rd Nov 2012)
http://www.saphana.com/docs/DOC-1085(Date: 20rd Nov 2012)
http://www.clivemargolis.com/articles-about-bi/qlikview-sap-hana-cognos-tm1-%E2%80%93-in-memory-analytics-what%E2%80%99s-it-all-about/(Date: 22rd Nov 2012)
https://scn.sap.com/thread/2065611(Date: 23rd Nov 2012)
Book:- SAP HANA Master Guide By SAP Press
(Date: Oct-Nov 2012)
http://en.wikipedia.org/wiki/SAP_HANAhttp://en.wikipedia.org/wiki/SAP_HANAhttp://whatis.techtarget.com/definition/in-memory-databasehttp://whatis.techtarget.com/definition/in-memory-databasehttp://slashdot.org/topic/datacenter/the-rise-of-in-memory-databases/http://slashdot.org/topic/datacenter/the-rise-of-in-memory-databases/http://www.webopedia.com/TERM/I/in_memory_analytics.htmlhttp://www.webopedia.com/TERM/I/in_memory_analytics.htmlhttp://www.saphana.com/docs/DOC-1085http://www.saphana.com/docs/DOC-1085http://www.clivemargolis.com/articles-about-bi/qlikview-sap-hana-cognos-tm1-%E2%80%93-in-memory-analytics-what%E2%80%99s-it-all-about/http://www.clivemargolis.com/articles-about-bi/qlikview-sap-hana-cognos-tm1-%E2%80%93-in-memory-analytics-what%E2%80%99s-it-all-about/http://www.clivemargolis.com/articles-about-bi/qlikview-sap-hana-cognos-tm1-%E2%80%93-in-memory-analytics-what%E2%80%99s-it-all-about/http://www.clivemargolis.com/articles-about-bi/qlikview-sap-hana-cognos-tm1-%E2%80%93-in-memory-analytics-what%E2%80%99s-it-all-about/http://www.clivemargolis.com/articles-about-bi/qlikview-sap-hana-cognos-tm1-%E2%80%93-in-memory-analytics-what%E2%80%99s-it-all-about/https://scn.sap.com/thread/2065611https://scn.sap.com/thread/2065611https://scn.sap.com/thread/2065611http://www.clivemargolis.com/articles-about-bi/qlikview-sap-hana-cognos-tm1-%E2%80%93-in-memory-analytics-what%E2%80%99s-it-all-about/http://www.clivemargolis.com/articles-about-bi/qlikview-sap-hana-cognos-tm1-%E2%80%93-in-memory-analytics-what%E2%80%99s-it-all-about/http://www.saphana.com/docs/DOC-1085http://www.webopedia.com/TERM/I/in_memory_analytics.htmlhttp://slashdot.org/topic/datacenter/the-rise-of-in-memory-databases/http://whatis.techtarget.com/definition/in-memory-databasehttp://en.wikipedia.org/wiki/SAP_HANA