47
TDWI BEST PRACTICES REPORT TDWI RESEARCH FOURTH QUARTER 2015 tdwi.org Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing By Philip Russom, David Stodder, and Fern Halper

Emerging Technologies For Business Intelligence, Analytics ...go.striim.com/acton/attachment/9667/f-0086/1/-/-/-/-/TDWI Best... · 2 TDWI RESEARCH Emerging Technologies For Business

Embed Size (px)

Citation preview

TDWI BEST PRACTICES REPORT

TDWI RESEARCH FOURTH QUARTER 2015

tdwi.org

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

By Philip Russom, David Stodder, and Fern Halper

Research Sponsors

Research Sponsors

HP Security Voltage

HP Vertica

MicroStrategy

Qlik

Snowflake Computing

Striim

Trifacta

tdwi.org 1

By Philip Russom, David Stodder, and Fern Halper

Table of ContentsResearch Methodology and Demographics 3

Executive Summary 4

Introduction to Emerging Technologies and Methods 5

Defining and Categorizing ETMs . . . . . . . . . . . . . . . . . . . 5

User Perspectives on ETMs . . . . . . . . . . . . . . . . . . . . . 6

Why Are Emerging Technologies So Important? . . . . . . . . . . . 7

The State of Emerging Technologies and Methods 10

Are ETMs a Problem or an Opportunity? . . . . . . . . . . . . . . 10

Benefits of ETMs for BI, Analytics, and DW . . . . . . . . . . . . 11

Barriers to Success with ETMs . . . . . . . . . . . . . . . . . . 12

ETMs for Business Intelligence 14

New Technology Adoption and Satisfaction . . . . . . . . . . . . 15

Self-Service BI: Power to the Users . . . . . . . . . . . . . . . . 16

Self-Service Data Preparation ETMs . . . . . . . . . . . . . . . . 16

Hadoop Access from BI and Analytics Tools . . . . . . . . . . . . 19

Mobile and Embedded BI: Potential for BI ETMs . . . . . . . . . . 21

Importance of Embedded BI and Analytics . . . . . . . . . . . . . 23

Using BI ETMs to Monetize Data Assets . . . . . . . . . . . . . . 23

ETMs for Analytics 24

Evolving Data Sets and Analytics . . . . . . . . . . . . . . . . . 24

Disparate Data Types Continue to Build Momentum . . . . . . . . 25

Analytics Hits the Mainstream . . . . . . . . . . . . . . . . . . . 27

The Internet of Things . . . . . . . . . . . . . . . . . . . . . . . 29

The Cloud for Data Management and Analytics . . . . . . . . . . 31

ETMs for Data Warehousing 33

Trends in ETMs Relative to Data Warehousing and Data Management . . . . . . . . . . . . . . . . . . . . . . . . . 33

Data-Centric Security . . . . . . . . . . . . . . . . . . . . . . . 37

A Sample of Relevant Vendor Platforms and Tools 39

Top 10 Priorities for Emerging Technologies in BI, Analytics, and Data Warehousing 42

Emerging TechnologiesFor Business Intelligence, Analytics, and Data Warehousing

FOURTH QUARTER 2015BEST PRACTICES REPORT

TDWI RESEARCH

© 2015 by TDWI, a division of 1105 Media, Inc. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. E-mail requests or feedback to [email protected].

Product and company names mentioned herein may be trademarks and/or registered trademarks of their respective companies.

2 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

About the AuthorsPHILIP RUSSOM is a well-known figure in data warehousing and business intelligence (BI), having published more than 500 research reports, magazine articles, opinion columns, speeches, Webinars, and more. Today, as the director of TDWI Research for data management, he oversees many of the company’s research-oriented publications, services, and events. Prior to joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research and Giga Information Group. He also ran his own business as an independent industry analyst and BI consultant and was a contributing editor with leading IT magazines. You can reach him by e-mail, on Twitter, and on LinkedIn.

DAVID STODDER is the director of TDWI Research for business intelligence (BI). He focuses on providing research-based insight and best practices for organizations implementing BI, analytics, performance management, data discovery, data visualization, and related technologies and methods. He has authored TDWI Best Practices Reports and Checklist Reports on customer analytics in the age of social media, BI/data warehouse agility, mobile BI, and information management. He has chaired TDWI conferences on BI agility and big data analytics. Stodder has provided thought leadership on BI, information management, and IT management for more than two decades, and he was the founding chief editor of Intelligent Enterprise, where he served as editorial director for nine years. You can reach him by e-mail, on Twitter, and on LinkedIn.

FERN HALPER is the director of TDWI Research for advanced analytics, focusing on predictive analytics, social media analysis, text analytics, cloud computing, and other “big data” analytics approaches. She has more than 20 years of experience in data and business analysis, and she has published numerous articles on data mining and information technology. Halper is co-author of “Dummies” books on cloud computing, hybrid cloud, service-oriented architecture, service management, and big data. Her Ph.D. is from Texas A&M University. You can reach her by e-mail, on Twitter, and on LinkedIn.

About TDWITDWI, a division of 1105 Media, Inc., is the premier provider of in-depth, high-quality education and research in the business intelligence and data warehousing industry. TDWI is dedicated to educating business and information technology professionals about the best practices, strategies, techniques, and tools required to successfully design, build, maintain, and enhance business intelligence and data warehousing solutions. TDWI also fosters the advancement of business intelligence and data warehousing research and contributes to knowledge transfer and the professional development of its members. TDWI offers a worldwide membership program, five major educational conferences, topical educational seminars, role-based training, onsite courses, certification, solution provider partnerships, an awards program for best practices, live Webinars, resourceful publications, an in-depth research program, and a comprehensive website: tdwi.org.

About the TDWI Best Practices Reports SeriesThis series is designed to educate technical and business professionals about new business intelligence technologies, concepts, or approaches that address a significant problem or issue. Research for the reports is conducted via interviews with industry experts and leading-edge user companies and is supplemented by surveys of business intelligence professionals.

To support the program, TDWI seeks vendors that collectively wish to evangelize a new approach to solving business intelligence problems or an emerging technology discipline. By banding together, sponsors can validate a new market niche and educate organizations about alternative solutions to critical business intelligence issues. To suggest a topic that meets these requirements, please contact TDWI Research Directors Philip Russom, David Stodder, and Fern Halper.

tdwi.org 3

Research Methodology and Demographics

Research Methodology and Demographics Report Purpose This report educates organizations worldwide about the inventory of currently available emerging technologies and methods (ETMs) as they apply directly to business intelligence (BI), analytics, and data warehousing (DW). TDWI assumes that the innovations and excitement of ETMs can make BI, DW, and analytics more appealing, pervasive, insightful, and actionable.

Terminology In this report, ETMs are defined multiple ways, depending on a technology’s type, functionality, age, and adoption. There are many ETM types; this report focuses on those that currently enable real-world use cases in BI, analytics, and DW. Some of the ETMs highlighted include visualization, clouds, the Internet of Things, Hadoop, NoSQL, mobile BI, advanced analytics, and multi-structured data.

Survey Methodology In May 2015, TDWI sent an invitation via e-mail to the BI and data professionals in its database, asking them to complete an Internet-based survey. The invitation was also posted on Web pages and in publications from TDWI and other firms. The survey collected responses from 403 respondents. Of these, 303 respondents completed every question. Survey branching limited the number of respondents for some questions. All responses are valuable and so are included in this report’s data sample. This explains why the number of respondents varies per question.

Research Methods In addition to the survey, TDWI conducted telephone interviews with technical users, business sponsors, and data management experts. TDWI also received briefings from vendors that offer products and services related to ETMs for data, BI, and analytics.

Survey Demographics The majority of survey respondents are IT professionals (64%), followed by business sponsors or users (16%) and consultants (12%). We asked consultants to fill out the survey with a recent client in mind.

The financial services industry (17%) dominates the respondent population, followed by consulting and professional services (13%), software/Internet (12%), healthcare (8%), education (7%), and other industries. Most survey respondents reside in the U.S. (53%) or Europe (15%) and from all sizes of enterprises, with a slight concentration on small ones.

AcknowledgmentsTDWI would like to thank many people who contributed to this report. First, we appreciate the many users who responded to our survey, especially those who agreed to our requests for phone interviews. Second, our report sponsors, who diligently reviewed outlines, survey questions, and report drafts. Finally, we would like to recognize TDWI’s production team: Michael Boyda, Marie Gipson, James Haley, and Denelle Hanlon.

SponsorsHP Security Voltage, HP Vertica, MicroStrategy, Qlik, Snowflake Computing, Striim, and Trifacta sponsored the research and writing of this report.

PositionCorporate IT professionals 64%

Business sponsors/users 16%Consultants 12%

Vendor representatives 5%Academics (professor or

student)3%

IndustryFinancial services 17%

Consulting/professional services 13%Software/Internet 12%

Healthcare 8%Education 7%

Telecommunications 6%Government (state and local) 5%

Insurance 5%Retail/wholesale/distribution 4%

Manufacturing (non-computers) 3%Transportation/logistics 3%

Other 17%

(“Other” consists of multiple industries, each represented by 2% or less of respondents.)

GeographyUnited States 53%

Europe 15%

Asia 11%

Canada 9%

Mexico, Central or South America 4%

Australia/New Zealand 3%

Middle East 3%

Africa 2%

Company Size by RevenueLess than $100 million 21%

$100–500 million 15%

$500 million–$1 billion 8%

$1–5 billion 16%

$5–10 billion 7%

More than $10 billion 16%

Don’t know 17%

Based on 303 respondents who completed every question in the survey.

4 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Executive Summary Part of the fun of being in business intelligence (BI), analytics, data warehousing (DW), and data management (DM) is the constant stream of new and exciting technologies, vendor tools, team structures, development methods, user best practices, and new sources of big data. TDWI refers to these collectively as emerging technologies and methods (ETMs).

For example, tools for data visualization are the most hotly adopted ETMs in BI in recent years. In addition to visualization, most of these tools also support other emerging techniques, including data exploration and discovery, data preparation, analytics, and storytelling. ETMs for analytics involve advanced techniques, such as predictive analytics, stream mining, graph, and text analytics, which are progressively applied to emerging data sources such as social media sites, machines, clouds, and the Internet of Things. A number of emerging data platforms have entered DW environments, namely Hadoop, MapReduce, columnar database management systems (DBMSs), and real-time platforms for event and stream data. The most influential emerging methods are based on agile development or collaborative team structures (e.g., competency centers).

According to this report’s survey, the leading general benefits of ETMs (in survey order) are improvements in competitiveness, decision making, responses to business change, business performance, and innovation. These benefits are being realized today because two-thirds of organizations surveyed are already using ETMs and the vast majority consider ETMs an opportunity.

Despite the benefits, a number of barriers stand in the way of adopting ETMs. Many people feel held back by their IT team’s lack of skills, staffing, infrastructure, and buy-in. Others have trouble seeing the business value of leading-edge technologies. Some work in risk-averse organizations that lack a culture of innovation for either IT or the business. Yet both business and technical respondents report working through these issues to adopt ETMs.

Some ETMs are more like tool features that are emerging in a variety of tool types. The most pervasive is self-service functionality, which is found in tools for reporting, analytics, data prep, analytics, and so on. The point is to give certain classes of users tools so easy, intuitive, and integrated with common data sources that they can use them with little or no setup or assistance from IT. More than half of users surveyed consider themselves successful with IT-free self-service. Other BI ETMs are also progressive, namely mobile BI and agile BI.

Hadoop (whether from Apache or a software vendor), tools and frameworks associated with it (MapReduce, Spark, Hive, HBase), and similar data platforms (NoSQL databases) have emerged from their Internet-company roots and are now being adopted by mainstream enterprises. These ETMs are examples of how influential open source software (OSS) has become for innovative products. Interfaces to these platforms’ data are also common emerging features in vendor-supplied tools for data integration, data prep, data exploration, reporting, and analytics. Hadoop is infamous for weak security, which a new class of ETMs is addressing.

All these OSS-based or OSS-inspired ETMs are now entering data warehouse environments, along with slightly older ETMs such as DW appliances, analytic DBMSs, and columnar DBMSs. This emergence has driven a trend toward multi-platform data warehouse environments, where the core relational warehouse is joined by a long list of standalone data platforms, most of them ETMs.

This report informs the reader about the ETMs that apply to BI, analytics, and DW, plus how ETMs can support innovation in business processes, customer management, competitiveness, IT best practices, and the business leverage of data.

There are many ETMs to consider for BI, analytics, and DW

ETMs assist with competitiveness,

decisions, business change, and innovation

Open source software has become an important wellspring for innovation

DW environments nowadays include

multiple ETMs, many based on open source

tdwi.org 5

Introduction to Emerging Technologies and Methods

Introduction to Emerging Technologies and Methods

Defining and Categorizing ETMsThe modern user organization relies on IT systems, along with best practices for these, to automate a wide range of operations. This is not just to survive but to thrive in a fast-paced, competitive, and economically challenged world. To thrive now and in the future, organizations need to continually update IT systems by adopting emerging technologies and methods (ETMs).

Keeping pace with ETMs is important—especially those with direct applications in business intelligence (BI), analytics, and data warehousing (DW)—because they can satisfy new technical requirements for data-driven applications, enable new business practices, and modernize teams and their methods. On a subjective level, the innovations and excitement of ETMs can make BI, analytics, and DW more appealing, pervasive, insightful, and actionable.

This report discusses ETMs in three categories based on their functions within common technology stacks for BI, analytics, and DW—or combinations of these. Later sections of this report dive into the details of the ETM categories introduced below.1

• ETMs for BI In recent years, TDWI has seen aggressive adoption of the newest generation of BI tools for data visualization. However, these tools do far more than “viz,” including ETMs for data exploration and discovery, data preparation, and analysis of diverse data sources through data-driven storytelling. Older ETMs continue to gain ground, such as mobile BI, mashups, and dashboards. Some of the most influential innovations have come from emerging development methods for agile BI and collaborative BI.

• ETMs for analytics Mature organizations have invested in OLAP and other traditional approaches to analytics. The trend is toward ETMs for advanced analytics, which turn a business into a data-driven operation so it can compete on analytics and achieve a higher level of operational excellence. Many of the innovations in tools for advanced analytics are geared to wringing analytic value from emerging data sources, such as social media data, the Internet of Things (IoT), streaming data, and a wide range of machine data (from sensors, devices, and vehicles).

• ETMs for DW ETMs that are disrupting data management (DM) and warehousing programs include Hadoop, Apache Spark, NoSQL databases, columnar databases, cloud databases, in-database analytics, stream processing, and a new generation of easy-but-effective data preparation tools. As users adopt these technologies, they typically substantially revise their work methods for ETL/ELT, data modeling, database design, team collaboration (via competency centers or centers of excellence), and architectures for data warehouses and other complex data environments.

• ETMs that reach into multiple layers of BI/DW/analytics technology stack These include variations of clouds and software-as-a-service (SaaS), as well as advancements in development best practices based on agile and lean methods. Across the board, open source software for BI, analytics, and DM has recently achieved a new level of maturity that makes it fit for broad enterprise use. Finally, many ETMs span multiple layers of a technology stack. For example, in-database analytics combines innovations in both analytics and database management. Similarly, real-time BI or stream analytics gets its speed from a combination of ETMs, such as in-memory functions, solid-state drives, and columnar databases.

ETMs help an organization innovate and excel

Dozens of ETMs can be applied in a tech stack for BI, analytics, and DW

1 Nowadays there are emerging technologies advancing almost every field of human endeavor—not just IT. For example, hundreds are presented in the article “List of Emerging Technologies” on en.Wikipedia.org. Note that this report focuses on ETMs that are high profile today in BI, analytics, and DW.

6 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

User Perspectives on ETMsThe survey for this report asked respondents to provide the terms and definitions for emerging technologies used where they work. Their responses corroborate that there are many viewpoints and terms, as seen in the following representative quotes:

In your own words, what terms or slang expressions do you and your colleagues use when referring to emerging technologies and methods?

• “Technique or method that leverages new principles to promise unique capabilities, efficiency, or other benefits not achievable before.” —BI service owner, financial services, Canada

• “We always use ‘big data’ as the single term to describe ETMs.”—Senior director of customer intelligence, nonprofit, United States

• “Big data is a very common catchall for ETMs, especially among the uninitiated (lay people) within the business.” —Farm resources manager, agriculture, United States

• “SMAC = Social, Mobile, Analytics, and Cloud.” —IT professional, Internet, Asia

• “Methods are the new emerging tech, such as predictive analytics and artificial intelligence.” —Head of data science and engineering, Internet, United States

• “Advanced analytics and Hadoop, to be able to analyze structured and unstructured data and to provide additional value to the business.” —Head of BI, healthcare, United States

• “Future of BI is advanced analytics in the hands of [the] user.” —BI consultant and developer, software, Asia

• “Cutting-edge solutions, innovative technologies, agile solutions, next-generation technologies.” —Director of partner development, financial services, United States

• “You will often hear the term ‘bleeding edge’ instead of ‘leading edge,’ especially when raising the notion of technology risk with larger, more risk-averse organizations.” —BI and analytics program manager, Internet, Europe

• “Emerging technologies. We don’t use a TLA [three-letter acronym] for this one. :-)” —Enterprise BI team lead, government, Canada

Figure 1. Based on 353 respondents.

Many users equate ETMs with big data,

analytics, bleeding-edge tech, and Hadoop

tdwi.org 7

Introduction to Emerging Technologies and Methods

Why Are Emerging Technologies So Important?Anecdotally, TDWI has seen users’ needs for ETMs increase noticeably since the new millennium began. To gauge the urgency of these needs, this report’s survey asked: “How important do you think it is to embrace ETMs?” (See Figure 2.)

Very few respondents (8%) question the importance of ETMs A business user from a U.S. telecommunications firm said: “ETMs are not important, as the reliability and relevance cannot be determined early on. We try to use tried and proven technologies and methodologies.” A similar contrarian view came from an information platform manager in the Australian government: “Awareness of ETMs is more important than adopting them early. Awareness allows the idea to be built upon without a headlong rush down the pathway of the untested.”

The vast majority of respondents (92%) recognize the importance of ETMs Over half feel that ETMs are “very important,” while more than a third see them as “somewhat important.”

How important do you think it is to embrace ETMs?

Very important 53%

Somewhat important 39%

Neutral 6%

Not very important 1%

Not important at all 1%

Figure 2. Based on 399 respondents.

Most users consider emerging technologies to be important

8 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

As we just saw, users are aggressively gung ho for ETMs. But why? To get their individual opinions, the survey asked the open-ended question: “In your own words, why are ETMs important (or not) to your organization?” The respondents’ comments are overwhelmingly favorable, as seen in the representative statements reproduced below:

In your own words, why are ETMs important (or not) to your organization?

• “Disruption in the financial industry is happening increasingly. ETMs can help us react to any disruption, can enable us to disrupt ourselves or our industry, and can create the required mindset in the organization to allow us to react quickly and painlessly.” —Head of data innovation, financial services, Europe

• “In order to stay competitive, we need to be in tune with the new technologies in the market that are going to help us move our business forward.” —IT director, education, United States

• “If we are not doing it, one of our competitors is doing it. It is better to try and fail—and at least gain an understanding—than not try at all. The rewards for success greatly outweigh the risks of failure.” —Senior technical architect, pharmaceuticals, Canada

• “ETMs keep you competitive. You have to ‘one-up’ your competitors on a regular basis, along with providing the ‘wow’ factor to your clients on a regular basis.” —Senior director of business intelligence, advertising, United States

• “Keeps IT in the mix of critical business initiatives.” —Senior director of a big data analytics center of excellence, pharmaceuticals, Europe

• “There is a fine line of chasing too many new tech/methodologies. But DW is taking a major turn as an industry, and companies will be left behind if they do not embrace [ETMs].” —Team lead for enterprise analytics, insurance, United States

• “Our organization needs to embrace ETMs because they complement or improve the value that a traditional data warehouse can provide.” —Senior business systems analyst, retail, United States

• “Technology advances in big steps, and in order to be on top of the game, whatever your industry is, you must be always researching what’s new, but most importantly, what can bring something useful or revolutionary to your business.” —Chief data scientist, consulting firm, Mexico

• “Human actions are being captured like never before, [and] machine-generated data is only adding different dimensions to the same. It is important to visualize trends and derive meaningful insights for the business.” —IT specialist, Internet, Asia

• “Sound evaluation of ETMs assists in maintaining a competitive edge and learning what may be on the horizon to assist in differentiation.” —CIO, healthcare, United States

Users’ priorities for ETMs involve disruption,

competitiveness, DW modernization, and

supporting new business requirements

tdwi.org 9

Introduction to Emerging Technologies and Methods

• “New business operating models are emerging, and they cannot be implemented without new technologies. The more data you can access and the faster you can process it, the better services you can offer [for customers].” —IT specialist, insurance, Europe

• “They are important because they fulfill a need not provided by other tools.” —Digital transformation executive, logistics, United States

Figure 3. Based on 364 respondents.

EXPERT COMMENT BEYOND REPORTS, OLAP, AND DASHBOARDS: USERS WANT ADDITIONAL VALUE FROM DATAMark Madsen, president of Third Nature, and Steve Dine, president and founder of Datasource Consulting LLC, have for several years developed and delivered TDWI training courses on the use of emerging technologies and methods (ETMs) for BI, analytics, and data warehousing Madsen and Dine explain their position on ETMs this way:

Many BI organizations have already delivered and mastered standard reporting, OLAP analysis, and dashboards. Most BI data is used for basic monitoring, drill-down, and reporting purposes. Now, organizations are looking to more advanced analytic methods to increase the value they are getting from their data. This often requires a greater variety, velocity, and volume of data than is being processed today.

In addition, technology advances are changing the economics of information management and creating new ways to deal with old problems. Advances in hardware and software are reinventing BI and data management, allowing us to alter the cost structure and approaches for deploying information and insights to end users.

Hence, there is a growing awareness among business leaders that technical improvements have led to an explosion of new capabilities, and the BI group is the natural focal point for those requests. Statistical analytics, data visualization, and textual data mining are a few of the areas that organizations are starting to focus on to obtain more value from their data. Others involve agile development methods, governance, and self-service functions for data exploration and data preparation.

10 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

The State of Emerging Technologies and Methods

Are ETMs a Problem or an Opportunity?Users wishing to adopt ETMs for BI, analytics, and DW face potential problems as well as potential opportunities. To get a sense of the balance between the two, this report’s survey asked: “In your organization, are ETMs considered mostly a problem or mostly an opportunity?” (See Figure 4.)

The vast majority of users surveyed (79%) consider ETMs an opportunity As seen elsewhere in this report, experienced users say that ETMs are an opportunity to compete with, evolve, perform, and address new business requirements.

Relatively few (21%) consider ETMs a problem Common challenges—such as limited skills, budget, and executive support—don’t seem to keep many organizations from considering ETMs.

In your organization, are ETMs considered mostly a problem or mostly an opportunity?

Problem 21%

79% Opportunity

Figure 4. Based on 374 respondents.

Emerging technologies and methods for BI, analytics, and DW are getting significant attention, but are user organizations actually using them today? To confirm the existence of ETMs, this report’s survey asked: “Does your organization currently have business application(s) in production or in development that you consider to be based on ETMs?” Roughly two-thirds of respondents answered yes (64%, Figure 5).

Does your organization currently have business application(s) in production or in development that you consider to be based on ETMs?

64% Yes

No 36%

Figure 5. Based on 403 respondents.

Emerging technologies can be a business

opportunity, but are rarely a problem

Two-thirds of users surveyed already have

ETMs in development or production

tdwi.org 11

The State of Emerging Technologies and Methods

Benefits of ETMs for BI, Analytics, and DWSurvey respondents are aware of the many benefits that ETMs for BI, analytics, and DW can offer an organization. (See Figure 6.)

ETMs help meet or beat the competition In fact, competitive advantage (56%) is the benefit chosen more than any other in Figure 6. Users who TDWI interviewed for this report regularly spoke of how having ETMs that competing firms don’t have can typically provide a competitive edge. For example, using real-time ETMs to review and approve new loans or new insurance policies faster than competitors helps to both get new customers and retain old ones. As another example, knowledge workers from logistics companies have spoken at TDWI conferences about how adding more sensors and devices to vehicles, shipping pallets, and other mobile assets (including employees) has enabled them to innovate with geospatial and near-time data, thereby remaining competitive.

ETMs extend BI, analytics, and DW Advanced analytics is one of the hottest categories of ETMs today. Analytics drives better strategic decision making (52%), whether strategic or operational. ETMs that involve new customer channels (e.g., self-service apps, monitoring, social media) help organizations understand customers (23%), which contributes to customer retention and customer-base growth. Similarly, ETMs that are new data platforms (such as Hadoop, NoSQL databases, and columnar databases) can extend the life of a data warehouse, while the new generation of data visualization tools (which also support new approaches to data exploration and data preparation) have already taken many BI programs to a higher level of insight and agility.

ETM automation can improve process outcomes When ETMs focus on BI, analytics, and DW practices, they provide new analytic ways to improve business performance (40%), drive operational efficiency (37%), and improve productivity (32%). In particular, ETMs that generate or handle data in real time or near time (e.g., stream analytics, in-database analytics, and in-memory functions) help organizations innovate by monitoring and responding quickly to a wide range of business entities, including customers, competitors, facilities, plants, business processes, traffic, the weather, and so on.

ETMs can be a positive response to change Many emerging technologies—and most emerging methods—provide greater agility in development than previous generations. This is true of ETMs that support self-service, data exploration, and on-the-fly data preparation. These agile ETMs in turn accelerate an organization’s response to business change (43%).

ETMs spur innovation From the above examples, you can see that ETMs promote innovation in general (37%), address new business needs (37%), and help people imagine new applications (16%).

Miscellaneous A few respondents described additional benefits for ETMs, including how they help retain valuable employees by offering new learning experiences. A number of ETMs extend security into more features for detecting and blocking unauthorized access to data, encrypting or masking data, and detecting fraud, as seen in data. Several respondents pointed out that ETMs help with “new stuff,” like finding new customers, designing new products, and developing new services that generate new revenue streams.

The top beneficiaries of ETMs for BI, analytics, and DW are competitiveness, decision support, business performance, and innovation

12 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

What are the leading benefits of embracing ETMs? Select five answers max

Competitive advantage 56%

Drive better strategic decision making 52%

Faster response to business change 43%

Improve business performance 40%

Address new business needs 37%

Drive operational efficiency 37%

Innovation, in general 37%

Improve productivity 32%

Drive IT and business collaboration 25%

Understand customers 23%

Learn new skills 20%

Reinvigorate both business and technology processes 20%

Monetize our data 17%

Imagine new applications 16%

More options to tap 8%

Other 3%

Figure 6. Based on 1,741 responses from 374 respondents; 4.7 responses per respondent, on average.

Barriers to Success with ETMsETMs have benefits, as we just saw. Yet, they also have barriers. (See Figure 7.)

The leading barrier to ETM adoption is the state of IT The barrier may be IT’s inadequate technical skills or staffing (56%, Figure 7), or it could be IT’s failure to provide self-service tools for business users (21%). A number of respondents selected “Other” and mentioned similar complaints, citing a lack of “IT support and buy-in” and “IT infrastructure.” One respondent was diplomatic, pointing out that “IT and the analytics organization have completely different cultures.”

Budgetary issues are the second most common barrier to ETMs More specifically, there is a “lack of budget” for ETMs (50%). Because ETMs tend to be new, there’s rarely a line item in a budget for them, but another reason may be that the people who want ETMs suffer an “inability to articulate value to budgetary decision makers” (36%). In response to that multiple-choice answer, one respondent selected “Other” and restated the problem as an “inability of budgetary decision makers to recognize the value of ETMs.” Another respondent said, “My industry is in financial crisis, so budget is always the first barrier.”

With new data and new tools, it takes time to establish best practices This is especially true with risk, security, and compliance issues (43%). In a related vein, most users interviewed for this report spoke of the importance of adjusting older programs for data stewardship and data governance to accommodate new tools, practices, and data resulting from ETMs. Though this is hard work, interviewees didn’t see the adjustments as actual barriers.

The barriers to ETMs are IT itself and governance

issues, plus a lack of skills, budgets, business

value, and innovation

tdwi.org 13

The State of Emerging Technologies and Methods

Like anything in IT, ETM adoption is unlikely without a business case For example, “unclear business value” (41%) is a fairly common barrier. If there is “no business need” (18%), there will be “no executive support” (34%).

Innovation culture isn’t as common as it should be Many organizations suffer “stodgy mindsets” (32%), such that there is “no culture of innovation” (33%) for either IT or the business. Stodgy thinking aside, risk aversion by management is also lethal to ETM adoption and other innovations. One respondent said that, among his senior managers, “no one takes a risk for business results.”

What are the barriers to embracing ETMs? Select five answers max

Inadequate technical skills or staffing 56%

Lack of budget 50%

Risk, security, compliance issues 43%

Lack of time 42%

Unclear business value 41%

Inability to articulate value to budgetary decision makers 36%

No executive support 34%

No culture of innovation 33%

Stodgy mindsets 32%

Data integration 31%

Lack of self-service tools for business users 21%

Vendor lock-in 21%

No business need 18%

Other 8%

Figure 7. Based on 1,342 responses from 374 respondents; 3.6 responses per respondent, on average.

14 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

ETMs for Business IntelligenceDelivering “the right data to the right users at the right time” has been the mantra for BI since its early days. It remains the paramount objective for most organizations as they evaluate and adopt ETMs for BI. Just getting the right data is often the primary initial focus as organizations build up the infrastructure to extract, transform, and deliver data to users and try to improve data quality. Data delivery challenges are never entirely solved, especially as data grows more diverse and organizations seek to move toward real-time views of data. However, even as they continue to work on data delivery, many organizations are as much (if not more) focused on the user experience—that is, applying BI and visual analytics tools, applications, and methods to enhance users’ interactions with data.

Certainly, users themselves are highly interested in improving their experience as evidenced by business-driven growth in deployment of self-service BI and analytics tools and applications. Line-of-business (LOB) units and departmental functions such as marketing are showing greater willingness to look for solutions outside the officially sanctioned enterprise BI standard, if the organization has one. This is contributing to turbulence in the BI marketplace with newer self-service data discovery and visual analytics technologies vying with more established enterprise BI solutions. In our research, we found that only a little over a third (36%) of research participants said that their organizations have standardized on one vendor or solution for BI, analytics, and data warehousing (64% have not standardized).

Thus, improving users’ BI experiences toward greater relevance, more actionable presentation, and easier analytics functionality is an imperative in many organizations, and not just for business analysts, data analysts, and power users. Many organizations want to “democratize” BI so that more users can benefit. Firms want to move away from guesswork and toward data-driven decision making at all levels, from corporate leadership and line-of-business (LOB) management to frontline operations and functions such as marketing, sales, service, and fulfillment, where employee actions are critical to attracting and satisfying customers. Static reports that are out of sync with the types of business decisions and challenges users face will no longer suffice, if they ever did. As users grow more dependent on data for daily decisions, they need tools and applications that support continuous data interaction plus the ability to share insights easily with colleagues.

BI applications and systems therefore need ETMs that enable them to be flexible to satisfy a broad spectrum of users and new use cases. This stretches from those who need to quickly consume reports, dashboards, and scorecards to track and lightly analyze key performance indicators to those who can apply more advanced analytics to gain higher value from data—and make more informed decisions. Their success hinges on engaging in deeper, more personalized exploratory and root cause analysis. Flexibility must extend to platform diversity as well; more users are growing accustomed to working on mobile devices while on the go. Mobile device deployment must be part of the plan, not merely an afterthought for many BI and analytics applications.

Only 36% have standardized on one

vendor or solution for BI, analytics, and DW

tdwi.org 15

ETMs for Business Intelligence

New Technology Adoption and SatisfactionAs users’ data requirements grow, they can become less satisfied with existing BI and analytics capabilities. This puts pressure on organizations to adopt new technologies and upgrade to the most current versions of their software. Our research finds that users are moderately satisfied with the rate of new technology adoption for their organizations’ BI, analytics, and data warehousing (DW) systems. Just 10% of research participants said that users in their organizations were very satisfied while 46% said they were somewhat satisfied; 41% are unsatisfied (3% didn’t know). It’s clear that a sizeable percentage of users would like to see faster adoption of new technologies.

Figure 8 shows research participant organizations’ level of currency with installed releases for their BI, analytics, data integration, and DW systems. We can see that most participants are not current with any of the applications and systems listed; the highest percentage (31%) is current with BI tools, followed by data security (27%) systems. This circumstance is not uncommon, particularly with large, complex, and expensive systems such as DW database platforms, where 23% are at the current release but 57% are at least one release behind. Moving to new releases of these systems can require significant work and planning.

However, we can see that most research participants are also at least one release behind with users’ software such as visual data discovery tools (23% at the most current release and 50% at least one release behind). Many organizations prefer to wait to upgrade or will engage in lengthy testing cycles to ensure that new releases perform well and significant bugs are remedied. The downside of waiting is that organizations do not immediately gain the benefits of emerging technologies offered in the latest releases.

How current are the installed releases for the following software used by your organization for its BI, analytics, data integration, and data warehousing systems? Select one answer per row

Most current release

One release behind

More than one release behind

SaaS provider manages releases

Not applicable Don’t know

BI tools 31% 32% 22% 5% 8%

Data security 27% 22% 15% 6% 28%

Data warehouse database platform 23% 31% 26% 7% 11%

ETL and data integration systems 23% 26% 29% 7% 13%

Visual data discovery tools 23% 25% 25% 15% 10%

Analytics (e g , data mining) tools 19% 26% 28% 12% 13%

Self-service data preparation tools 16% 22% 24% 21% 15%

Data quality and profiling tools 15% 18% 26% 23% 16%

Packaged analytic or data warehouse appliance 14% 17% 21% 29% 18%

Complex event processing software 7% 13% 16% 33% 28%

Figure 8. Based on 394 respondents.

A sizeable percentage of users would like to see faster adoption of new technologies

2%

2%

2%

2%

2%

2%

2%

2%

1%

3%

16 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Organizations are seeking to accelerate the pace of development Users would also like to see more rapid development and deployment of new BI and analytics applications as well as activation of new features for their existing applications, according to our research. More than half (55%) of research participants said that users in their organizations are dissatisfied with the amount of time development and deployment is taking, with 23% not satisfied at all. Significantly fewer (42%) are satisfied (3% don’t know). The results speak to both the length of time it traditionally takes to gather requirements and develop finished BI and analytics applications as well as underlying DW systems and users’ impatience with the process.

To increase development speed and deliver value sooner, many organizations are implementing agile software development methods for BI, analytics, and DW projects. These methods offer an alternative to traditional waterfall methods and cycles. Agile methods aim at closer collaboration between users and IT developers; they propose iterative cycles to deliver value to the business incrementally rather than only at the end of full waterfall cycles. Although agile methods are not the answer for every type of project, our interview research finds that many organizations that have implemented agile methods (or even less strict, “agile-inspired” methods) exhibit closer ongoing collaboration between business users and developers that results in greater satisfaction with project results.

Self-Service BI: Power to the UsersIn recent years, we have witnessed a steady advance in technologies that enable users to do more data access, analysis, transformation, and sharing without having to wait for IT developers to do it for them. Our past research has consistently found that reducing dependence on IT and increasing users’ self-reliance with BI and analytics are top priorities for most organizations. ETMs that support self-service BI, visual analytics, and data discovery provide users with capabilities for moving beyond canned reports and spreadsheets to explore and interact with data and build visualizations mostly on their own.

Research participants for this report indicated that users in their organizations are moderately successful with performing BI and analytics functions in a self-directed fashion, without close IT support; 45% are somewhat successful, 9% are very successful, and 40% are either somewhat unsuccessful or not at all successful (6% don’t know). Having adequate training is a key factor in increasing the level of success. Users need to learn both how to use the tool or application and how to work more extensively with data and visualizations than they may have previously. IT can play a vital role in managing users’ transition to self-service BI and analytics by facilitating training in these areas.

Self-Service Data Preparation ETMsAlthough users are taking greater control of front-end BI and data discovery, TDWI research finds that in most organizations, IT remains primarily responsible for back-end data preparation, which can include data cleansing and enrichment, consolidation of records, and the creation of calculated fields, aggregations, and dimensions. IT also remains primarily responsible for data extraction, transformation, and loading (ETL). Yet, as users seek to go beyond standard BI reports and dashboards and want to perform more self-service data discovery and analytics, traditional processes for preparing data are under pressure.

Agile method adoption is one of the strongest

trends today in BI, analytics, and DW

Users are moderately successful with self-

service for BI and analytics, with minimal

IT intervention

tdwi.org 17

ETMs for Business Intelligence

One of the most important emerging technologies today is self-service data preparation. These tools (or cloud-based services) essentially aim at building IT’s data preparation and integration intelligence into software. Vendors are applying analytics and machine learning to automate data preparation steps and enable systems to learn about the data sources (including finding relevant data relationships and anomalies) as well as learn users’ data preferences over time. Some self-service BI, visual analytics, and data discovery tools are beginning to provide these capabilities themselves or are embedding those provided by specialized self-service data preparation vendors.

Many of these emerging technologies are geared to handle more ad hoc, on-the-fly data preparation needs for analytics than traditional ETL for BI reporting. Many also specialize in accessing, integrating, and preparing unstructured data stored in Hadoop clusters (e.g., in data lakes).

New terms in the industry are being applied to describe self-service data preparation and integration, including data blending, data munging, and data wrangling. Although vendors use the terms somewhat differently, they generally signify easier and faster data preparation and integration of a wide range of sources, usually through automated processes driven by advanced analytics. To hide the complexity of selecting, blending, and accessing data sources, the tools enable users to work with graphical icons rather than code to perform data mashups, set filters, or create custom data blends for their immediate analytics needs.

Our research finds that users are already having some success with formatting and cleaning data for BI and analytics in a self-directed fashion without close IT support. Half (50%) of research participants said that users in their organizations are successful with data preparation steps, although only 7% are very successful; 44% are unsuccessful (6% didn’t know). Given the growing number of data sources and the complexity of preparing them for BI and analytics, it is unlikely that users will ever be able to do all of their own data preparation; however, the technologies could help by offloading tasks to users, reducing IT’s backlog, and giving users the capabilities to prepare unusual data sources on their own.

USER STORY BUNCOMBE COUNTY REDUCES DATA CHAOS AND CAPTURES INSTITUTIONAL KNOWLEDGE WITH BILike many state and local governments, Buncombe County, North Carolina, is focused on becoming more data-driven in its decision making. With data and users spread across 21 departments and services, the county is implementing BI and data visualization tools and a BI/data warehousing infrastructure to reduce data chaos and establish single views of key types of information so that users can tap quality data for their decisions.

There is also another, more human urgency behind the effort: “We are in urgent need of effective planning for the workforce of the future; they tend to be more data-driven,” said Michael Greene, business intelligence manager with the county, which is seated in Asheville. “We would like to ensure any institutional knowledge is captured and allow for the data to enforce and inform strategic decision making for both the current and new generation of the workforce and their staff well into the future,” he said.

Although the county is “not on the bleeding edge,” Greene said, his BI team “has an active eye out at all times” for ETMs that could help drive greater operational efficiency, improve productivity and performance, and spur creative ideas for data-driven applications. Expansion in self-service BI and data visualization has been a plus, enabling users to share charts easily on Microsoft SharePoint to communicate data points. “It’s much better than trying to read a big report,” observed Greene. “Users can look at something and they all understand it.”

Modern tools improve the user experience of data professionals, with graphical icons instead of heavy coding or modeling

18 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Self-service data discovery, visualization, and dashboard authoring are most important In Figure 9, we can see how research participants view the relative importance of various technology systems or capabilities for enabling users in their organizations to achieve objectives with BI and analytics in a self-directed fashion without close IT oversight. The largest percentages indicated that the most important were self-service data discovery (47% said very important and 32% said somewhat important), visualization for query and analysis (44% and 34%), and self-service dashboard authoring (43% and 33%). These capabilities are core to self-service BI and data discovery. Somewhat fewer participants said self-service data preparation was important (28% and 40%), which may indicate IT’s current preeminence in this area.

How important are the following technology systems or capabilities to enabling users in your organization to achieve their objectives with BI and analytics in a self-directed fashion, without close IT oversight? Select one answer per row

Very important Somewhat important

Somewhat unimportant

Not important Don’t know

Self-service data discovery 47% 32% 6% 4% 11%

Visualization for query and analysis 44% 34% 6% 6% 10%

Self-service dashboard authoring 43% 33% 9% 5% 10%

Search for exploring BI reports and tables 38% 36% 9% 6% 11%

Analytic application platform 35% 34% 9% 8% 14%

Self-service data preparation (e g , data blending) 28% 40% 13% 7% 12%

Analytic appliances 27% 32% 9% 16% 16%

Self-service data mapping and transformation 27% 35% 15% 10% 13%

In-memory computing/data grids 23% 28% 17% 13% 19%

Storytelling and collaboration 23% 36% 14% 12% 15%

Cloud-based BI and analytics 21% 25% 18% 20% 16%

Software-as-a-service tools 21% 29% 17% 17% 16%

Figure 9. Based on 320 respondents.

Figure 9 shows that the smallest percentages of research participants regard cloud-based BI and analytics and software-as-a-service (SaaS) as very or somewhat important. This reflects both immaturity in deploying these types of offerings for BI and analytics as well as continuing concern about security and governance of sensitive data and analytics. Cloud and SaaS alternatives for BI and analytics can provide flexibility to meet dynamic needs, support mobile or widely distributed users, and scale up without having to build infrastructure on premises. However, organizations need to be assured that key data governance and security priorities are met before they adopt cloud and SaaS more fully.

tdwi.org 19

ETMs for Business Intelligence

Hadoop Access from BI and Analytics ToolsOpen source technologies are a major phenomenon in data management and the source of many ETMs. Apache Hadoop—an open source framework for distributed storage and processing—is the center of an ecosystem that has spawned many related and sometimes competitive technologies. Some of the largest data lakes built on Hadoop clusters store petabytes of data, usually of multiple types. NoSQL (loosely defined as “Not only SQL”) is an umbrella term for technologies that specialize in non-relational storage and retrieval of data. These include key-value store, graph databases, content or document database systems, and non-relational columnar databases. With these technologies, users can gain new perspectives on data relationships, examine context around BI reports and key performance indicators (KPIs), and search and analyze unstructured big data generated by customer behavior, social media, and more.

Although Hadoop and NoSQL ETMs are important for organizations to consider deploying to store and analyze huge volumes of multi-structured data, they have presented challenges in terms of accessing them from BI, data discovery, and visual analytics tools and applications that are built to work with relational data structures. Connecting these tools and applications to Hadoop and NoSQL data sources has typically required specialized development of custom MapReduce code and scripts and, even if commercial software is used, customization for queries as well as the ETL routines needed for each source.

These difficulties are beginning to lift, however, with recent releases of BI, data discovery, and visual analytics tools, plus the vendors’ partnerships with Hadoop distribution providers. Together, they are offering more standardized, built-in connectivity, most often through Apache Hive ODBC drivers and the distribution providers’ commercial SQL-on-Hadoop tools and connectors.

Alternatives such as the Spark SQL library, which is part of the core Apache Spark framework, are enabling developers to write interactive SQL queries against Hadoop as well as streaming, real-time data feeds and other sources. Plus, new BI and analytics tool capabilities are maturing in the marketplace that let users be less restricted to relational structures and more able to use search, text analytics, and other querying strategies to gain insight into multi-structured data in Hadoop and NoSQL systems.

Figure 10 offers insight into which technologies research participants’ users are currently implementing or planning to implement to access Hadoop files from BI and analytics tools. We can see that for many of the technologies, current implementation is low; this is likely because despite the hype around Hadoop, many organizations do not have Hadoop files, do not see a reason to connect those they have to BI and analytics tools, or are not mature in connecting them.

Commercial SQL DBMS access tools are the most prevalent (39% currently implementing and 15% planning to implement), which is not surprising because most organizations have existing investments in this technology and may expect that reaching out to these new sources is best accomplished with these tools. Columnar or other new relational DBMS type is next highest (22% currently implementing and 20% planning to implement), which indicates that some organizations could be using (or planning to use) Hadoop as a data landing zone for ETL and then employing a columnar database to support more efficient analytics.

Hadoop and NoSQL data sources have presented challenges in terms of accessing them from tools and applications built for relational data structures

Common starter use cases for Hadoop include ETL data staging and SQL-based analytics with a columnar database

20 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Which of the following technologies are currently being implemented or planned to be implemented by users in your organization for accessing Hadoop files from BI and analytics tools? Select one answer per row

Current Planned No plans Don’t know

Apache Drill 3 11% 47% 39%

Apache Flume 5% 11% 45% 39%

Apache Hive 12% 17% 40% 31%

Apache Spark (Spark SQL) 9% 18% 38% 35%

Apache Storm 4% 13% 43% 40%

Apache Tez (Hive on Tez) 3 10% 45% 42%

Commercial BI- or SQL-on-Hadoop engine 16% 21% 35% 28%

Commercial SQL DBMS access tool 39% 15% 24% 22%

Data virtualization server 19% 19% 33% 29%

Kafka 4% 11% 42% 43%

MapReduce 17% 20% 35% 28%

Presto open source SQL query engine 5% 8% 46% 41%

Homegrown tool 18% 8% 42% 32%

Search tool 19% 20% 31% 30%

Self-service data preparation tool 19% 27% 27% 27%

Columnar or other new relational DBMS type 22% 20% 33% 25%

NoSQL DBMS 16% 20% 36% 28%

Figure 10. Based on 320 respondents.

The highest planned technology implementation shown in Figure 10 is self-service data preparation (27%), which shows the importance of this emerging technology discussed earlier. The highest “no plans” percentages generally were for newer Apache open source projects such as Drill (47%), Flume (45%), and Tez (45%), as well as Presto (46%), an open source distributed SQL query engine developed at Facebook. Nonetheless, all of these technologies bear examination by organizations currently or planning to invest in BI querying and analytics on Hadoop systems, because they are part of the rapid evolution of emerging technologies and could offer better solutions than existing means depending on the data architecture and specific querying or analytics needs.

Of course, many are so new that unless organizations have access to developers experienced in programming with these open source frameworks and technologies, they will likely need to wait for tooling that can automate tasks and allow developers and users to work with the frameworks and code at a higher level.

New tools and methods for data prep and

interactive queries are coming on strong

%

%

tdwi.org 21

ETMs for Business Intelligence

USER STORY PRIMATICS SEEKS ETMS FOR CREATING NEW SPEED AND FLEXIBILITY ADVANTAGESFinancial institutions eat, sleep, and breathe risk. It is crucial to know the difference between loans that will perform well and contribute positively to the business versus loans that might default and cause a negative impact. Yet, there’s not much time for consideration because to gain and keep the best loans, time is of the essence—loans must be approved and processed quickly, no matter what sort of volume or complexity the institution is facing. At the same time, firms must adhere to a nest of complex and changing regulatory rules and reporting practices throughout their risk and finance processes.

Primatics provides EVOLV, a software platform and cloud offering with integrated solutions for automating and managing financial institutions’ risk and finance processes as well as supporting a variety of reporting and analytics. EVOLV uses sophisticated risk analytics. Cary Moore, director of BI at Primatics, said the firm actively scouts for ETMs that “provide a fundamentally better way of achieving desired goals, especially to meet emerging customer needs.” Of high interest is technology to reduce delays and increase flexibility in support of data flows that fuel complex analytics for evaluating loan variables and applying them to models.

The company is already using self-service BI and data preparation for dashboards, scorecards, and operational intelligence activities such as continuous alerting. It is looking at greater use of in-memory computing for analytics, including platforms that provide real-time processing in the cloud to support Primatics’ growing cloud-based offerings. Moore said the company is also looking at exploiting Hadoop data lakes for storing raw data and keeping each data source’s structure intact so that it is easier to continue receiving data from those sources. To further support real-time computation and faster transformation, the company is also looking at ETMs that employ Apache Pig for ETL and Storm for rules processing.

Mobile and Embedded BI: Potential for BI ETMsThe continuing spread of mobile devices is breaking open new use cases and new technologies for innovation in BI and analytics. According to our research, tablets are the dominant device type that organizations plan to support for BI functionality; 82% of research participants indicated that this was the preferred platform, with 74% indicating smartphones (note that participants could select more than one answer). In our interview research, we found that many organizations would prefer to make mobile BI and analytics part of their overall BI strategy rather than have users download apps in a haphazard fashion. This is in part because organizations would like to provide users with a seamless experience across PCs and mobile devices.

Challenges in providing BI and analytics on mobile devices continue to hold back broader deployment, with the top challenges being concerns about data security and authentication. Organizations need to be able to determine who is accessing the data and be able to shut down access if a device is in the wrong hands or is located outside the allowable area (e.g., within a store or warehouse) for its use.

From a design perspective, challenges include adapting to smaller viewing spaces and dealing with limits on what works in terms of data interaction. Designers and developers also need to take advantage of native device functionality, while at the same time effectively using HTML5 and other Web application standards to enable a “develop once, deploy anywhere” strategy. Finally, query performance can be a challenge because IT cannot control variable bandwidth and transmission strength for remote users. Expectations for query performance and frequency of data updates must be set and communicated clearly.

Organizations need to be able to determine who is accessing the data and be able to shut down access if a mobile device is in the wrong hands

22 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

In Figure 11, we can see the types of BI functionality that users at research participants’ organizations are currently implementing or planning to implement. The highest percentage of participants (45%) said that dashboards are currently being implemented; 34% said they plan to implement them. Dashboards, typically the first visualization that organizations deploy onto mobile devices, are also prevalent on PCs, which means that the challenge for organizations is to give users a cohesive experience as they move between dashboards on the two platforms. Metrics and KPIs had the second-highest percentage, with 38% of participants currently implementing them and 33% planning to do so. This indicates that performance management and monitoring of KPIs is a priority use case to fulfill on mobile devices.

More than a quarter (28%) of participants said that users are currently performing report authoring, filtering, and distribution on mobile devices (27% plan to do this), which shows advancement in maturity. Not many participants (16%) said that their users are employing mobile devices for in-memory BI and OLAP, however, which could indicate that form factors, security concerns, and the kinds of use cases for mobile are collectively not very conducive for deeper analysis on devices.

Which of the following types of BI functionality are users in your organization currently implementing on mobile devices, and which ones are planned for implementation, if any? Select one answer per row

Current Planned No plans Don’t know

Dashboards 45% 34% 11% 10%

Metrics and KPIs 38% 33% 15% 14%

Data visualization (e g , graphs, heat maps, etc ) 35% 37% 16% 12%

Alerts and activity monitoring 32% 36% 18% 14%

Ad hoc querying and reporting 30% 25% 32% 13%

Spreadsheet functionality 30% 19% 31% 20%

Drill-down and slice-and-dice 29% 33% 22% 16%

Report authoring, filtering, distribution 28% 27% 30% 15%

Security protocol enforcement 23% 31% 25% 21%

Data exploration and discovery 22% 30% 34% 14%

Native mobile device functionality 21% 24% 31% 24%

BI embedded in other apps 17% 32% 32% 19%

In-memory BI/OLAP 16% 23% 39% 22%

Storytelling features 15% 26% 34% 25%

Figure 11. Based on 311 respondents.

More than a quarter said their users are

performing report authoring, filtering, and distribution on

mobile devices

tdwi.org 23

Importance of Embedded BI and AnalyticsOften the best way for users to interact with data, whether on mobile devices or PCs, is not through dedicated BI and analytics applications but through other applications or services that they use for managing operations and business processes or inside customer relationship management and engagement systems. Rather than switch to separate applications, users find value in having BI and analytics tightly integrated with point applications that are dedicated to their roles and responsibilities. Our research bears this out. Overwhelmingly, research participants said that it is important to their organization’s BI and analytics strategy to embed dashboards, reports, or other functionality within existing applications, portals, or processes; 45% said it was very important and 40% said it was somewhat important.

Objectives for real-time, actionable intelligence can be furthered by having dashboards and analytics functionality embedded in operational processes and applications. If these are not enhanced by BI and analytics, organizations run the risk of having operational processes that are uncoordinated with KPIs and other metrics and that lack a steady supply of the latest information and insights gleaned from the organization’s data.

Using BI ETMs to Monetize Data AssetsA strategic objective for many organizations is not only to improve management and execution through use of leading-edge BI and analytics technology and methods, but also to develop revenue-generating, information-based products and services. Both business-to-business (B2B) and business-to-customer (B2C) relationships can be enhanced when organizations tap their data assets to develop dashboards, visualizations, and analytics that communicate helpful insights. Financial services firms, for example, have been successful in packaging information for clients or even providing access to selected data sources for clients to run their own analytics. Analytics that correlate sales data with location data can enable retailers to work more effectively with partners to ship the right quantities of products to stores where demand is greatest. Mobile apps provide a myriad of opportunities for organizations to package data insights for consumers and business partners.

Our research finds that monetizing data assets or creating data-as-a-service solutions for customers, partners, or clients is not yet an active strategy for most research participants’ organizations. (See Figure 12.) Just 18% said that their organizations currently monetize some data assets, with 28% indicating that they plan to do so. The largest percentage (42%) said that their organizations have no plans to monetize data assets, and 12% didn’t know.

Does your organization currently employ BI and analytics to “monetize” data assets or create data-as-a-service solutions for customers, partners, or clients? Does your organization plan to do so?

Yes, we currently monetize some data assets 18%

We plan to do so 28%

No plans 42%

Don’t know 12%

Figure 12. Based on 311 respondents.

Users find value in embedded BI and analytics that are tightly integrated with applications and processes

Monetizing data assets or creating data-as-a-service solutions are not yet active strategies for most organizations surveyed

ETMs for Business Intelligence

24 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

For most organizations, monetization of data assets is not something they are considering if they are focused more inwardly on how they can use BI and analytics to improve employees’ ability to manage costs, operations, supply chains, and customer relationships. However, with information now a competitive asset, if not a weapon, organizations should consider how ETMs for BI and analytics could open up opportunities for revenue generation through information-based products and services.

ETMs for Analytics

Evolving Data Sets and AnalyticsAs more kinds of data are created (some with emerging technologies!), it is only natural to expect that emerging technologies will also evolve to help analyze and utilize this data. Although some “emerging” analytics technologies have been available for years, others are newer in scope. There have been advances in anomaly detection, pattern learning, signal analysis, text analysis, and more. Some of the older algorithms are being refactored, for example, to deal with big data. These analytics technologies are being used to extract insights and gain value from unstructured text data, as well as structured data, real-time data, and other multi-structured data types. There are exciting times ahead for organizations using emerging analytics techniques and technologies. Here is a sampling of some of these technologies:

Text analytics Text analytics is the process of analyzing unstructured text, extracting relevant information, and transforming it into structured information that can be leveraged in various ways. Text analytics can be used on a range of text from e-mail messages to social media to understand the “why” behind the “what.” For instance, if a customer discontinues a service, text analytics can help to understand the reasons for the action. Were they unhappy? Why? Text analytics often uses some form of natural language processing (NLP), statistics, or other math-based techniques. It has been available for a number of years but is gaining more attention as organizations realize the value of the insight text can provide.

Operational intelligence (OI) This analytics technique involves using query analysis or other more advanced analytics against continuous, potentially real-time or near-real-time data to gain visibility into operations. OI can go hand in hand with the Internet of Things (IoT). OI could be used to evaluate complex events in oil well operations, for instance, or in a manufacturing operation. In some ways, OI is an evolution of event processing and complex event processing techniques.

Machine learning The term machine learning was coined by Arthur Samuel back in the 1950s in referring to computers that have the “ability to learn without being explicitly programmed.” The essence of machine learning is about a machine learning from examples. This machine can learn by giving it data in either a supervised (where a person is involved, giving the machine a set of data with known outcomes) or an unsupervised approach. Machine learning is becoming popular, and the algorithms are evolving as organizations want to find patterns in data that is more multi-structured, large, and complex.

Stream analytics Stream analytics, or mining, is used to analyze data that arrives continuously, often as a sequence of instances. This might include sensor data (or other machine-generated IoT data), social media data, traffic feeds, or financial data. Often, this data needs to be processed immediately or stored for offline analysis and model development. The models are then used against the data stream to monitor and score new data as it flows through the models. Newer technologies allow for processing of data quality and/or model development within the stream as opposed to offline.

Although some “emerging” analytics

technologies have been available for years, others are

newer in scope

tdwi.org 25

Cognitive computing Cognitive computing is a problem-solving approach that uses hardware or software to approximate the form or function of natural cognitive processes.2 It often works from a large corpus of text data and uses machine learning and NLP techniques. It typically involves some sort of human-machine interaction. Cognitive computing is being used in healthcare to help oncologists make recommendations for treatment. It is being used to predict the likelihood of terrorist attacks from social media chatter. The use cases are wide and growing.

Deep learning Deep learning uses sophisticated layers of abstractions and parallelism to learn and model data similar to how the brain functions. It is being used in many of the emerging applications where there is complex data from many different sources that needs to be analyzed at great speed. This would include applications in security and drug discovery.

Disparate Data Types Continue to Build MomentumNew and diverse data is helping to drive emerging analytics technologies. Is this data being used in organizations today? What is it being using for? We asked respondents what kinds of data they are using with ETMs. The results fell into a few categories, including traditional structured data, emerging data types such as text data, and newer forms of data such as that generated by the IoT.

Traditional data still rules Currently, 84% of respondents stated that they are using structured data, such as that found in relational tables or in flat file records with their ETMs. (See Figure 13.) Eighty-two percent are using data from their transactional systems. Using data from transactional systems, such as order data or credit card payment data, can be useful for a range of emerging analytics such as building models for recommendation engines or operationalizing retention analytics for use by a call center. This is what leading-edge firms have been doing for years. Others are starting to follow suit. However, different kinds of data can provide even more options for organizations looking to better understand their customers or operations or to become more competitive or innovative.

Different, emerging data types are also in use Sources such as internal text data, log data, geospatial data, and event data from applications are already being used by a large percentage of respondents. Forty-three percent of respondents cited their use of internal text data with their ETMs. Some use text data from call center notes or e-mails to understand the voice of the customer. Others are predicting fraudulent activity using text data in claims forms. Thirty-five percent of respondents are using external social media data for brand reputation management, competitive intelligence, and other use cases. Respondents are also making use of log data (53%) and event data from applications (51%). Often this kind of event data can be used for operational intelligence. Thirty-nine percent are using geospatial data today. This data can be used in conjunction with other data, for instance, to understand insurance risk or to predict areas of crime. The point is that additional kinds of data, aside from the traditional data found in a transactional system or in a data warehouse, are being used by forward-looking companies today.

Newer sources of data poised to grow Other, newer sources of data are set to double or even triple in usage over the next three years if respondents stick to their plans. Machine generated data from sensors and devices is currently being used by 27% of respondents today; an additional 33% plan to use it in the next three years. Likewise, IoT data is used by fewer than 20% of respondents today, but another 40% are expecting to use it in the next three years. Data from sensors can be used to track and monitor assets (think sensors on construction material, expensive equipment, or animals) or even determine when crops need to be watered. Real-time event streaming, which goes hand-in-hand with IoT data, is also set to grow.

Although IoT data is used by fewer than 20% of respondents today, another 40% expect to use it in the next three years

ETMs for Analytics

2 For more on cognitive computing, see Judith S. Hurwitz, Marcia Kaufman, and Adrian Bowles, Cognitive Computing and Big Data Analytics (Hoboken, NJ: Wiley, 2015).

26 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Which of the following data types or sources is your organization using with ETMs? Select one answer for each row

Already using today

Not using today, but will within 3 years

No plans for using

Structured data (relational, tables, records) 84% 10% 6%

Transactional data 82% 9% 9%

Demographic data 59% 22% 19%

Log data 53% 21% 26%

Time series data 53% 22% 25%

Event data from application 51% 25% 24%

Internal text data (i e , e-mails, call center, claims) 43% 33% 24%

Geospatial data 39% 33% 28%

External text data (i e , social media, news, etc ) 35% 37% 28%

Clickstream data 29% 32% 39%

Network component data 28% 26% 46%

Machine-generated data (e g , sensors, devices, vehicles) 27% 33% 40%

Data stored in Hadoop 27% 33% 40%

Data available from public clouds 27% 35% 38%

Real-time event streams 23% 37% 40%

Internet of Things (IoT) data 18% 40% 42%

Image still data 16% 29% 55%

Audio data 13% 28% 59%

Video data 10% 28% 62%

Figure 13. Based on 332 respondents.

As organizations collect ever increasing amounts of disparate data, they will need to consider how to manage this data. They will also need to think about how much of this data they really want to capture. (Does it make sense to capture data from a smart building every second? Probably not.) Data frequency considerations will become more important. Organizations will also need to determine where they want to keep their data. (On premises? In the cloud?) There is a lot of talk in the market about organizations capturing as much data as they possibly can and dumping it into a Hadoop data lake or onto an appliance for later preprocessing and analysis. Lakes and dumps are great for data exploration, but they are not a viable long-term strategy, which generally entails a bit of data restructuring for recurring analytic operations (such as refreshing an analytic model). Organizations will need a data plan and an analytics plan.

tdwi.org 27

ETMs for Analytics

Analytics Hits the MainstreamWhat good is data if you don’t analyze it? Aside from regulatory and compliance rules that may mandate data storage, analytics is where the action is, so we asked respondents about a range of ETMs used for BI and analytics. (See Figure 14.) Although the vast majority (81%) are using dashboards and scorecards, other kinds of analytics are emerging for use in organizations. Many are even becoming mainstream.

Predictive analytics and Web analytics are becoming entrenched in organizations Often, as organizations move past reporting and dashboards, one of the first kinds of more advanced analytics techniques they implement is predictive analytics. The technology can have significant value in terms of predicting potential outcomes of interest, such as: Who will churn? Who will be readmitted to a hospital? Who will pay their bills on time? In this group of respondents, 49% stated that they are using predictive analytics in their organization today. Another 37% claim that they will be using it in the next three years. This percentage is higher than TDWI has previously recorded in surveys.

Of course, this survey may have appealed to those organizations that are further along the maturity curve in terms of analytics. For instance, some of the respondents are using predictive analytics and do not consider it to be an emerging technology. However, this points out that predictive analytics is becoming firmly entrenched in organizations. In fact, previous TDWI research indicates that the majority of organizations believe that business users (not statisticians!) will be some of the main users and builders of predictive models because the technology is getting easier to use. This was also the case with Web analytics. Of course, although it is encouraging to see some of these advanced analytics in use by more organizations today, the reality is that unless an organization takes action on the analytics—such as applying its epiphanies to a business process—the full value of the technology will not be realized.

Operational intelligence provides action for analytics Forty-one percent of respondents stated that they are currently using some form of OI today, and another 30% plan to use it in the next three years. As stated above, OI can be used to gain real-time visibility and understanding into what is happening in operational systems. These can be IT systems or systems that support any kind of business operations, such as telecommunications network management or manufacturing lines. OI is important because, by its very nature, it is making analytics part of a process. This is when analytics becomes actionable and when it starts to provide real value.3

Other analytics are still relatively immature Although a number of analytics ETMs (e.g., optimization, geospatial, big data visualization, and in-memory analytics) were used by at least 30% of the respondents, others are now just truly emerging these ETMs, including such technologies as cognitive computing and deep learning. Interestingly, those who have ETMs in place tend to be more likely to be planning to embrace some of these newer analytics technologies—such as cognitive computing, stream mining, and deep learning—versus those who do not have ETMs already in place. This makes sense: if organizations are gaining value from ETMs for analytics, they would be more likely try new techniques that might also provide value.

Unless an organization takes action on analytics, such as using it as part of a business process, the true value of the technology will not be realized

3 For more on operationalizing analytics, see the 2015 TDWI Best Practices Report Next-Generation Analytics and Platforms For Business Success, online at www.tdwi.org/bpreports.

28 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Which of the following ETMs is your organization using for BI and analytics? Select one answer for each row

Already using today

Not using today, but will within 3 years

No plans for using

Dashboards and scorecards 81% 14% 5%

Self-service BI 55% 29% 16%

Web analytics 50% 25% 25%

Predictive analytics 49% 37% 14%

Optimization 41% 32% 27%

Operational intelligence 41% 30% 29%

In-memory analytics 39% 33% 28%

Mobile BI 39% 43% 18%

Geospatial analysis 36% 30% 34%

Big data visualization 33% 41% 26%

Social media analytics 31% 38% 31%

Network/graph analysis 31% 33% 36%

Continuous alerting 28% 33% 39%

Visual storytelling tools 28% 33% 39%

Prescriptive analytics 28% 37% 35%

Text analytics 27% 39% 34%

Simulation 23% 31% 46%

Machine learning 19% 36% 45%

Link analysis 18% 30% 52%

Internet of Things (IoT) 15% 38% 47%

Neural networks 14% 27% 59%

Natural language processing (NLP) 14% 33% 53%

Stream mining 11% 36% 53%

Video analytics 9% 30% 61%

Cognitive computing 9% 32% 59%

Voice analysis 9% 27% 64%

Deep learning 8% 34% 58%

Figure 14. Based on 344 respondents.

tdwi.org 29

USER STORY PUTTING THE SKILLS IN PLACE FOR ADVANCED ANALYTICS“I have been involved with and seen many data science and data-driven projects fail and some succeed,” said one senior data scientist from a technology company. Of the ones that failed, it was rarely the data science part that failed. Technology is rarely the issue. Many forward-looking organizations realize that to be innovative and gain deeper insight, they have to invest in people. To that end, this company has put a process in place to educate talented employees. They identify those employees who are big thinkers, with a can-do attitude, who want to grow and who have real business needs for analytics. Depending on the person and the job requirements, the company offers different levels of training. These include lunch-and-learns and training in centers of excellence. If employees are very interested in analytics, they can get further education, sometimes at longer outside programs. People who are trained in analytics become the next level of influencers.

The Internet of ThingsThe Internet of Things (IoT)—a network of connected devices that can send and receive data over the Internet—is a hot market topic. These devices might be cell phones or wearable devices or sensors on components on airplanes and on machines in oil rigs. It is predicted that there will be tens of billions of these devices connected over the Internet in the next few years. The idea behind the IoT has been around for years, but the combination of cheap compute, advances in microprocessors, and more advanced software is making this a reality. This network is a trend in and of itself, but the analytics that can be performed on this data is where the value is. Analytics will play a big role in the IoT, from the simple to the complex.

Although only 16% of respondents are analyzing IoT data in their organizations today, more than double that amount said they are thinking about it. (See Figure 15.) The use cases for the IoT are wide, varied, and growing across virtually every industry. We asked respondents and subject matter experts to provide examples of how the IoT is being used in organizations today. These include:

Quantified self This is the movement to gather information about a person’s daily life. Several healthcare organizations reported using or planning to use wearable medical devices to monitor patients. Some organizations utilize wearable fitness devices in conjunction with loyalty programs. Other businesses promote wearable fitness devices to encourage healthy lifestyles. For instance, one respondent mentioned that her company organized teams that are using wearable fitness devices for company activity challenges between business units. The organization feels that this promotes fitness and team building.

Preventive maintenance The idea behind preventive maintenance is to identify and fix problems with equipment and other assets before they occur. The data associated with past failures is used to predict the probability of potential future problems. One respondent from a financial institution stated that they are using the IoT to gather information at branch and ATM networks to understand behavior and prevent equipment failures. Others mentioned that IT is using it to predict data center failures. Respondents in other industries said that they are using the IoT to monitor systems in remote locations (such as in the utility industry) or to monitor complex, expensive equipment such as on an oil rig.

Asset monitoring and tracking Tracking assets (especially expensive assets) can help maintain the bottom line. Some respondents in the transportation industry are using radio-frequency identification (RFID) to track assets. Others cited using RFID to track items such as produce to promote freshness.

Can’t disclose Some respondents felt that their IoT implementations were too sensitive to discuss because they provide competitive advantage.

It is predicted that there will be tens of billions of devices connected over the next few years

ETMs for Analytics

30 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Does your organization analyze data from the IoT?

Yes 16%

No, but we are thinking about it 33%

No, and we have no plans 33%

Don’t know 18%

Figure 15. Based on 303 respondents.

Of course, there are many other use cases for the IoT. This is an emerging technology that is in its infancy. However, TDWI expects to hear much more about the IoT in the coming few years. Some important areas will be around data management, data security, data connectivity, and data analysis for the IoT.

USER STORY COMMUNICATION AND CONTROL TESTBED FOR MICROGRID APPLICATIONSThe Industrial Internet Consortium is an international nonprofit consortium formed to accelerate the development, adoption, and widespread use of interconnected machines and devices, intelligent analytics, and people at work. One of its projects is the Microgrid Communication and Control Testbed, being built through a collaboration with National Instruments (NI), Cisco, and Real-Time Innovations. The testbed addresses the need for real-time analytics and control to increase efficiencies in the energy grid process, helping to increase reliability while decreasing operating cost. Traditionally, grids operate using a central architecture model. With the growth of renewable energy such as wind and solar, the grid is becoming more dynamic, meaning there is a need for more data and analytics to monitor and control these systems for optimal energy distribution. Additionally, aging assets also necessitate better quality control and a more intelligent approach to maintenance.

The Microgrid Communication and Control Testbed, which uses NI’s CompactRIO and Grid Automation Systems technology, focuses on machine-to-machine communication, interoperability, and security. It is a collection of hardware and software technology that makes it easier for vendors and utilities to test out new microgrid technologies. The microgrid utilizes small, networked controllers that provide command and control functions. They communicate via a databus using DDS, an open data communication standard. Rather than a central architecture, intelligence is pushed to the edge in these local microgrids for faster response times and more automated control.

The project consists of three phases. Phase 1 is a proof of concept phase in a NI lab that utilizes multiple controllers performing command/response to ensure basic security and performance. A few application use cases are currently demonstrated including demand-side management, island detection, and a mode that prepares the microgrid for an impending storm. In phase 2, planned for 2016, the microgrid will work in a simulated lab environment to ensure reliable and safe operation. In phase 3, a small controlled microgrid will be deployed in a live environment. “We are excited to have industry leaders lending their expertise to this project,” said Brett Burger, principal marketing manager for smart grid applications at NI. “By working together, we can help create the grid of the future.”

tdwi.org 31

ETMs for Analytics

The Cloud for Data Management and AnalyticsAlthough different deployment and delivery models for the cloud have been available for years, recently there has been more talk about the cloud as an emerging platform for both data management and analytics. Part of this is related to the reality of ever increasing amounts of disparate data. Some organizations feel that the cloud (and often the public cloud) provides the flexibility and scalability for managing and analyzing data—especially data that requires iterative analysis. In fact, flexibility and scalability are among the top reasons organizations embrace the cloud (not shown).

Cloud offerings are expanding. Vendors are providing data warehouses in the cloud. Analytics vendors are also providing cloud options for their products and services. The cloud is becoming an important platform for many organizations as they plan to extend their environment past the on-premises data warehouse.

We asked respondents if they are using the cloud now for data management or analytics activities (results not shown). Seventeen percent stated that they would never use the cloud for data management or analytics; 35% are thinking about it. Another 35% are already using the cloud in some way. The rest (about 13%) are not sure. TDWI has been tracking the progress of cloud uptake for the past few years. It appears that resistance to cloud adoption for analytics may be diminishing. In 2013, when we asked similar questions about adopting the cloud for analytics or BI, approximately 25% of respondents stated they would never use the cloud for this, but that number has been slowly declining over the past few years.

Data warehouses in the cloud are gaining momentum For those organizations using the cloud or thinking about using it, the cloud is becoming more popular for data management (Figure 16). We asked respondents how they were using the cloud. The top response is utilizing a data warehouse in the cloud; 35% of respondents using or planning to use the cloud are deploying (or planning to deploy) a data warehouse in the cloud (with most probably planning this; see the next section). Interestingly, back in 2011 when we asked this question of a smaller group of respondents, fewer than 10% were using a data warehouse in the cloud. The data warehouse isn’t the only data management platform in the cloud; 19% of respondents are using or planning to use Hadoop in the cloud to manage their data. Hadoop can be effective in helping to manage multi-structured data. Hadoop and the cloud (and Hadoop in the cloud) are all becoming part of a forward-looking organization’s hybrid data architecture.

Others are using the cloud for analytics Approximately 31% of respondents stated that they are using an analytics platform in the cloud; 29% are using the cloud as a sandbox for analytics. There are a number of use cases that respondents cited in terms of what they are doing in the cloud for analytics. First, if organizations are generating large volumes of data in other cloud applications, such as CRM or ERP, then they often consider analyzing this data in the cloud. Many respondents stated, “[We are] analyzing public cloud data in the cloud.” This makes intuitive sense. Additionally, sometimes organizations supplement this data with other public cloud data, such as demographic data or publically available data.

Some organizations perform data reduction in the cloud For instance, an organization might amass a big data set with thousands of attributes. The organization doesn’t want to send all of that data on premises to the data warehouse. They only want to send the important variables that might be used for reporting, so they do some data preparation and transformation in the cloud and perhaps even explore the data in the cloud to determine what the important attributes are. They then send only that data on premises.

The cloud is becoming an important platform for many organizations as they plan to extend their environment past the on-premises data warehouse

32 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Of course, organizations are also using the cloud for big data analytics projects Many of these are in the exploratory phase. For instance, if an insurance company is collecting telematics data from cars that arrive over the Internet, the data might analyzed in the cloud. This is true for other big data projects, too.

Some organizations have a mandate to use the cloud Others have large on-premises investments. Oftentimes, if a company has a large on-premises investment, it will do exploration with the goal to bring the data back on premises. Of course, the cloud is not for everyone. Companies need to do their homework and measure the risk and reward of using the cloud model. TDWI research indicates, however, that those companies that are more advanced in their analytics tend to be more likely to use the cloud for analytics. Perhaps they are seeing good returns.

What are you doing in the cloud for data management and analytics? Please select all that apply

Using a data warehouse in the cloud 35%

Using an analytics platform in the cloud 31%

Using the cloud as a sandbox for analytics in the cloud 29%

Using a platform-as-a-service model 25%

Using a data integration tool in the cloud 24%

Using the cloud in a SaaS model for analytics 24%

Using Hadoop in the cloud 19%

Using data security to protect cloud data 17%

Using a data quality tool in the cloud 14%

Using the cloud for IoT connectivity 11%

Using the cloud in a IaaS model for analytics 10%

Figure 16. Based on 518 responses from 216 respondents who are either using or planning to use the cloud; 2.4 responses per respondent, on average.

tdwi.org 33

ETMs for Data Warehousing

Trends in ETMs Relative to Data Warehousing and Data ManagementThis report’s survey presented respondents with a list of ETMs that could be used in a data warehouse environment. (See Figure 17.) The survey asked: “Which of the following ETMs is your organization using for data warehousing (DW) and other data management disciplines?” For each ETM on the list, a respondent selected one of the following answers:

• No plans for using

• Already using today

• Not using today but will within three years

Based on the responses, we can see trends in ETM adoption for DW and other data management practices. We can get a sense of which emerging technologies are of minimal interest, which are being applied to use cases in DW today, and which will see greater use in the future. In turn, the information is useful for business and technical users who consider such trends in their planning and tool adoption.

No Plans for UsingSubstantial numbers of user organizations have no plans to adopt ETMs This is often the case with new features and platforms. Despite the passion and commitment of early adopters, many organizations are (1) not open to new options, (2) simply don’t need them because old ones still suffice, or (3) haven’t evolved into modern business practices that would require modern ETMs.

Clouds and open source are not for everyone Ironically, many of the ETMs that ranked strongly for “no plans” also ranked strongly for “will use within three years.” This is the case with clouds for DW (44% no plans; 36% within three years), software-as-a-service (SaaS) for DW (44%; 36%), NoSQL DBMSs (43%; 34%), streaming data processing (40%; 37%), and Hadoop (36%; 36%). All the ETMs just listed are further off the mainstream than most (being based on clouds and/or open source), which may explain why they are so unpalatable to conservative organizations, yet an excellent fit for progressive ones.

Will Use within Three YearsReal-time data warehousing is set for aggressive adoption In fact, it tops the list of data-oriented ETMs in Figure 17. Twenty-two percent of respondents are doing real-time DW today, with an additional 39% coming in three years.4 One of the most common tasks in DW modernization is to retrofit real-time technologies onto the core warehouse or another data platform within the broader DW environment. Real-time ETMs for the DW typically support time-sensitive, data-driven business practices such as operational BI, operational analytics, business performance management, and management dashboards. These practices are enabled by capabilities built into the DW platform, as well as by real-time ETMs that fared well in this report’s survey, namely data virtualization (40% using today; 32% within three years) and data federation (31%; 30%).

ETMs are not universally embraced

The most hotly adopted ETMs involve real time or high performance

ETMs for Data Warehousing

4 Real-time DW has fared well in other TDWI studies. For example, see the discussion around Figure 17 in the 2014 TDWI Best Practices Report Real-Time Data, BI, and Analytics, online at www.tdwi.org/bpreports.

34 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Events and streams are now established in DWs and set for growth Most use cases for real-time data warehousing are better described as “near time,” with latency up to several minutes or hours, whereas use cases for events and streams usually demand responses within a few seconds or milliseconds. To achieve true real time, organizations are progressively turning to streaming data processing (23% using today; 37% within three years) and event processing (34%; 28%), which capture and process data that is generated frequently or constantly by applications, machines, devices, or social media. These ETMs are typically applied in ultrafast applications that monitor customer behaviors (to present e-commerce recommendations or discounts that halt churn), facilities (to optimize network performance or production yield), and business processes (to spot fraud or compliance infractions as they occur).

The trend is to correlate real-time information (from events and streams) with historic information (from a DW or similar database). Hence, truly modern organizations deploy ETMs for events and streaming data in tandem with a real-time DW.

Data management demands ever greater performance One of the strongest trends TDWI has observed among its members is the proliferation of various types of in-memory databases (42% using today; 29% within three years). Keeping certain data sets in large server memory spaces avoids time-consuming disk input/output; this is true for many use cases that need speed, from metrics and KPIs in a table pinned in memory (for fast refresh of management dashboards) to zero-landing ETL pipes (to reduce the latency of batch processing, as in microbatches). Similarly, more enterprise servers now include solid-state drives (37% using today; 26% within three years) based on high-speed flash memory. “Hot” data that is subject to frequent queries is kept on solid-state drives, whereas most data (which is “cool”) still resides on less expensive traditional drives.

Open source platforms are becoming established in data warehouse environments Oddly enough, 56% of respondents claim “no plans” for open source software (OSS) for data warehousing, whereas substantial percentages claim that within three years they will use open-source-based ETMs specifically for DW—namely Hadoop (36% within three years), NoSQL DBMSs (34%), and MapReduce (31%). Regardless of this contradiction (a common vagary with survey data), other studies from TDWI show growing adoption of OSS in general, especially anything involving Hadoop.5

In TDWI’s view, the adoption of OSS for BI, DW, and analytics picked up during the recession of 2008, driven by the low cost and easy availability of most OSS. Since then, the steadily increasing maturity of most OSS product types has made them more enterprise-grade. Today, much of the innovation coming from the software industry is based on or inspired by OSS, which makes OSS even more appealing for general enterprise use.

As Hadoop usage spreads into more use cases, the need for SQL support spreads, too For example, many professionals in BI, analytics, and DW have skills for SQL and SQL-based tools that they wish to leverage when using Hadoop. So it’s no surprise that (in this report’s survey) users anticipate within three years using both SQL on Hadoop (38%) and SQL off Hadoop (33%).

Open source software is now ensconced

in DW environments, and it is spreading

5 For example, see Figure 4 in the 2015 TDWI Best Practices Report Hadoop for the Enterprise, online at www.tdwi.org/bpreports. In that report’s survey, 60% of respondents anticipate having Hadoop clusters in production by early 2016.

tdwi.org 35

ETMs for Data Warehousing

Data integration tools and practices are evolving due to new user requirements Extract, transform, and load (ETL) continues to be a time-consuming task performed by personnel with special skills. For the complex transformations and data structures required of a true data warehouse, ETL will continue to be required. However, a growing number and diversity of users need “ETL light” for data exploration and preparing data for analytics. Many vendors are responding to this need by providing high ease of use through ETM features such as drag-and-drop data access (37% using today; 33% within three years) and self-service data preparation (36%; 36%). These emerging capabilities are today built into many types of tools, including those for data integration, data quality, reporting, analytics, and data visualization.

A modern data warehouse environment includes multiple ETMs That’s because the modern DW is an environment that includes many tools and (especially) many data platforms. There is almost always a core warehouse atop a relational DBMS, but it’s accompanied by standalone servers for data marts, operational data stores, data staging, and specialized functions (e.g., for capturing and processing unstructured or real-time data). The point of having multiple platforms is so technical users can match a given data type or workload with a platform optimized for it.

The multiple platforms of the modern DW environment regularly include data warehouse appliances (45% using today), columnar DBMSs (32%), and other analytic DBMSs (40%). New platforms entering the DW environment include Hadoop (28% using today), MapReduce (26%), and real-time platforms for event processing (34%) and streaming data processing (23%). A trend across all these platforms is users’ growing preference for those based on massively parallel processing (33% using today), a computing architecture that provides speed and scale for data-driven use cases.

Already Using TodaySome ETMs have emerged so much as to become common A few DW and DM technologies and methods have proliferated so successfully that most of the organizations that need such ETMs have already deployed them. This seems to be the case with service-oriented architecture (51% using today; 21% within three years), DW appliances (45%; 23%), and in-memory databases (42%; 29%). Others have proliferated to a degree but still have ample room for growth, as with data virtualization (40% using today; 32% within three years), analytic DBMSs (40%; 31%), in-database analytics (37%; 34%), and solid-state drives (37%; 26%).

Some ETMs have a minimal presence in DW environments today This includes a third of the ETMs charted in Figure 17—namely SQL off Hadoop, NoSQL DBMSs, stream data processing, SQL on Hadoop, real-time DW, clouds for DW, and SaaS for DW. Of respondents, 20% to 25% are using these today; yet an additional 33% to 39% say they will adopt them within three years. Therefore, these ETMs are a bit rare today but will soon be more widely adopted.

Security has become more important for DWs Note that data security (69%) is the highest-ranking function in use today among the multiple choices charted in Figure 17. No doubt, most DW environments have some kind of security controlling access to the DW and tools that interface with it. However, anecdotal evidence suggests that this is almost exclusively security in the form of user-centric authorization, whereas there is a compelling need for more DWs to support data-centric security (e.g., data de-identification via encryption, masking, tokenization). These and other ETMs related to security are explained in the ensuing section of this report.

Many ETMs take the form of data platforms found in DW environments

The good news is that ETMs, in general, have proved their usefulness to data warehousing

36 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Which of the following ETMs is your organization using for data warehousing (DW) and other data management disciplines? Select one answer for each row

No plans for using

Already using today

Not using today, but will within 3 years

Real-time DW 39% 22% 39%

SQL on Hadoop 40% 22% 38%

Streaming data processing 40% 23% 37%

Self-service data preparation 28% 36% 36%

Hadoop 36% 28% 36%

Clouds for DW 44% 20% 36%

Software-as-a-service for DW 44% 20% 36%

In-database analytics 29% 37% 34%

NoSQL DMBSs 43% 23% 34%

Drag-and-drop data access 30% 37% 33%

SQL off Hadoop 42% 25% 33%

Data virtualization 28% 40% 32%

Analytic DBMSs 29% 40% 31%

MapReduce 43% 26% 31%

Data federation 39% 31% 30%

In-memory database or data cache 29% 42% 29%

Columnar DBMSs 39% 32% 29%

Event processing 38% 34% 28%

Solid-state drives 37% 37% 26%

Massively parallel processing (MPP) 42% 33% 25%

Data warehouse appliances 32% 45% 23%

Open source software (OSS) for DW 56% 21% 23%

Service-oriented architectures (SOA) 28% 51% 21%

Data security 15% 69% 16%

Figure 17. Based on 344 respondents. Sorted by “Not using today, but will within three years.”

tdwi.org 37

ETMs for Data Warehousing

Data-Centric SecurityToday, the news is loaded with stories of high-profile security breaches where the IT systems of large corporations and government agencies are hacked and large volumes of sensitive data are stolen. Due to the current wave of cybercrime, many organizations are rethinking their strategies for securing applications and data. Both the vendor and open source communities are responding to the need by developing new and deeper approaches. Hence, a number of emerging technologies and methods involve the next generation of digital security.

To start, let’s review the four basic categories of security controls:

Authentication: This function identifies and confirms that a user is he or she claims to be. This could be as simple as verifying a username and password or as leading edge as a fingerprint reader or retina scanner.

Authorization: This maps a known user to the applications and data the user is requesting, thereby corroborating (or denying) that the user’s role and security profile is appropriate to the request. Depending on the systems in question, authorization may involve granular control to data access down to the row, field, or cell.

Auditing: This control involves recording an audit trail of who accesses which applications and data, ideally across multiple systems and access methods. The record of data accesses can be studied to discover violations but also to enlighten capacity planning, data archiving procedures, and charge-back accounting. In some tools, auditing is coupled with real-time rules and analytics to spot or predict violations, as they occur.

Data-centric security: Unlike the user and application orientation of the previous security techniques, this category is truly data-centric in that it operates on the data or very close to the data. The assumption is that even the best systems can be hacked by intruders or simply misused by employees. When data-centric security is in place, unauthorized users may see data, but the data has been cleansed, blocked, and made anonymous—or de-identified—to the point that the intruder gets nothing of value. Because no usable data is extracted, the organization is saved from the onerous measures that typically follow the theft of sensitive data.

Emerging techniques for data-centric security apply various forms of data de-identification (and re-identification) based on new and improved approaches to older techniques, such as encryption, masking, and tokenization. Let’s take a look at each of these ETMs:

Data-at-rest encryption: This renders data unreadable for intruders who bypass controls at the application and operating system levels to access data directly. Data is encrypted during application write operations and decrypted during reads, and this overhead has traditionally degraded system performance. New algorithms involve less overhead for better performance.

Data-in-motion encryption: Also called wire encryption, this technique is applied within transport-level protocols to protect data as it travels a network, which is a common point of data interception. Note that all modern approaches to encryption should retain data’s original format.

Masking: This technique has been around for years as a latent method of generating anonymous test data. This replaces sensitive data elements with usable data that’s equivalent to original data in form, behavior, and meaning, although transformed to hide the identity of people and other entities. This way data in its de-identified state is usable for most applications (especially analytics) without fear of privacy or other compliance violations.

Among the four categories of security, only one is truly focused on data

Today’s innovations in security are mostly down at the data level

The goal of data-centric security is to de-identify and re-identify sensitive data elements, on the fly, per user and per field, in all systems including Hadoop

38 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Masking has had problems, which are corrected by the latest generation of masking techniques. In utility tools for data management, masking typically runs as an offline process, whereas the new generation is online and running in real time, so masking and related tasks can be done on the fly. In the past, masking was always irreversible, whereas new approaches mask data for some users but unmask it for others with a higher security clearance. For example, a data analyst can develop an accurate list of patients that fit an analytic model while working from masked data so no one’s privacy is violated. Later, a physician (or other healthcare provider with proper clearance) can re-identify the data in order to test, treat, or otherwise assist patients on the list.

Tokenization: This process replaces live data with a random surrogate. Old approaches involved slow and non-scalable mappings to tables of surrogate values (which required backup and other database maintenance), whereas new approaches involve stateless pre-generated tokens for greater speed at runtime and less maintenance.

Many Types of IT Systems Need Better Data-Centric SecurityNote that many of the innovations in data-centric security are today intended to improve the Hadoop ecosystem of tools, which is notoriously lacking in enterprise-grade security functions. In its pure open source versions, Hadoop supports little more than Kerberos-based authentication. Hence, new ETMs for data-centric security are key to making Hadoop better suited to a wide range of use cases in a wide range of mainstream enterprises.

Hadoop’s need for better security aside, these same ETMs for data-centric security apply to other systems, too, especially DWs and most enterprise applications. In particular, TDWI has noted for years that most DWs and integrated tools rely on user-centric authorization almost exclusively, with little or no use of data-centric security, as described here. Auditing user activity is almost as negligible. Given the escalating war on cybercrime, security upgrades are strongly recommended for DWs. As organizations migrate data from DW platforms to Hadoop, functions for data-centric security become even more paramount.

USER STORY IT’S BEST TO PREPARE CAREFULLY BEFORE IMPLEMENTING A MAJOR EMERGING TECHNOLOGY LIKE HADOOP“Due to mergers and acquisitions, we have five firms that need to consolidate data for sharing and decision support,” said a senior manager of advanced analytics at an insurance conglomerate based in the U.S. “I worked with Hadoop in my prior job, so I feel that Hadoop is perfect for our consolidation project in my current job. The catch is that we’re not yet ready for Hadoop. The biggest gap is our lack of programming skills for the languages that Hadoop requires. The second gap is our weak relationship between business and IT. We need more buy-in from IT and better definition of direction and requirements from the business. The third issue is that our data governance policies need clearer definitions in terms of which data can be accessed and transformed in which ways, but with compliance. We have a lot of work to do before we can credibly implement Hadoop, but I feel confident we’ll get there.”

Hadoop, data warehouses, and many

enterprise applications today lack adequate

data-centric security

tdwi.org 39

A Sample of Relevant Vendor Platforms and ToolsBecause the firms that sponsored this report are all good examples of vendors that offer emerging technologies for various kinds of BI, analytics, and data, let’s take a brief look at the product portfolio of each. The sponsors form a representative sample of the vendor community, although their offerings illustrate different approaches.6

HP Security VoltageHP Security Voltage provides multiple security solutions for IT systems in both traditional enterprise uses and new big data analytics. Its specialty, however, is the ETM data-centric security. HP Security Voltage is known for providing data-centric encryption, tokenization, and key management solutions that are fast, scalable, and reliable. Furthermore, HP Security Voltage solutions excel with techniques for de-identification, which encrypt or mask sensitive data elements (down to the field level) by replacing values with new ones of the same format and integrity. That way, data content is securely cleansed and yet fully functional for analytics.

We all know that security is a major weakness or omission for Apache Hadoop and some Hadoop distributions. For those environments, HP Security Voltage provides much-needed data-centric security that operates natively, during data import, storage, processing, and export. From a single console, HP Security Voltage users can secure data across many systems, data types, and vintages of IT infrastructure. This reduces the effort of development and maintenance, improves consistent security standards enterprise-wide, and reduces both risk exposure and the scope of compliance.

HP VerticaHP Vertica provides solutions to big data challenges. The HP Vertica Analytics Platform was purpose-built for advanced analytics against big data. It consists of a massively parallel columnar database, plus an extensible analytics framework optimized for the real-time analysis of data. It is known for high performance with very complex analytic queries against multi-terabyte data sets.

HP Vertica for SQL on Hadoop is a new offering that provides a high-performance, enterprise-ready way to perform ANSI-standard SQL queries on Hadoop data. It integrates with any Hadoop distribution, but HP has partnered with the major distribution providers—Hortonworks, Cloudera, and MapR—to ensure consistent performance and standards across all distributions.

Although SQL is the primary query and analysis language, Vertica also supports Java, Python, R, and C. Furthermore, the HP Vertica Flex Zone feature enables users to define and apply schema during query and analysis, thereby handling exotic data types that are unstructured or schema-free. To simplify and accelerate the deployment of an analytics solution, HP offers reference architectures and hardware that are optimized for Vertica, although special hardware is not required.

A Sample of Relevant Vendor Platforms and Tools

6 The vendors and products mentioned here are representative, and the list is not intended to be comprehensive.

40 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

MicroStrategyFounded in 1989, MicroStrategy delivered one of its most significant releases, MicroStrategy 10, in 2015. The release offers capabilities for increasing user agility with BI and analytics, faster development, and self-service data discovery with built-in preparation (data wrangling, profiling, and cleansing), while ensuring centralized governance, scalability, and security. To enable organizations to broaden their data architectures beyond just relational databases and have faster loading as multi-structured data volumes grow, the release offers straightforward access to Hadoop, Salesforce.com, and other cloud sources, as well as access to other, less traditional data sources through native connectors. Special options include Web crawlers as well as common ODBC connectors. Revamped data discovery tools, a redesigned HTML5-conforming interface, and a more mature parallel relational in-memory engine (PRIME) have given MicroStrategy 10 a new look and feel along with optimized performance for all types of data sources. Governance and security are important parts of the release; MicroStrategy Usher enables organizations to define security around each user and guarantee that only that person can see relevant data, including data in the cloud.

QlikEstablished in 1993, Qlik has long been focused on enabling users to engage in intuitive, visual data discovery and analytics without having to confront underlying complexity. The company approaches technology from the perspective that decision making is a human endeavor, and that visualization and visual stimulus are central to how humans identify patterns and look at information. Both the QlikView data discovery platform and Qlik Sense, introduced in 2014, use an underlying in-memory associative data indexing engine that dynamically calculates and reveals data associations across multiple sources as users interact with analytics through selections or searches. In 2015, Qlik introduced Smart Data Load for visual data profiling and self-service data preparation. Qlik Branch, an open-source-inspired online community, provides a place for developers to share visualizations and collaborate on Qlik Sense projects. The Qlik Sense Enterprise edition provides tools for IT to centrally govern visual analytics and data discovery, including capabilities for managing secure data libraries for user access.

Snowflake ComputingRapid evolution in data and its use have strained the limits of conventional data warehousing. Complexity, inflexibility, and cost have become major barriers to gaining insights from the deluge of data available today.

Snowflake delivers a new data warehouse designed for today’s data and its use. The Snowflake Elastic Data Warehouse is an SQL data warehouse delivered as a software service running in the cloud. It is designed to deliver the performance of data warehousing, the flexibility of big data platforms, and the ease of use of a true software service.

Snowflake’s multi-cluster, shared data architecture decouples data storage from query processing, making it possible to bring together data in a single location while independently scaling computing horsepower on the fly. Because of this architecture, any scale of workloads and users can run concurrently without performance degradation. In addition, Snowflake natively understands and optimizes diverse data, allowing semi-structured data to be loaded and used without transformation. Finally, Snowflake’s data warehousing service provides performance optimization, resiliency, monitoring, and security built into the service, significantly reducing cost and complexity.

tdwi.org 41

StriimDeveloped by the core team that introduced GoldenGate Software, the Striim platform provides high-volume streaming data integration and analysis. With its tagline “Make data useful the instant it’s born,” the company’s goal is to provide a quick-to-deploy, complete, and nonintrusive solution for high-speed data movement and real-time operational intelligence.

WebAction’s Striim platform enables the continuous extraction of a wide variety of data, including transaction/change data via real-time change data capture, log and sensor data via parallel data collection, and continuous event collection. It deploys easy-to-use streaming transformations such as filtering, outlier detection, enrichment, and correlation to help make sense of data—in real time. Relevant data is then moved to a range of big data, cloud, and database targets.

TrifactaMost data management professionals realize that traditional ETL has its place with the complex transformations and data structures required of a dimensional DW. However, ETL is too expensive and time-consuming for the varied and more agile data preparation that a growing number of data analysts, business analysts, and other mildly technical business users need to do.

Trifacta addresses the need for straightforward and effective self-service data preparation—or data wrangling, as they call it—by providing visual interfaces with high ease of use. These interfaces present data in a drag-and-drop, interactive development environment that is quickly learned and used. Trifacta’s innovation is to focus on the user experience, thereby giving data management professionals a lift in productivity.

For the greatest level of self-service and productivity, the Trifacta Data Transformation Platform is architected so that data prep seamlessly flows from data exploration, profiling, and data set development to analysis and visualization. The end result is a wide range of user constituencies getting business value from new big data sources in conjunction with more traditional data sources, while exploring data directly in an agile and productive fashion with little or no need for IT intervention.

A Sample of Relevant Vendor Platforms and Tools

42 TDWI RESEARCH

Emerging Technologies For Business Intelligence, Analytics, and Data Warehousing

Top 10 Priorities for Emerging Technologies in BI, Analytics, and Data WarehousingIn closing, let’s summarize the report by listing the top 10 priorities for emerging technologies (as applied to BI, analytics, and data warehousing), with a few comments about why each is important. Think of the priorities as requirements, rules, or recommendations that can guide organizations into successful implementations of emerging technologies and methods (ETMs).

1. Adopt ETMs for the business benefits This report’s survey shows that ETMs applied to BI, analytics, and DW can lead to improvements in competitiveness, decision support, business performance, and innovation. Interviewees talked about using ETMs for data warehouse modernization and the positive disruption of an organization stuck in a rut.

2. Understand your organization’s goals and map them to potential ETMs If you can do the mapping credibly, you stand a better chance of gaining appropriate business sponsorship and funding. If you can’t do this mapping, it probably means you don’t have a compelling business case for ETMs, and so you may prefer to turn to more traditional technology solutions.

3. Know the hurdles so you can leap over them According to our survey, your IT department may be your biggest barrier. If so, you may need to help IT get the human, budgetary, and infrastructure resources it needs to handle various ETMs. You may also need to help IT abandon its risk-averse posture in favor of an innovation culture. Other likely hurdles include inadequate data governance, skills, and business support.

4. Consider the organizational implications of emerging technologies Often, cultural issues are the hardest to overcome when deploying new technologies. Successful organizations seek out executive sponsorship, develop a proof of concept to illustrate the value of the technology, and make a point of getting everyone on board. They highlight accomplishments, push the innovation message, and continue to evangelize.

5. Keep an open mind TDWI research indicates that as companies become more advanced with new data-related technology, they are more likely to consider other new data technologies. These forward-looking companies are also measuring real, tangible impact. At the end of the day, emerging technologies are helping organizations become more data-driven, which has been shown to make companies more profitable (and more competitive).

6. Focus on how ETMs can improve agility with data and analytics Self-service BI and analytics tools and applications, software automation for data preparation, and agile development methods are coming together to enable organizations to move more quickly to gain value from data so that they can effectively address dynamic business needs or initiatives. Organizations should take advantage of this ETM “perfect storm” by instituting agile methods as part of their deployment of self-service BI and analytics tools.

7. Give users the freedom to personalize their BI and analytics experiences Visual analytics tools and applications are popular ETMs in part because they give users capabilities for shaping their own data exploration and creating visualizations that further understanding and inquiry. IT must play a critical role in governing and managing self-service BI and analytics but must not go too far in restricting what users can do. Users and IT should formally communicate to establish the right balance.

Embrace emerging technologies for

organizational advantage

Prepare for successful ETM use by first evolving

organizational culture toward innovation

Don’t forget important emerging methods, such as agility and

personalization

tdwi.org 43

8. Evaluate whether ETMs for BI and analytics could create opportunities to monetize data assets Technologies and methods that make it easier to build dashboards and other visualizations based on creative analytics could do more than help organizations internally. They could facilitate the development of information-based products and services that bring in revenue, or at least enhance the value of partner and customer relationships. Devote time to defining potential monetization of data assets and analytics.

9. New skills may be needed Although vendors are making data-related products simpler to use than ever before, it is still critical to think through the skills you need in your organization to make emerging technologies viable. For instance, even though more advanced analytics software solutions might be easy to use (plug the data in and get a model), it is still important to understand how to interpret the output, determine whether it makes sense, and be able to defend it. This may require training in new tools and techniques. It can also include mentoring others in the organization and even holding office hours for people to ask questions.

10. Don’t expect the new ETMs to replace many older systems In recent years, TDWI has seen tremendous growth and diversification in users’ portfolios of software tools and platforms for BI, analytics, and data warehousing. This process is usually accretive, in that more new tool types are introduced than are decommissioned. For example, the explosive adoption of data visualization tools in recent years has been a godsend for data exploration and analytics, especially at the departmental level. Yet, viz tools don’t replace enterprise business intelligence platforms, which are still required for the thousands of refreshed reports that thousands of users demand every day. Likewise, the data warehouse has evolved into a multi-platform environment that includes many new ETMs, such as Hadoop, NoSQL DBMSs, columnar DBMSs, and platforms for events and streaming data. Although the ETMs satisfy new requirements (especially for big data and analytics), older relational DBMSs and SQL-based platforms are just as important as ever to the overall success of the data warehouse (which is still mostly about provisioning data for standard reports, OLAP, and performance management).

Find a place for ETMs in your toolkit, alongside older tools and techniques

Top 10 Priorities for Emerging Technologies in BI, Analytics, and Data Warehousing

Research Sponsors

HP Security Voltagewww.voltage.com

HP Security Voltage is a world leader in data-centric encryption and tokenization . HP Security Voltage provides trusted data security that scales to deliver cost-effective PCI compliance, scope reduction, and collaboration security . HP Security Voltage solutions are used by leading enterprises worldwide, reducing risk and protecting brand while enabling business . For more information, go to www .voltage .com .

HP Verticawww.vertica.com

HP Vertica provides a complete, purpose-built platform for big data analytics . It consists of a massively parallel, columnar database in an open platform that supports SQL, Java, Python, and R-based predictive analytics . No matter if you’re accessing data in Hadoop or in our high-performance columnar database, HP Vertica is meant to scale up to petabytes of data and offers advanced analytical functions for your toughest data warehouse workloads . Get a free trial of HP Vertica today .

Microstrategywww.microstrategy.com

Founded in 1989, MicroStrategy (Nasdaq: MSTR) is a leading worldwide provider of enterprise software platforms . The company’s mission is to provide enterprise analytics, mobility, and security platforms that are flexible, powerful, scalable, and user-friendly . To learn more, visit MicroStrategy online, and follow us on Facebook and Twitter .

MicroStrategy, MicroStrategy 10, MicroStrategy Secure Cloud, MicroStrategy 10 Secure Enterprise, and MicroStrategy Analytics Platform are either trademarks or registered trademarks of MicroStrategy Incorporated in the United States and certain other countries . Other product and company names mentioned herein may be the trademarks of their respective owners .

Qlikwww.qlik.com

Qlik® (NASDAQ: QLIK) is a leader in visual analytics . Its platform-based approach meets customers’ growing needs, from reporting and self-service data discovery to centrally deployed guided analytics and embedded analytics . Approximately 36,000 customers worldwide rely on Qlik solutions to gain meaning out of information from varied sources, exploring the hidden connections within their data that lead to insights and ignite good ideas . What sets Qlik apart is its associative model, which empowers everyone in an organization to see the whole story that lives within their data, simply . For more information, visit www .qlik .com .

Snowflake Computingwww.snowflake.net

Snowflake Computing, the cloud data warehousing company, has reinvented the data warehouse for the cloud and today’s data . The Snowflake Elastic Data Warehouse is built from the cloud up with a patent-pending new architecture that delivers the power of data warehousing, the flexibility of big data platforms, and the elasticity of the cloud—at a fraction of the cost of traditional solutions . The company is backed by leading investors including Altimeter Capital, Redpoint Ventures, Sutter Hill Ventures, and Wing Ventures . Snowflake is headquartered in Silicon Valley and can be found online at snowflake .net .

Striimwww.striim.com

Striim is an end-to-end streaming data integration and operational intelligence platform enabling continuous query/processing and streaming analytics . With Striim, you can get to know your data—and sort out what’s important—the instant it’s born .

Striim specializes in multi-stream integration from a wide variety of data sources including transaction/change data, events, log files, application, and Internet of Things sensor data .

Add structure, logic, and rules to streaming data . Define time windows for analysis . Detect outliers, visualize events of interest, and trigger alerts and automated workflows—all within milliseconds .

Respond faster to your customers, make better decisions, and grow your business with Striim .

Trifactawww.trifacta.com

Trifacta, the pioneer in data transformation, significantly enhances the value of an enterprise’s big data by enabling users to easily transform raw, complex data into clean and structured inputs for analysis . Leveraging decades of innovative work in human-computer interaction, scalable data management, and machine learning, Trifacta’s unique technology creates a bidirectional partnership between user and machine, with each component learning from the other and becoming smarter through use . Trifacta is backed by venture capital firms Greylock and Accel and is headquartered in San Francisco . Its founders and technical advisers include global leaders in data science, interaction design, and big data .

TDWI RESEARCH

TDWI Research provides research and advice for data professionals worldwide. TDWI Research focuses exclusively on business intelligence, data warehousing, and analytics issues and teams up with industry thought leaders and practitioners to deliver both broad and deep understanding of the business and technical challenges surrounding the deployment and use of business intelligence, data warehousing, and analytics solutions. TDWI Research offers in-depth research reports, commentary, inquiry services, and topical conferences as well as strategic planning services to user and vendor organizations.

555 S Renton Village Place, Ste. 700

Renton, WA 98057-3295

T 425.277.9126

F 425.687.2842

E [email protected] tdwi.org