54
Big Data Analytics Predictions for 2016 With an introduction by Bernard Marr Sponsored By:

Big Data Predictions ebook

Embed Size (px)

Citation preview

Page 1: Big Data Predictions ebook

Big Data Analytics Predictions for 2016 With an introduction by Bernard Marr

Sponsored By:

Global Technology Solutions Provider

Page 2: Big Data Predictions ebook

PAGE 2

Table of Contents:

Page 3 2015: A Transformative Year for Big Data

Page 6 Cloud Computing Moves to the Edge in 2016

Page 10 Hadoop, Open Source Adoption and Growth to Accelerate in 2016

Page 13 Future State: Where Data Integration Will Move in 2016

Page 16 Maturing IoT Will Drive Innovation Across Industry in 2016

Page 20 Cognitive Analytics Becomes Essential for Business Users in 2016

Page 23 Data Governance in 2016: Modeling the Enterprise

Page 26 2016: A Make or Break Year for Your Data Analytics Initiative

Page 28 Comprehensive Analytic Maturity Assessment

Page 3: Big Data Predictions ebook

PAGE 3

2015: A Transformative Year for Big Data

Bernard Marr

Over the past year, big data continued to be big news. Across every industry and government sector, data analytics involving ever-increasing amounts of structured and unstructured data is changing the world.

At the start of the year, pundits were predicting one of two things. Either 2015 would be the year that the “big data fad” finally fizzled, or it would be the year that big data went truly mainstream. Having spent the year working with companies of all shapes and sizes, and speaking to probably hundreds of people involved in analytics projects, I have seen nothing to convince anyone that the buzz is

fizzling out, but plenty that shows mainstream applications. Wherever you look, big data and analytics is taking place – across healthcare, crime fighting, finance, insurance, travel, transport, science and entertainment. 2015 was the year big data broke out of the IT lab and became something that everyone wanted to be part of.

So here’s a roundup of some of the milestones and landmarks of 2015 for big data, the highs and – to keep things balanced – the lows as well.

The most obvious is of course the ever-increasing size. Data continued to grow at a phenomenal rate, with just one example being the 1 trillion pictures being taken with digital cameras. Every day, 315 million of these are uploaded to Facebook. In fact, the number of users logging into Facebook in one day exceeded one billion for the first time in August – that’s one seventh of the world’s population logging into one network in a single day!

And that’s just a small amount of the data which is being generated by people. In fact, far larger volumes are being generated automatically by machines. Increasingly, due to the Internet of Things, machinery, tools, and vehicles are capable of talking to each other, logging and recording everything that their sensors can measure. This year, driverless cars from many of the major automobile manufacturers took to the roads for trials. And giants of industry such as GE and Rolls Royce pressed on with developing and refining the “industrial Internet.”

2015 was also the year big data went mobile in a big way. We entered the year with, for the first time ever, more people using mobile devices than fixed line broadband to connect to the Internet. This seismic shift in the way we consume data has led to a widespread “repurposing” of the Internet, toward serving up data on the go. More than 14 billion cell

Bernard Marr

Page 4: Big Data Predictions ebook

PAGE 4

2015: A Transformative Year for Big Data

phones were shipped this year. On top of that, mainstream consumers began to take wearable technology seriously for the first time, with Apple selling more than 3 million watches and Fitbit shipping more than 4 million wearable fitness trackers.

In homes, too, the Internet of Things continued to grow in popularity. Devices like Nest’s smart thermostat have become an increasingly common sight, often thanks to deals with power companies that see the devices fitted for free when customers sign up for a contract. Devices like these have shown that the efficiency they bring is a benefit to all parties.

Machine learning was another hot topic in 2015. The rise in “cognitive computing,” which has been encapsulated as the development of computers that can learn for themselves rather than having to be programmed, has been one of the most frequent subjects of discussion. This year, IBM continued development and refining of its Watson Analytics engine, which puts artificial intelligence-driven big data analytics in the hands of businesses and individuals around the world.

2015 also may be remembered as the year that legislators began to catch up with the big data revolution. In one court case that is likely to have far reaching implications, courts in Europe sided with Austrian citizen Mark Schrems, who complained that American businesses that were transmitting his data from the EU to the United States were not taking adequate care that his information would be protected from unauthorized use. This brought to an end the so-called “safe harbor” agreement, which previously codified that it was taken for granted that giant U.S. data-driven companies such as Google and Facebook could be trusted to look after our personal data. (In years to come, I predict we will look back on that sort of thinking the way we now look at people who used to think the Earth was flat!) Adapting to the hurdles that this ruling puts in their way undoubtedly will be a priority for data-driven businesses, moving into 2016.

The dark side of big data was also brought into focus by the ongoing trend of large-scale data breaches and thefts. 2014 saw a greater than 50 percent increase in the number of people falling victim to these new hazards of the big data age, and although the figures aren’t in yet, it looks like 2015 will smash all records once again. Customers of U.S. pharmacy chain CVS, UK phone retailer Carphone Warehouse, dating site Ashley Madison, crowdfunding service Patreon, password management service Lastpass, and credit reference agency Experian, all experienced the fun and excitement of being informed their highly personal and often financially sensitive data had fallen into the hands of persons unknown, affecting many millions of customers.

Not long ago, some people were quite open about their suspicion of the term “big data.” There were those who considered it a buzzword or fad that would soon die out, and said that those hawking it would move onto whatever became fashionable next.

Meanwhile another group of detractors, these ones harboring no doubts over the ability of big data to change the world, warned it was more likely to become a tool of surveillance and oppression rather than positive economic and social change.

Page 5: Big Data Predictions ebook

PAGE 5

2015: A Transformative Year for Big Data

This year has certainly proven the first group to be mistaken. Investment in big data and analytical technology continues to grow at an ever-increasing rate. And the results are apparent in the flood of companies coming forward to talk about how their analytics projects are driving success and growth.

And the second group? Well, their fears are not as easy to dismiss. Nobody knows what tomorrow will bring. What I do know is that, of the hundreds of people I have spoken with about big data this year, the vast majority are confident that its potential for good outweighs its potential for bad. But that doesn’t mean that we shouldn’t continue to be vigilant. Governments as well as corporations must always be held to account to ensure they have our best interests at heart when we let them use our data.

With that in mind, 2016 is likely to be another boundary-shattering year for those of us lucky enough to be part of the big data revolution, and I, for one, am very excited about what it will bring.

Bernard Marr is a bestselling author, keynote speaker, strategic performance consultant, and analytics, KPI, and big data guru. In addition, he is a member of the Data Informed Board of Advisers. He helps companies to better manage, measure, report, and analyze performance. His leading-edge work with major companies, organizations, and governments across the globe makes him an acclaimed and award-winning keynote speaker, researcher, consultant, and teacher.

Page 6: Big Data Predictions ebook

PAGE 6

Cloud Computing Moves to the Edge in 2016

By Andy Thurai and Mac Devine

The year 2016 will be exciting in terms of applied technologies. We see a lot of technologies maturing and moving from lab exercises to real-world business technologies that solve real-life customer problems – especially in the areas of digital transformation, API, cloud, analytics, and the Internet of Things (IoT). In particular, we see the following areas evolving faster than others.

Year of the Edge (Decentralization of Cloud)Cloud has become the ubiquitous solution for many enterprises in their quest to provide a single unified digital platform. Integrating the core IT with the shadow IT has been the main focus for the last few years, but in 2016 we anticipate the next step in this process. We started seeing companies moving from the central cloud platforms toward the edge, or toward decentralizing the cloud. This is partly because, with the proliferation of IoTs, operations technologies (OT) and decision intelligence need to be closer to the field than to the central platform.

Cloud has become the massive centralized infrastructure that is the control point for compute power, storage, process, integration, and decision making for many corporations. But as we move toward IoT proliferation, we need not only to account for billions of devices

sitting at the edge, but also to provide quicker processing and decision-making capabilities that will enable the OT. Areas of low or no internet connectivity need to be self-sufficient to enable faster decision making based on localized and/or regionalized data intelligence.

An IDC study estimates that, by 2020, we will have 40+ zettabytes of data. IDC also predicts that, by 2020, about 10 percent of the world’s data will be produced by edge devices. Unprecedented and massive data collection, storage, and intelligence needs will drive a major demand for speed at the edge. Services need to be connected to clients, whether human or machine, with very low latency, yet must retain the ability to provide a holistic

Andy Thurai, Program Director: API Economy, IoT and Connected Cloud solutions, IBM

Mac Devine, Chief Architect, Caserta Concepts

Page 7: Big Data Predictions ebook

PAGE 7

Cloud Computing Moves to the Edge in 2016

intelligence. In 2016, the expansion of the cloud – moving a part of cloud capabilities to the edge – will happen.

Because of the invention of micro services, containers, and APIs, it is easy to run these smaller, self-contained, purpose-driven services that specifically target only certain functions that are needed at the edge. The ability to use containers for mobility and the massive adoption of Linux will enable much thicker, monolithic services previously running centralized to be “re-shaped” into a collection of smaller, purpose driven micro services. Each of these can be deployed and run on the edge as needed and on-demand. Spark is an excellent example of this because it is focused on real-time streaming analytics, which is a natural “edge service.”

M2M Communications Will Move to the Main StageThe proliferation of billions of smart devices around the edge will drive direct machine-to-machine (M2M) communications instead of the centralized communication model. The majority of the IoT interactions are still about humans (such as the concept of the Quantified Self ), and a human element also is involved in the decision making somewhere, even if it is not about Quantified Self.

We predict that the authoritative decision-making source will begin moving slowly toward machines. This will be enabled by the M2M interactions. The emergence of cognitive intelligence themes (such as IBM Watson) and machine-learning concepts (such as BigML) will drive this adoption. Currently, trust and security are major factors preventing this from happening on a large scale. By enabling a confidence-score based authoritative source, we can eliminate the human involvement and the ambiguity in decision making. This will enable autonomous M2M communication, interaction, decision making, intelligence and data sharing which will lead to replication of intelligence for quicker localized decisions. In addition, when there is a dispute, the central authoritative source, with cognitive powers, can step in to resolve the issues and make the process smoother – without the need for human intervention.

This centralized cognitive intelligence can also manage the devices, secure them, and maintain their trust. It can help eliminate rogue devices from the mix, give a lower rating to untrusted devices, eliminate the data sent by breached devices, and give a lower score to the devices in the wild versus a higher score to the devices maintained by trusted parties.

Smart Contracts to Enable Smarter CommerceAnother trend that is gaining a lot of traction is smart, automated commerce. Even though the edge devices are growing to billions in numbers, the monetization of those devices is

Page 8: Big Data Predictions ebook

PAGE 8

Cloud Computing Moves to the Edge in 2016

still sporadic – there is no consistent way to commercialize those edge IoT devices. This is where the Blockchain concept can help. Edge IoT devices can create smart contracts and publish their details – such as pricing, terms, length, delivery mechanisms, and payment terms – to the Blockchain network. The data consumer can browse the list of published smart contracts, choose a good match, and auto-negotiate the contract.

Once the terms are agreed upon and the electronic agreement is signed, the data supplier can start delivering the goods to the consumer and get paid from the source automatically. The lack of a need for human intervention will make commerce faster and smarter. This automation also gives an option to the data consumer to evaluate the value of data being received on a constant basis. Re-negotiation or cancellation of the contract at any time without a longer time binding makes smart contracts more attractive.

On the flip side, the data provider also can choose to cancel or re-negotiate the contract, based on contract violation, market demand, deemed usage, etc.

Another important aspect of edge IT and edge processing, which includes IoT and Fog computing, is about monetization and commercialization. Currently, most IoT companies are popularizing their IoT gadget and solution set based on how innovative they are. The commercialization aspect of the gadgets themselves is very limited, however, and will not deliver the true IoT concept. Once companies figure out the value of their data, offering Data as a Service or even Data Insights as a Service will become more popular. Once this happens, we predict that companies will rush to maintain infrastructure to create an open source economy, in which data and data-based insights can be easily produced and sold.

FACTS-based Smarter Systems Finally Come to FruitionIoT helps bridge the gap between IT and operations technologies (OT). Currently, most of the core IT decisions about OT are based either on old data (data that is more than seconds old) or on some estimation. The current decisions in the field regarding OT are made based on isolated data sets that are partial in nature and delayed. This leads to subjective decisions.

Going forward, based on growing decentralization of cloud and M2M communications, as well as real-time interaction of the OT data set with the core IT, decisions will be made closer to full ecosystem based real-time data. This will lead to objective decisions. These fast, accurate, complete, trusted, scalable (FACTS) real-time systems will make core IT business decisions in real time and enforce them at the OT level.

We predict that the authoritative decision-making source will begin moving slowly toward machines. This will be enabled by the M2M

interactions.

“ ”

Page 9: Big Data Predictions ebook

PAGE 9

Cloud Computing Moves to the Edge in 2016

As discussed above, Apache Spark allows the necessary services such as analytics, data intelligence, security, and privacy all to be containerized and moved closer to the edge instead of centralized processing. This allows for the edges not only to make decisions based on the events happening elsewhere in the enterprise, but also to make decisions faster, more complete, and accurate, all the time.

Andy Thurai is Program Director for API, IoT, and Connected Cloud with IBM, where he is responsible for solutionizing, strategizing, evangelizing, and providing thought leadership for those technologies. Prior to this role, he held technology, architecture leadership, and executive positions with Intel, Nortel, BMC, CSC, and L-1 Identity Solutions. You can find more of his thoughts at www.thurai.net/blog or follow him on Twitter @AndyThurai.

Mac Devine has 25 years of experience with networking and virtualization. He became an IBM Master Inventor in 2006, an IBM Distinguished Engineer in 2008, and has been a Member of the IBM Academy of Technology since 2009. Mac currently serves as Vice President of SDN Cloud Services, CTO of IBM Cloud Services Division, and as a faculty member for the Cloud and Internet-of-Things expos. He also is a member of the Data Informed Board of Advisers.

Page 10: Big Data Predictions ebook

PAGE 10

Hadoop, Open Source Adoption and Growth to Accelerate in 2016

By Elliott Cordo, Chief Architect, Caserta Concepts

The past few years have seen major advances in open-source technology in the data analytics space. These advancements are impressive from a technology perspective and for what it says about the overall acceptance of an open-source model.

Now that 2015 has drawn to a close, let’s review some of the main themes of this past year’s data analytics evolution and use them to predict the future of open source for 2016 and beyond.

The open-source movement itself is just as exciting as the technological innovations that fostered it. The beauty of open-source software, as we all know, is that it allows for collaborative development and gives license holders the ability to modify and

distribute the software as they choose. Sounds great.

But just a few years ago, most mature IT organizations were literally terrified of open-source software. This applied not just to systems and frameworks, but also to programming languages. There was an inherent fear of not having commercial support available and a general lack of trust for community-driven development.

Over these past few years, the innovation of the big data revolution provided a forcing factor for businesses and IT organizations to reconsider their stance on open-source software. In order to benefit from software such as Apache Hadoop, Spark, and Cassandra, the easing of the mandate for commercially available software became necessary. And to address the desire for value-added features and traditional support capabilities, a rich ecosystem of commercial support organizations emerged. This included major “distributions” of Hadoop as well as distribution and commercial support for just about every mainstream open-source project.

A look at the technology environment today will show that the tables have totally turned. Even in large, conservative organizations, we see the desire for open-source software over even the tried-and-true commercial, closed-source options. After emerging from their bubble, organizations have begun objectively questioning whether licensing and support costs are providing true ROI. They are becoming increasingly concerned about vendor lock

Elliott Cordo, Chief Architect, Caserta Concepts

Page 11: Big Data Predictions ebook

PAGE 11

Hadoop, Open Source Adoption and Growth to Accelerate in 2016

and see the strategic limitations of the inability to fix bugs or enhance these commercial offerings. If companies want to change something in an open-source project, or accelerate a bug fix, they now have the opportunity simply to do it themselves.

Due to the pressure to provide an open environment, several traditional commercial software companies have chosen to open-source their software. Perhaps one of the most significant examples is Pivotal. The company has begun open sourcing its Hadoop projects, including its flagship SQL-on-Hadoop engine HAWQ earlier this year, and just recently opened its traditional MPP database Greenplum. Another good example is Intel. The company recently open-sourced its Trusted Analytic Platform (TAP), a significant move because it now provides a full-featured ecosystem that makes it easier for organizations to create big data applications. Even traditional commercial tech giants like Microsoft are dabbling in open source, committing to open-source platforms that compete against their customary offerings.

We see this open-source movement as long-term and anticipate that it will be greatly accelerated in 2016. Due to increasing interest, we predict additional closed-source projects being converted to open source, as well as continued growth of the open-source distribution and commercial support ecosystem. It’s hard to guess what’s next, but we suspect some definite conversion in the MPP/data warehousing space. This is mainly due to competition from Apache Hadoop, Spark, and cloud-based commercial offerings such as Amazon Web Services’ Redshift.

Hadoop: Open-Source CatalystFor many, Apache Hadoop is the poster child of the open-source revolution. Since its mainstream adoption began four years ago, Hadoop has been the single biggest catalyst for open-source adoption in data analytics.

Hadoop has matured over the past few years and, at the same time, rapidly evolved. It is now more of a data operating system than simply an application or platform. Allowing all types of distributed applications to “plug in” and take advantage of distributed storage (HDFS) and robust resource management (YARN), Hadoop now finds itself the home for all sorts of applications, including distributed databases, data processing, and even data warehousing.

However, this new open-source world is very disruptive. Advancements in data analytics technology are happening all the time, and there is competition for Hadoop as a platform as well as for the applications that call Hadoop home.

Apache Spark is the leader in this challenge. Born as a science project at UC Berkeley, this open-source platform has skyrocketed in popularity and adoption. 2015 was definitely the year of Spark, and we predict a similar theme in 2016. For now, the majority of Spark implementations resides on Hadoop, leveraging the core services of YARN and HDFS. However, these services are pluggable and there are other resource managers, such as

Page 12: Big Data Predictions ebook

PAGE 12

Hadoop, Open Source Adoption and Growth to Accelerate in 2016

Mesos, and other distributed file systems, such as Luster and AWS S3. For 2016, we predict that Hadoop remains Spark’s favorite home. But in the long term, this may change.

Upheaval in the market comes not just from open source, but also from the cloud. Disruptive services from Amazon, such as S3, Lambda, Kinesis, EMR, and Dynamo provide fierce competition for many of Hadoop’s core applications. Even Elastic Map Reduce (EMR), AWS’s on-demand Hadoop service, challenges Hadoop’s value proposition. Instead of presenting Hadoop as a persistent, central, solve-for-all analytics platform, it promotes using Hadoop on demand, as a pluggable solution where it best fits the problem.

We predict that Hadoop’s growth and adoption will continue to increase in 2016. This adoption will be fueled by exciting new use use cases, such as the Internet of Things, as well as traditional use cases, such as data warehousing and ETL (extract, transform, load) processing. As Hadoop and Spark are “friends” for the near future, continued adoption of Spark also will fuel Hadoop’s growth.

Elliott Cordo is a big data, data warehouse, and information management expert with a passion for helping transform data into powerful information. He has more than a decade of experience in implementing big data and data warehouse solutions with hands-on experience in every component of the data warehouse software development lifecycle. As chief architect at Caserta Concepts, Elliott oversees large-scale major technology projects, including those involving business intelligence, data analytics, big data, and data warehousing. Elliott is recognized for his many successful big data projects ranging from big data warehousing and machine learning to his personal favorite, recommendation engines. His passion is helping people understand the true potential in their data, working hand-in-hand with clients and partners to learn and develop cutting edge platforms to truly enable their organizations.

Page 13: Big Data Predictions ebook

PAGE 13

Future State: Where Data Integration Will Move in 2016

By Nidhi Aggarwal

Data integration technology has evolved slowly since the rise of data warehousing in the 1990s. At the time, systems emerged to manage several data sources, typically fewer than 20, within the warehouse with what database pioneer Mike Stonebraker has labeled “first-generation” extract, transform, and load (ETL). Not long after, second-generation systems provided incremental cleaning and preparation capabilities. We employed this basic design for most of the next 20 years.

More recently, the “big data era” and the explosion of available data has strained ETL architectures built to handle a small fraction of the thousands of diverse and often siloed sources within today’s

enterprise. Over the last few years, this has led to waves of data integration innovation, including self-service and machine-learning solutions.

So what in the world of accelerating innovation should we look for next? Here are a few thoughts, based on what we have been seeing in the market this year at Tamr:

The Strategic Embrace of Data VarietyMore and more enterprises are recognizing the massive untapped analytic value of “long-tail” data – information often dispersed in datasets across the organization and, as a result, difficult to see, much less locate. Consider the simple-to-ask but difficult-to-answer question, “Are we getting the best price for any given product we purchase at our company?” Our natural tendency is to play “grab and go” with the procurement data we know and see. As such, we end up ignoring sourcing data in other divisions that other people own – and that could just be the data needed to optimize your procurement system.

Think you are embracing all this variety and value by moving everything into a data lake? Think again. Yes, in theory data lakes are inexpensive and can hold a ton of data in diverse formats from disparate sources. In practice, lakes can get so large and loose that they risk becoming unusable.

Smart companies get this, knowing that incremental “spray and pray” investments in data warehouses and other traditional infrastructure aren’t enough to harness variety and harvest

Nidhi Aggarwal, Global Head of Operations, Strategy, and Marketing, Tamr

Page 14: Big Data Predictions ebook

PAGE 14

Future State: Where Will Data Integration Move in 2016

its full value. In 2016, this incremental approach will give way to much more strategic investments in data systems that embrace all enterprise silos through the following:

• Distributed, shared-nothing, multipurpose hardware infrastructure (which is what the big Internet companies began building years ago, launching the Hadoop and NoSQL movements)

• Software that can handle data variety at scale by utilizing automated integration approaches that also tap into human expertise

• Modern data storage systems that include both access via a declarative language (SQL) as well as other more direct methods (JSON primary among them)

The Rewiring of the Analytics ProcessA recent survey of 316 large global company executives by Forbes Insights found that 47 percent “do not think that their companies’ big data and analytics capabilities are above par or best of breed.”

Given the accelerating investments in big data analytics, this is too big a gap to ascribe to overheated expectations. I believe the problem has more to do with the analytics process itself. Specifically, too many analytics projects:

• Start from the wrong place. Too often, we launch analytics with the data that we believe is available as opposed to the questions we want to answer – which, consequently, limits the number and types of problems we can solve to the data that we can locate in the first place.

• Take too long. Conventional industry wisdom holds that 80 percent of analytics time is spent on preparing the data, and only 20 percent on actually analyzing the data. With massive reserves of enterprise data scattered across hundreds of disparate silos, manually integrating information for analysis significantly delays attempts to answer mission-critical questions.

• And still fall short. Manual integration can also significantly diminish the quality and accuracy of the answers, with incomplete data potentially delivering incorrect insights and decisions.

In 2016, enterprises will move toward human-machine preparation and analytics solutions designed specifically to get businesses more and better answers, faster and continuously. In other words:

• Speed/Quantity. Data preparation platforms will become faster, nimbler, and more lightweight than traditional ETL and Master Data Management solutions, allowing enterprises to get more answers faster by spending less time preparing data and more time analyzing it.

Page 15: Big Data Predictions ebook

PAGE 15

Future State: Where Will Data Integration Move in 2016

• Quality. Advanced cataloging software will identify much more of the data that are relevant for analysis, allowing enterprises to get better answers to questions by finding and using more relevant data in analysis – not just what’s most obvious/familiar.

The Adoption of ‘DataOps’DevOps – which combines software engineering, quality assurance, and technology operations – has emerged over the last 10 years because traditional systems management couldn’t meet the needs of modern, web-based application development and deployment.

Next year will see the emergence of another management method to meet the needs of the modern era: DataOps for the big data era.

Organizations are starting to realize two critically important things as they continue to democratize analytics. One, they can’t maximize ROI without optimizing data management to make clean, unified data accessible to every analyst. And two, the infrastructure required to support the volume, velocity, and variety of data available in the enterprise today is radically different than what traditional data management approaches can provide.

DataOps comprises four processes – data engineering, integration, quality, and security/privacy working together in a successful workflow aimed at helping the enterprise rapidly deliver data that accelerates analytics and enables analytics that were impossible previously.

To integrate this revolutionary data management method, enterprises will need two basic components. The first is cultural – an environment of communication and cooperation among data analytics teams. The second component is technical – workflow automation technologies like machine learning to recommend, collect, and organize information. This groundwork will help radically simplify administrative debt and vastly improve the ability to manage data as it arrives.

I see innovation in the area of data integration continuing to accelerate in 2016, especially with the evolution of machine learning solutions to automate and DataOps to manage more and more of the process. This will have an intensely positive impact for enterprises that want to embrace the full variety and value of their available data.

Nidhi Aggarwal is Global Head of Operations, Strategy, and Marketing at Tamr.

Page 16: Big Data Predictions ebook

PAGE 16

Maturing IoT Will Drive Innovation Across Industry in 2016

By Don DeLoach

From an Internet of Things perspective, what can we expect in 2016? The short answer is: A lot. There is so much effort now going into the Internet of Things that the only certain prediction is that we will see more IoT progress in 2016 than we have ever seen before. I would go so far as to suggest that 2016 will bring about more progress in IoT than we have seen cumulatively to date. So I don’t believe there is any one specific prediction that stands alone. Here are a few of my thoughts about what we can expect for the Internet of Things in 2016:

TeleHealth/mHealth will continue to gain traction and attention. The world of Fitbits and smart phone apps for health-related

considerations are cute, fun, and potentially really cool. What is becoming much clearer is that this is not merely a fun little initiative that can show you how many stairs you climbed today, but a major force in changing the way healthcare is delivered (and empowered) on a global basis. This includes a number of elements. First, the extreme sophistication of certain wearable devices is creating digital health signatures in ways never really contemplated (at scale) before. The result is an ability to do much more granular analysis – and at a lower cost. And the more this accelerates, the more good ideas iterate and the more we continue to accelerate.

This dovetails into the world of TeleHealth. Arguably, this has been evolving for many years, but the increase in sophistication of TeleHealth technology and the linkages with wearables creates a phenomenal new level of insight into patient care. Granted, the proliferation of wearables is still limited, but the capabilities of TeleHealth are now much more widely understood, and allow the doctor to do just about everything but touch the patient. The quality of the data being received and analyzed just keeps getting better. And the results are an increase in the quality of outcomes and a reduction in the cost of delivery. Think about that combination. The economic impact alone is massive, not to mention increased quality of life for so many, many people. No wonder this is picking up steam.

Don DeLoach, President and CEO, Infobright

Page 17: Big Data Predictions ebook

PAGE 17

Maturing IoT Will Drive Innovation Across Industry in 2016

Insurance (on an ever broader scale) does IoT like the Oakland A’s do scouting. By now, most of us have heard about usage-based auto insurance, in which customers’ driving habits are recorded for the insurance company, so they ostensibly get better rates for better driving behavior. The technology is available through the OBD-II (On Board Diagnostics) units in cars and associated data logging to record driving behavior (the early versions of this date back to 1968, when Volkswagen delivered the first, primitive version of this). Insurance companies traditionally have assessed risk on a profile based on known statistical information about the likelihood that, for example, a 28-year-old male living in Atlanta with a wife and no kids, with one prior speeding ticket three years ago, will be in an accident. Now the industry can assess risk based on how a specific, individual driver fitting that profile actually drives.

But this is where the story starts, not stops. With the Internet of Things, there is an increasing amount of data about how you live (wearables), and even how you live inside your house (smart homes, security, energy, etc.). So what we see happening for auto insurance is something we should expect to see for health insurance and homeowners insurance, and certainly that will extend to businesses as well. The cool thing about this is that it illustrates the utility value of IoT data. The data collected by the car is not primarily used for insurance; it is used by the auto manufacturers for the servicing and enhancement of their products. Your Fitbit is not primarily an insurance tool – it is a health tool. Your IoT-enabled door locks, Nest Thermostat, Big Ass fan, smart refrigerator, and Dropcams are not for insurance – but the data they generate can be. With that, the insurance industry illustrates the big idea underneath all of the Internet of Things – the utility value of the data.

Retail pilots increasingly showcase the power of IoT, paving the way to a dramatically different world by 2018. Like healthcare, retail is an area where huge changes are coming. Ranging from innovative growth companies like Kairos to global giants like Accenture, there is a wealth of insight and innovation going into the retail space. The combination of what has been learned from commerce and the advances in technology bringing about the Internet of Things come together to create altogether new shopping experiences. This means facial recognition detecting emotion and heat mapping to understand the flow of traffic through stores. It also means digital dressing rooms that “know” who you are according to a good amount of your past preferences. For example, it’s a fact that if you leave the dressing room because you did not like what you tried on, you are unlikely to find something else and come back. In this case, the recommendation engine kicks in and the store clerk is there with new, high probability offerings that you can try on without having to change again to go back out into the store. This is very cool, but requires a bit of a cultural shift to adapt to the new experience. Moreover, wholesale upgrading of retail infrastructure will take some time, so my gut feeling is that this is a serious trend and 2016 will make this evident, but widespread proliferation of this will be a little further out. One interesting example of retail initiatives can be found at the Remodista Project.

Page 18: Big Data Predictions ebook

PAGE 18

Maturing IoT Will Drive Innovation Across Industry in 2016

North America gets serious about IoT. Much of the early advances in IoT took place outside of North America, due in part to Europe’s greater focus on sustainability. Among other things, this resulted in the formation of a number of public-private initiatives, notably in Spain (Barcelona, Zaragoza, Santander) and the Netherlands (Smart Amsterdam, Eindhoven Living Lab, etc.). This is changing. Where technology advances, so does Silicon Valley, Boston, and other tech hubs in North America. We have seen a number of one-off projects in many cities throughout North America, and are now seeing deliberate regional efforts, not unlike Spain and The Netherlands, where public-private initiatives have engaged. Most notably, the ITA Midwest IoT Council based in Chicago was announced in April of 2015 with the goal of promoting and embracing IoT in the Midwest. The Council formed with 17 inaugural Board members (now 18), and a mere seven months later has five working committees and participation from 130 different organizations. Other cities/regions are mobilizing to follow this example. The Midwest effort showcases IoT initiatives ranging from GE/SmartSignal to the Array of Things project driven by the Urban Center for Computation and Data of the Computation Institute, a joint initiative of Argonne National Laboratory and the University of Chicago. The outlook for regional initiatives is promising, and I expect 2016 to be the year when there is clear evidence of IoT embracement in North America.

Maturity breeds understanding: The role of governance and architecture begins to have a wide impact on deployment strategies. Until recently, most IoT projects have been deployed as closed loop, message-response systems. This is beginning to change and will only pick up momentum. The reason for this is the data. Just like in the insurance examples, it will become increasingly clear that the real value for IoT is realized in the data, as the underlying data will be used by a variety of consuming applications. However, to facilitate this, deployment architectures must contemplate how messages travel from ingestion to use in consuming applications, including the cleansing and enriching process – especially at the point of first receiver. This, in turn, raises the question of ownership and stewardship of the data, which speaks to security and privacy issues. To provide ideal leverage of IoT data for all consuming constituencies, these architectural and governance issues will have to be addressed. Organizations that do this with a well-thought-out deployment architecture likely will be big winners, and those that do not will lose out. But there is a growing amount of anecdotal evidence to suggest that more and more organizations want access to, if not outright ownership of, the data created by their deployed IoT subsystems. That will increasingly force this issue, which I believe will begin to play out in earnest in 2016.

Edge computing will become a mainstream consideration. Right now, the default IoT deployment model is the IPv6 based sensor that sends a message into the machine cloud. More and more, this will be called into question, due in part to the governance and architectural reasons expressed above. Additionally, the notion of pushing every single message into a central repository will not always make sense because, in many instances, the vast majority of the messages (temperature readings, carbon dioxide readings, etc.)

Page 19: Big Data Predictions ebook

PAGE 19

Maturing IoT Will Drive Innovation Across Industry in 2016

are individually inconsequential, and pushing everyone into the cloud costs more money and diminishes the payload. There are many other reasons why edge computing will become increasingly important but, in a nutshell, we will see this become a mainstream consideration in 2016.

Don DeLoach is CEO and president of Infobright and a member of the Data Informed Board of Advisers. Don has more than 30 years of software industry experience, with demonstrated success building software companies with extensive sales, marketing, and international experience. Don joined Infobright after serving as CEO of Aleri, the complex event processing company, which was acquired by Sybase in February 2010. Prior to Aleri, Don served as President and CEO of YOUcentric, a CRM software company, where he led the growth of the company’s revenue from $2.8M to $25M in three years, before the company was acquired by JD Edwards. Don also spent five years in senior sales management, culminating in the role of Vice President of North American Geographic Sales, Telesales, Channels, and Field Marketing. He has also served as a Director at Broadbeam Corporation and Apropos Inc.

Page 20: Big Data Predictions ebook

PAGE 20

Cognitive Analytics Becomes Essential for Business Users in 2016

By Suman Mukherjee

Business users across functions and company sizes are quickly graduating from being just the consumers of analytics to actually doing analytics in a self-service mode with minimal dependencies. Over the past decade, the clout of line-of-business users in the analytics purchase decision-making process has increased, too. To make critical business decisions on a daily basis, business users’ expectations of analytics have evolved, and so have their ways of interacting with the tools. In the face of all these exciting changes with business users at their center, I see the realm of the cognitive analytics becoming essential for business users going forward.

I don’t mean that business users will have to equip themselves with technical expertise to leverage cognitive analytics solutions. I mean that providers will deliver certain cognitive analytics products, which will be served in the self-service flavor and focused on the needs of regular business users to maximize their productivity on a daily basis.

Let’s set the context. Data today is characterized by its sheer volume and its abundance while being under-utilized.

Machine learning systems, with training, detect patterns from deep within that data – or those related and even unrelated data sets – and surface insights often invisible to the naked eye. Cognitive capabilities built on top of such machine-learning systems transcend pre-defined rules and structured queries. Instead, they dynamically modify these resultant insights and their relevance to reflect the user’s query and the context. And they keep learning to perfect this over time.

Although it might sound a bit far-fetched for cognitive analytics to become an essential technology for a regular business user at this point in time, progressive technology companies seem committed to making it possible.

So what would be an ideal cognitive analytics solution?

In my opinion, it will be about more than just the ease of use. It will be about more than just leveraging different data sources (both on-premise and in the cloud) across multiple

Suman Mukherjee, Product Experience and Design, IBM Watson Analytics

Page 21: Big Data Predictions ebook

PAGE 21

Cognitive Analytics Becomes Essential for Business Users in 2016

data types. It will be about more than just the breadth of features and the ability to share and collaborate from/within the same user interface. It will be about more than just the low cost of ownership and a faster time-to-insight, along with the flexibility to scale per need for such a solution. It will be about more than just integration of such systems with the existing analytic platforms and installed applications. It will be about more than just the ease of consumption across different form-factors and, finally, it will be about more than just those pretty visualizations.

For cognitive analytics to become essential and pervade the farthest corners of a business, it needs to embody all of the above and then add some tailored customizations on top of it. While that might seem a tall order, keep in mind that, two decades ago, so did BI, predictive, and self-service analytics.

The following are some interesting trends to watch for in self-service style cognitive analytics products for business users in 2016:

Solution Providers Will Be Committed to Business Users • Cognitive analytics solution providers will invest heavily to cater to the specific needs

and style of working of a business user.

• Flexible and economical deployment models will become available.

• Such solutions will extend beyond just the in-house historical structured data along with flat files and provide the ability to leverage data on weather, social media, and industry to help users/organizations stay on top of these critical micro-trends.

• Innovative apps will focus on specific industry-function combinations and provide out-of-the-box solutions with the ability to utilize users’ own data.

Cognitive Solutions will Maximize Productivity • Guidance-based features and an easy-to-use self-service interface will ensure the fastest

time-to-insight.

• Automated features will smartly mask the complexities and, for the business user, not require any prerequisite expertise in the BI, predictive, or cognitive technologies that run in the background.

• Cognitive starting points and context-based insights both will lead to brilliant “aha” moments and validate users’ existing understanding with statistical confidence.

• Natural-language querying ability will generate recommended insights, sorted in the order of their relevance, along with the ability to further modify them as necessary.

Page 22: Big Data Predictions ebook

PAGE 22

Cognitive Analytics Becomes Essential for Business Users in 2016

Convergence of Self-Service Features in a Single User Interface • Out-of-the-box data management capabilities will be automated, and that will ensure

users’ access to a broad range of data connectors (across cloud and on-premise sources); support data shaping, joining, and quality enhancement; as well as creating reusable data operations like groups, hierarchies, and calculations.

• Exploration and dashboarding will feature natural language enabled ad-hoc analysis along with the ability to create interactive, multi-tabbed dashboards to focus on KPIs.

• Automated predictive modeling will help users understand the drivers of their KPIs as well as provide top recommendations to improve on them. In addition, automated forecasting capabilities will work in tandem with the predictive piece, giving users the added ability to perform scenario modeling to establish an ideal forecast (apart from just viewing dotted line forecasts for the next period).

• Means to collaborate with others, reusability of insights across features, sharing of objects, and presentation of findings from/within the same user interface will enhance efficiency and productivity.

Deployment • Flexible deployment models will facilitate deeper and faster adoption while being able

to scale at will.

• Architectures will easily integrate with and leverage established analytics systems while ensuring security.

• Consumption across form-factors will include native app support for tablets and (I love to imagine) voice-based natural language ad-hoc queries through mobile devices.

• Existing cognitive systems that are industry/function/data agnostic could be customized and tweaked to serve a specific industry/function/role combination.

Cognitive analytics is disruptive, truly a shift of the curve capable of transforming functions and industries alike. I believe that the innovators and early adopters of cognitive solutions will have a decided edge over their competition.

Suman Mukherjee works with the IBM Watson Analytics Product Experience and Design team. He creates demonstrations and works on customer POCs and internal enablements.

Page 23: Big Data Predictions ebook

PAGE 23

Data Governance in 2016: Modeling the Enterprise

Jelani Harper

A number of changes in the contemporary data landscape have affected the implementation of data governance. The normalization of big data has resulted in a situation in which such deployments are so common that they are merely considered a standard part of data management. The confluence of technologies largely predicated on big data – cloud, mobile and social – are gaining similar prominence, transforming the expectations of not only customers but business consumers of data.

Consequently, the demands for big data governance are greater than ever, as organizations attempt to implement policies to reflect their corporate values and sate customer needs in a world in which increased regulatory consequences and security breaches are not aberrations.

The most pressing developments for big data governance in 2016 include three dominant themes. Organizations need to enforce it outside the corporate firewalls via the cloud, democratize the level of data stewardship requisite for the burgeoning self-service movement, and provide metadata and semantic consistency to negate the impact of silos while promoting sharing of data across the enterprise.

These objectives are best achieved with a degree of foresight and stringency that provides a renewed emphasis on modeling in its myriad forms. According to TopQuadrant co-founder, executive VP, and director of TopBraid Technologies Ralph Hodgson, “What you find is the meaning of data governance is shifting. I sometimes get criticized for saying this, but it’s shifting toward a sense of modeling the enterprise.”

In the CloudPerhaps the single most formidable challenge facing big data governance is accounting for the plethora of use cases involving the cloud, which appears tailored for the storage and availability demands of big data deployments. These factors, in conjunction with the analytics options available from third-party providers, make utilizing the cloud more attractive than ever. However, cloud architecture challenges data governance in a number of ways:

• Semantic modeling. Each cloud application has its own semantic model. Without dedicated governance measures on the part of an organization, integrating those different models can hinder data’s meaning and its reusability.

Page 24: Big Data Predictions ebook

PAGE 24

Data Governance in 2016: Modeling the Enterprise

• Service provider models. Additionally, each cloud service provider has its own model, which may or may not be congruent with enterprise models for data. Organizations have to account for these models as well as those at the application level.

• Metadata. Applications and cloud providers also have disparate metadata standards that need to be reconciled. According to Tamr Global Head of Strategy, Operations, and Marketing Nidhi Aggarwal, “Seeing the metadata is important from a governance standpoint because you don’t want the data available to anybody. You want the metadata about the data transparent.” Vendor lock-in in the form of proprietary metadata issued by providers and their applications can be a problem too, especially because such metadata can encompass an organization’s so that it effectively belongs to the provider.

Rectifying these issues requires a substantial degree of planning prior to entering into service-level agreements. Organizations should consider both current and future integration plans and their ramifications for semantics and metadata, which is part of the basic needs assessment that accompanies any competent governance program. Business input is vital to this process. Methods for addressing these cloud-based points of inconsistency include transformation and writing code, or adopting enterprise-wide semantic models via ontologies, taxonomies, and RDF graphs. The critical element is doing so in a way that involves the provider prior to establishing service.

The Democratization of Data StewardshipThe democratization of big data is responsible for an emergence of what Gartner refers to as “citizen stewardship” in two capital ways. The popularity of data lakes and the availability of data preparation tools with cognitive computing capabilities are empowering end users to assert more control over their data. The result is a shifting from the centralized model of data stewardship (which typically encompassed stewards from both the business and IT, the former in accordance to domains) to a decentralized one in which virtually everyone actually using data plays a role in its stewardship.

Both preparation tools and data lakes herald this movement by giving end users the opportunity to perform data integration. Machine learning technologies inform the former and can identify which data is best integrated with others on an application or domain-wide basis. The celerity of this self-service access and integration to data necessitates that the onus of integrating data in accordance to governance policy falls on the end user. Preparation tools can augment that process by facilitating ETL and other forms of action with machine-learning algorithms, which can maintain semantic consistency.

Data lakes equipped with semantic capabilities can facilitate a number of preparation functions, from initial data discovery to integration, while ensuring the sort of metadata and semantic consistency for proper data governance. Regardless, “If you put data in a data lake, there still has to be some metadata associated with it,” MapR Chief Marketing Officer Jack Norris explained. “You need some sort of schema that’s defined so you can accomplish self-service.”

Page 25: Big Data Predictions ebook

PAGE 25

Data Governance in 2016: Modeling the Enterprise

Metadata and Semantic ConsistencyNo matter what type of architecture is employed (either cloud or on-premise), consistent metadata and semantics represent the foundation of secure governance once enterprise-wide policies based on business objectives are formulated. As noted by Franz CEO Jans Aasman, “That’s usually how people define data governance: all the processes that enable you to have more consistent data.” Perhaps the most thorough means of ensuring consistency in these two aspects of governance involves leveraging a data lake or single repository enriched with semantic technologies. The visual representation of data elements on a resource description framework (RDF) graph is accessible for end-user consumption, while semantic models based on ontological descriptions of data elements clarify their individual meanings. These models can be mapped to metadata to grant uniformity in this vital aspect of governance and provide semantic consistency on diverse sets of big data.

Alternatively, it is possible to achieve metadata consistency via processes instead of technologies. Doing so is more tenuous, yet perhaps preferable to organizations still utilizing a silo approach among different business domains. Sharing and integrating that data is possible through the means of an enterprise-wide governance council with business membership across those domains, which rigorously defines and monitors metadata attributes so that there is still a common semantic model across units. This approach might behoove less technologically savvy organizations, although the sustainment of such councils could become difficult. Still, this approach results in consistent metadata and semantic models on disparate sets of big data.

Enterprise ModelingThe emphasis on modeling that is reflected in all of these trends substantiates the view that effective big data governance requires strident modeling. Moreover, it is important to implement governance at a granular level so that data is able to be reused and maintain its meaning across different technologies, applications, business units, and personnel changes. The degree of prescience and planning required to successfully model the enterprise to ensure governance objectives are met will be at the forefront of governance concerns in 2016, whether organizations are seeking new data management solutions or refining established ones. In this respect, governance is actually the foundation upon which data management rests. According to Cambridge Semantics president Alok Prasad, “Even if you are the CEO, you will not go against your IT department in terms of security and governance. Even if you can get a huge ROI, if the governance and security are not there, you will not adopt a solution.”

Jelani Harper has written extensively about numerous facets of data management for the past several years. His many areas of specialization include semantics, big data, and data governance.

Page 26: Big Data Predictions ebook

PAGE 26

2016: A Make or Break Year for Your Data Analytics Initiative

By Scott Etkin

Prolifics sees 2016 as a year in which IT and data analytics will be the differentiator that will enable industry leaders to pull away from companies that are late to adopt analytics and/or do not have an accurate assessment of their analytics capabilities and needs, or what they must do to be able to compete.

Data Informed spoke with Dr. Michael Gonzales, Director of Research and Advanced Analytics at global technology solutions provider Prolifics, about the data analytics experience and skills gap, what lagging organizations must do to develop an honest and precise understanding of their big data analytics know-how, and what the company sees in the data analytics space as it looks ahead to 2016.

Data Informed: Why do you see 2016 as a difference maker for companies’ analytics capabilities?

Dr. Michael Gonzales: No business strategy is implemented today without an IT component. Anyone looking at getting into data analytics now is late. Most competitors already have looked at their core capability. Others are reacting to the disruption that those early movers have created.

Data Informed: How difficult is it for an organization to understand how its analytics capability stacks up with the competition? What factors make accurate self-assessment a challenge?

Gonzales: Many organizations simply do not have a complete understanding of the scope of analytics available, the role of the various analytics techniques in an overall big data analytics program, and the value each of those techniques brings to the organization. If you do not have an understanding of analytics, then it becomes problematic for identifying what analytics are being conducted in your organization and how you compare to industry best practices and the leading companies in your space.

Data Informed: What steps can companies take to honestly and accurately assess their analytics capability relative to that of their competitors?

Gonzales: If a company wants an accurate and thorough inventory of its analytics capabilities relative to its competitors, the company needs to engage a firm that has a formal methodology and proven techniques to conduct the assessment. Moreover, the firm conducting the assessment must be willing to provide a knowledge transfer and leave the tools and techniques so that the company can conduct its own internal assessment periodically.

Page 27: Big Data Predictions ebook

PAGE 27

2016: A Make or Break Year for Your Data Analytics Initiative

Data Informed: How can companies determine what they need to do in 2016 to catch up?

Gonzales: Companies can conduct broad research, examining what are leading analytic trends in the private, public, and academic communities. Or, they can engage a firm whose primary purpose is to keep abreast of analytics trends, the role and value each brings to a company, and how these trends can be incorporated into short-term and long-term road maps. The figure below outlines a few distinct analytic functions within the context of three categories: Traditional BI, Analytics, and Cognitive.

Key elements of a successful analytics program include the following:

• Establish a flexible and adaptable internal analytic organization

• Embed analytics within solutions so that user communities can benefit from the value of analytics without having to deal with any of the attending complexity

• Plan for extending/complementing internal analytic resources with a proven partner

• Prepare a conscious approach for engaging in cloud-based solutions

Data Informed: What is the most difficult challenge that companies face in terms of closing the analytics gap and catching up with their competitors?

Gonzales: The primary challenge that companies face in closing the analytics gap is that the scope of data and the relevant analytics continue to evolve without respite. This is the new reality that organizations face. It is well known that data generates more innovation. The resulting innovation generates more data that must be analyzed, and this new data goes on to generate even more innovation. It is cyclical and self-perpetuating, without any lull to allow firms that have fallen behind to catch up or react. It is simply a moving target.

Scott Etkin is the editor of Data Informed. Email him at [email protected]. Follow him on Twitter: @Scott_WIS.

Page 28: Big Data Predictions ebook

PAGE 28

Comprehensive Analytic Maturity AssessmentA Guide to the Approach and Execution of an Analytic Maturity Assessment

By Michael L. Gonzales, Ph.D., Director of Research and Advanced Analytics Foreword By Wayne Eckerson, Eckerson Group

ForewordI’ve conducted analytic assessments for many years and have seen first-hand their power to help companies understand their strengths and weaknesses and develop a strategy for improvement.

More recently, I’ve teamed up with Michael Gonzales, Ph.D., of Prolifics to conduct analytic assessments with joint clients. Michael also has a long history with assessments, conducting them for many large corporations, including The Gap, U.S. Postal Service, General Motors, Canada Post, and Dell. Together, we plan to take the art of assessing a client’s business intelligence (BI), analytics, and data management capabilities to a new level using state of the art software, cloud services, and statistical techniques. This document represents the heart and soul of this effort.

Assessments to Benchmarks. The real value of an assessment is to give executive leaders a quick and visual representation of how the company compares to its peer competitors. In other words, the best analytic assessments are industry benchmarks.

Michael and I, with support from the Eckerson Group, are embarking on a journey to create the world’s biggest database of analytical assessment data, which subscribers can access and analyze using a self-service cloud application. This will enable organizations to continually evaluate their progress against an industry maturity model and a community of peers.

Open and Repeatable. This open and online approach represents a new, and refreshing, departure from the analytic assessments offered by most consulting firms, which are typically black-box projects. You pay a lot of money and get results, but you cannot repeat the assessment yourself, and therefore, evaluate your progress over time, unless, of course, you pay the consultancy to repeat the assessment.

The assessment described in this document uses a white-box approach with knowledge transfer at the heart of the process. We urge clients to learn to conduct the assessment themselves, either online or on paper, so that they can continue to get value out of the assessment long after we are gone.

We hope you find this document valuable and that it will kick start an assessment process in your organization.

- Wayne Eckerson, Eckerson Group

Page 29: Big Data Predictions ebook

PAGE 29

Comprehensive Analytic Maturity Assessment

IntroductionStrategy is best created by establishing synergy among a company’s activities. The success of a strategy depends largely on integrating many activities well, as opposed to excelling at only one. Without synergy among activities, no distinctive or sustainable strategy is possible (Porter, 1996).

Strategic fit among several activities is, therefore, cornerstone for creating and sustaining a competitive advantage. Interlocking and synchronizing several activities is simply more difficult for competitors to emulate. An organization’s competitive position that is established on a system of multiple activities is more sustainable than one built on a single capability/activity.

Establishing a sustainable competitive advantage in analytics, therefore, requires a network of interrelated analytic-centric activities, including: Business Intelligence (BI), Visualization, Data Warehousing (DW), data integration, statistics, and other relevant activities. Companies that know how to leverage their IT resources gain an analytic-enabled competitive advantage (Porter, 1980; Sambamurthy, 2000), which is the basis of Prolifics’ analytic-enabled competitive advantage research. For the purpose of this paper, the term analytics will represent the comprehensive view that encompasses concepts such as predictive and exploratory analysis, BI, Visualization, DW, and Big Data.

The challenge, when creating an analytic strategy, is to identify which activities to focus on. To that end, our research identifies factors of analytic-centric initiatives that significantly contribute to the overall maturity and success of a program (Gonzales, 2012). Building on this research, coupled with extensive practical application of maturity assessments for leading companies, Prolifics’ Comprehensive Analytic Maturity Assessment (CAMA) creates an index that measures the analytic-enabled competitive maturity of an organization. The constructs of this index and the metrics on which the constructs are quantified are outlined in Table 1.

The objective of CAMA is to estimate not only the overall level of analytic maturity within an organization, but to do so while providing guidance and a roadmap to evolve to higher levels of maturity. Once the metrics are calculated and the competitive advantage index is quantified, it is then evaluated against a maturity model. We recommend leveraging the maturity model shown in Figure 1.1

Table 1. Assessment Measures

Page 30: Big Data Predictions ebook

PAGE 30

Comprehensive Analytic Maturity Assessment

Therefore, Prolifics’ approach to measuring the analytic maturity level of an organization is based on two models:

1. The first model provides the means to quantify the analytic-enabled competitive advantage index.

2. The second model then applies that index to a best-practice maturity model in order to categorize the maturity level within widely accepted, prescriptive, maturity stages.

Table 2 provides context for each of the intersections between the assessment measures and maturity levels.

The Value of AssessmentMany organizations that invest in an analytic-centric assessment do so in order to establish an unbiased measurement of their organization and, more specifically, of their analytic program. And while this is an important task for any company focused on an analytic-enabled

Figure 1. Analytic Maturity Level Model

Table 2. Maturity Model

1Maturity Level Model is based on a TDWI model originally developed by Wayne Eckerson and updated by Michael L. Gonzales, PhD in 2012.

Page 31: Big Data Predictions ebook

PAGE 31

Comprehensive Analytic Maturity Assessment

competitive advantage, it is only a fraction of the value that can be mined from this investment. There are four other success factors that should be leveraged in order to maximize your return on investment (ROI), including:

1. Establish performance metrics to measure and monitor your program

2. Periodically conduct the same assessment to measure and monitor progress

3. Create a roadmap for your analytic program improvement and evolution to higher levels of maturity

4. Ensure both business and IT are involved

Each is discussed below.

Performance MetricsAn effective maturity study will measure multiple dimensions of your organization and its analytic program, including: leadership, organizational structure, user competencies, and other points. Each provides key metrics to both measure your current maturity level and monitor the program’s progress.

Figure 2. Sample Dimensions Measured

Page 32: Big Data Predictions ebook

PAGE 32

Comprehensive Analytic Maturity Assessment

Establish a Repeatable ProcessOrganizations embark on analytic maturity assessments for several reasons. Whatever your motivation for conducting this type of study, you should not use the assessment results as merely a one-time, snap-shot and then shelve the expensive report. Instead, it should be used as a starting point, a baseline for your analytic strategy.

A baseline assumes there are subsequent measurements that will be conducted. To that end, you should establish an assessment schedule. Depending on the volatility of your ecosystem/organization, you may want to conduct the same assessment, internally, once every six to 12 months. Doing so achieves the following:

• Ability to demonstrate real gains across quantitative points.

• Contribution to budget elements. If you can demonstrate significant maturity increases over several key metrics, the results will support your argument for budget increases in order to secure more staff, software, hardware, etc.

However, if you are going to conduct the same assessment periodically, you must insist on retaining3 the instruments used and methodology applied to arrive at the gauged maturity level. Some assessment services will simply not comply. It is this author’s recommendation that you should not invest in any assessment process that contains black-box calculations/weights that are proprietary to the service provider. Frankly, if you do not have a white-box assessment, one that provides visibility to all aspects of how the assessment is derived, then it is not worth the price you will be asked to pay. Real value from these initiatives is derived when you can internalize the assessment instruments and processes to enable your organization to periodically conduct the assessment.

Create a RoadmapAssessments of value will expose a list of opportunities for improvement. But it is important that the opportunities are identified in terms that are actionable. For example, if the assessment informs you that the program is weak, but does not specify which aspects of the program are weak and what can be done to improve them, then the assessment is of little value as nothing is identified as actionable. Actionable insights with clear objectives of how to improve your program should be the objective of a roadmap.

A roadmap will provide clear steps to improve your analytic initiatives. For example, if the assessment identified that your organization lacks technology standards, prohibits data exploration, and has no consistent definitions for key reference data, such as customers or products, then the roadmap could specify the creation of a data governance program, technical architecture standards, an exploratory (sandbox) platform, and the implementation of a customer Master Data Management (MDM) program, including the steps necessary to achieve each objective.

Page 33: Big Data Predictions ebook

PAGE 33

Comprehensive Analytic Maturity Assessment

Gain Organizational Collaboration and Buy-InAn effective assessment project provides an excellent opportunity for an organization to foster collaboration between business and IT. When selecting members for the assessment team, there must be some members from the assessment firm and others from your organization. And of those representing your organization, you should select individuals associated with the business and IT sides of your company. This means that not only is your company actively involved in the assessment process but you’ve also ensured that business and IT are focused on an unbiased assessment of your company, from both dimensions. Some assessments are sponsored by business, often because they feel IT has not been responsive. And sometimes an assessment is sponsored by IT in order to measure their internal capabilities to deliver BI or to serve as a means to make arguments for more funding. This author has found that the most effective method for conducting enterprise assessments that provide clear, unbiased findings is to involve both business and IT.

Read more about the assessment team in the Assessment Team section of this paper.

The Assessment TeamWhen conducting a maturity assessment, it is important to keep in mind the following:

1. Your organization must not only actively participate in the initial assessment process, but must also learn how to conduct the assessment for subsequent progress reports.

2. The assessment represents a great opportunity for collaboration between business and IT.

These two points dictate what you should expect to contribute to the process and the role the consulting firm that is providing the assessment process is to play.

From the numerous assessments this author has conducted, there are at least eight key groups that can contribute to the overall effort as defined in Table 3. The number of groups you have involved in the effort will be determined by the scope of your study.

Enterprise efforts whose goal is to accurately measure level of maturity for the organization will likely leverage all the groups defined in Table 3. Smaller or more focused assessments may choose to use only those groups most relevant.

An effective assessment project provides an excellent opportunity for

an organization to foster collaboration between business and IT.

“ ”

Page 34: Big Data Predictions ebook

PAGE 34

Comprehensive Analytic Maturity Assessment

Assessment DesignYou should insist that the assessment consulting company you hire provides an assessment design, including methodology and instruments. However, you can create your own assessment as outlined in this section or at the very least, measure the type of assessment your consulting team plans to execute.

Understand the Scope of the Assessment

Not all analytic assessments are the same. This author has worked on assessments where sponsors would only allow a few executives to be interviewed and no end users were to participate. Other clients have opened their entire organization with the objective of gaining unbiased, enterprise perspective of the analytic program. The scope, however, will dictate the questions to ask and the type of instruments to be created.

Assessment Information Channels

Depending on the scope of your study, there are several channels of information that you can garner for calculating the final results. These channels are consistent with the groups you have chosen to include in the study as defined in Table 3 of the Assessment Team section of this paper.

Table 3. Team Roles and Responsibilities

Page 35: Big Data Predictions ebook

PAGE 35

Comprehensive Analytic Maturity Assessment

For brevity, outlined below are those key channel participants, each representing a particular dimension of the user experience of your analytic program, including:

• Executive Interviews. These represent a key channel of information for your assessment. Their unique perspective must be collected and recorded in a manner consistent with their position. Consequently, structured interviews are the only effective way to document their perspective as defined in the Executive Interviews section of this paper.

• Key Subject Matter Expert Survey. There are a few experts that we invite to participate in the assessment. Specifically, these groups are associated with the following:

• Business. Include senior analysts and data scientists.

• IT. Include both technical and data architecture.

• 3rd Party. Experts in the analytic space.

• Non-managerial User Survey. For any comprehensive study, end users must be included. They represent the front-line of analytic-centric information consumption.

In addition to the interview and survey participants, there are other channels of necessary information to gather for a composite view of the BI program. These include:

• Application Footprint Inventory

• Data Architecture Inventory

• Technology Inventory

Combined, these multiple channels of information provide the assessment team with the broadest perspective of analytic-centric activity being implemented and experienced throughout the organization.

Identify Survey QuestionsMany assessments offered by leading consulting companies are based only on anecdotal experience. This means that some of the firm’s subject matter experts have decided that the questions are significant when measuring maturity. The problem is that many of these types of questions, while they may seem relevant, may actually not be statistically significant for assessing maturity. For example, this author reviewed 40 questions used by a major provider of BI/DW maturity assessments and found that only 18 of the questions were actually statistically significant. This means that they ask 22 questions that are basically irrelevant.

The best questions to include in an assessment are grounded. That means questions that have been proven to be statistically significant in assessing maturity. With relatively simple research you can identify many questions that have been used and are proven important in assessing maturity in previous, relevant studies. Since vendors typically do not publish their question pool, you can research academic studies that will do so. The key is to build a pool of grounded questions that can be leveraged in the subsequent instruments.

Refer to Appendix A for more information for conducting and analyzing surveys.

Page 36: Big Data Predictions ebook

PAGE 36

Comprehensive Analytic Maturity Assessment

Segment Questions to Match ParticipantsOnce you have a database of grounded questions, you want to select the questions you plan to include in surveys designed for specific participating groups. As defined in the Assessment Information Channels section, potential participants responding to survey questions include: Executives, BI Business SMEs, BI Technical SMEs, Non-Managerial Users, and 3rd Party Experts.

While there will be some questions that are relevant to only a specific group, there are other questions that should be asked of each participating group. For example, a statement like, “The Strategy of our BI program is well aligned with our corporate strategy,” is best answered by executives and business SMEs. But asking non-managerial users to respond to this statement is likely not productive for the simple reason that they may not know what the corporate or BI program strategies are. However, a statement like, “Employees are given access to the information they need to perform their duties,” could be asked to each participating group. For non-managerial users you may want to consider changing the wording slightly. Instead of starting with “Employees are given…,” you may consider using, “I am given….”

Repeating key questions between groups provides the research team a means to compare and contrast perspectives of different organizational communities. In the previous example, asking executives, IT SMEs, and non-managerial users to respond to the statement can provide important insight. It is entirely possible that the IT SMEs believe employees get access to all the information they need, whereas, executives might believe that access to data is limited without involving IT. This type of insight provides the assessment team guidance in maturity level and demonstrates a lack of: 1) IT’s understanding of business needs, or 2) Executive’s lack of knowledge of how to access the data. It represents a disconnect between IT and business which always infers a lower level of maturity.

Executive InterviewsConducted correctly, executive interviews provide valuable insight into the vision and direction of your organization and the impact that information and IT-centric applications have on its ability to compete. The operative word is ‘correctly.’ Many assessment efforts, including those from high-end business consulting firms, conduct executive interviews almost as ad hoc, information trolling efforts. And once complete, all the notes scribed during the interviews are consolidated (assuming a scribe was dedicated to the effort), interviewers review their notes and provide their perspective in a black-box approach in rating issues within the company. This approach reduces executive interviews to anecdotal guidance regarding the challenges facing the organization, but little else. Structured executive interviews are the only professional approach.

Page 37: Big Data Predictions ebook

PAGE 37

Comprehensive Analytic Maturity Assessment

Structured Interview Instruments

The worst use of an executive’s time is conducting an ad hoc interview. Executive interviews must be planned and scripted. To that end, there are three types of questions that must be crafted in a single interview to extract the maximum value of the executive’s time and insight.

Shown in Table 4 are the three types of questions that should be a part of any executive interview: Short Answer, Open Response, and Single Answer. The questions provide executives an opportunity to share information, but in a structured, guided format. If your assessment consulting firm suggests a few leading questions are necessary to get an interview session started and then let the executive ramble on and share whatever is on their mind, fire them!

Our objective is to gain the insight and perspective of the executive office with regard to BI-centric issues and their impact on the company’s goals and objectives. Moreover, our interview should be conducted in a style that ensures a means to quantify comparisons between executives as well as contrast the executive office with other important user communities.

To that end, this author has successfully created a structured interview form that is followed to maximize the time with an executive. Refer to Appendix B for an example. The format of any interview follows the question types outlined in Table 4. We start with pertinent, short answer responses to specific questions. The questions will vary between companies based on industry, culture, geographic region, etc., but they are all focused on BI-enabled competitive advantage. After 10 to 15 short answer questions, an open response question is offered. The strategy of asking short answer questions before an open ended question is simple: if your short answer questions are well crafted, the executive will have little to add when given the opportunity to share anything they believe we have missed or that they want to make sure to re-emphasize in the open ended question. This is an important measure of the effectiveness of the structured interview: How many notes are taken when

Table 4. Executive Structured Interview Question

Page 38: Big Data Predictions ebook

PAGE 38

Comprehensive Analytic Maturity Assessment

an executive is offered the opportunity to respond to an open ended question? If few notes are taken, you can infer that you are asking a good combination of short answer questions. If the open ended invitation generates significant notes, it means you need to adjust the short response questions. Your goal should be to ask the right combination of short answer questions that minimize open end response notes. The end of your interview should be a series of single response questions. In other words, questions that require one response, typically on a Likert4 scale, e.g. Strongly Disagree, Disagree, Uncertain, Agree, or Strongly Agree. This list of questions must be similar to questions on surveys for SMEs and end users. This will allow the assessment team to quantify, compare, and contrast how the executives respond versus other key user communities.

Selecting and Scheduling Interview

Interviews are the most time consuming information channel of an assessment. Not in terms of the interview itself, but in terms of scheduling. Since an executive’s time is in demand, getting on their schedule is a challenge in itself. Consequently, it stretches out the time it will likely take to interview all participating executives. This is exacerbated when assessments are conducted around holidays or during summer vacation season. Do not underestimate the scheduling challenges.

In order to minimize the scheduling issue, some assessment teams attempt to conduct workshops where several executives are asked to participate. This author believes workshops are ill conceived ideas and a poor substitute when attempting to gather insight from the executive office. Not only do the executives have competing agendas that must be properly documented, but each represents a particular aspect of the organization, functional and business unit alike. Their perspectives must be accorded the attention and documentation warranted from such a unique, single individual.

Knowing the potential problem of scheduling time with executives, the assessment team must emphasize the following:

1. Have a properly built structured interview instrument to maximize the limited time in front of an executive.

2. Be judicious in selecting the executives to be interviewed. Driven by company politics, non-C level individuals are often added to the interview list. Consequently, assessment teams interview far too many ‘managers’ as opposed to strictly executives. As long as the number of interviews does not adversely affect the assessment timeline, adding non C-level individuals may not be a problem--but be selective. Teams should limit face-to-face interviews with C-level executives, which are few even in the largest organizations.

Conducting an InterviewThere are at least two individuals necessary from the team for each interview: an interviewer and a scribe. Armed with your structured interview, the interviewer sets the pace and guides the executive through the questions and the executive’s responses. If the interviewer asks a

Page 39: Big Data Predictions ebook

PAGE 39

Comprehensive Analytic Maturity Assessment

question that requires a short response, the interviewer must reserve the right to guide the executive back to the question at hand if they veer off topic. The interviewer must attempt to get responses for all questions of the structured instrument.

It is the scribe’s role to properly document responses for each question and, where possible, to document relevant follow-up questions/answers between the interviewer and executive.

Techniques for Analyzing Short Answer and Open Ended Questions

Most executive interviews are conducted using open ended or short answer questions that create free form text. Generally a scribe is assigned to the interview team in order to take notes or the interviews are recorded for later transcription. Put simply: interviews generate significant text. And when interviews are conducted without a means for quantifying the results, this author believes they are essentially a waste of time. Any insight gleaned will be anecdotal at best and require a black-box process to score.

It is always recommended that you conduct executive interviews with a clear understanding of how their free form responses will be scored and incorporated into the overall assessment. From this author’s perspective, interviews should be designed and information collected with the following method in mind: Text Mining for key words and phrases.

Text mining provides an effective means of quantifying free form responses. Text mining processes can

range from simple word clouds (see Figure 3) to semantic examination. This type of mining technique allows the assessment team to quantify the following:

• An individual executive’s key concerns across all subjects queried

• Compare and contrast executives and their functional and business units

• Overall executive perspective per short answer question. For example, Figure 3 shows a word cloud with regard to the technology used to analyze data across all executives interviewed

Figure 3. Executive Survey Technology Word Cloud

Page 40: Big Data Predictions ebook

PAGE 40

Comprehensive Analytic Maturity Assessment

Techniques for Analyzing Executive Survey Responses

The single response survey question section of a structured executive interview will be designed similar to those survey questions on the SME and end user surveys. Doing so allows your assessment team to identify a pool of questions to be asked across all participating survey communities, including executives, SMEs, and end users. Asking similar questions across the communities provides a unique assessment perspective; specifically, are there differences in responses between participating communities? For example, do executives believe end users have access to all the data they need versus the end user perspective?

Single response questions should be scored and compared between the following respondent groups:

• Individually between executives and their functional and business units

• As an aggregate of ‘executive’ scores compared to SMEs and end user communities

For more information about survey design and analysis, please refer to Appendix A.

SME SurveyThe SME survey will always be the most comprehensive. This user community represents those with the most intimate knowledge of the analytic ecosystem, the data, and the scope of analysis. Consequently, surveys crafted for this group should be comprehensive, covering a broad range of topics.

Creating the Survey

This author recommends executing online surveys. Refer to Appendix A for more information about how to create and execute surveys. Appendix C provides an example of a detailed SME survey.

The SME survey will contain the entire scope of single response and short answer questions. From this survey, subsets of questions are duplicated and reworded for the other survey audiences. For example, single response questions about leadership and executive perspective of the BI program for SMEs are duplicated in the structured interview for executives. And questions asked about users getting the information and support they need are duplicated in the end user surveys.

Consequently, the SME survey must represent the scope of relevant questions that are correlated between all survey participants. It not only represents the entire pool of single and short response questions, but serves as the point of rationalization of questions to ask across all channels of the assessment.

Page 41: Big Data Predictions ebook

PAGE 41

Comprehensive Analytic Maturity Assessment

Identify SME Participants

The assessment team should select the participants for the SME survey. Since this is a comprehensive questionnaire, it can only be submitted to individuals intimate with many aspects of the BI program. Consequently, the audience will be small, less than 20. Many of the assessment team members from the organization will likely be candidates for this survey.

Techniques for Analyzing Survey Responses for SMEs

Since the SMEs participating in the survey are considered your organization’s experts regarding BI, data warehousing, and analytics, they are uniquely qualified to respond to more detailed types of questions and longer surveys. A SME survey often covers the entire scope of dimensions being assessed (refer to Table 1), including: Leadership, Value & Risk, Infrastructure, and Skill.

While the detail and number of questions being asked to SMEs is different than those single responses included in the executive and end user surveys, how questions are constructed and scored are the same. For more information regarding the design and analysis of survey questions, refer to Appendix A.

End User SurveyEnd user surveys are often referred to as non-management or front-end user surveys. These participants represent the consumers of analytic programs (or “the analytic program”). This user community can help assess the value of the reports (in terms of content, quality, and timeliness) they receive as well as the provided training and support.

Creating the Survey

This author recommends executing online surveys; refer to Appendix A for more information about how to create and execute surveys. Appendix D provides an example of an end user survey.

The content of this survey must be reduced to a handful of questions that you expect end user communities to be able to answer. While these questions are duplicates of specific questions in the SME survey, they must be selected and reworded for easy understanding of any end user consumer of BI-centric reporting or analysis.

There are unique considerations necessary when crafting end user surveys, specifically regional diversity, language and word choices, and legal constraints. To start, this author highly recommends that nonmanagement surveys are approved by the legal or compliance department. Next, it is important for the assessment team to consider language translation and what the word choices might invoke when crafting final questions.

Page 42: Big Data Predictions ebook

PAGE 42

Comprehensive Analytic Maturity Assessment

Identify End User Participants Using Sampling

This is unique to non-management surveys. Since end users potentially represent the largest consumers of analytic-centric output, it is often difficult or impractical to survey the entire population. For example, your program might have 20,000 registered users across multiple global regions and encompass several business and functional units. Consequently, the assessment team must leverage statistical methods to ensure a representative group of this large community is identified to participate.

To that end, this author recommends that the assessment team select participants based on the following steps:

1. 1. Company team members must identify the total population of non-management users who are consumers of the analytic program output. For example, we need to identify those who receive one or more of the reports produced by the analytic program. Or, the program publishes a sales dashboard to 5,000 sales representatives, covering three geographic regions including North America, Asia, and Europe, and encompassing our manufacturing and distribution business units.

2. 2. Once a total population of non-managerial consumers of analytic-centric output is quantified, a stratified random sample should be executed. This sample must include consideration for the following:

• What are the strata we want to consider in this assessment? This is a company decision. The company might have a heavy presence in a single geographic region or most of their revenue may be represented in a single business unit. All these aspects define your organization and must be considered when identifying the strata to be represented in the assessment.

• Once you’ve defined the strata, the total population of end users can be identified and serve as the basis of a random sample.

• A redacted list of the total potential survey respondents is consolidated into a file that contains the geographic region, Business Unit, and Functional Unit in which each respondent works.

• Using a tool such as IBM SPSS Statistics, a random sample is executed with the final survey participants selected.

3. 3. Identified participants are invited to a pre-survey workshop.

Stratified random sampling is not a trivial process, assuming you want to have some confidence in the inferences made on the survey responses. From this author’s perspective, if the assessment team does not have the skill to properly conduct this sampling, the team must reach outside the group to ensure a skilled resource can assist.

Page 43: Big Data Predictions ebook

PAGE 43

Comprehensive Analytic Maturity Assessment

Conducting Pre-Survey Workshop

While the objective of a pre-survey workshop is similar to other surveys conducted, there are unique challenges in conducting a workshop for the non-management participants. Most of the complexity is a direct result of regional diversity, time zones, and sheer number of participants.

It is important that the assessment team does not underestimate the level of preparation necessary for large end user audiences. While global differences have obvious implications in terms of time zones and languages, these issues might also be experienced within a single country. There are several countries with multiple national languages. This means that your team may need to communicate/survey employees in all recognized national languages. This might be necessary to adhere to legal requirements or even union regulations.

Executing the End User Survey

By design, end user surveys are much smaller in scope and therefore, fewer questions are asked. It is recommended that these surveys be designed to be completed within 15 to 20 minutes, limiting the survey to about 15 questions.

Since there are only a limited number of questions, you only need to keep the survey active for sufficient time for participants to respond. It is recommended that you pick a 48 hour window for end users to respond. And while providing a weekend for SMEs to consider responses, we should never expect end users to work over their days off.

Techniques for Analyzing End User Survey Responses

In order to compare and contrast between executives, SMEs, and end user responses, the questions must be similar. While the wording of questions may vary to accommodate the different participating communities, the construction of the question and how it will be scored and incorporated into the assessment must be similar.

Refer to Appendix A for more information regarding the design and analysis of survey questions.

Project PlanThe project plan is based on the breadth of assessment undertaken. The Assessment Information Channels section of this guide identifies several areas and information sources that can be included in an assessment, referred to as Information Channels. The more channels included, the more comprehensive the study and the higher the confidence you will have in its findings. Information Channels include:

• Executive Interviews

• Key Subject Matter Expert Surveys

• Non-management User Surveys

Page 44: Big Data Predictions ebook

PAGE 44

Comprehensive Analytic Maturity Assessment

• Application Footprint Inventory

• Data Architecture Inventory

• Technology Inventory

For example, an assessment that includes all the information channels above and is conducted with full support of the organization can be executed in 8 to 10 weeks. Appendix E details a comprehensive study conducted in 8 weeks. The plan in Appendix E can be summarized by week as shown in Table 5.

On the other hand, if the organization only wants executives interviewed and a quick review of data and technical architecture, the assessment might be able to be completed within 5 to 6 weeks.

The number of Information Channels to be included in the scope of the assessment determines the length of the assessment. Other aspects of the effort can also contribute to the project duration, including:

• Availability of assessment team members from the organization. We generally recommend that team members from the client-side should be prepared to dedicate 50% to 75% of their time for the duration of the assessment.

• Availability of participants identified. For example, executives have busy schedules. If the organization’s executives are committed to participating, they will open their schedules to accommodate the study. On the other hand, executives can create scheduling delays that extend the project duration.

Assessments should not be excessively long or unnecessarily complex. A robust study can be executed relatively quickly, in as little as 8 weeks, assuming the commitment from top management is strong and consistent.

Table 5. Sample Plan Weekly Summary

Page 45: Big Data Predictions ebook

PAGE 45

Comprehensive Analytic Maturity Assessment

Appendix A – Conducting Surveys and Analyzing ResponsesWhen conducting surveys associated with the maturity assessment, the following steps are highly recommended by this author.

Conducting Surveys

There are 5 key steps for executing any survey, including:

1. Create the survey based on the target audience and integration to the overall assessment

2. Select survey participants

3. Conduct a short pre-survey workshop

4. Conduct survey

5. Analyze results and incorporate them into the overall findings

This author recommends that surveys are conducted online. Doing so affords the following benefits:

1. Online surveys allow the research team to collapse the overall assessment time since online surveys can be conducted for different respondent audiences at the same time as other Information Channels are executed, that is, other surveys, executive interviews, data architecture inventory, etc.

2. It represents an individual’s perspective, un-influenced by their boss or peers. Taking an online survey is a personal event. There is no outside bias introduced while the respondent considers their answers.

3. SME surveys can be executed anonymously, as it is likely to entice more frank response from participants, further enhancing their contribution to the overall assessment.

4. Sufficient time should be provided to respondents. Since the SME surveys are the most comprehensive, there are often 60 to 100 questions to answer, covering a broad array of topics. Consequently, it is important that the respondents are provided sufficient time to consider their answers. This research recommends that SME participants be given access to the survey on Thursday and it remains active until close of business the following Monday or Tuesday. This gives them enough time to carefully weight their responses.

Unless a company already has survey software at their disposal, it is highly recommended that the use of existing survey services, such as SurveyMonkey, be employed. Not only will your organization save significant time and money compared to developing your own survey tools, but you will be able to conduct the survey quickly, thus collapsing the overall assessment time.

Page 46: Big Data Predictions ebook

PAGE 46

Comprehensive Analytic Maturity Assessment

Techniques for Analyzing Survey Responses

While there are several methods for creating and scoring surveys, this author highly recommends adopting a Likert scale. This is one of the most popular methods for crafting and analyzing surveys. It is associated with responses that most of us are familiar. There are three common Likert scales, including 3-point, 5-point, and 7-point. Table 6 outlines response examples for each scale.

Aside from the obvious differences between the scales, there are other factors that you need to consider. For example, a 3-point scale is the easiest for participants simply because there are only three choices. But what is easier for respondents does not give researchers much information about the perspective of respondents. A 7-point scale provides considerably more detail about the perspective of participants but is more difficult to use. The number of choices requires more thought. This author believes that a 5-point scale provides the right level of detail without being overly difficult for various survey communities to complete.

The best practice for analyzing Likert surveys is to break it down into two distinct approaches for different aspects of the survey as follows:

1. Aggregate Level of Dimensions/Constructs. This approach uses analysis of means. In other words, the researcher simply calculates the mean value of all responses for a particular construct that is comprised of two or more specific questions. This is often the type of results reported from surveys; for example, the average response for all Technical Architecture questions was 4.1, which on a scale of 1 to 5 (assuming a 5-point scale where 1 is negative and 5 is positive), the reader could infer a relatively positive perspective from respondents.

Analysis of means is not only aggregation of means across dimensions but also between respondents and participant communities. Proper analysis would attempt to identify statistically significant differences between respondents based on their mean scores.

2. At the Question Level. This approach uses Range and Mode analysis. This analysis provides guidance of the shape of the the data in terms of providing a measure of center (Median and Mode) and a measure of dispersion (Range).

• Range is used as an estimate for a statistical dispersion of data, specifically:

• High value suggests a wide variety of opinions within a group of respondents. Range scores of 4 and 3 are indicative of wide variability of opinion.

Table 6. Likert Scales

Page 47: Big Data Predictions ebook

PAGE 47

Comprehensive Analytic Maturity Assessment

• Low value is associated with consistency between respondents. Range scores of 2 and 1 are considered relatively consistent responses.

• Zero means complete agreement among respondents.

• Mode identifies the answer that was mentioned more often than others. It is significant if one or two mode responses exist for a single question.

These are two distinct methods for analyzing Likert-based surveys. A common mistake by many researchers is using means analysis for specific questions. If your consulting firm only uses means analysis, then you should challenge their approach if the survey is Likert-based.

Page 48: Big Data Predictions ebook

PAGE 48

Comprehensive Analytic Maturity Assessment

Appendix B – Sample of Executive Structured Interview

Figure 4. Executive Structured Interview Sample

Page 49: Big Data Predictions ebook

PAGE 49

Comprehensive Analytic Maturity Assessment

Appendix C – Sample Survey Questions

Table 7. Sample Survey Questions

Page 50: Big Data Predictions ebook

PAGE 50

Comprehensive Analytic Maturity Assessment

Appendix D – Sample End User SurveyThe end user survey should have questions that overlap with other surveys like the self-assessment and executive interviews. This affords cross-analysis in an effort to discern if the organization has a consistent message among communities or there is a significant disconnect.

Table 8. Sample End User Survey

Page 51: Big Data Predictions ebook

PAGE 51

Comprehensive Analytic Maturity Assessment

Appendix E – Sample Project Plan

Table 9. Sample Project Plan

Page 52: Big Data Predictions ebook

PAGE 52

Comprehensive Analytic Maturity Assessment

Appendix F - ReferencesBharadwaj, A. S. (2000). A resource-based perspective on information technology capability and firm performance: An empirical investigation. MIS Quarterly, 24(1), 169-196.

Bhatt, G. D., & Grover, V. (2005). Types of information technology capabilities and their role in competitive advantage: An empirical study. Journal of Management Information Systems, 22(2), 253-277.

Dehning, B., & Stratopoulos, T. (2003). Determinants of a sustainable competitive advantage due to an ITenabled strategy. Journal of Strategic Information Systems, 12(1), 7-28.

Gartner (2009). Key roles for successful BI/DW delivery; Business Intelligence solution architect. Gartner.

Gonzales, M.L. (2012). Competitive Advantage Factors and Diffusion of Business Intelligence and Data Warehousing, The University of Texas at El Paso.

Gonzales, M. L., Bagchi, K., Udo, G., & Kirs, P. (2011). Diffusion of business intelligence and data warehousing: An exploratory investigation of research and practice. 44th Hawaii International Conference on System Sciences.

Gonzales, M. L., Mahmood, M. A., & Gemoets, L. (2009). Technology-enabled competitive advantage: Leadership, skill and infrastructure. Decision Science Institute.

Gonzales, M. L., Mahmood, M. A., Gemoets, L., &Hall, L. (2009). Risk and IT factors that contribute to competitive advantage and corporate performance. Americas Conference on Information Systems, San Francisco, CA.

Gonzales, M. L., & Wells, D. L. (2006). BI strategy: How to create and document. El Paso, TX: HandsOn-BI, LLC.

IDC (2003). Leveraging the foundations of wisdom: The financial impact of business analytics. IDC.

Inmon, W. H. (1992). Building the data warehouse, 1st Edition. Boston, MA: QED Technical Pub. Group.

Johannessen, J., & Olsen, B. (2003). Knowledge management and sustainable competitive advantages: The impact of dynamic contextual training. International Journal of Information and Management, 23(4), 277-289.

Oh & Pinsonneault. (2007). On the assessment of the strategic value of information technologies: Conceptual and analytical approaches. MIS Quarterly, 31(2), 239-265.

Piccoli, G., & Ives, B. (2005). IT-dependent strategic initiatives and sustained competitive advantage: A review and synthesis of the literature. MIS Quarterly, 29(4), 749-776.

Porter, M. E. (1979). How competitive forces shape strategy. Harvard Business Review, 57(2), 137-145.

Page 53: Big Data Predictions ebook

PAGE 53

Comprehensive Analytic Maturity Assessment

Porter, M. E. (1980). Competitive strategy. New York, NY: The Free Press.

Porter, M. E. (1998). Competitive advantage: Creating and sustaining superior performance. New York, NY: The Free Press.

Ramiller, N. C., Swanson, E. B., & Wang, P. (2008) Research directions in information systems: Toward an institutional ecology. Journal of the Association for Information Systems, 9(1), 1-22.

Ross, J. W., & Beath, C. M. (2002). Beyond the business case: New approaches to IT investment. MIT Sloan Management Review, 43(2), 21-24.

Sambamurthy, V. (2000). Business strategy in hypercompetitive environments: rethinking the logic of IT differentiation. In: R. W. Zmud, Framing the domains of IT management (pp. 245-261). Cincinnati, OH: Pinnaflex Educational Resources.

Sambamurthy, V., Bharadwaj, A., & Grover, V. (2003). Shaping agility through digital options: Reconceptualizing the role of information technology in contemporary firms. MIS Quarterly, 27(2), 237-263.

Santhanam, R., & Hartono, E. (2003). Issues in linking information technology capability to firm performance. MIS Quarterly, 27(1), 125-153.

Turban, E., Aronson, J. E., Liang, T., & Sharda, R. (2007). Decision support and business intelligence systems (8th Ed). NJ: Prentice Hall.

Watson, H. J., Goodhue, D. L., & Wixom, B. H. (2002). The benefits of data warehousing: Why some organizations realize exceptional payoffs. Information & Management, 39(6), 491-502.

Weill, P., & Broadbent, M. (2000). Managing IT infrastructure: A Strategic Choice. Cincinnati, Ohio: Pinnaflex Educational Resources.

Weill, P., Subramani, M., & Broadbent, M. (2002). Building IT infrastructure for strategic agility. MIT Sloan Management Review, 44(1), 57-65.

Wixom, B. H., & Watson, H. J. (2001). An empirical investigation of the factors affecting data warehousing success. MIS Quarterly, 25(1), 17-41.

About ProlificsProlifics creates a competitive advantage for organizations around the world by implementing customized, end-to-end IT solutions that achieve business success, leveraging IBM, Microsoft and Open Source technologies in a global delivery model. For more than 35 years, the company’s technology expertise, industry-specific insights and certified technology accelerators have transformed organizations around the world by solving complex IT challenges.

For more information, visit http://www.prolifics.com.

Email us at: [email protected]

Page 54: Big Data Predictions ebook

PAGE 54

Check out additional content on Data InformedFind other articles like these and more at Data Informed: data-informed.com

Data Informed gives decision makers perspective on how they can apply big data concepts and technologies to their business needs. With original insight, ideas, and advice, we also explain the potential risks and benefits of introducing new data technology into existing data systems. Follow us on Twitter, @data_informed

Data Informed is the leading resource for business and IT professionals looking for expert insight and best practices to plan and implement their data analytics and management strategies. Data Informed is an online publication produced by Wellesley Information Services, a publishing and training organization that supports business and IT professionals worldwide. © 2015 Wellesley Information Services.