5
White Paper Understanding the Anametrix Cloud-based Analytics Platform Leveraging a Multi-Tenant Architecture

Understanding the Anametrix Cloud-based Analytics Platform

Embed Size (px)

DESCRIPTION

Anametrix is a distributed data acquisition, processing and visualization platform that allows structured and unstructured data to be made available for reporting, visualization and data federation. To meet the extreme demands of its clients, Anametrix operates a cloud-based multi-tenant analytics platform that allows clients to gain analytical capabilities without upfront costs and investments in server and processing infrastructure.This white paper explains the patent-pending technology that makes the Anametrix platform fast, scalable and secure for any type of application.

Citation preview

Page 1: Understanding the Anametrix Cloud-based Analytics Platform

White Paper

Understanding the Anametrix Cloud-based Analytics PlatformLeveraging a Multi-Tenant Architecture

Page 2: Understanding the Anametrix Cloud-based Analytics Platform

OverviewAnametrix is a distributed data acquisition, processing and visualization platform that allows structured and unstructured

data to be made available for reporting, visualization and data federation. To meet the extreme demands of its clients,

Anametrix operates a cloud-based multi-tenant analytics platform that allows clients to gain analytical capabilities without

upfront costs and investments in server and processing infrastructure.

This white paper explains the patent-pending technology that makes the Anametrix platform fast, scalable and secure for any

type of application.

INTRODUCTION

A change in the way organizations access and manage data has created a major shift in the way software applications are

designed, built, and accessed. Today, advances in technologies such as broadband Internet access and service-oriented

architectures (SOAs) have created an environment more adept for handling and processing large amounts of data. However,

the cost inefficiencies surrounding the management of on-premises applications are also driving a transition toward the

delivery of Web-based services, or software as a service (SaaS). Anametrix utilizes a SaaS platform to deliver its robust

solution to clients around the world.

THE MULTI-TENANT ARCHITECTURE

To reduce the delivery cost of providing the same application to many different clients, a number of applications are

multi-tenant rather than single-tenant. A multi-tenant application can satisfy the needs of multiple tenants (companies or

departments within a company, etc.) using the hardware resources and staff needed to manage just a single software

instance. This allows for a dedicated set of resources to fulfill the needs of many organizations.

This unique architecture is structured in such a way that tenants using multi-tenant services operate in virtual isolation from

one another. This allows organizations to use and customize an application as though they each have a separate instance.

However, their data and customizations remain secure and insulated from the activity of all other tenants. The single

application instance effectively morphs at runtime for any particular tenant at any given time.

Multitenancy is a win-win situation to both application providers and users. Economies of scale are leveraged and the cost of

hardware resources is much less than that required by on-premise applications. As a result, a relatively small, experienced

administrative staff can efficiently manage only one stack of software and hardware, and developers can build and support

a single code base on just one platform (operating system, database, etc.) rather than many. Also, because multi-tenant

application is a single large community hosted by the provider itself, operational information from a collective user population

(which queries respond slowly, what errors happen, etc.) can be more easily obtained. This information can then be used to

make frequent improvements to the services that benefit the entire user community.

The above advantages of multitenancy allow the application provider to offer a service to end users at a much lower cost.

Some additional benefits of multitenancy include a higher degree of quality, user satisfaction, and customer retention.

Page 3: Understanding the Anametrix Cloud-based Analytics Platform

DATA ACQUISITION

Anametrix utilizes several complimentary techniques for acquiring data from the various data sources that combine into a

multi-channel data repository. (Figure A)

Primarily, three methods are used for data acquisition:

API-based connections: Anametrix uses a 3rd party API to download and integrate report data into the Anametrix data

warehouse. This typically happens on a set schedule that is determined in accordance with recommendations from the API

provider.

Batched data uploads from various sources: This is an approach that is often used for client-specific data uploads from

internal databases, end-user uploads from Anametrix tools (such as the Excel Client) or from 3rd parties. Batch uploads can

happen on demand or on a schedule.

Live web-based data acquisition: This is the preferred mode for web analytics and is also the method with least delay be-

tween data being created and reporting availability. In the web analytics scenario, client-side data collection (also known as

page tags, web beacons, pixel technology and “web bugs”) are utilized to send real-time data to the Anametrix cloud for

direct integration into the Anametrix data warehouse.

Page 4: Understanding the Anametrix Cloud-based Analytics Platform

DATA TIMELINESS

The Anametrix cloud makes acquired data available in real-time. Anametrix is always “as real-time as the source data”, mean-

ing that data will be integrated as quickly as possible within the constraints placed by third parties. In particular, certain data

sets may be finalized only once a day and will subsequently only be available to the Anametrix interface on the same sched-

ule, while others will be query-able instantly as they happen.

SESSIONIZATION, DATA CLEANSING AND STRUCTURING

Sessionization refers to how the Anametrix solution is able to order a sequence of actions or requests made by an individu-

al during the course of an interaction or “session” as part of a series of transactions made available to the Anametrix cloud.

Sessionization capabilities allow Anametrix to extract and visualize essential information contained within data streams. With

sessionization for Web Analytics, you can determine where visitors get lost or frustrated, how deeply they go into content, and

where the opportunities are for site organizational improvements. Without a sessionization method, log files and page tags

have no reliable way of determining that the individual who viewed page one is the same person who viewed page two.

To ensure that acquired data is actionable and report-ready, Anametrix will also apply a layer of data appropriate cleansing

and restructuring to data that is provided for integration. The actual amount of transformation needed varies by data source

but may involve large amounts of pre-processing for data that with low entropy (in other words, low amount of actionable

information per transaction) to direct data imports for report-ready data.

QUERIES, DATA VISUALIZATION, AND EXTRACTION

The Anametrix distributed query engine by Anametrix is a comprehensive, real-time, cloud-based data storage and retrieval

service that enables all products to provide real-time query ability for clients while leveraging a multi-tenant processing archi-

tecture.

Anametrix receives billions of rows of client-supplied data each month and continuously integrates all acquired data in data

centers. The system is responsible for handling incoming data, structuring, processing and making it available to the query

engine for instant availability to the end user.

All data that is made available is replicated across a shared distributed query system. Data integrity and safety is ensured by

an intelligent software layer that takes logical and physical parameters into account when storing data. In particular, the sys-

tem is aware of the physical characteristics of each Anametrix storage system. Data is replicated, there is no single point of

failure and data is spread evenly across servers, switches, server cabinets and data centers to guard against logical, physical,

and geographical failures.

Page 5: Understanding the Anametrix Cloud-based Analytics Platform

ConclusionsThe Anametrix approach for managed data acquisition, processing, visualization and reporting provides significant cost sav-

ings. Internet-based, shared computing platforms are attractive because they let businesses quickly access hosted, managed

software assets on demand and altogether avoid the costs and complexity associated with the purchase, installation, config-

uration, and ongoing maintenance of an on-premise data center. Dedicated hardware, software, and accompanying adminis-

trative staff are not needed and result in additional cost savings for businesses.

The Anametrix platform provides world class security, proven scalability, performance and high availability.

Anametrix continually monitors and gathers operational information from the Anametrix cloud. These are used to help drive

incremental improvements and new features that benefit existing and new clients.

ABOUT ANAMETRIXAnametrix transforms businesses with marketing analytics. We collect, analyze and make sense out of data across all marketing channels in real time to enable marketers to discover new truths about customers, prospects and the market at large. Anametrix delivers 360-degree visibility into business data to uncover new trends and hidden correlations, explore new relationships and deliver a bigger and more predictable impact on revenue. Founded in 2010 by the trailblazing web analytics team behind WebSideStory, Anametrix has headquarters in San Diego, Calif.

For more information, visit our Website, Twitter, Facebook, Google+, and our Blog.