35
© Nube Technologies Better decisions through better data

Reifier

Embed Size (px)

Citation preview

Page 1: Reifier

© Nube Technologies

Better decisions through better data

Page 2: Reifier

© Nube Technologies

About Myself and Nube

- Big data - Hadoop, Spark- Analytics, Data wrangling, Machine Learning- Nube Products - Reifier, Crux and HIHO - IIT Delhi, 98. - Cofounder from IIT Kanpur, 97

Page 3: Reifier

© Nube Technologies

Business Data is spread across many systems● Discovering information a challenge - which are the

entities whom we need to address?● Consolidating information a challenge - not sure if the

data is tied back to a single entity● Enhancing data a challenge - are these new records

genuine or do they already exist?

Business Challenges

Page 4: Reifier

© Nube Technologies

The problem - lake or swamp?According to Gartner, businesses lose upto 25% of potential revenue due to lack of multichannel view of data. 67% data scientists say cleaning, organizing and linking data is their most time consuming task, and 52.3% cite poor data quality as their biggest challenge.

Page 5: Reifier

© Nube Technologies

● Data volumes are high● Each record has multiple dimensions● Exact matches are rare● Comparing each record with every other is not possible● There are many disparate systems● Languages have unique issues

Technical Challenges for Matching

Page 6: Reifier

© Nube Technologies

● Discovering and maintaining rules for data quality is extremely tough

● Custom coding and domain specific logic makes maintenance a nightmare

● No one size fits all, big custom implementations needed every time even after using existing tools

Technical Challenges for Matching

Page 7: Reifier

© Nube Technologies

● Point and Shoot - Zero config● Learns similarity definitions from data● No hard coding of business rules● Highly scalable - runs on open source Apache Spark● Advanced Machine Learning algorithms pick most

optimal solution● Domain agnostic, can work with various kinds of data● Utilities to create labeled data available - just point it to

the data

Reifier Advantages

Page 8: Reifier

© Nube Technologies

● Handles different languages - English, Chinese, Japanese

● Highly accurate results● Available as a library or as a private/public cloud

deployment● REST interface● AJAX based web front end● Real time as well as batch support● Support and Documentation through web based support

portal http://reifier.freshdesk.com

Reifier Advantages

Page 9: Reifier

© Nube Technologies

Customer Feedback

Before Reifer we had to use a lot of manual efforts to identify potential duplicates in customer data, now the system can learn patterns and find duplicates for us

intelligently. It’s a breakthrough to a long-standing issue of our businesses.”

- Mr. Dave Chan, Regional Director Business Intelligence, UBM Asia

Page 10: Reifier

© Nube Technologies

Case Study - UBM Asia

- Deduplication of marketing data- Combination of English, Chinese, Japanese

and other languages- Upto 1 million new records per week- Temp can do only about 800 records per day- AWS Hosted, yearly license- Reference customer

Page 11: Reifier

© Nube Technologies

Case Study - Government of India

- Invited for data matching for intelligence agencies

- Reifier outperformed leading international competition 2x on accuracy and >10x for speed

- Matched 40million records

Page 12: Reifier

© Nube Technologies

A banking institution uses Reifier to run loan applications against credit listing data to ensure that they are not dealing with blacklisted individuals and corporates.

Case Study - BFSI

Page 13: Reifier

© Nube Technologies

Case Study - BFSIA leading insurance provider uses Reifier to prevent fraudulent claims. By creating a centralized consolidated data repository, the company reduces overexposure of an individual who has multiple policies. By matching records, Reifier also helps find out average policy per individual and household.

Page 14: Reifier

© Nube Technologies

A credit rating company utilizes Reifier to consolidate personal credit histories from different sources and provide accurate ratings to their customers.

Case Study - BFSI

Page 15: Reifier

© Nube Technologies

A telecom company offers various products and services and wants to cross sell to existing customers. Existing information is fuzzily matched for accurate customer segmentation and marketing.

Case Study - Cross Selling

Page 16: Reifier

© Nube Technologies

Case Study - RegulatoryRegulatory compliance of all kinds - including related to policies, taxes, privacy, anti terror, and anti money-laundering - require matching up data pulled from a variety of sources. With Reifier, organizations can meet regulatory mandates with capabilities that support everything from simple deduplication of customer lists to matching data against government lists of suspected terrorists.

Page 17: Reifier

© Nube Technologies

A services company sources organization and people data from LinkedIn and Crunchbase and uses Reifier to match existing in house entities to identify leads.

Case Study - Lead Generation

Page 18: Reifier

© Nube Technologies

By consolidating vendor information from different geographies, source systems and channels, a retail operator gets a complete view of its supply chain and it able to garner better deals and discounts from its vendors. Reifier helps in cutting costs for the retailer.

Case Study - Retail Operations

Page 19: Reifier

© Nube Technologies

Case Study - Telecom

Using Reifier, telecom companies can detect delinquency patterns by identifying non paying customers who evade detection by enrolling with give similar sounding names and addresses with different formatting and spellings.

Page 20: Reifier

© Nube Technologies

A local search company lists millions of regional businesses, restaurants and contacts. They periodically crawl the web to update their listing database. Information crawled from the web have similar entries found from different websites and also with pre-existing entries in the database. Reifier helps the search company compare their existing listings with potential listings from the crawled data, and keeps their directory up to date and free from duplicate data.

Case Study - Directory Service

Page 21: Reifier

© Nube Technologies

Case Study - Ecommerce

Matching for competitive pricing and catalog enrichment

Page 22: Reifier

© Nube Technologies

Reifier News

Invited to present at Strata Hadoop World 2015, Singapore.

Page 23: Reifier

© Nube Technologies

Reifier - News

Reifier presented at Spark Summit 2015, SFO, USA.

Page 24: Reifier

© Nube Technologies

Reifier Technology presented at Spark Summit, 2014 at San Francisco, USA

Reifier - News

Page 25: Reifier

© Nube Technologies

Reifier - News

● Reifier 1.0 released in October 2014 with one international paying customer.

● Reifier 2.0 with interactive web GUI released March 2015.

● GOI POC in Aug - Sep 2015● Working on real time matching, merging,

GUI enhancements.

Page 26: Reifier

© Nube Technologies

Reifier - News

Page 27: Reifier

© Nube Technologies

Part of MapR App Gallery

Reifier Industry Validation

Page 28: Reifier

© Nube Technologies

Covered in Databricks as a leading machine learning tool

Reifier Industry Validation

Page 29: Reifier

© Nube Technologies

Part of Databricks Certified on Spark Apps

Reifier Industry Validation

Page 30: Reifier

© Nube Technologies

● Accept or create training data with marked duplicates

● Identify similarity and indexing rules through Machine Learning

● Group near similar records together● Match and predict similar records

Reifier Technology

Page 31: Reifier

© Nube Technologies

Reifier - learn

Page 32: Reifier

© Nube Technologies

Reifier - learn

Page 33: Reifier

© Nube Technologies

Reifier - learn

Page 34: Reifier

© Nube Technologies

● Built using open source● Apache Spark● ElasticSearch● Machine Learning● Java● Scala

Reifier Under The Hood

Page 35: Reifier

© Nube Technologies

Thanks for your time, please feel free to write to [email protected] for more details.

Thank You