Upload
sonal-goyal
View
548
Download
0
Embed Size (px)
Citation preview
© Nube Technologies
Better decisions through better data
© Nube Technologies
About Myself and Nube
- Big data - Hadoop, Spark- Analytics, Data wrangling, Machine Learning- Nube Products - Reifier, Crux and HIHO - IIT Delhi, 98. - Cofounder from IIT Kanpur, 97
© Nube Technologies
Business Data is spread across many systems● Discovering information a challenge - which are the
entities whom we need to address?● Consolidating information a challenge - not sure if the
data is tied back to a single entity● Enhancing data a challenge - are these new records
genuine or do they already exist?
Business Challenges
© Nube Technologies
The problem - lake or swamp?According to Gartner, businesses lose upto 25% of potential revenue due to lack of multichannel view of data. 67% data scientists say cleaning, organizing and linking data is their most time consuming task, and 52.3% cite poor data quality as their biggest challenge.
© Nube Technologies
● Data volumes are high● Each record has multiple dimensions● Exact matches are rare● Comparing each record with every other is not possible● There are many disparate systems● Languages have unique issues
Technical Challenges for Matching
© Nube Technologies
● Discovering and maintaining rules for data quality is extremely tough
● Custom coding and domain specific logic makes maintenance a nightmare
● No one size fits all, big custom implementations needed every time even after using existing tools
Technical Challenges for Matching
© Nube Technologies
● Point and Shoot - Zero config● Learns similarity definitions from data● No hard coding of business rules● Highly scalable - runs on open source Apache Spark● Advanced Machine Learning algorithms pick most
optimal solution● Domain agnostic, can work with various kinds of data● Utilities to create labeled data available - just point it to
the data
Reifier Advantages
© Nube Technologies
● Handles different languages - English, Chinese, Japanese
● Highly accurate results● Available as a library or as a private/public cloud
deployment● REST interface● AJAX based web front end● Real time as well as batch support● Support and Documentation through web based support
portal http://reifier.freshdesk.com
Reifier Advantages
© Nube Technologies
Customer Feedback
Before Reifer we had to use a lot of manual efforts to identify potential duplicates in customer data, now the system can learn patterns and find duplicates for us
intelligently. It’s a breakthrough to a long-standing issue of our businesses.”
- Mr. Dave Chan, Regional Director Business Intelligence, UBM Asia
© Nube Technologies
Case Study - UBM Asia
- Deduplication of marketing data- Combination of English, Chinese, Japanese
and other languages- Upto 1 million new records per week- Temp can do only about 800 records per day- AWS Hosted, yearly license- Reference customer
© Nube Technologies
Case Study - Government of India
- Invited for data matching for intelligence agencies
- Reifier outperformed leading international competition 2x on accuracy and >10x for speed
- Matched 40million records
© Nube Technologies
A banking institution uses Reifier to run loan applications against credit listing data to ensure that they are not dealing with blacklisted individuals and corporates.
Case Study - BFSI
© Nube Technologies
Case Study - BFSIA leading insurance provider uses Reifier to prevent fraudulent claims. By creating a centralized consolidated data repository, the company reduces overexposure of an individual who has multiple policies. By matching records, Reifier also helps find out average policy per individual and household.
© Nube Technologies
A credit rating company utilizes Reifier to consolidate personal credit histories from different sources and provide accurate ratings to their customers.
Case Study - BFSI
© Nube Technologies
A telecom company offers various products and services and wants to cross sell to existing customers. Existing information is fuzzily matched for accurate customer segmentation and marketing.
Case Study - Cross Selling
© Nube Technologies
Case Study - RegulatoryRegulatory compliance of all kinds - including related to policies, taxes, privacy, anti terror, and anti money-laundering - require matching up data pulled from a variety of sources. With Reifier, organizations can meet regulatory mandates with capabilities that support everything from simple deduplication of customer lists to matching data against government lists of suspected terrorists.
© Nube Technologies
A services company sources organization and people data from LinkedIn and Crunchbase and uses Reifier to match existing in house entities to identify leads.
Case Study - Lead Generation
© Nube Technologies
By consolidating vendor information from different geographies, source systems and channels, a retail operator gets a complete view of its supply chain and it able to garner better deals and discounts from its vendors. Reifier helps in cutting costs for the retailer.
Case Study - Retail Operations
© Nube Technologies
Case Study - Telecom
Using Reifier, telecom companies can detect delinquency patterns by identifying non paying customers who evade detection by enrolling with give similar sounding names and addresses with different formatting and spellings.
© Nube Technologies
A local search company lists millions of regional businesses, restaurants and contacts. They periodically crawl the web to update their listing database. Information crawled from the web have similar entries found from different websites and also with pre-existing entries in the database. Reifier helps the search company compare their existing listings with potential listings from the crawled data, and keeps their directory up to date and free from duplicate data.
Case Study - Directory Service
© Nube Technologies
Case Study - Ecommerce
Matching for competitive pricing and catalog enrichment
© Nube Technologies
Reifier News
Invited to present at Strata Hadoop World 2015, Singapore.
© Nube Technologies
Reifier - News
Reifier presented at Spark Summit 2015, SFO, USA.
© Nube Technologies
Reifier Technology presented at Spark Summit, 2014 at San Francisco, USA
Reifier - News
© Nube Technologies
Reifier - News
● Reifier 1.0 released in October 2014 with one international paying customer.
● Reifier 2.0 with interactive web GUI released March 2015.
● GOI POC in Aug - Sep 2015● Working on real time matching, merging,
GUI enhancements.
© Nube Technologies
Reifier - News
© Nube Technologies
Part of MapR App Gallery
Reifier Industry Validation
© Nube Technologies
Covered in Databricks as a leading machine learning tool
Reifier Industry Validation
© Nube Technologies
Part of Databricks Certified on Spark Apps
Reifier Industry Validation
© Nube Technologies
● Accept or create training data with marked duplicates
● Identify similarity and indexing rules through Machine Learning
● Group near similar records together● Match and predict similar records
Reifier Technology
© Nube Technologies
Reifier - learn
© Nube Technologies
Reifier - learn
© Nube Technologies
Reifier - learn
© Nube Technologies
● Built using open source● Apache Spark● ElasticSearch● Machine Learning● Java● Scala
Reifier Under The Hood
© Nube Technologies
Thanks for your time, please feel free to write to [email protected] for more details.
Thank You