View
216
Download
2
Category
Tags:
Preview:
Citation preview
Your Grandmother Doesn’t LikeSurprises
A case study of ANM’s Travel Site
Jeffrey Catlin – Lexalytics, Inc.Bob Pierce – Fast Search & Transfer
May 3, 2005
Overview
Project Overview Project Goals Technology Elements
Site Features Improved Search Automated Processing of Hotel Reviews
Knowledge Management in Action Sentiment / Tone capability is unique and fully automated Improvement over 1 to 5 star ratings
Customer Reaction and Futures Go Live for this site Other sites utilizing this technology
Contacts: Jeff Catlin: jeff@lexalytics.com Bob Pierce: bob.pierce@fastsearch.com
Project Overview
ANM – Associated News Media is a publisher in the UK that is leveraging it’s content to reach into Internet Applications like Travel
Project Goals: Improve Stickiness of the site, which is key to generating more add Dollars Improve and simplify the search features of the site, including sorting by a
variety of field types and making search available throughout the site Expose and Automate user reviews. Providing accurate and ready access
to user reviews improves stickiness and acceptance of the site Reduce the cost of utilizing user reviews Dramatically increase the breadth of coverage of user reviews
Project Overview
Technology Elements: ANM
Custom Application interface Utilizing FAST ESP for search features
FAST Marketrac: FAST ESP provides Application Search Features FAST Content Processing Pipeline and web spider for reviews
Lexalytics: Salience Server for Scoring hotel and travel reviews Sentiment Toolkit: Build out a travel focused Sentiment/Tone database
Knowledge Management in Action
Trustworthy User Reviews are a key to the stickiness of the site Reviews are obtained through feeds and spidering:
Feeds: IgoUgo & Fodors Spidering: tripadvisor.com & virtualtourist.com
Reviews are monitored and updated continuously and processed through the FAST Content Processing Pipeline
Automated reviews are more consistent, trusted and up to date than star ratings Unique feature Totally automated and more consistent than human ratings
Knowledge Management in Action
How does it all work? Lexalytics provides out-of-the-box sentiment tone analysis Toolkit to build scoring databases for verticals like travel, finance, security System builds up a dictionary of scored phrases that indicate good or bad
depending on the vertical it’s used for Phrase scores are determined using a training set and msn search Scores are measuring nearness of phrases with good and/or bad terms
Results in a phrase dictionary with phrases like: Sunny Day: 1.2706 Unsafe food: -0.7634
The Lexalytics Salience Server is embedded within FAST’s Marketrac product, so integration of sentiment/tone is very straightforward
Knowledge Management in Action
Looking at the scoring of an individual review: Review for Marriott Marquis “Great stay, no elevator problems”
Reviews are scored, averaged and displayed on a 1 to 10 scale
Customer Feedback
Customer is pleased with the site Goes live today (5/3/05) Tuning of the hotel scoring has allowed the customer to put their own
touch on the system, giving them a unique offering Combination of information discovery features and integrated booking
should allow ANM to compete with any of the well known travel sites.
Information Intelligence Examples
Financial news and market analysis Market intelligence portal and alerts for brokers
Pharmaceutical competitive analysis Tracking molecules, drugs and companies “in the rear-view mirror”
Intellectual property protection Content similarity analysis and alerting
Illegal e-commerce Contraband trafficking and the “whack-a-mole” problem Cracking pornography rings
Automated image analysis Chat room monitoring and alerting
Threat detection and analysis
Market Intelligence in Financial Services
Leading European financial services group Capital markets, insurance, real estate, asset management, securities Goal: Trade more competitively, create better analyst reports
Leveraged FAST ESP and FAST Marketrac Collect actionable information ahead of general market availability
Premium sources, blogs, local web sites, research reports, etc. Real-time, personalized analysis
Search domains selected by individual analysts Correlate price movements with related news Analyze news flow for market-moving potential
Communicate and act Minimal latency Profile-based SMS/e-mail alerting Automated “morning reports”
Because Timing is Money: First-mover Advantage in
Markets
Speech by CEO at Copenhagen Business School quoted by Danish news site.
When Reuters published hours later, stock moved 2%. But these traders were already done with the trades....
ACTDecideAnalyzeSearch/Gather
Accelerate The Decision Cycle
Identify
time
Deci
sion p
oin
t
ACTDecideAnalyzeDiscoverGather
Deci
sion p
oin
tBefore
After
BETTER Decisions, FASTER!
Time
Impact
poin
t
Futures
Text analysis software has matured to the point where powerful applications can be deployed at a reasonable expense and high degree of confidence
Search and Text Analysis will play an increasingly important part in Business Intelligence, High Volume Storage and Consumer Electronics Entity extraction is relatively mature and fairly high-quality Classification (subject and tone) is being deployed in real-world apps Relationships between content elements is on the short-term horizon
Intellectual Property ProtectionWhen Information, Time are the Assets
WWW
Crawler
Seed URL DBTarget Site profile
SimilarityQueries
SimilarityResults
Real-time index
API - Similarity
Check similarityDetailed similarity check
...The Wimbledonand U.S. Open
champion,seeded second,breezed past...
Document
IP Database
Similarity vector:<[wimbledon, 1][USopen,0,7][champion,0,6]…>
Document
...The Wimbledonand U.S. Open
champion,seeded second,breezed past...
Similar doc.
...The Wimbledonand U.S. Open
champion,seeded second,breezed past...
...The Wimbledonand U.S. Open
champion,
seeded second,breezed past...
Real-Time Content Analysis
Sequential analysis compareslongest common subsequenceand maximum overlap.
Detected matches
1) Article extraction from websites2) Computation of similarity primitives
Validate content anddetermine changes
Notification,Enforcement
Recommended