1
Training Sentiment Analysis Models How Amazon turns unstructured text into meaningful insight Visionary companies like Amazon are leveraging sentiment analysis models to dig beyond surface-level understandings of what people are saying and examine the nuances of how it’s being said. However, sentiment in language is a difficult thing to parse. One person’s “negative” doesn’t always match their neighbor’s, and even short phrases (”I never liked this dinky office, but I’ll be sad to leave it” ) can contain layers of nuance. Those complications are only compounded when it comes to long-form writing like feature stories and product reviews. Ideally, the most sophisticated sentiment models could deliver broad-level, composite scores for long-form content, while simultaneously sifting through individual paragraphs, sentences, and words to extract granular- -level insights. When Amazon wanted to turn that ideal into a reality, they partnered with DefinedCrowd. The Challenge The days of the quantity-focused data provider are long gone. The dawn of the quality-focused data partner has arrived. Want to learn how partnering with DefinedCrowd can unlock cutting-edge AI solutions for your business? [email protected] Follow us www.definedcrowd.ai Contact us Visit us In partnering with DefinedCrowd, Amazon benefitted from extensive data expertise, customizable workflows, and full-service data solutions that made for guaranteed quality results, even within this kind of complex data collection. Our rigorous qualification tests, analysis of text-to-speed/ segment-to-speed ratios, and inter-annotator agreement calculations led to an error rate of less than 3%. High-precision training data makes for high-performance models. Amazon knows this all too well. They choose their data partners accordingly. The Results This is exactly the kind of use case our dedicated team of NLP experts loves to sink its teeth into. Amazon provided more than 100,000 documents, ranging from short paragraphs left on their site, to full-length 1,500-word articles published online. First, we analyzed those documents and developed an optimal segmentation methodology. On average, we cut each document into 4 distinct pieces, though the variance was wide-ranging. The longest document had 84 unique segments. The Solution Our Neevo contributors tagged the sentiment of each individual segment, while also providing high-level sentiment scores for each document as a whole. During that process, we ran a wide range of automated gatekeeping procedures to monitor their quality of work in real time. In the end, we sourced half a million annotations on the original 100,000 documents. Step 1 Image Collection Step 2 Image Tagging RTA % Tag Precision (Percentage of correct tags vs. RTA task) Users with low RTA% prevented from working. documents provided 100,000 segments/document identified 4 annotations collected 500,000 accuracy 97.3%* * Average Text to Speed Ratio (Length of the input document / task time) Outliers spot checked internally. Average Segment to Speed Ratio (Number of segments / task time) Outliers spot checked internally. Offensiveness (Percentage of user’s unique assessments vs. 2 other annotators) >20% spot checked internally.

Training Sentiment Analysis Models - DefinedCrowd®€¦ · Training Sentiment Analysis Models How Amazon turns unstructured text into meaningful insight Visionary companies like

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Training Sentiment Analysis Models - DefinedCrowd®€¦ · Training Sentiment Analysis Models How Amazon turns unstructured text into meaningful insight Visionary companies like

Training Sentiment Analysis Models How Amazon turns unstructured text into meaningful insight

Visionary companies like Amazon are leveraging sentiment analysis models to dig beyond surface-level understandings of what people are saying and examine the nuances of how it’s being said. However, sentiment in language is a di�cult thing to parse. One person’s “negative” doesn’t always match their neighbor’s, and even short phrases (”I never liked this dinky o�ce, but I’ll be sad to leave it” ) can contain layers of nuance. Those complications are only compounded when it comes to long-form writing like feature stories and product reviews.

Ideally, the most sophisticated sentiment models could deliver broad-level, composite scores for long-form content, while simultaneously sifting through individual paragraphs, sentences, and words to extract granular- -level insights. When Amazon wanted to turn that ideal into a reality, they partnered with DefinedCrowd.

The Challenge

The days of the quantity-focused data provider are long gone. The dawn

of the quality-focused data partner has arrived. Want to learn how partnering

with DefinedCrowd can unlock cutting-edge AI solutions for your business?

[email protected]

Follow us

www.definedcrowd.ai

Contact us Visit us

In partnering with De�nedCrowd, Amazon bene�tted from extensive data expertise, customizable work�ows, and full-service data solutions that made for guaranteed quality results, even within this kind of complex data collection. Our rigorous quali�cation tests, analysis of text-to-speed/ segment-to-speed ratios, and inter-annotator agreement calculations led to an error rate of less than 3%. High-precision training data makes for high-performance models. Amazon knows this all too well. They choose their data partners accordingly.

The Results

This is exactly the kind of use case our dedicated team of NLP experts loves to sink its teeth into. Amazon provided more than 100,000 documents, ranging from short paragraphs left on their site, to full-length 1,500-word articles published online.

First, we analyzed those documents and developed an optimal segmentation methodology. On average, we cut each document into 4 distinct pieces, though the variance was wide-ranging. The longest document had 84 unique segments.

The Solution

Our Neevo contributors tagged the sentiment of each individual segment, while also providing high-level sentiment scores for each document as a whole. During that process, we ran a wide range of automated gatekeeping procedures to monitor their quality of work in real time. In the end, we sourced half a million annotations on the original 100,000 documents.

Step 1Image Collection

Step 2Image Tagging

RTA % Tag Precision

(Percentage of correct tags vs. RTA task)

Users with low RTA% prevented from working.

documentsprovided

100,000segments/document

identified

4annotations

collected

500,000

accuracy97.3%*

*

Average Text to Speed Ratio

(Length of the input document / task time)

Outliers spot checked internally.

Average Segment to Speed Ratio

(Number of segments / task time)

Outliers spot checked internally.

Offensiveness

(Percentage of user’sunique assessments vs. 2 other annotators)

>20% spot checked internally.