Building Unbiased Comment Toxicity Classification...

Building Unbiased Comment Toxicity Classification Model

Louise Qianying Huang, and Mi Jeremy Yu

Department of Statistics, Stanford University{qyhuang, miyu1996}@stanford.edu

Building Unbiased Comment Toxicity Classification Model

Louise Qianying Huang, and Mi Jeremy Yu

Department of Statistics, Stanford University{qyhuang, miyu1996}@stanford.edu

Task Introduction

Machine learning and deep learning techniques have achieved state-of-the-art results in many natural language processing tasks, such asQuestion Answering and Neural Machine Translation. However, naturallanguage often contains bias and stereotyping, and models that trainon such corpus will unconsciously learn and intensify those biases. Inthe previous iteration of Jigsaw toxicity classification competition, thedata contain unintended biases, as certain identities are overwhelminglyreferred to in offensive ways. Unfortunately, the model internalizes suchlanguages biases and learns to predict toxicity based on whether theseidentities are mentioned. In this project, we aim to build an unbiasedclassification model that takes in raw online comments as input and toclassify toxicity of those comments, using both non-neural and neuraltechniques.

The data was collected from the Civil Comments platform and anno-tated by the Jigsaw/Conversation AI team. The original training sethas 1804874 examples of public comments. The toxicity of the commentis quantified as toxicity label ranging from 0 to 1, where 0 representsnon-toxic at all and 1 represents severely toxic. Each example has theoverall toxicity label, as well as five toxicity sub-type labels. Addi-tionally, each example is labelled with a variety of identity attributes,representing whether the identities that are mentioned in the comment.Since the original test set does not have the identity attributes, forfaster testing, we held out 5% of the original training set as our newtest set. Therefore, the new training dataset size is 1715241, and thenew test dataset has 89633 comments. The left graph shows some com-mon words associated with toxicity greater than 0.5. The left graphshows the distribution of toxicity of the training set.

Non-Neural Methods

•Feature Extraction. We first used 1-gram and 2-gram word tokens and 1-gram up to 6-gram character tokens to tokenize the corpus. Then we countedthe occurrence of each tokens with sublinear scaled frequency to smooth theinfluence of count. In the end we normalized the frequency by inverse docu-ment frequency. We selected the top 20000 word features and 30000 characterfeatures.

•Logistic Regression Baseline. The logistic regression classifier has theform hθ(x) = g(θTx) = 1

1+e−θT x, which returns the probability of a comment

being toxic.

•Naive Bayes Classifier Baseline. Multinomial event model is under theassumption that x(i) are conditionally independent given y. The likelihood formultinomial event model is l(φy, φk|y=0, φk|y=1) =

∏ni=1(

∏dij=1 p(x

(i)j |y;φk|y =

0, φk|y=1))p(y(i);φy). After estimating the parameters, the corresponding prob-ability can be calculated using the Bayes rule.

Neural Methods

•Embeddings. We used various pretrained word embeddings on their own orcombined them together to mitigate the potential biases in any embeddings.We experimented with GloVe 300D trained on Common Crawl [3] and fastText300D trained on Common Crawl [2]. We also implemented character-levelembedding introduced in [1].

Results

The evaluation metric is generalized AUC.

Discussion and Conclusion

To conclude, pre-trained embeddings are crucial and variations of net-work architectures have minor influences on the performance, andmodels with simpler structure are preferred. Seq2seq self-attentiondoes help increase the performance of the model. For future work, wethink incorporating pre-trained BERT and ELMo as the embeddinglayer may drastically increase the performance. Also, if time permit-ted, training separate models for different subcategories of toxicity mayalso boost the performance.

References

[1] Yoon Kim et al. “Character-Aware Neural Language Models”. In: CoRRabs/1508.06615 (2015). arXiv: 1508.06615. url: http://arxiv.org/abs/1508.06615.

[2] Tomas Mikolov et al. “Advances in Pre-Training Distributed Word Representa-tions”. In: Proceedings of the International Conference on Language Resourcesand Evaluation (LREC 2018). 2018.

[3] Jeffrey Pennington, Richard Socher, and Christopher D. Manning.“GloVe: GlobalVectors for Word Representation”. In: Empirical Methods in Natural LanguageProcessing (EMNLP). 2014, pp. 1532–1543. url: http://www.aclweb.org/anthology/D14-1162.

Building Unbiased Comment Toxicity Classification...

Documents

Gram positive and Gram negative anaerobic rods Gram

FUNCTIONAL ANATOMY OF PROKARYOTES AND EUKARYOTES€¦ · Functional anatomy of prokaryotes Gram positive VS gram negative Characteristics Gram positive Gram negative Gram reaction

Lecture 12.2b- Gram-Gram Stoich

ECE531 Lecture 10a: Best Linear Unbiased EstimationECE531Lecture10a: BestLinearUnbiased Estimation FindingtheBLUE:TheConstraint(part1) Let’s look at the unbiased constraint ﬁrst

Stereo Investigator Stereological Formulas - MBF … · Stereo Investigator Stereological Formulas ... Unbiased Stereology, ... Howard, V., & Reed, M. (2005). Unbiased stereology:

Bacterias Gram Positivas Gram Negativas

Gram a Ti Gram As

Building Unbiased Comment Toxicity Classiﬁcation Model ...cs229.stanford.edu/proj2019spr/report/79.pdfwith Natural Language Processing Louise Qianying Huang, Mi Jeremy Yu Department

Gram Positive Bacteria - University of Alabama at … year/micro/powerpoint/Gram positive... · Gram Positive Bacteria ... of clinically important gram-positive cocci and gram-positive

Contributions to unbiased diagrammatic methods for

BACTERIAS GRAM POSITIVO - fmed.uba.ar · TINCION de GRAM • Gram + : Púrpura - Violeta • Gram - : Rojo - Rosa - Fucsia Staphylococcus aureus (coco gram positivo); E. Coli (bacilo

UNBIASED LOSS DEVELOPMENT FACTORS

Table 3. Exactly Median-Unbiased Estimators

Nearly Weighted Risk Minimal Unbiased Estimation"

GRAM POSITIVE & GRAM NEGATIVE BACTERIAksumsc.com/download_center/1st/1. Foundation Block/Female Group/Microbiology/09- Gram...Gram Positive Gram Negative Cocci Bacilli Aerobic Anaerobic

Unbiased Pool Based Active Learning

ZymoBIOMICS™ DNA Miniprep Kit - Zymo Research · B) The ZymoBIOMICS™ DNA Miniprep Kit reliably isolates DNA from even the toughest to lyse Gram positive organisms, enabling unbiased

Unbiased Dice

ExTEnsion informaTion BasEd on UnBiasEd

An Unbiased Opinion