Online harassment has been a problem to a greater or lesser extent since the early days of the internet. Previous work has applied anti-spam techniques like machine-learning based text classification (Reynolds, 2011) to detecting harassing messages. However, existing public datasets are limited in size, with labels of varying quality. The #HackHarassment initiative (an alliance of 1 tech companies and NGOs devoted to fighting bullying on the internet) has begun to address this issue by creating a new dataset superior to its predecssors in terms of both size and quality. As we (#HackHarassment) complete further rounds of labelling, later iterations of this dataset will increase the available samples by at least an order of magnitude, enabling corresponding improvements in the quality of machine learning models for harassment detection. In this paper, we introduce the first models built on the #HackHarassment dataset v1.0 (a new open dataset, which we are delighted to share with any interested researcherss) as a benchmark for future research.
CERC2016
Harassment detection: a benchmark on the #HackHarassment dataset
Alexei Bastidas, Edward Dixon, Chris Loo, John Ryan
Intel
email: edward.dixon@intel.com
Keywords:e.g.Machine Learning, Natural Language Processing, Cyberbullying
Introduction
Online harassment has been a problem to a greater or lesser extent since the early days of the
internet.
Previous work has applied antispam techniques like machinelearning based text
classification (Reynolds, 2011) to detecting harassing messages. However, existing public datasets
are limited in size, with labels of varying quality. The #HackHarassment initiative (an alliance of
1
tech companies and NGOs devoted to fighting bullying on the internet) has begun to address this
issue by creating a new dataset superior to its predecssors in terms of both size and quality. As we
(#HackHarassment) complete further rounds of labelling, later iterations of this dataset will
increase the available samples by at least an order of magnitude, enabling corresponding
improvements in the quality of machine learning models for harassment detection. In this paper, we
introduce the first models built on the #HackHarassment dataset v1.0 (a new open dataset, which
we are delighted to share with any interested researcherss) as a benchmark for future research.
Related Work
Previous work in the area by Bayzik 2011 showed that machine learing and natural language
processing could be successfully applied to detect bullying messages on an online forum. However,
the same work also made clear that the limiting factor on such models was the availability of a
suitable quantity of labeled examples. For example, the Bayzick work relied of a dataset of 2,696
samples, only 196 of which were found to be examples of bullying behaviour. Additionally, this
work relied on model types like J48 and JRIP (types of decision tree), and knearest neighbours
classifiers like IBk, as opposed to popular modern ensemble methods or deep neuralnetworkbased
approaches.
Methodology
Our work was carried out using the #HackHarassment Verison 1 dataset, the first iteration of which
consists exclusively of Reddit posts. An initially random selection of posts, in which harassing
content occured at a rate of between 5% and 7% was culled of benign content using models training
on a combination of existing cyberbullying datasets (Reynolds 2001, also “Improved cyberbullying
detection through personal profiles). Each post is labelled independently by at least five Intel
Security Web Analysts. (a post is considered “bullying” if it labelled as such by 20% or more of
the human labelers as shown in the following histogram, a perfect consensus is relatively rare, and
so we rate a post as “harassing” if 20% 2 of our 5 raters consider it to be harassing). This is a
relatively balanced dataset, with 1,280 nonbullying/harassing posts,, and 1,118 bullying/harassing
examples.
1 "Hack Harassment." 2016. 26 Jul. 2016 <http://www.hackharassment.com/>
CERC2016
All preprocessing, training and evaluation was carried out in Python, using the popular
SciKitLearn library (for feature engineering and linear models) in combination with Numpy (for
2
3
matrix operations), Keras and TensorFlow (for models based on deep neural networks DNNs).
4
5
For the linear models, features were generated by tokenizing the text (breaking it aparting into
words), hashing the resulting unigrams, bigrams and trigrams (collectiojns of one, two, or three
adjacent words) and computing at TF/IDF for each hashed value. The resulting feature vectors
were used to train and test Logistic Regressioin, Support Vector Machine and Gradient Boosted
Tree models, with 80% of data used for training and 20% held out for testing (results given are
based on the heldout 20%).
For the DNNbased approach, a similar approach was taken to tokenization, both bigram and
trigram hashes were computed; these were onehot encoded, and dense representations of these
features were learned during training, as per Joulin 2016.
2 “scikitlearn: machine learning in Python — scikitlearn 0.17.1 …” 2011. 29 Jul. 2016 <http://scikitlearn.org/>
3 “NumPy — Numpy.” 2002. 29 Jul. 2016 <http:/
This content is AI-processed based on open access ArXiv data.