Mowa nienawiści w mediach społecznościowych – możliwości automatycznej detekcji i eliminacji

Jędrzej Wieczorkowski,

Aleksandra Suwińska

Abstrakt

Hate Speech on Social Media – The Possibility of Automatic Detection and Elimination

The article deals with the issues of hate speech and other forms of verbal aggression on the Internet as well as the possibility of their automatic detection. The paper discusses the studies confirming the partial effectiveness of text mining methods in the automatic detection of hate speech on social media. Hate speech is related to verbal aggression resulting from belonging to a group (national, racial, religious, etc.) and has become a significant problem in the social and economic context. Automatic detection significantly support the management of online news websites and social media due to the moderation of the received content. Moreover, eliminating online hate speech reduces its negative social and economic effects. The linguistic and cultural specificity of the hate speech are the problem, and the gap so far is solving the problem in Polish conditions. The study used the Tweeter database. Then, methods such as artificial neural networks, naïve Bayes classifier and support vector machine were used. The obtained results confirm the thesis about the possibility of using text mining methods in the process of reducing hate speech, but at the moment the described methods do not allow for full automation of the elimination of such content. The issue was presented in the article primarily in the context of the significance and scale of the problem and the possibility of solving it, and less from the point of view of the technical details.

JEL: L86, L82, C40, O30

Słowa kluczowe: hate speech, cyberbullying, social media, text mining, Twitter
References

Andrusyak B., Rimel M., Kern R. (2018). Detection of Abusive Speech for Mixed Sociolects of Russian and Ukrainian Languages. Proceedings of Twelfth Workshop RASLAN, s. 77–84.

Aumer-Ryan K., Hatfield E. (2007). The Design of Everyday Hate: A Qualitative and Quantitative Analysis. „Interpersona: An International Journal on Personal Relationships”, 1(2), s. 143–172. DOI: 10.5964/ijpr.v1i2.11.

Biesek M. (2019). Comparison of Traditional Machine Learning Approach and Deep Learning Models in Automatic Cyberbullying Detection for Polish Language. Proceedings of the PolEval 2019 Workshop, s. 121–126.

Cortese A. (2006). Opposing Hate Speech. Westport: Praeger Publishers.

Davidson T., Warmsley D., Macy M., Weber I. (2017). Automated Hate Speech Detection and the Problem of Offensive Language. Proceedings of the Eleventh International AAAI Conference on Web and Social Media ICWSM, s. 512–515.

Ellis J. (2015). What happened after 7 news sites got rid of reader comments. NiemanLab, https://www.niemanlab.org/2015/09/what-happened-after-7-news-sites-got-rid-of-reader-comments (dostęp: 30.08.2020).

Erjavec K., Kovačič M.P. (2012). You Don’t Understand, This is a New War! Analysis of Hate Speech in News Web Sites’ Comments. „Mass Communication and Society”, 15, s. 899–920. DOI: 10.1080/15205436.2011.619679.

Facebook (2019). Standardy społeczności. Propagowanie nienawiści, https://www.facebook. com/communitystandards/hate_speech (dostęp: 8.11.2019).

Gale L.R., Health W.C., Ressler R. (2002). An Economic Analysis of Hate Crime: Eastern. „Economic Journal”, 28(2), s. 203–216.

Gardiner B., Mansfield M., Anderson I., Holder J., Louter D., Ulmanu M. (2016). The Dark Side of Guardian Comments. „The Guardian”, 12.04.2016, https://www.theguardian.com/technology/2016/apr/12/the-dark-side-of-guardian-comments (dostęp: 22.07.2020).

Jabłońska M. (2017). Doświadczanie agresji słownej w cyberprzestrzeni wśród cyfrowych tubylców. „Ekonomiczne Problemy Usług”, 126(2), s. 175–183.

Kodeks karny, 1997. Ustawa z dnia 6 czerwca 1997 r. – Kodeks karny. Dz.U. 1997 nr 88 poz. 553 ze zm.

Korzeniowski R., Rołczyński R., Sadownik P., Korbak T., Możejko M. (2019). Exploiting Unsupervised Pre-Training and Automated Feature Engineering for Low-Resource Hate Speech Detection in Polish. Proceedings of the PolEval 2019 Workshop, s. 141–148.

Kwok I., Wang Y. (2013). Locate the Hate: Detecting Tweets against Blacks. Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, AAAI Press, s. 1621–1622.

Plaza-Del-Arco F.M., Molina-González M.D., Ureña-López L.A., Martín-Valdivia M.T. (2020). Detecting Misogyny and Xenophobia in Spanish Tweets Using Language Technologies. „ACM Transactions on Internet Technology (TOIT)”, 20(2), Article 12, s. 1–19. DOI: 10.1145/3369869.

Polska. Reagowanie na mowę nienawiści. Raport (2018). London: ARTICLE 19.

Ptaszynski M., Pieciukiewicz A., Dybala P. (2019). Dataset for Automatic Cyberbullying Detection in Polish Language, https://github.com/ptaszynski/cyberbullying-Polish#readme (dostęp: 17.12.2019).

Reber A.S. (1975). The Penguin Dictionary of Psychology. New York: Penguin Press.

Rzecznik Praw Obywatelskich (2018). Jedynie 5% przestępstw motywowanych nienawiścią jest zgłaszanych na policję – badania RPO i ODIHR/OBWE, https://www.rpo.gov.pl/pl/content/jedynie-5-przestepstw-motywowanych-nienawiscia-jest-zglaszanych-na-policje-badania-rpo-i-odihrobwe (dostęp: 8.11.2019).

Santucci V., Spina S., Milani A., Biondi G., Di Bari G. (2018). Detecting Hate Speech for Italian Language in Social Media. Proceedings of the Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA).

Suler J. (2004). The Online Disinhibition Effect. „Cyberpsychology & Behavior”, 7(3), s. 321– 326. DOI: 10.1089/1094931041291295.

Tereszkiewicz A. (2012). Do Poles Flame? Aggressiveness on Polish Discussion Groups and Social Networking Sites. W: L. Laineste, D. Brzozowska, W. Chłopicki (red.). Estonia and Poland: Creativity and Tradition in Cultural Communication, 1, s. 221–236. Tartu: ELM Scholarly Press.

Thomas Z. (2020). Facebook Content Moderators Paid to Work from Home. BBC News, https://www.bbc.com/news/technology-51954968 (dostęp: 30.08.2021).

Twitter (2019). Zasady dotyczące zachowań przepełnionych nienawiścią, https://help.twitter. com/pl/rules-and-policies/hateful-conduct-policy (dostęp: 8.11.2019).

Warner W., Hirschberg J. (2012). Detecting Hate Speech on the World Wide Web. Proceedings of the Second Workshop on Language in Social Media. Montreal: Association for Computational Linguistic, s. 19–26.

Wernik A. (2018). Odpowiedzialność administratora strony za nienawistne wpisy. Infor, https://www.infor.pl/prawo/prawa-konsumenta/konsument-w-sieci/2800913,2,Odpowiedzial-noscadministratora-strony-za-nienawistne-wpisy.html (dostęp: 22.07. 2020).

Winiewski M., Hansen K., Bilewicz M., Soral W., Świderska A., Bulska D. (2017). Mowa nienawiści, mowa pogardy. Raport z badania przemocy werbalnej wobec grup mniejszościowych. Warszawa: Fundacja im. Stefana Batorego.