Best Practices for Adopting the Computational Ethics for Natural Language Processing Course

Public Interest Technology University Network Projects

Member/Grantee

Carnegie Mellon University

Author

Yulia Tsvetkov, assistant professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington (formerly an assistant professor in the Language Technologies Institute, School of Computer Science at Carnegie Mellon University)

PRINT IN PDF

Best Practices for Adopting the "Computational Ethics for Natural
Language Processing" Course

In spring 2020, the Computational Ethics for NLP course launched at Carnegie Mellon University and has since been adopted by several PIT-UN members, including Stanford University, MIT, and Georgia Tech, as well as the Technical University of Darmstadt in Germany.

The course successfully attracted several students from underrepresented backgrounds and even led to a joint effort with investigative reporters at The Washington Post on a report that required former students to computationally analyze some evidence of anti-black discrimination in China over coronavirus fears. In addition, several course projects were published at leading research conferences, including the 12th International Conference on Social Informatics (SocInfo2020), and workshops by authors and collaborators who were women and other underrepresented members in STEM (science, technology, engineering, and mathematics) fields.

Best Practices for Adopting the Course

Overall, it was an extremely rewarding experience to develop and teach this course, and I thank the PIT-UN mission for enabling it. I hope more AI programs incorporate ethics-focused courses in their curriculum.
And I would be very happy if people consider adopting our course or parts of it.

Yulia Tsvetkov

1. It is important to acknowledge that computational ethics is an emerging and rapidly changing field.

With the development of AI, new ethical issues constantly arise, and new pitfalls are revealed every few weeks, unfortunately. In the field of NLP, for example, in 2016 only a couple of published research papers raised concerns about social biases in datasets that we are using and that machine learning models are liable to absorb these biases. In 2017, there were maybe a dozen of such papers.

This year, there are hundreds, and there are full conference tracks, workshops, tutorials dedicated to bias, hate speech, misinformation, and so on. This means that while the primary syllabus structure can be adopted as is, some content needs to be updated every year, to cover most recent research and to discuss the most relevant case studies. A modular structure of the course facilitates this, and we have released updated versions of the course in the past couple of years.

2. The field of ethical AI is highly interdisciplinary, and it is important to invite experts from diverse fields to provide an interdisciplinary perspective on issues in the intersection of ethics and AI.

At CMU, we invited researchers from philosophy, social sciences, statistics, humanities, public policy, and other departments. These lectures provided a fresh perspective, much needed to understand the problem in depth, and students enjoyed the lectures. I would recommend including such invited talks in the course syllabus.

3. For the course to have practical value for students, I recommend engaging the students in research in the form of course projects and practical exercises.

We found that students who worked on more creative and thought-provoking course projects were more engaged, and they were excited to learn to do research in the field of ethical NLP and to publish their research results. So it is important to prepare a list of ideas for students with projects at different difficulty levels and to involve teaching assistants to mentor the students more closely. We have publicly released a list of example projects that other researchers are most welcome to use.

Course Overview

As language technologies have become increasingly prevalent, there is growing awareness that decisions we make about our data, methods, and tools are often tied up with their impact on people and societies. This course introduces students to real-world applications of language technologies and the potential ethical implications associated with them. We discuss philosophical foundations of ethical research along with advanced state-of-the art techniques.

Discussion topics include:

Philosophical foundations

What are ethics, history, medical and psychological experiments, IRB and human subjects, ethical decision-making?

Misrepresentation and bias

Algorithms to identify biases in models and data and adversarial approaches to debiasing.

Privacy

Algorithms for demographic inference, personality profiling, and anonymization of demographic and personal traits.

Civility in communication

Techniques to monitor trolling, hate speech, abusive language, cyberbullying, toxic comments.

Democracy and the language of manipulation

Approaches to identify propaganda and manipulation in news, to identify fake news, political framing.

Natural language processing for social good

Low-resource NLP, applications for disaster response and monitoring diseases, medical applications, psychological counseling, interfaces for accessibility.

Multidisciplinary perspective

Invited lectures from experts in behavioral and social sciences, rhetoric, etc.