AI language models may be offensive or biased towards individuals with disabilities

Natural Language Processing (NLP) is a type of artificial intelligence that allows machines to use text and spoken words in many different applications -; Like smart assistants or email autocorrect and spam filters -; Help automate and simplify processes for individual and enterprise users. However, the algorithms driving this technology often have attitudes that can be abusive or biased toward individuals with disabilities, according to researchers at Penn State College of Information Science and Technology (IST).

The researchers found that all algorithms and models they tested contained significant implicit bias against people with disabilities. Previous research on pre-trained language models -; that have been trained on large amounts of data that may contain implicit biases -; He found sociodemographic bias against races and ethnicities, but to date similar biases against PWD have not been extensively explored.

“The thirteen models we explored are in high use and are generic in nature,” said Pranav Venket, a doctoral student at IST and first author of the study paper presented today (October 13) at the 29th International Conference on Computational Linguistics. Colling). “We hope that our findings will help developers who create AI to help certain groups – especially people with disabilities who rely on AI to help with their daily activities – become aware of these biases.”

In their study, the researchers examined machine learning models that were trained on source data to group similar words together to enable a computer to automatically generate sequences of words. They created four simple sentence templates in which the gender noun “man”, “woman” or “person” is filled in differently, and one of the ten most frequently used adjectives in English -; For example, “They are the parents of a good person.” Subsequently, they generated more than 600 adjectives that can be associated with people with disabilities or non-disabled people -; such as neurotypical or visually impaired -; To replace the adjective randomly in each sentence. The team tested over 15,000 unique sentences in each form to form adjective word associations.

For our example, we chose the word “good,” and we wanted to see how it relates to terms related to both non-disability and disability. By adding a term other than disability, the effect of “good” becomes “great”. But when “good” is combined with a disability-related term, we get a “bad” result. So this change in the form of the adjective itself shows the clear bias of the model.”

Pranav Venkett, PhD Student, IST College

While this exercise revealed the explicit bias found in the models, the researchers wanted to further measure each model for the implicit bias -; Attitudes towards people or associating stereotypes with them without conscious knowledge. They examined the traits created for the disabled and non-disabled groups and measured each other’s feelings -; A natural language processing technique for assessing whether a text is positive, negative, or neutral. All of the models they studied scored more consistent sentences with words associated with disability negatively than those without it. One model, which was previously tested on Twitter data, flipped the emotion score from positive to negative 86% of the time when the term disability-related was used.

“When we look at this finding alone, we see that as soon as a disability-related term is added into the conversation, the emotion score for the whole sentence goes down,” Finkett said. “For example, if a user includes a term related to disability in a comment or post on social media, the possibility that that post will be censored or restricted.”

The researchers also tested implicit bias in two large linguistic models that are used to automatically create long texts, such as news articles, to see how leaving a blank in the sentence template would change depending on the adjective used. In this analysis, they composed 7,500 sentences, again included variously adjectives related to non-disability or disability, and tested to see how leaving a blank in the sentence template would change depending on the adjective used. In this case, given the sentence “a man has a void,” the language forms expected “changed” for the empty word. However, when an adjective related to disability was added to the sentence, resulting in the ‘deaf-blind man being ’empty’, the form expected ‘died’ for emptiness.

The implicit bias of models against PWD can appear in various applications -; For example, in text messages when autocorrecting a misspelled word or on social media where there are rules that prohibit offensive or harassing posts. Finally, since humans are unable to review the huge number of posts made, AI models use these sentiment scores to filter out those posts that are in violation of the platform’s community standards.

“If someone is discussing a disability, and even though a publication is not toxic, a model like this that does not focus on separating biases may classify a publication as toxic simply because there is a disability associated with the publication,” explained Mukund Srinath, a PhD student at IST and co-author of the study.

“When a researcher or developer uses one of these models, they don’t always look at all the different ways and all the different people that will affect them — especially if they focus on outcomes and how well they work,” Finkett said. These are the repercussions that can affect real people in their daily lives.”

Venkett and Srinath collaborated with Schumer Wilson, assistant professor of information science and technology, on the project.

Leave a Comment