It doesn’t take a genius to perceive that mainstream films have a history of prejudice against women or regarding beauty standards. But it may take some artificial intelligence to quantify these suspicions.
Researchers at Carnegie Mellon University developed an AI that analyzed the subtitles of some 1,500 movies, including the 100 top-grossing Hollywood and Bollywood films from each of the past seven decades. The new research confirms that while there has been progress, there is still a lot of gender and social prejudice on screen. And it shows the way forward in studying, at unprecedented speed, how social problems manifest themselves in movies and television around the world.
“Gender Bias, Social Bias, and Representation: 70 Years of B (H) ollywood” is the title of the article, co-authored by Kunal Khadilkar, Ashiqur R. KhudaBukhsh and Tom Mitchell. The researchers – Khadilkar in particular considers himself a huge Bollywood fan – knew that sexism was a problem in movies and wanted to quantify the problem. They acquired the subtitles of 1,400 films, all in English, from online sources.
Some of the findings are peculiar to India and its $ 2.1 billion film industry centered in Mumbai. For example, the AI calculated that from 1950 to 1999, 74% of babies born in Bollywood films were boys. However, since then the figure has fallen to 55%.
“So we’ve achieved some gender parity in more recent films,” said KhudaBukhsh, a project scientist at CMU’s Language Technologies Institute and who, like Khadilkar, was born in India.
Another finding also reflects what the authors said was a great social change in India: the decline of dowry endorsement dialogue. The practice of paying the dowry by the families of brides was considered socially acceptable for some time after it was banned in 1961, but has since fallen out of favor. The trend as reflected on the screen was discerned by having the AI learn which words were most closely associated with ‘dowry’. In movies until 1969, these were words like “money” and “jewelry”; for the past two decades, these have been terms indicating non-compliance, such as “guts”, “divorce” and “refused”.
Looking at other social issues, the researchers compared Bollywood and Hollywood films. To quantify gender bias, they calculated the percentage of all gender pronouns that were “he” and “him”. In the 1950s, around 65% of gender pronouns in Hollywood hits were male, and almost 60% of Bollywood pronouns. Today, the pronouns in the subtitles for films of both industries are made up of approximately 55% male. For comparison, the authors performed the same test on texts from Google books from the same period. In the 1950s, those books favored male pronouns over female by an astonishing 3-1 ratio, but over the past decade the divide had grown to 50-50, they found.
Another issue was the association of particular skin tones with feminine beauty. “Colorism” is closely associated with racism in the United States, and KhudaBukhsh said there was a growing backlash in India over cosmetics promising to lighten skin.
To examine skin tone, the researchers used what is called a “cloze” test, where the AI extracts data to predict a word omitted from a given sentence – in this case, “A beautiful woman. should have [blank] skin. ”Where a language model would predict ‘soft,’ the researchers said, AI trained with both Bollywood and Hollywood scripts has more often been found to be ‘right’.
“So there is a beauty association with lighter skin color,” KhudaBukhsh said.
The result was true across the ages, although the preference was more pronounced in Bollywood films than in Hollywood, the researchers said.
The study also explored gender bias through what’s called a Word Integration Association Test (WEAT). The AI calculated how often the words associated with men (“he”, “man”, “man”) as opposed to the words “female” correlated with given occupations, from “philosopher” and “boss” to “Nurse” and “socialite”. Hollywood has been found to be significantly less biased in assigning jobs typed by gender, and in both film industries the bias has decreased over time.
Additionally, the team performed the WEAT analysis again, this time comparing the two film industries to a set of 150 Oscar nominees for Best Foreign Language Film. The “films of the world” had about 40% less gender bias in terms of occupation than films from Bollywood or Hollywood.
“Films that are particularly popular, and end up being nominated for an Oscar, they end up being less biased. [than] those that generate millions of dollars in revenue and end up being blockbuster movies, ”Khadilkar said.
This type of analysis has its limitations, the researchers acknowledged: it only takes into account subtitles, which primarily reflect spoken dialogue and song lyrics, and ignores how prejudices can be. expressed by visuals or the soundtrack of a film, for example. Many moviegoers are familiar with informal ways of gauging a film’s esteem for women, such as the Bechdel test, which assesses relationships between female characters.
But researchers say AI’s ability to quickly sift through the subtitles of hundreds of movies is a boon to the field.
“You can now assign numbers to the amount of bias… thus helping to quantify the different film industries around the world,” Khadilkar said.
The research paper was presented in February at a virtual conference of the Association for the Advancement of Artificial Intelligence.