How does GenAI CSAM impact Content Moderator wellbeing?

Generative AI

In the Trust and Safety industry, there has been ongoing dialogue around the proliferation of Generative AI, its potential harms, and how companies deploy it safely for public use. There are already several tools that are being used by the public regularly, with a 2022 McKinsey survey highlighting that the adoption of Generative AI has more than doubled in the past five years.

Tools like ChatGPT and GPT-4, DALL-E and its advanced versions 2 and 3, and Midjourney AI have gained a significant user base, even prompting academic institutions to issue policies to students about using tools like ChatGPT to complete assignments and projects citing academic integrity or risks to personal, private, confidential, or proprietary information.

According to a recent Gartner article, they predict that by 2026, GenAI “will automate 60% of the design effort for new websites and mobile apps” and that “over 100 million humans will engage robocolleagues to contribute to their work”. By 2027, Gartner predicts that “nearly 15% of new applications will be automatically generated by AI without a human in the loop”. These predictions might evoke fear in some individuals whilst others may see these predictions as exciting.

For the Trust and Safety industry, there is a trade-off between deploying GenAI tools for public consumption to enhance working practices and advance technology use in sectors like healthcare for better patient outcomes and ensuring users are kept safe against bad actors and malicious attacks online.

Current Challenges

One of the most topical challenges discussed in the Trust and Safety industry is addressing the trade-off mentioned above. Although GenAI tools are ostensibly developed with guardrails in place (the same way that other online platforms have user terms of service to curb bad online behaviour), it is not uncommon for bad actors to find methods of manipulating and avoiding these guardrails.

We have likely all seen the media sharing incidents where GenAI tools have been used to develop harmful imagery – such as the Taylor Swift deepfakes which some have cited as image-based sexual violence. Other problematic GenAI usage runs the gamut, including scammers using voice cloning, ongoing elections disinformation campaigns, and even risks to journalistic integrity.

Unfortunately, the increase of GenAI child sexual abuse materials (CSAM) is one of the most deeply harmful issues that needs to be addressed within the industry. In fact, Stanford researchers found that Stable Diffusion, a text-to-image GenAI tool, was generating CSAM content because its models were trained on an open-source dataset that included hundreds of known CSAM images scraped from the internet. It began generating photo-realistic nude images, including CSAM.

Unfortunately, when these materials are generated through AI tools, it is up to Content Moderators to tackle the issue.

GenAI CSAM Content Moderation

As highlighted above, some AI tools like Stable Diffusion have been found to generate CSAM content because their models were trained on datasets that contained this imagery. The issue develops partially because of the way these models are trained and partially due to bad actors manipulating or circumventing the tools’ guardrails.

The ability for users to generate this imagery then turns into a problem for platforms to tackle at a large scale, with Content Moderators at the forefront of this problem. Whether they are data labellers for AI companies or traditional Content Moderators for social media platforms, it is their role to ensure that users are kept from generating or viewing these harmful and illegal materials online.

While reviewing CSAM is not new for many Content Moderators, there are unique challenges that come with reviewing GenAI CSAM versus ‘real’ CSAM. Some of these challenges include:

Exponentially increased volume of CSAM content,
Discerning GenAI CSAM and ‘real’ CSAM based on policies,
Risk of vicarious traumatization and other mental health difficulties.

Exponentially Increased Volume

The increased volume of CSAM content is one challenge that Content Moderators must contend with. In 2023, the Internet Watch Foundation compiled a report on their investigations of the proliferation of GenAI CSAM content online. They found a total of 20,245 AI-generated images posted to one dark web CSAM forum in a one-month period. Of these images, over 11,000 were investigated as they were judged to be the most likely to be criminal. The IWF stated in their report that “chief among those differences [from previous technologies] is the potential for offline generation of images at scale – with the clear potential to overwhelm those working to fight online child sexual abuse and divert significant resources from real CSAM towards AI CSAM.”

The exponential increased volume of CSAM materials not only places undue stress on Content Moderators to quickly remove the materials but, also means that they will require more specialized training to escalate these matters to law enforcement, the National Center for Missing and Exploited Children, and other third-party agencies.

Unfortunately, discerning GenAI CSAM from ‘real’ CSAM is challenging.

Discerning GenAI CSAM from ‘Real’ CSAM

It is up to Content Moderators to discern whether these images and videos are AI-generated or whether they are ‘real’ CSAM content – and in some cases, a combination of both. Bad actors are not only creating novel images that don’t depict real children, but they are also circumventing the tools’ guardrails to generate hundreds of new images of previous victims, sharing tips with other predators about how they are navigating around the safeguards in place, and re-victimizing children in the process.

Platform policies or user terms of service are what guides a Content Moderator to make an accurate decision for violative content. These policies are regularly reviewed and updated based on users’ online behaviors changing, regulatory requirements, legal requirements, or the advancement of technologies like GenAI.

The question is whether platform policies or terms of service have caught up to the proliferation of GenAI CSAM. While many platforms’ terms of service include digitally generated imagery, fictional characters, art and other non-real depictions of CSAM as violations, it falls to the Content Moderator to accurately interpret these policies and make executive decisions about the materials.

Risk of Vicarious Traumatization and Other Mental Health Difficulties

Not unlike moderating real CSAM imagery and videos, Content Moderators who are now tasked with reviewing AI-generated CSAM are at higher risk of developing vicarious traumatization and other mental health difficulties. The added stressors of increased volumes of content, swift takedowns, and accurate decision-making only heightens the level of risk to Content Moderators.

Based on research conducted amongst adjacent populations including law enforcement and mental health professionals who are similarly exposed to child abuse materials in their line of work, repetitive exposure to CSAM can result in:

Secondary traumatic stress indicated by irritability, social withdrawal, marriage difficulties, intrusive thoughts, autonomic system arousal, and accute reactivity such as shock, anger, and sadness resulting from displays of emotions by victims, norm violations, and personal relevance to the viewer
Vicarious trauma symptomology including changes in cognitive schemas and core beliefs
Post-traumatic stress symptoms, especially when CSAM included violence beyond the sexual assault
Discomfort expressing intimacy with their own children – more prevalent in males than females

Other factors that influenced mental health difficulties such as elevated post-traumatic stress symptoms, anxiety and depressive symptoms, and lower subjective wellbeing in research conducted with ICAC (Internet Crimes Against Children) investigators included:

Less control over work assigned,
Not knowing about final case resolutions,
Not attending training programs related to CSAM, and
Unavailability of process-oriented staff discussions, access to mental health professionals and individual case reviews.

Supporting Content Moderators Investigating GenAI CSAM

There are several ways that companies can support Content Moderators investigating cases of AI-generated CSAM or other real CSAM materials. At Zevo, we highly recommend consideration of the working practices and policies.

The literature suggests that work-related factors including sense of agency or autonomy in choosing case work, offering ample personal time off, and providing opportunities for shared debriefing and process-oriented discussions between colleagues and facilitated by mental health professionals can minimize the potential risk of harm to individuals exposed to this type of egregious content.

Finally, knowing the outcomes of their investigations has been demonstrated to increase wellbeing scores and reduce mental ill health symptomology amongst adjacent populations such as law enforcement and mental health professionals similarly exposed to CSAM. Therefore, we advocate for organizations to develop feedback loops between all stakeholders that allow Content Moderators to acknowledge the positive outcomes of the work they are conducting.

While the proliferation of GenAI CSAM will undoubtedly place additional pressures on organizations and Content Moderators alike to swiftly and accurately remove harmful materials and discern what is AI-generated versus real CSAM, there are indeed measures that can be implemented to protect moderation teams from further harm.

GenAI CSAM – How Does this Impact Content Moderator Wellbeing?

Generative AI

Current Challenges

GenAI CSAM Content Moderation

Exponentially Increased Volume

Discerning GenAI CSAM from ‘Real’ CSAM

Risk of Vicarious Traumatization and Other Mental Health Difficulties

Supporting Content Moderators Investigating GenAI CSAM

Previous PostHow to Manage Daily Stressors in Our Lives

Next PostSupporting Content Moderator Wellbeing: The Power of Mind-Body Practices

Quick Links

Resources