7 min read

The limits of community labeling services

Community labelling and moderation models are a great way to distribute power among platform users. But there are limits to what kind of reports are appropriate and what we can expect from volunteers

I'm Alice Hunsberger. Trust & Safety Insider is my weekly rundown on the topics, industry trends and workplace strategies that trust and safety professionals need to know about to do their job. This week, I'm thinking about:

  • Where community labeling efforts fell short on Bluesky (and what they could have done differently)
  • A new compensation survey for T&S professionals

Get in touch if you'd like your questions answered or just want to share your feedback. Here we go! — Alice


Lessons we learned from Aegis' collapse

Why this matters: Community labelling and moderation models are a great way to distribute power among platform users. But there are limits to what kind of reports are appropriate and what we can expect from volunteers.

Warning: this post mentions sexual assault allegations, albeit not in detail.

I have enormous empathy and respect for volunteer community moderators. I was one myself; I started my Trust & Safety journey almost 25 years ago, when I founded an online message board and got a crash course in community management. Unfortunately, my beloved community soon succumbed to toxicity and bullying, and I eventually decided it wasn’t worth it anymore and quit. I was a naïve college kid, and this was in the Wild West days of the internet, so it’s not wholly surprising that it ended that way.

Twenty five years on, you'd expect that community moderation would have matured. We have a quarter of a century of lessons to draw on and some great examples of online communities that have flourished while being led by volunteers. And yet the same mistakes are happening today.

We saw this play out recently with Aegis, a volunteer community labelling service mainly for LGBTQIA+ users on Bluesky. Like me, they had idealistic goals and had some success; they were one of the first recipients of a small grant from Bluesky. And yet everything suddenly imploded and they quit.

The following will make more sense if you read Aegis' post about what happened but in a nutshell, it appears as though there were two unhappy groups — one was being very loud about Aegis moderators not responding to sexual assault allegations quickly enough, and the other was criticising the first group for being too aggressive. The second group said that the criticism was a distraction from a different argument that had been going on, and they reported some people in the first group as "anti-social", bringing the decision to Aegis and leading to its downfall.

Here are the four lessons that I took from the whole saga:

  1. Never create policy decisions on the fly

When someone believes that moderators aren’t responding quickly enough to a serious allegation like sexual assault, it’s understandable for the reports to get more intense. It’s not uncommon for victims (or supporters of victims) to resort to tactics like creating sockpuppet accounts, spamming reports, or being incredibly aggressive and repetitive in order to be heard.

This is why some policies take into account the intent behind behaviour. It makes sense to make different enforcement decisions about someone who thinks it’s funny to be mean versus a victim who is merely trying to protect other people. However, creating exceptions is a slippery slope and full of potential for biased enforcement, especially when there is no on-platform evidence for the claim in the first place.

In this situation, a moderator (who wasn't the main organiser of the Aegis collective) made a decision to apply the "Anti-Social" label to the reported users. This was then reversed by the main moderator a few days later, because she said the first moderator had acted without "considering the optics of the situation". The whole situation would have worked better with a solid policy in place about how and when to consider intent (both behind posting and reporting). Moderators should never make policy decisions on the fly while they're considering enforcement decisions — this is just asking for personal bias to creep in.

  1. Make the policy as simple as possible

It is reasonable that, like all moderators, volunteers will make mistakes. This is why the major dating platforms actually have incredibly simple policies when it comes to sexual assault reports: believe the victim, ban the reported account. Do not ask for evidence, do not try to play judge and jury, do not get into a back-and-forth, do not try to ascertain the truth. Believe the victim unless there are clear and obvious signs on the platform itself that the report is false. (This is incredibly rare— one example might be a user who reports 1,000 people around the world for personally sexually assaulting them that week).

This simple policy is because it’s better to make the mistake of banning an innocent person than to potentially allow a sexual predator to remain on a platform. And on a dating platform which focuses on real-life connections, this makes sense. For a platform that is more focused on speech and expression, like Bluesky, perhaps not. But the policy still needs to be simple so that it can be enforced fairly.

  1. Take care in using online tools to account for offline harms

One lesson that Aegis mentioned from this whole situation is the need to establish “crystal clear protocols for how to handle sexual assault/harassment allegations made against someone in the community beyond just things like reporting of boundary-violating replies.” I would argue that it’s naïve to think that a community labelling service can or should respond effectively to any offline harms, let alone sexual assault allegations.

If there is any evidence of an offline incident, it exists outside of the platform. If evidence of the incident can be submitted electronically, then it can be faked. Taking on the responsibility to mediate between potential liars with potentially-doctored evidence is a fool’s errand. However, I will note that — although there can be lying on both sides — in my experience, it is overwhelmingly the case that victims are more likely to make true reports and alleged assaulters are more likely to be lying.

This is where the approach that Aegis took of creating a “Boundary Violator” label falls short. Aegis took on the responsibility of deciding which offline allegations are true, and then they applied account labels based on that judgement. Interestingly, Aegis leaned further into this idea in its announcement post, when it said,

It is also imperative that any future label operators wishing to protect the community from these sorts of threats be plugged into multiple whisper networks (both on and off Bluesky) to enable more thorough and proactive investigations.

This is an untenable policy.

A better idea could be to label the account without taking responsibility for determining the truth. For example: “This account has been reported to Aegis for sexual assault.” It is a fact that the account was reported. The absolute truth outside of that is not known. It is potentially helpful for users to be given that information, but Aegis is not stating the truth categorically. Users can choose what to do with the information.

Another option would be to have a policy that only public on-platform content and behaviour is considered. Offline violations are out of bounds. If someone is repeatedly violating boundaries in public on the platform and in a way that Aegis can verify directly themselves without third-party submitted evidence, then they can apply a “Boundary Violator” label. An example of this would be continually arguing with people in threads even after they're asked not to.

  1. Set reasonable expectations of what's possible

Regardless of labelling and policies, the final mistake is believing that volunteers should be moderating at all hours of the day. Part of the issue with this situation is that the main moderator was offline for a couple of days and didn’t respond immediately to the sexual assault reports that were coming in.

It is entirely reasonable for someone to be offline for a couple of days, and yet the internet never sleeps. This is why platforms pay people to be around 24/7 — so that serious issues can be handled immediately. It is unreasonable to expect volunteers to do this, and yet Aegis set the expectation that they could.

Finally, as I wrote about just a couple weeks ago, moderation is a thankless job. Moderators will make mistakes. They will be slow to respond sometimes. They will make people unhappy by not doing enough or by doing too much. And they will get torn apart by people who demand that they do better. This is why it’s so important to set reasonable expectations, to write detailed policies in advance, to give moderators clear boundaries, and to pay everyone involved well for their effort. It’s a complicated, time-consuming, emotionally difficult job, and even more so when the stakes are high. For Aegis, their goals were lofty and their intentions good, but they set themselves up to fail.

Networking Questions? I'm here to help.

I'm writing an ultimate guide to networking in T&S. Send me your questions — or situations you need help to think through — and I'll make sure to address them in the guide

Get in touch

Job hunt

The Trust & Safety Professional Association is doing their first Global Compensation Survey and are inviting all T&S professionals (regardless of TSPA member status) to participate:

As a professional member association, we want to better understand the global compensation landscape for T&S practitioners so that we can create a resource that will help T&S professionals (like you!) understand the value of their professional expertise, providing context for salary negotiations and career decisions.

You can participate anonymously, and they aim to release the data before the end of the year.


Also worth reading

Ensuring inclusive social media: the importance of anti-bias training in content moderation (PartnerHero)
Why? I wrote one final piece about Trust & Safety and the LGBTQ+ community to round out Pride month this year. This article was inspired by GLAAD's recommendation for moderators to receive anti-bias and cultural awareness training. The article provides concrete examples of what training could include, as well as why it's important, based on years of collaboration between PartnerHero and Grindr on training moderators to skillfully handle LGBTQ+ content.

TikTok LLM (The New Inquiry)
Why? I take issue with the framing in this article claiming simple content moderation is "censorship" but. if you can set that aside, it's a fascinating look at how moderation of certain issues on TikTok has resulted in linguistic changes which are then trickling out into life outside the internet.

Considering New York's Stop Addictive Feeds Exploitation for Kids Act (Tech Policy Press)
Why? This article provides an outline of two new bills passed into law in New York that focus on protecting children from online harms.

Deepfake News Center (Reality Defender)
Why? A compilation of news articles about AI safety and deepfakes, available for free (with your email).

AI-powered scams and what you can do about them (TechCrunch)
Why? Readers of this newsletter will probably already know about these scams, but this is a great resource to send to friends and family.


I'll be at TrustCon with a few shirts and stickers to give away, but if you really want one, you should order now. We've also added a "T&S doesn't take a day off" shirt!

Picture of a black and white thirt with slogans about trust and safety
The must-buy fashion item of the summer!

Enjoy today's edition {first_name}? If so...

  1. Hit the thumbs or send a quick reply to share feedback or just say hello
  2. Forward to a friend or colleague (PS they can sign up here)
  3. Become an EiM member and get access to original analysis and the whole EiM archive for less than the price of a coffee a week