Not a good look for Mastodon - what can be done to automate the removal of CSAM?
https://stacks.stanford.edu/file/druid:vb515nd6874/20230724-fediverse-csam-report.pdf
I’d suggest that anyone who cares about the issue take the time to read the actual report, not just drama-oriented news articles about it.
So if I’m understanding right, based on their recommendations this will all be addressed as more moderation and QOL tools are introduced as we move further down the development roadmap?
What development roadmap? You’re not a product manager and this isn’t a Silicon Valley startup.
Removed by mod
As does most successful open source software. It’s more of a "this is where we’d like to see things go long term, but that in no way restricts contributions, it merely helps communicate the ideas of the core contributors.
If I can try to summarize the main findings:
- Computer-generated (e.g…, Stable Diffusion) child porn is not criminalized in Japan, and so many Japanese Mastodon servers don’t remove it
- Porn involving real children is removed, but not immediately, as it depends on instance admins to catch it, and they have other things to do. Also, when an account is banned, the Mastodon server software is not sending out a “delete” for all of their posted material (which would signal other instances to delete it)
Problem #2 can hopefully be improved with better tooling. I don’t know what you do about problem #1, though.
One option would be to decide that the underlying point of removing real CSAM is to avoid victimizing real children; and that computer-generated images are no more relevant to this goal than Harry/Draco slash fiction is.
And are you able to offer any evidence to reassure us that simulated child pornography doesn’t increase the risk to real children as pedophiles become normalised to the content and escalate (you know, like what already routinely happens with regular pornography)?
Or are we just supposed to sacrifice children to your gut feeling?
Would you extend the same evidence-free argument to fictional stories, e.g. the Harry/Draco slash fiction that I mentioned?
For what it’s worth, your comment has already caused ten murders. I don’t have to offer evidence, just as you don’t. I don’t know where those murders happened, or who was murdered, but it was clearly the result of your comment. Why are you such a terrible person as to post something that causes murder?
I have no problem saying that writing stories about two children having gay sex is pretty fucked in the head, along with anyone who forms a community around sharing and creating it.
But it’s also not inherently abuse, nor is it indistinguishable from reality.
You’re advocating that people just be cool with photo-realistic images of children, of any age, being raped, by any number of people, in any possible way, with no assurances that the images are genuinely “fake” or that pedophiles won’t be driven to make it a reality, despite other pedophiles cheering them on.
I was a teenage contrarian psuedo-intellectual once upon a time too, but I never sold out other peoples children for something to jerk off too.
If you want us to believe its harmless, prove it.
You keep making up weird, defamatory accusations. Please stop. This isn’t acceptable behavior here.
Awful pearl-clutchy for someone advocating for increased community support for photorealistic images of children being raped.
Which do you think is more acceptable to Lemmy in general? Someone saying “fuck”, or communities dedicated to photorealistic images of children being raped?
Maybe I’m not the one who should be changing their behavior.
Such a signal exists in the ActivityPub protocol, so I wonder why it’s not being used.
I don’t know what you do about problem #1, though.
Well the simple answer is that it doesn’t have to be illegal to remove it.
The legal question is a lot harder, considering AI image generation has reached levels that are almost indistinguishable from reality.
In which case, admins should err on the side of caution and remove something that might be illegal.
I personally would prefer to have nothing remotely close to CSAM, but as long as children aren’t being harmed in any conceivable way, I don’t think it would be illegal to post art containing children. But communities should absolutely manage things however they think is best for their community.
In other words, I don’t think #1 is a problem at all, imo things should only be illegal if there’s a clear victim.
4.1 Illustrated and Computer-Generated CSAM
Stopped reading.
Child abuse laws “exclude anime” for the same reason animal cruelty laws “exclude lettuce.” Drawings are not children.
Drawings are not real.
Half the goddamn point of saying CSAM instead of CP is to make clear that Bart Simpson doesn’t count. Bart Simpson is not real. It is fundamentally impossible to violate Bart Simpson’s rights, because he doesn’t fucking exist. There is nothing to protect him from. He cannot be harmed. He is imaginary.
This cannot be a controversial statement. Anyone who can’t distinguish fiction from real life has brain problems.
You can’t rape someone in MS Paint. Songs about murder don’t leave a body. If you write about robbing Fort Knox, the gold is still there. We’re not about to arrest Mads Mikkelsen for eating people. It did not happen. It was not real.
If you still want to get mad at people for jerking off to the wrong fantasies, that is an entirely different problem from photographs of child rape.
You should keep reading then, because they cover that later.
What does that even mean?
There’s nothing to “cover.” They’re talking about illustrations of bad things, alongside actual photographic evidence of actual bad things actually happening. Nothing can excuse that.
No shit they are also discussing actual CSAM alongside… drawings. That is the problem. That’s what they did wrong.
Oh, wait, Japanese in the other comment, now I get it. This conversation is a about AI Loli porn.
Pfft, of course, that’s why no one is saying the words they mean, because it suddenly becomes much harder to take the stance since hatred towards Loli Porn is not universal.
I mean, I think it’s disgusting, but I don’t think it should be illegal. I feel the same way about cigarettes, 2 girls 1 cup, and profane language. It’s absolutely not for me, but that shouldn’t make it illegal.
As long as there’s no victim, knock yourself out with whatever disgusting, weird stuff you’re into.
Oh no, what you describe is definitely illegal here in Canada. CSAM includes depictions here. Child sex dolls are illegal. And it should be that way because that stuff is disgusting.
CSAM includes depictions here.
Literally impossible.
Child rape cannot include drawings. You can’t sexually assault a fictional character. Not “you musn’t.” You can’t.
If you think the problem with child rape amounts to ‘ew, gross,’ fuck you. Your moral scale is broken, if there’s not a vast gulf between those two bad things.
Okay, thanks for the clarification
Everyone except you still very much includes drawn & AI pornographic depictions of children within the basket of problematic content that should get filtered out of federated instances so thank you very much but I’m not sure your point changed anything.
They are not saying it shouldn’t be defederated, they are saying reporting this to authorities is pointless and that considering CSAM is harmful.
Everybody understands there’s no real kid involved. I still don’t see an issue reporting it to authorities and all the definitions of CSAM make a point of including simulated and illustrated forms of child porn.
What’s the point of reporting it to authorities? It’s not illegal, nor should it be because there’s no victim, so all reporting it does is take up valuable time that could be spent tracking down actual abuse.
If you don’t think images of actual child abuse, against actual children, is infinitely worse than some ink on paper, I don’t care about your opinion of anything.
You can be against both. Don’t ever pretend they’re the same.
Step up the reading comprehension please
I understand what you’re saying and I’m calling you a liar.
You mean to say I’m wrong or you actually mean liar?
‘Everyone but you agrees with me!’ Bullshit.
‘Nobody wants this stuff that whole servers exist for.’ Self-defeating bullshit.
‘You just don’t understand.’ Not an argument.
Hey, just because someone has a stupid take on one subject doesn’t mean they have a stupid take on all subjects. Attack the argument, not the person.
He invented the stupid take he’s fighting against. Nobody equated “ink on paper” with “actual rape against children”.
The bar to cross to be filtered out of the federation isn’t rape. Lolicon is already above the threshold, it’s embarrassing that he doesn’t realize that.
We’re not just talking about ‘ew gross icky’ exclusion from a social network. We’re talking about images whose possession is a felony. Images that are unambiguously the product of child rape.
This paper treats them the same. You’re defending that false equivalence. You need to stop.
Who places the bar for “exclusion from a social network” at felonies? Any kind child porn has no place on the fediverse, simulated or otherwise. That doesn’t mean they’re equal offenses, you’re just not responsible for carrying out anything other than cleaning out your porch.
Some confused arguments reveal confused people. Some terrible arguments reveal terrible people. For example: I don’t give two fucks what Nazis think. Life’s too short to wonder which subjects they’re not facile bastards about.
If someone’s motivation for making certain JPEGs hyper-illegal is “they’re icky” - they’ve lost benefit of the doubt. Because of their decisions, I no longer grant them that courtesy.
Demanding pointless censorship earns my dislike.
Equating art with violence earns my distrust.
Perhaps. But pretty much everyone has a stupid take on something.
There’s obviously a limit there, but most people can be reasoned with. So instead of jumping to a conclusion, attempt a dialogue first until they prove that they can’t be reasoned with. This is especially true on SM where, even if you can’t convince the person you’re talking with, you may just convince the next person to come along.
Telling someone why they’re a stupid bastard for the sake of other people is not exactly a contradiction. You know what doesn’t do observers any good? “Debating” complete garbage, in a way that lends it respect and legitimacy. Sometimes you just need to call bullshit.
Some bullshit is so blatant that it’s a black mark against the bullshitter.
Mastodon is a piece of software. I don’t see anyone saying “phpBB” or “WordPress” has a massive child abuse material problem.
Has anyone in the history ever said “Not a good look for phpBB”? No. Why? Because it would make no sense whatsoever.
I feel kind of a loss for words because how obvious it should be. It’s like saying “paper is being used for illegal material. Not a good look for paper.”
What is the solution to someone hosting illegal material on an nginx server? You report it to the authorities. You want to automate it? Go ahead and crawl the web for illegal material and generate automated reports. Though you’ll probably be the first to end up in prison.
I get what you’re saying, but due to federated nature, those CSAMs can easily spread to many instances without their admins noticing them. Having even one CSAM in your server is a huge risk for the server owner.
I don’t see what a server admin can do about it other than defederate the instant they get reports. Otherwise how can they possibly know?
This could be a really big issue though. People can make instances for really hateful and disgusting crap but even if everyone defederates from them it’s still giving them a platform, a tiny tiny corner on the internet to talk about truly horrible topics.
Again if it’s illegal content publically available, officials can charge those site admins with crime of hosting. Everyone just has a duty to defederate.
Those corners will exist no matter what service they use and there is nothing Mastodon can do to stop this. There’s a reason there are public lists of instances to defederate. This content can only be prevented by domain providers and governments.
Thats a dumb argument, though.
phpbb is not the host or the provider. Its just something you download and install on your server, with the actual service provider (You, the owner of the server and operator of the phpbb forum) being responsible for its content and curation.
Mastadon/Twitter/social media is the host/provider/moderator.
deleted by creator
According to corporate news everything outside of the corporate internet is pedophiles.
Well, terrorists became boring, and they still want the loony wing of the GOP’s clicks, so best to back off on Nazis and pro-Russians, leaving pedophiles as the safest bet.
Nazis not being the go-to target for a poisoning the well approach worries me in many different levels
Agreed. I’m in my 40s, and I’ve never seen anywhere near the level of subsurface signaling and intentional complacency we’re experiencing now.
Hasn’t Twitter had the same problem for years?
These articles are written by idiots, serving the whims of a corporate stooge to try and smear any other than corporate services and it isn’t even thinly veiled. Look at who this all comes from
Its weird how this headline shows up only when other headlines start covering how popular Mastadon is now.
Coincidence? Sure smells like it. God, I love astroturfing in the morning.
The article written by WaPo and regurgitated by The Verge is crap, but the study from Stanford is solid. However, it’s nowhere near as doom and gloom as the articles, and suggests plenty of ways to improve things. Primarily they suggest better tools for moderation.
better tools for moderation
Where have I heard that before?
The study from Stanford conflates pencil drawings of imaginary characters with actual evidence of child rape.
Half the goddamn point of saying CSAM instead of CP is to make that difference blindingly obvious. Somehow, they still missed it. Somehow they are talking about sexual abuse as if it’s something that can happen to pixels.
Direct link to the (short) report this article refers to:
https://stacks.stanford.edu/file/druid:vb515nd6874/20230724-fediverse-csam-report.pdf
https://purl.stanford.edu/vb515nd6874
After reading it, I’m still unsure what all they consider to be CSAM and how much of each category they found. Here are what they count as CSAM categories as far as I can tell. No idea how much the categories overlap, and therefore no idea how many beyond the 112 PhotoDNA images are of actual children.
- 112 instances of known CSAM of actual children, (identified by PhotoDNA)
- 713 times assumed CSAM, based on hashtags.
- 1,217 text posts talking about stuff related to grooming/trading. Includes no actual CSAM or CSAM trading/selling on Mastodon, but some links to other sites?
- Drawn and Computer-Generated images. (No quantity given, possibly not counted? Part of the 713 posts above?)
- Self-Generated CSAM. (Example is someone literally selling pics of their dick for Robux.) (No quantity given here either.)
Personally, I’m not sure what the take-away is supposed to be from this. It’s impossible to moderate all the user-generated content quickly. This is not a Fediverse issue. The same is true for Mastodon, Twitter, Reddit and all the other big content-generating sites. It’s a hard problem to solve. Known CSAM being deleted within hours is already pretty good, imho.
Meta-discussion especially is hard to police. Based on the report, it seems that most CP-material by mass is traded using other services (chat rooms).
For me, there’s a huge difference between actual children being directly exploited and virtual depictions of fictional children. Personally, I consider it the same as any other fetish-images which would be illegal with actual humans (guro/vore/bestiality/rape etc etc).
If we took this to its logical conclusion, most popular games would be banned. How many JRPGs have underage protagonists? How many of those have some kind of love story going on in the background? What about FPS games where you’re depicted killing other people? What about fantasy RPGs where you can kill and control animals?
Things should always be legal unless there’s a clear victim. And communities should absolutely be allowed to filter out anything they want, even if it’s 100% legal. So the lack of clear articulation of the legal issues is very worrisome since it implies a moral obligation to remove legal but taboo content.
Nothing you can do except go after server owners like usual. Has nothing to do with the fedi. Mastodon has nothing to do with either because anyone can pop up their own alternative server. This is one of many protocols they have or will use to distribute this stuff.
This just in: criminals are using the TCP protocol to distribute CP!!! What can the internet do to stop this? Oh yeah, go after server owners and groups like usual.
Things are a bit complicated in the fediverse. Sure, your instance might not host any pedo community, but if a user on your instance subscribe/interact with those community, the CSAMs might get federated into your instance without you noticing. There are tools to help you combat this, but as an instance owner you can’t just assume it’s not your problem if some other instance host pedo stuff.
That is definitely alarming, and a downside of the fedi, but seems like a necessary evil. Unfortunately admins and mods of small communties in the fedi will be the ones exposed to this. There has been better methods if handling this though. There are shared block lists out there and they already have lists that block out undesirable stuff like that, so it at least minimizes the amount of innocent eyes of mods, who are just regular unpaid people, from seeing disgusting stuff. Also, obviously those instances should be reported to the police, fbi, or whatever the heck
There is a database of known files of CSAM and their hashes, mastodon could implement a filter for those at the posting interaction and when federating content
Shadow banning those users would be nice too
They are talking about AI generated images. That’s the volume part.
This is one of the reasons I’m hesitant to start my own instance - the moderation load expands exponentially as you scale, and without some sort of automated tool to keep CSAM content from being posted in the first place, I can only see the problem increasing. I’m curious to see if anyone knows of lemmy or mastodon moderation tools that could help here.
That being said, it’s worth noting that the same Standford research team reviewed Twitter and found the same dynamic in play, so this isn’t a problem unique to Mastodon. The ugly thing is that Twitter has (or had) a team to deal with this, and yet:
“The investigation discovered problems with Twitter’s CSAM detection mechanisms and we reported this issue to NCMEC in April, but the problem continued,” says the team. “Having no remaining Trust and Safety contacts at Twitter, we approached a third-party intermediary to arrange a briefing. Twitter was informed of the problem, and the issue appears to have been resolved as of May 20.”
Research such as this is about to become far harder—or at any rate far more expensive—following Elon Musk’s decision to start charging $42,000 per month for its previously free API. The Stanford Internet Observatory, indeed, has recently been forced to stop using the enterprise-level of the tool; the free version is said to provide read-only access, and there are concerns that researchers will be forced to delete data that was previously collected under agreement.
So going forward, such comparisons will be impossible because Twitter has locked down its API. So yes, the Fediverse has a problem, the same one Twitter has, but Twitter is actively ignoring it while reducing transparency into future moderation.
If you run your instance behind cloudlare, you can enable the CSAM scanning tool which can automatically block and report known CSAMs to authorities if they’re uploaded into your server. This should reduce your risk as the instance operator.
https://developers.cloudflare.com/cache/reference/csam-scanning/
Sweet - thanks - that’s a brilliant tool. Bookmarked.
@Arotrios @corb3t @redcalcium perhaps we should learn to not stand behind cloudflare at all! their proxy:
- is filtering real people,
- is blocking randomly some requests between activity pub servers ❌the best way to deal with non solicited content is human moderation, little instance, few people, human scale… #smallWeb made of lots of little instances without any need of a big centralized proxy… 🧠
some debates: 💡 https://toot.cafe/@Coffee/109480850755446647
https://g33ks.coffee/@coffee/110519150084601332Thanks for the comment - I wasn’t aware of a cloudflare controversy in play, and went through your links and the associated wikipedia page. It’s interesting to me, as someone who previously ran a public forum, to see them struggle with the same issues surrounding hate speech I did on a larger scale.
I agree with your thoughts on a centralized service having that much power, although Cloudfare does have a number of competitors, so I’m not quite seeing the risk here, save for the fact that Cloudfare appears to be the only one offering CSAM filtering (will have to dig in further to confirm). The ActivityPub blocking for particular instances is concerning, but I don’t see a source on that - do you have more detail?
However, I disagree with your statement on handling non-solicited content - from personal experience, I can confidently state that there are some things that get submitted that you just shouldn’t subject another human too, even if it’s only to determine whether or not it should be deleted. CSAM falls under this category in my book. Having a system in place that keeps you and your moderators from having to deal with it is invaluable to a small instance owner.
@Arotrios @corb3t @redcalcium it’s all about community & trust. I agree that no one should ever has to deal with such offensive content, and my answer again is: running a small instance, with few people, creating a safe space, building trust… ☮️
Of course it’s a different approach about how to create and maintain an online community I guess. We don’t have to deal with non solicited content here because we are 20 and we kind of know each other, subscription is only available by invitation, so you are kind of responsible for who you’re bringing here… and we care about people, each of them! Again, community & trust over any tools 👌
obviously we do not share the same vision here, but it’s ok, I’m not trying to convince, I just wanted to say our approach is completely different 😉
more about filtering content: https://www.devever.net/~hl/cloudflare 💡
Thanks -that’s the detail I was looking for. Definitely food for thought.
I trust CloudFlare a helluva lot more than I trust most of these companies discussed on this thread. Their transparency is second to none.
I think the common sense solution is creating instances for physically local communities (thus keeping the moderation overhead to a minimum) and being very judicious about which instances you federate your instance with.
That being said, It’s only a matter of time before moderation tools are created for streamlining the process.
My instance is for members of a certain group, had to email the owner a picture of your card to get in. More instances should exist like that. General instances are great but it’s nice knowing all the people on my local are in this group too.
Nah, not intimidated. More that I ran a sizeable forum in the past and I know what what a pain in the ass this kind of content can be to deal with. That’s why I was asking about automated tools to deal with it. The forum I ran got targeted by a bunch of Turkish hackers, and their one of their attack techniques involved a wave of spambot accounts trying to post crap content. I wasn’t intimidated (fought them for about two years straight), but by the end of it I was exhausted to the point where it just wasn’t worth it anymore. An automated CSAM filter would have made a huge difference, but this was over a decade ago and those tools weren’t around.
“We got more photoDNA hits in a two-day period than we’ve probably had in the entire history of our organization of doing any kind of social media analysis, and it’s not even close,”
How do you have “probably” and “it’s not even close” in the same sentence?
Here’s the thing, and what I’ve been saying for a long time about The Fediverse:
I don’t care what platform you have, if it is sufficiently popular, you’re GOING to have CSAM. You’re going to have alt-right assholes. You’re going to have transphobia, you’re going to have racism and every other kind of discrimination.
People point fingers at Meta for “allowing” this but there’s no amount of money that can reasonably moderate 3 b-b-billion users. Meta, and probably every other platform that’s not Twitter or False social, does what they can about this.
Masto and Fedi admins need to be cognizant of the amount of users on their instances and need to have a sufficient number of moderators to manage those users. If they don’t have them, they need to close registrations.
But ultimately the Fediverse can also create safe-havens for these sorts of things. Making it easy to set up a discriminatory network that has no outside moderation. This is the downside of free speech.
Heck, Truth Social uses Mastodon, IIRC.
Ultimately, it’s software. Even if my home instance does a good job of enforcing it’s CoC, and every instance it federated with does as well, someone else can spin up their own instance, load up on whatever, and I’ll never know or even be aware if it’s never federated with my instance.
I think it uses SOME code
People point fingers at Meta for “allowing” this but there’s no amount of money that can reasonably moderate 3 b-b-billion users.
This is a prime use case for AI technology
“Now you have two problems.”
Thanks for reminding me of this masterpiece of writing about management of social networks
Seems odd that they mention Mastodon as a Twitter alternative in this article, but do not make any mention of the fact that Twitter is also rife with these problems, more so as they lose employees and therefore moderation capabilities. These problems have been around on Twitter for far longer, and not nearly enough has been done.
The actual report is probably better to read.
It points out that you upload to one server, and that server then sends the image to thousands of others. How do those thousands of others scan for this? In theory, using the PhotoDNA tool that large companies use, but then you have to send the every image to PhotoDNA thousands of times, once for each server (because how do you trust another server telling you it’s fine?).
The report provides recommendations on how servers can use signatures and public keys to trust scan results from PhotoDNA, so images can be federated with a level of trust. It also suggests large players entering the market (Meta, Wordpress, etc) should collaborate to build these tools that all servers can use.
Basically the original report points out the ease of finding CSAM on mastodon, and addresses the challenges unique to federation including proposing solutions. It doesn’t claim centralised servers have it solved, it just addresses additional challenges federation has.
I’m not actually going to read all that, but I’m going to take a few guesses that I’m quite sure are going to be correct.
First, I don’t think Mastodon has a “massive child abuse material” problem at all. I think it has, at best, a “racy Japanese style cartoon drawing” problem or, at worst, an “AI generated smut meant to look underage” problem. I’m also quite sure there are monsters operating in the shadows, dogwhistling and hashtagging to each other to find like minded people to set up private exchanges (or instances) for actual CSAM. This is no different than any other platform on the Internet, Mastodon or not. This is no different than the golden age of IRC. This is no different from Tor. This is no different than the USENET and BBS days. People use computers for nefarious shit.
All that having been said, I’m equally sure that this “research” claims that some algorithm has found “actual child porn” on Mastodon that has been verified by some “trusted third part(y|ies)” that may or may not be named. I’m also sure this “research” spends an inordinate amount of time pointing out the “shortcomings” of Mastodon (i.e. no built-in “features” that would allow corporations/governments to conduct what is essentially dragnet surveillance on traffic) and how this has to change “for the safety of the children.”
How right was I?
The content in question is unfortunately something that has become very common in recent months: CSAM (child sexual abuse material), generally AI-generated.
AI is now apparently generating entire children, abusing them, and uploading video of it.
Or, they are counting “CSAM-like” images as CSAM.
Of course they’re counting “CSAM-like” in the stats, otherwise they wouldn’t have any stats at all. In any case, they don’t really care about child abuse at all. They care about a platform existing that they haven’t been able to wrap their slimy tentacles around yet.
Halfway there. The PDF lists drawn 2D/3D, AI/ML generated 2D, and real-life CSAM. It does highlight the actual problem of young platforms with immature moderation tools not being able to deal with the sudden influx of objectional content.
I’m not going to read all that. You were probably pretty right.
…If you read it then you’d know if you’re rig
I’m always suspicious if someone argues pro Contents Filter with “protection of children” as the main argument…
As a parent, I’m always worried about any policy with “protect the children” as the main argument. There are lots of stupid policies proposed and sometimes implemented that are justified this way, such as:
- facial recognition to prevent underage kids from playing certain video games
- proof of ID to access social media and porn
- complicated parental controls on devices and services
And so on.
Most of these have easy ways to circumvent these rules and absolutely violate privacy, so I will be teaching my kids how to do that. In fact, once our home Internet gets fast enough, I may route all traffic through a VPN just to avoid most of these stupid rules and instead rely on trust with my kids to keep them safe on the Internet.
So what im reading is they didnt actually look at any images, they found hashtags, undisclosed hashtags at that. So basically we’ve no idea what they think they found, for all we know cartoon might’ve been one of the tags
I bet theres more CP hosted by Bing.
Is this Blahaj.zone admin “child abuse material” or actual child abuse material?
Or maybe it’s better to err on the side of caution when it comes to maybe one of the worst legal offences you can do?
I’m tired of people harping on this decision when it’s a perfectly legitimate one from a legal standpoint. There’s a reason tons of places are very iffy about nsfw content.