How Facebook Secretly Decides What Counts As Hate Speech

Image courtesy of Facebook

Facebook, like many social media platforms, wants to be a supporter of free expression without being seen as actively encouraging hate speech, threats, or harassment. Internally, Facebook has guidelines on what crosses that threshold, but a new investigation finds that these rules do not always make sense, nor do they tend to protect those who are most likely to be targeted.

In its latest investigation of Facebook’s black box, ProPublica dived into the actual rules Facebook uses to determine what qualifies as “hate speech” vs “legitimate political expression.”

The line, of course, can be blurry. Sometimes someone may quote or share hate speech they’ve received, in order to make a point. Or perhaps someone in a position of power, like a lawmaker, says something particularly hateful — in that instance, millions will share it, to meaningful effect. Or maybe someone really is a genuinely awful human being, throwing slurs, attacks, and threats at people who don’t look like them.

All of these, and more, happen every day. So what did ProPublica learn about how Facebook handles the (growing) problem?

Too Many People

One of the biggest challenges for Facebook is simply its sheer reach and scope.

The company wants to put in place globally applicable rules, but with a user base of more than 2 billion people — more than a quarter of the entire human population on Earth — the laws and culture each user is subject to or part of are going to have a pretty wide variance.

After a recent hiring spree, Facebook will eventually be up to approximately 7,500 content moderators. So right off the bat, users outnumber moderators by more than 266,000:1.

There aren’t any current, updated public statistics available about how much content hits Facebook every day. However, of Facebook’s 2 billion monthly active users, about two-thirds, or 1.28 billion, use the site on any given day. Back in 2013, four years ago, Facebook users were uploading 350 million photos a day, worldwide.

Even if that were all the content put on Facebook daily, it would still mean posts outnumber the moderators by more than 46,000:1. And the real number of photos, videos, and status updates shared each day is much, much higher than that.

With a user base that large, scanning everything posted would simply be physically impossible, completely aside from any questions of speech rights or ethics — you just couldn’t do it.

So Facebook, as it recently explained, analyzes content that gets flagged or reported by other users as being hate speech. The company says that on average, over the past two months, it’s deleted 66,000 posts per week that were reported as hate speech.

That, however, isn’t how many posts are reported for hate speech. Nor does that number of deletions include groups, pages, or profiles that get deleted or suspended for inappropriate behavior.

(As for getting more granular numbers, Facebook says, “We are exploring a better process by which to log our reports and removals, for more meaningful and accurate data.”)

The Math Of Protection

ProPublica obtained a set of training slides that Facebook uses to teach its content moderation team what is and isn’t hate speech.

It starts out with reducing the nuance of language to a single mathematical equation: “Protected category + Attack = hate speech.”

Mentioning a protected category does not count as hate speech, the slides say, which makes sense. Attacking a protected category does count as hate speech. In theory.

Facebook trains its content reviewers to explicitly protect users based on eight key categories: Sex, sexual orientation, gender identity, religious affiliation, ethnicity, race, national origin, and disability or disease.

It also explicitly instructs reviewers not to protect certain other groupings: social class, continental origin, appearance, age, occupation, political ideology, religions, and countries.

It’s confusing right off the bat: Attacks bases on someone’s religious affiliation or national origin seem to be against the rules, but attacks based on religions or countries seem not to be. Except for when they are.

Because it gets tricky when Facebook starts splitting people up into “subsets.” Someone who belongs to two protected categories is a protected person, Facebook seems to say, but someone belonging to one protected category and not another qualifies as “not.” The second, unprotected category basically overrides your presence in the protected category.

For example, the slide concludes, “Irish Women” qualify as a protected category, because that subset includes both national origin and sex. But “Irish teens” do not, because that subset is where national origin meets age.

Civil rights advocates often apply a concept called “intersectionality” to examinations of discrimination and hate speech, where a person is considered to sit at the nexus where all the aspects of their identity meet. So for example, one’s position in society is not simply considered on the basis of their gender, but also of their race, their disability status, their age, their socioeconomic standing, and so on.

But Facebook’s content moderation guidelines seem to apply the opposite tactic, instead considering people not as a whole, but as a pile of disparate parts, where the least qualified sets the overall rule.

And that leads to confusing results, like one training slide ProPublica obtained that included three examples of groups and asked, “Which of the below subsets do we protect?” The options were female drivers, black children, and white men; as Facebook has it, the correct answer to the quiz is “white men.”

Why? Because both “white” (a race) and “men” (a gender) are considered protected categories. But while “female” (a gender) and “black” (a race) are protected categories, neither “driver” (an occupation or skill) or “children” (an age) are. The unprotected part overrides the protected part, and so those groups are not protected from hate speech as far as Facebook is concerned.

Who It Hurts

As ProPublica found, Facebook’s uneven application of “protected class,” and exemptions for certain users, has a tendency to advantage the powerful and — inadvertently or not — silence those who are trying to draw attention to harm or oppression.

Facebook, for what it’s worth, is actively discussing its changing approach to hate speech as part of its corporate “hard questions” initiative, where it publicly discusses challenges like terrorism, hate, and propaganda pose for the platform.

“Sometimes, it’s obvious that something is hate speech and should be removed – because it includes the direct incitement of violence against protected characteristics, or degrades or dehumanizes people,” Facebook says. “If we identify credible threats of imminent violence against anyone, including threats based on a protected characteristic, we also escalate that to local law enforcement.”

But sometimes, Facebook admits, it doesn’t correctly interpret context, and makes a bad call. For example, at one point last year civil rights activist Shaun King posted hate mail he had received. Facebook deleted it because its vulgar slurs were indeed hate speech — but King shared it not to attack anyone, but rather to show that he was attacked.

“When we were alerted to the mistake, we restored the post and apologized,” Facebook says. “Still, we know that these kinds of mistakes are deeply upsetting for the people involved and cut against the grain of everything we are trying to achieve at Facebook.”

However, that kind of experience is far from isolated, ProPublica reports.

It profiles several black academics and activists in the U.S. who have had posts voicing sentiments like, “White folks. When racism happens in public — YOUR SILENCE IS VIOLENCE” deleted without explanation.

As ProPublica notes, users who have content deleted, or who have their accounts suspended, are not usually told what rule they may have broken, and they cannot appeal the decision.

This is particularly challenging in parts of the world with recurrent armed conflict or occupation: ProPublica profiled activists and journalists from India Ukraine, Western Sahara, and Israel who say they have had their accounts or pages disabled by Facebook, and have had to create new accounts.

Users who are high enough profile to be able to gain media attention may see an apology and their content restored. “If you get publicity, they clean it right up,” activist Leslie Mac told ProPublica.