Facebook’s language gaps weaken screening of hate, terrorism

Facebook reported internally it had erred in nearly half of all Arabic language takedown requests submitted for appeal. (File/AFP)
Short Url
Updated 25 October 2021
Follow

Facebook’s language gaps weaken screening of hate, terrorism

  • Arabic poses particular challenges to Facebook’s automated systems and human moderators, each of which struggles to understand spoken dialects
  • In some of the world’s most volatile regions, terrorist content and hate speech proliferate because Facebook remains short on moderators who speak local languages and understand cultural contexts

DUBAI: As the Gaza war raged and tensions surged across the Middle East last May, Instagram briefly banned the hashtag #AlAqsa, a reference to the Al-Aqsa Mosque in Jerusalem’s Old City, a flash point in the conflict.
Facebook, which owns Instagram, later apologized, explaining its algorithms had mistaken the third-holiest site in Islam for the militant group Al-Aqsa Martyrs Brigade, an armed offshoot of the secular Fatah party.
For many Arabic-speaking users, it was just the latest potent example of how the social media giant muzzles political speech in the region. Arabic is among the most common languages on Facebook’s platforms, and the company issues frequent public apologies after similar botched content removals.
Now, internal company documents from the former Facebook product manager-turned-whistleblower Frances Haugen show the problems are far more systemic than just a few innocent mistakes, and that Facebook has understood the depth of these failings for years while doing little about it.
Such errors are not limited to Arabic. An examination of the files reveals that in some of the world’s most volatile regions, terrorist content and hate speech proliferate because the company remains short on moderators who speak local languages and understand cultural contexts. And its platforms have failed to develop artificial-intelligence solutions that can catch harmful content in different languages.
In countries like Afghanistan and Myanmar, these loopholes have allowed inflammatory language to flourish on the platform, while in Syria and the Palestinian territories, Facebook suppresses ordinary speech, imposing blanket bans on common words.
“The root problem is that the platform was never built with the intention it would one day mediate the political speech of everyone in the world,” said Eliza Campbell, director of the Middle East Institute’s Cyber Program. “But for the amount of political importance and resources that Facebook has, moderation is a bafflingly under-resourced project.”
This story, along with others published Monday, is based on Haugen’s disclosures to the Securities and Exchange Commission, which were also provided to Congress in redacted form by her legal team. The redacted versions were reviewed by a consortium of news organizations, including The Associated Press.
In a statement to the AP, a Facebook spokesperson said that over the last two years the company has invested in recruiting more staff with local dialect and topic expertise to bolster its review capacity around the world.
But when it comes to Arabic content moderation, the company said, “We still have more work to do. ... We conduct research to better understand this complexity and identify how we can improve.”
In Myanmar, where Facebook-based misinformation has been linked repeatedly to ethnic and religious violence, the company acknowledged in its internal reports that it had failed to stop the spread of hate speech targeting the minority Rohingya Muslim population.
The Rohingya’s persecution, which the US has described as ethnic cleansing, led Facebook to publicly pledge in 2018 that it would recruit 100 native Myanmar language speakers to police its platforms. But the company never disclosed how many content moderators it ultimately hired or revealed which of the nation’s many dialects they covered.
Despite Facebook’s public promises and many internal reports on the problems, the rights group Global Witness said the company’s recommendation algorithm continued to amplify army propaganda and other content that breaches the company’s Myanmar policies following a military coup in February.
In India, the documents show Facebook employees debating last March whether it could clamp down on the “fear mongering, anti-Muslim narratives” that Prime Minister Narendra Modi’s far-right Hindu nationalist group, Rashtriya Swayamsevak Sangh, broadcasts on its platform.
In one document, the company notes that users linked to Modi’s party had created multiple accounts to supercharge the spread of Islamophobic content. Much of this content was “never flagged or actioned,” the research found, because Facebook lacked moderators and automated filters with knowledge of Hindi and Bengali.
Arabic poses particular challenges to Facebook’s automated systems and human moderators, each of which struggles to understand spoken dialects unique to each country and region, their vocabularies salted with different historical influences and cultural contexts.
The Moroccan colloquial Arabic, for instance, includes French and Berber words, and is spoken with short vowels. Egyptian Arabic, on the other hand, includes some Turkish from the Ottoman conquest. Other dialects are closer to the “official” version found in the Qur’an. In some cases, these dialects are not mutually comprehensible, and there is no standard way of transcribing colloquial Arabic.
Facebook first developed a massive following in the Middle East during the 2011 Arab Spring uprisings, and users credited the platform with providing a rare opportunity for free expression and a critical source of news in a region where autocratic governments exert tight controls over both. But in recent years, that reputation has changed.
Scores of Palestinian journalists and activists have had their accounts deleted. Archives of the Syrian civil war have disappeared. And a vast vocabulary of everyday words have become off-limits to speakers of Arabic, Facebook’s third-most common language with millions of users worldwide.
For Hassan Slaieh, a prominent journalist in the blockaded Gaza Strip, the first message felt like a punch to the gut. “Your account has been permanently disabled for violating Facebook’s Community Standards,” the company’s notification read. That was at the peak of the bloody 2014 Gaza war, following years of his news posts on violence between Israel and Hamas being flagged as content violations.
Within moments, he lost everything he’d collected over six years: personal memories, stories of people’s lives in Gaza, photos of Israeli airstrikes pounding the enclave, not to mention 200,000 followers. The most recent Facebook takedown of his page last year came as less of a shock. It was the 17th time that he had to start from scratch.
He had tried to be clever. Like many Palestinians, he’d learned to avoid the typical Arabic words for “martyr” and “prisoner,” along with references to Israel’s military occupation. If he mentioned militant groups, he’d add symbols or spaces between each letter.
Other users in the region have taken an increasingly savvy approach to tricking Facebook’s algorithms, employing a centuries-old Arabic script that lacks the dots and marks that help readers differentiate between otherwise identical letters. The writing style, common before Arabic learning exploded with the spread of Islam, has circumvented hate speech censors on Facebook’s Instagram app, according to the internal documents.
But Slaieh’s tactics didn’t make the cut. He believes Facebook banned him simply for doing his job. As a reporter in Gaza, he posts photos of Palestinian protesters wounded at the Israeli border, mothers weeping over their sons’ coffins, statements from the Gaza Strip’s militant Hamas rulers.
Criticism, satire and even simple mentions of groups on the company’s Dangerous Individuals and Organizations list — a docket modeled on the US government equivalent — are grounds for a takedown.
“We were incorrectly enforcing counterterrorism content in Arabic,” one document reads, noting the current system “limits users from participating in political speech, impeding their right to freedom of expression.”
The Facebook blacklist includes Gaza’s ruling Hamas party, as well as Hezbollah, the militant group that holds seats in Lebanon’s Parliament, along with many other groups representing wide swaths of people and territory across the Middle East, the internal documents show, resulting in what Facebook employees describe in the documents as widespread perceptions of censorship.
“If you posted about militant activity without clearly condemning what’s happening, we treated you like you supported it,” said Mai el-Mahdy, a former Facebook employee who worked on Arabic content moderation until 2017.
In response to questions from the AP, Facebook said it consults independent experts to develop its moderation policies and goes “to great lengths to ensure they are agnostic to religion, region, political outlook or ideology.”
“We know our systems are not perfect,” it added.
The company’s language gaps and biases have led to the widespread perception that its reviewers skew in favor of governments and against minority groups.
Former Facebook employees also say that various governments exert pressure on the company, threatening regulation and fines. Israel, a lucrative source of advertising revenue for Facebook, is the only country in the Mideast where Facebook operates a national office. Its public policy director previously advised former right-wing Prime Minister Benjamin Netanyahu.
Israeli security agencies and watchdogs monitor Facebook and bombard it with thousands of orders to take down Palestinian accounts and posts as they try to crack down on incitement.
“They flood our system, completely overpowering it,” said Ashraf Zeitoon, Facebook’s former head of policy for the Middle East and North Africa region, who left in 2017. “That forces the system to make mistakes in Israel’s favor. Nowhere else in the region had such a deep understanding of how Facebook works.”
Facebook said in a statement that it fields takedown requests from governments no differently from those from rights organizations or community members, although it may restrict access to content based on local laws.
“Any suggestion that we remove content solely under pressure from the Israeli government is completely inaccurate,” it said.
Syrian journalists and activists reporting on the country’s opposition also have complained of censorship, with electronic armies supporting embattled President Bashar Assad aggressively flagging dissident content for removal.
Raed, a former reporter at the Aleppo Media Center, a group of antigovernment activists and citizen journalists in Syria, said Facebook erased most of his documentation of Syrian government shelling on neighborhoods and hospitals, citing graphic content.
“Facebook always tells us we break the rules, but no one tells us what the rules are,” he added, giving only his first name for fear of reprisals.
In Afghanistan, many users literally cannot understand Facebook’s rules. According to an internal report in January, Facebook did not translate the site’s hate speech and misinformation pages into Dari and Pashto, the two most common languages in Afghanistan, where English is not widely understood.
When Afghan users try to flag posts as hate speech, the drop-down menus appear only in English. So does the Community Standards page. The site also doesn’t have a bank of hate speech terms, slurs and code words in Afghanistan used to moderate Dari and Pashto content, as is typical elsewhere. Without this local word bank, Facebook can’t build the automated filters that catch the worst violations in the country.
When it came to looking into the abuse of domestic workers in the Middle East, internal Facebook documents acknowledged that engineers primarily focused on posts and messages written in English. The flagged-words list did not include Tagalog, the major language of the Philippines, where many of the region’s housemaids and other domestic workers come from.
In much of the Arab world, the opposite is true — the company over-relies on artificial-intelligence filters that make mistakes, leading to “a lot of false positives and a media backlash,” one document reads. Largely unskilled human moderators, in over their heads, tend to passively field takedown requests instead of screening proactively.
Sophie Zhang, a former Facebook employee-turned-whistleblower who worked at the company for nearly three years before being fired last year, said contractors in Facebook’s Ireland office complained to her they had to depend on Google Translate because the company did not assign them content based on what languages they knew.
Facebook outsources most content moderation to giant companies that enlist workers far afield, from Casablanca, Morocco, to Essen, Germany. The firms don’t sponsor work visas for the Arabic teams, limiting the pool to local hires in precarious conditions — mostly Moroccans who seem to have overstated their linguistic capabilities. They often get lost in the translation of Arabic’s 30-odd dialects, flagging inoffensive Arabic posts as terrorist content 77 percent of the time, one document said.
“These reps should not be fielding content from non-Maghreb region, however right now it is commonplace,” another document reads, referring to the region of North Africa that includes Morocco. The file goes on to say that the Casablanca office falsely claimed in a survey it could handle “every dialect” of Arabic. But in one case, reviewers incorrectly flagged a set of Egyptian dialect content 90 percent of the time, a report said.
Iraq ranks highest in the region for its reported volume of hate speech on Facebook. But among reviewers, knowledge of Iraqi dialect is “close to non-existent,” one document said.
“Journalists are trying to expose human rights abuses, but we just get banned,” said one Baghdad-based press freedom activist, who spoke on condition of anonymity for fear of reprisals. “We understand Facebook tries to limit the influence of militias, but it’s not working.”
Linguists described Facebook’s system as flawed for a region with a vast diversity of colloquial dialects that Arabic speakers transcribe in different ways.
“The stereotype that Arabic is one entity is a major problem,” said Enam Al-Wer, professor of Arabic linguistics at the University of Essex, citing the language’s “huge variations” not only between countries but class, gender, religion and ethnicity.
Despite these problems, moderators are on the front lines of what makes Facebook a powerful arbiter of political expression in a tumultuous region.
Although the documents from Haugen predate this year’s Gaza war, episodes from that 11-day conflict show how little has been done to address the problems flagged in Facebook’s own internal reports.
Activists in Gaza and the West Bank lost their ability to livestream. Whole archives of the conflict vanished from newsfeeds, a primary portal of information for many users. Influencers accustomed to tens of thousands of likes on their posts saw their outreach plummet when they posted about Palestinians.
“This has restrained me and prevented me from feeling free to publish what I want for fear of losing my account,” said Soliman Hijjy, a Gaza-based journalist whose aerials of the Mediterranean Sea garnered tens of thousands more views than his images of Israeli bombs — a common phenomenon when photos are flagged for violating community standards.
During the war, Palestinian advocates submitted hundreds of complaints to Facebook, often leading the company to concede error and reinstate posts and accounts.
In the internal documents, Facebook reported it had erred in nearly half of all Arabic language takedown requests submitted for appeal.
“The repetition of false positives creates a huge drain of resources,” it said.
In announcing the reversal of one such Palestinian post removal last month, Facebook’s semi-independent oversight board urged an impartial investigation into the company’s Arabic and Hebrew content moderation. It called for improvement in its broad terrorism blacklist to “increase understanding of the exceptions for neutral discussion, condemnation and news reporting,” according to the board’s policy advisory statement.
Facebook’s internal documents also stressed the need to “enhance” algorithms, enlist more Arab moderators from less-represented countries and restrict them to where they have appropriate dialect expertise.
“With the size of the Arabic user base and potential severity of offline harm … it is surely of the highest importance to put more resources to the task to improving Arabic systems,” said the report.
But the company also lamented that “there is not one clear mitigation strategy.”
Meanwhile, many across the Middle East worry the stakes of Facebook’s failings are exceptionally high, with potential to widen long-standing inequality, chill civic activism and stoke violence in the region.
“We told Facebook: Do you want people to convey their experiences on social platforms, or do you want to shut them down?” said Husam Zomlot, the Palestinian envoy to the United Kingdom, who recently discussed Arabic content suppression with Facebook officials in London. “If you take away people’s voices, the alternatives will be uglier.”


France tries five for kidnapping journalists in Syria

Updated 17 February 2025
Follow

France tries five for kidnapping journalists in Syria

  • They were charged with holding four French journalists hostage for Daesh in war-torn Syria more than a decade ago
  • The journalists were held by Daesh in Aleppo for 10 months until their release in April 2014

PARIS: Five men went on trial in France on Monday charged with holding four French journalists hostage for Daesh in war-torn Syria more than a decade ago.

Daesh emerged in 2013 in the chaos that followed the outbreak of the Syrian civil war.

The militants kidnapped a number of foreign journalists and aid workers before US-backed forces eventually defeated the group in 2019.

Reporters Didier Francois and Edouard Elias, and then Nicolas Henin and Pierre Torres, were abducted 10 days apart while reporting from northern Syria in June 2013.

The journalists were held by Daesh in Aleppo for 10 months until their release in April 2014.

They were found blindfolded with their hands bound in the no-man’s land straddling the border between Syria and Turkiye.

More than a decade later, jailed militant Mehdi Nemmouche, 39, is among five men accused of their kidnapping at a trial to last until March 21.

Nemmouche is already in prison after a Belgian court jailed him for life in 2019 for killing four people at a Jewish museum in May 2014, after returning from Syria.

“I was never the jailer of the Western hostages or any other hostage, and I never met these people in Syria,” Nemmouche told the Paris court, breaking his silence after not speaking throughout the Brussels trial or during the investigation.

All four journalists told investigators they were sure Nemmouche was their jailer.

Henin, in a magazine article in September 2014, recounted Nemmouche, then called Abu Omar, punching him in the face and terrorizing Syrian detainees. He described him as “a self-centered fantasist.”

Also in the dock are Frenchman Abdelmalek Tanem, 35, who has already been sentenced in France for heading to fight in Syria in 2012, and a 41-year-old Syrian called Kais Al-Abdallah, accused of facilitating Henin’s kidnapping. Both have denied the charges.

Belgian militant Oussama Atar, a senior Daesh commander, is being tried in absentia because he is presumed to have died in Syria in 2017. He has already been sentenced to life over attacks in Paris in 2015 claimed by Daesh that killed 130 people, and Brussels bombings by the group that took the lives of 32 others in 2016.

French Daesh member Salim Benghalem, who was allegedly in charge of the hostages, is also on trial though believed to be dead.


West Bank booksellers say arrests reflect intensifying Israeli crackdown on Palestinian culture

Updated 15 February 2025
Follow

West Bank booksellers say arrests reflect intensifying Israeli crackdown on Palestinian culture

  • Mahmoud Muna and his nephew Ahmed were arrested on Sunday after Israeli police raided the family-owned bookshops on accusation of selling books that supported terrorism
  • “Case is not isolated event, but part of series of attack against Palestinian cultural institutions,” Mahmoud said

LONDON: Two booksellers from the West Bank, recently arrested by Israeli police, say their detention is part of an escalating effort by Israeli authorities to suppress Palestinian culture.

In an interview with The Guardian, Mahmoud Muna and his nephew Ahmed, whose family has owned the Educational Bookshop in East Jerusalem for more than 40 years, described the raid on their store as part of a broader campaign to stifle Palestinian identity and free expression.

“We should not look at this as an isolated event,” Mahmoud said. “There have been a series of attacks on cultural institutions in Jerusalem and beyond. I think there is an awareness in the Israeli establishment that cultural institutions are playing a role in galvanising and protecting Palestinian cultural identity.”

The raid occurred last Sunday when plainclothes officers entered two branches of the bookshop on Salah Eddin Street — one specializing in Arabic books, the other in English and foreign-language publications. Mahmoud and Ahmed were arrested and detained for two days.

Israeli police accused the men of “selling books containing incitement and support for terrorism,” claiming officers found materials with “nationalist Palestinian themes,” including a children’s coloring book that contained the Israeli-contested sentence “From the river to the sea.”

The two men said that police confiscated about 300 books for examination, but all were eventually returned except for eight, including the coloring book, which they said had been sent for review and was not on sale.

After appearing in Jerusalem Magistrates Court on Monday, the charges against them were downgraded to a public order offense, but they were ordered to spend another 24 hours in detention, followed by five days of house arrest.

Their arrest sparked international condemnation, with journalists and diplomats closely following the case. In Israel, the incident also drew criticism, with journalist Noa Simone calling the raid a “fascist act” that “evokes frightening historical associations with which every Jew is very familiar.”

Recalling their time in detention, the booksellers described the conditions as “simply unfit for a human to live in.” They said they were held in overcrowded, windowless cells without heating, forced to sleep on mats on a concrete floor in near-freezing temperatures — treatment they likened to psychological torture.

While their experience was harsh, they acknowledged that their situation could have been far worse without international attention and support.

“If we were not working in a bookstore with an international outreach with good international connections, what would have happened?” Mahmoud asked. “Probably the case would have been manipulated against us.”

He also warned of the broader implications of their arrest. “The question is how far are they going to go? If they’re attacking Palestinian bookstores now, they will be attacking Israeli bookstores next.”


Bristling at ‘Gulf of Mexico’ name change on maps, Mexico threatens to sue Google

Updated 14 February 2025
Follow

Bristling at ‘Gulf of Mexico’ name change on maps, Mexico threatens to sue Google

  • After assuming office as US president, Donald Trump declared that he was changing the name Gulf of Mexico to Gulf of America
  • Mexican President Claudia Sheinbaum said the name Gulf of Mexico dates back to 1607 and is recognized by the United Nations
  • Google has said that it maintains a “long-standing practice of applying name changes when they have been updated in official government sources”

MEXICO CITY: Mexican President Claudia Sheinbaum said Thursday that her government wouldn’t rule out filing a civil lawsuit against Google if it maintains its stance of calling the stretch of sea between northeastern Mexico and the southeastern United States the “Gulf of America.”
The area, long named the Gulf of Mexico across the the world, has gained a geopolitical spotlight after President Donald Trump declared he would change the Gulf’s name.
Sheinbaum, in her morning news conference, said the president’s decree is restricted to the “continental shelf of the United States” because Mexico still controls much of the Gulf. “We have sovereignty over our continental shelf,” she said.
Sheinbaum said that despite the fact that her government sent a letter to Google saying that the company was “wrong” and that “the entire Gulf of Mexico cannot be called the Gulf of America,” the company has insisted on maintaining the nomenclature.
It was not immediately clear where such a suit would be filed.
Google reported last month on its X account, formerly Twitter, that it maintains a “long-standing practice of applying name changes when they have been updated in official government sources.”
As of Thursday, how the Gulf appeared on Google Maps was dependent on the user’s location and other data. If the user is in the United States, the body of water appeared as Gulf of America. If the user was physically in Mexico, it would appear as the Gulf of Mexico. In many other countries across the world it appears as “Gulf of Mexico (Gulf of America).”
Sheinbaum has repeatedly defended the name Gulf of Mexico, saying its use dates to 1607 and is recognized by the United Nations.
She has also mentioned that, according to the constitution of Apatzingán, the antecedent to Mexico’s first constitution, the North American territory was previously identified as “Mexican America”. Sheinbaum has used the example to poke fun at Trump and underscore the international implications of changing the Gulf’s name.
In that sense, Sheinbaum said on Thursday that the Mexican government would ask Google to make “Mexican America” pop up on the map when searched.
This is not the first time Mexicans and Americans have disagreed on the names of key geographic areas, such as the border river between Texas and the Mexican states of Chihuahua, Coahuila, Nuevo León and Tamaulipas. Mexico calls it Rio Bravo and for the United States it is the Rio Grande.
This week, the White House barred Associated Press reporters from several events, including some in the Oval Office, saying it was because of the news agency’s policy on the name. AP is using “Gulf of Mexico” but also acknowledging Trump’s renaming of it as well, to ensure that names of geographical features are recognizable around the world.

 


124 journalists killed, most by Israel, in deadliest year for reporters

Updated 13 February 2025
Follow

124 journalists killed, most by Israel, in deadliest year for reporters

  • The uptick in killings marks a 22 percent increase over 2023
  • Journalists murdered across 18 different countries, including Palestine's Gaza, Sudan and Pakistan

NEW YORK: Last year was the deadliest for journalists in recent history, with at least 124 reporters killed — and Israel responsible for nearly 70 percent of that total, the Committee to Protect Journalists reported Wednesday.
The uptick in killings, which marks a 22 percent increase over 2023, reflects “surging levels of international conflict, political unrest and criminality worldwide,” the CPJ said.
It was the deadliest year for reporters and media workers since CPJ began keeping records more than three decades ago, with journalists murdered across 18 different countries, it said.
A total of 85 journalists died in the Israeli-Hamas war, “all at the hands of the Israeli military,” the CPJ said, adding that 82 of them were Palestinians.
Sudan and Pakistan recorded the second highest number of journalists and media workers killed, with six each.
In Mexico, which has a reputation as one of the most dangerous countries for reporters, five were killed, with CPJ reporting it had found “persistent flaws” in Mexico’s mechanisms for protecting journalists.
And in Haiti, where two reporters were murdered, widespread violence and political instability have sown so much chaos that “gangs now openly claim responsibility for journalist killings,” the report said.
Other deaths took place in countries such as Myanmar, Mozambique, India and Iraq.
“Today is the most dangerous time to be a journalist in CPJ’s history,” said the group’s CEO Jodie Ginsberg.
“The war in Gaza is unprecedented in its impact on journalists and demonstrates a major deterioration in global norms on protecting journalists,” she said.
CPJ, which has kept records on journalist killings since 1992, said that 24 of the reporters were deliberately killed because of their work in 2024.
Freelancers, the report said, were among the most vulnerable because of their lack of resources, and accounted for 43 of the killings in 2024.
The year 2025 is not looking more promising, with six journalists already killed in the first weeks of the year, CPJ said.


Roblox CEO announces Arabic version at World Governments Summit

Updated 12 February 2025
Follow

Roblox CEO announces Arabic version at World Governments Summit

DUBAI: Roblox CEO David Baszucki announced an Arabic version of the hit game platform during the World Governments Summit on Wednesday.

Baszucki said that the new feature enabled Arabic-speaking creators to reach audiences instantly all over the world.

Through the move, everything on the platform will be available in Arabic.

“Today, we launched worldwide in Arabic, everything on Roblox: Roblox Studio, the Roblox app, automatic translation. Anyone who’s building a Roblox experience in Arabic, it will automatically translate into languages around the world,” he said.

Roblox, an online game platform and game creation system, has more than 88.9 million daily active users.

Many brands use the platform to promote their products, from cosmetics to high-end luxury goods.

“Brands are using our platforms to build 3D experiences to help promote their brands — everything from e.l.f. Beauty to Lamborghini,” he added.

“We have been growing consistently for 18 years now, over 20 percent year on year.”

In the past, the gaming platform faced criticism over safety concerns regarding children on the platform. In 2018, it was banned for several years in the UAE for exposing children to swearing, violence and sexually explicit content.

Baszucki said that child safety is a major concern for the company and that Roblox is utilizing AI technology to ensure a safe gaming experience for users.

“AI is getting so good and evolving so quickly. We have over 200 AI systems on Roblox. We are clear that we are looking at everything on the platform for safety and stability. We are so into the notion of online safety — it’s a top priority,” he said.