How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream

As Gulf states aim to become AI leaders by investing in R&D and startups (Supplied/MBZUAI)
Short Url
Updated 09 October 2023
Follow

How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream

  • ChatGPT understands inquiries in Arabic, but answers can sound unnatural or fail to convey the right message
  • Now homegrown LLMs can capture linguistic nuances and even comprehend dialects and cultural references

DUBAI: When ChatGPT made its debut last year, the artificial intelligence program caused a global sensation, as users found themselves communicating with a machine that could pass as another human being.

However, the enthusiasm among techies in the Arab world was somewhat diminished by ChatGPT’s limited grasp of Arabic, in part the result of the language’s complexity, diacritical markings, inflection system and regional dialects.

Although ChatGPT, which is based on a large language model, or LLM, can understand inquiries in Arabic and is able to translate, especially when using Modern Standard Arabic, answers can come across as unnatural, while literal translations do not always convey the right message.

That is why Jais, an LLM designed to support Arabic, was unveiled in July, bringing one of the world’s most widely spoken, though occasionally overlooked, languages into the AI mainstream.

Jais, a name that recalls the UAE’s highest peak in Ras Al-Khaimah, is the brainchild of a team of academics and engineers who embarked on the project because they felt too few LLMs were credibly multilingual.




The Ameca humanoid robot greets visitors at Dubai's Museum of the Future. (AFP)

Downloadable on the machine learning platform Hugging Face, Jais is the result of a collaboration between Cerebras Systems, Mohamed bin Zayed University of Artificial Intelligence, or MBZUAI, and a subsidiary of the Abu Dhabi-based G42 called Inception.

“It is vital that large language models are developed for languages other than English to ensure that innovation is accessible to everyone,” Andy Jackson, CEO of Inception, told Arab News.

“A quality Arabic LLM is critical for all sectors, businesses and organizations, as well as individuals. Innovation thrives when we collaborate, and Jais sets a new standard for AI advancement in the Middle East, ensuring that the Arabic language, with its depth and heritage, finds its voice within the AI landscape.

“Jais demonstrates our commitment to excellence, and our dedication to democratizing AI and promoting innovation.”

LLMs are functional machine learning models that use deep learning algorithms to process and understand natural human language. These models are then trained on large amounts of text data to learn patterns in the language.

These programs, which are rapidly proliferating in the wake of ChatGPT’s success, are capable of generating text on a seemingly endless array of subjects, producing everything from academic papers to poetry.

What is especially impressive about them is their ability to create responses to questions that are so convincingly human-like in almost any language, including coding.

But in order to make those languages sound convincing, native-speaking human programmers are often required to provide a critical layer of context and understanding that can enhance accuracy and reliability.

“Jais is purpose-built for the Arabic language and excels in capturing its intricacies and nuances, ensuring highly accurate and contextually relevant responses — a distinct advantage over general-purpose models,” said Jackson.




AI programs that are responsive to the Arabic language could widen access to a transformational new technology. (MBZUAI)

“This specialization is a pivotal development, opening up opportunities for governments, industries, and individuals across the Arab world to tap into the potential of generative AI.”

Currently considered among the foremost Arabic LLMs, Jais, a 13-billion parameter model, was trained on a newly developed 395-billion-token Arabic and English dataset on Condor Galaxy, one of the largest cloud AI supercomputers in the world, launched by G42 and Cerebras in July using 116 billion Arabic tokens and 279 billion English tokens.

“Jais was born in Abu Dhabi and offers more than 400 million Arabic speakers the opportunity to harness the potential of generative AI,” Preslav Nakov, professor and deputy department chair of Natural Language Processing at MBZUAI, told Arab News.

“It will facilitate and expedite innovation, highlighting Abu Dhabi’s leading position as a hub for AI, innovation, culture preservation and international collaboration.”

As an open-source model, Jais is expected to engage scientists, academics and developers to accelerate the growth of a an Arabic language AI ecosystem. It could also serve as a model for other languages now underrepresented in mainstream AI.

FASTFACTS

• Large language models, or LLMs, are a type of AI that can mimic human intelligence.

• Arabic is spoken by 400m people, but accounts for 1 percent of total global online content.

• Jais was created by Cerebras, MBZUAI, and a subsidiary of G42 called Inception.

“Jais outperforms existing Arabic models by a sizable margin,” said Nakov. “It is also competitive with English models of similar size despite being trained on significantly less English data.

“This exciting result shows that the model’s English component learned from the Arabic data and vice versa, opening a new era in LLM development and training.”

In Jais’s development, significant attention was devoted to pre-processing Arabic text, enhancing support for the language’s unique features, including its writing style and word order.

Jais also maintains a balanced Arabic-English dataset focus for optimal performance, offering a marked improvement over models with a limited Arabic text presence.

Its developers say Jais, unlike other models, captures linguistic nuances and even comprehends various Arabic dialects and cultural references.

“Jais facilitates faster customization for specific Arabic-focused use cases and addresses data ownership concerns by being based in the UAE, offering a reassuring solution for local enterprises,” said Inception CEO Jackson.




LLMs are functional machine learning models that use deep learning algorithms to process and understand natural human language. (Supplied)

The UAE’s Ministry of Foreign Affairs and Ministry of Industry and Advanced Technology, Abu Dhabi’s National Oil Company and Department of Health, Etihad Airways, First Abu Dhabi Bank, and global technology group e& are planning to utilize Jais, offering valuable insights to enhance the model and its applications across their industries.

Given the strong digital transformation efforts by several of the Arab Gulf governments, accompanied by huge investments in high-tech industries and homegrown tech startups, AI programs that are responsive to the Arabic language could widen access to a transformational new technology and challenge the monopoly of a clutch of Silicon Valley companies.

Last month, Technology Innovation Institute, an Emirati research center in Abu Dhabi, released Falcon 180b, an open-source AI model. Established in 2020, TII released Falcon 40b, the first version of its flagship open-source AI model, in May this year, after unveiling Noor, an Arabic-based AI model, last year.

According to a report in The Economist magazine, TII is the applied-research arm of the Advanced Technology Research Council, a government agency that employs an 800-strong multinational staff working on subjects from biotechnology and robotics to quantum computing.

“We are entering the game to disrupt the core players,” Faisal Al-Bannai, secretary-general of the ATRC, told The Economist, adding that TII will build new proprietary models and applications catering for specific fields such as medicine and law.

For its part, Saudi Arabia launched its National Strategy for Data and Artificial Intelligence in October 2020, aiming to become a global leader in the field as it seeks to attract $20 billion in foreign and local investments by 2030.

The Kingdom is also determined to future-proof its workforce, initially by training and developing a pool of 20,000 AI and data specialists. In May this year, Deloitte’s AI Institute was officially launched at the Experience Analytics conference in Riyadh.

Just last week Saudi Arabia launched a National Olympiad for Programming and Artificial Intelligence open to all middle- and high-school pupils. An estimated 300,000 students will be selected from 3 million participants for training in programming and AI, according to media reports.




The hope is that the advent of AI and the automation of rapid translation will be a game changer for Arabic content. (LEAP)

The initiative is a collaboration between the Saudi Data and Artificial Intelligence Authority, in collaboration with the Ministry of Education and King Abdulaziz and His Companions Foundation for Giftedness and Creativity (Mawhiba).

Saudi Arabia’s adoption of digitalization and emerging technologies is forecast to contribute about 2.4 percent to its gross domestic product by 2030, according to a recent report by global consultancy firm PwC.

In terms of average annual growth in the contribution of AI by region, Saudi Arabia is expected to grab a 31.3 percent share in the technology’s expansion between 2018 and 2030, the PwC report added.

“AI is developing rapidly, and its impact will be felt more and more across all sectors and areas of life,” said MBZUAI’s Nakov. “In this context, it is vital that the Arab world has access to an advanced LLM that can be adapted and utilized across all sectors.

“The rapid advancement of AI means that organizations that fail to adapt and start using AI sooner rather than later will be left behind, which makes it even more essential for the Arab world to have access to quality LLMs.”

Beyond its business applications, however, a crucial aspect of a program such as Jais is its ability to champion neglected languages, preserve them in a fast-changing economy, and promote digital inclusivity.

Although Arabic is an official language in 22 countries and is partly spoken in 11 others, it accounts for just 1 percent of total global online content, according to Jais’s creators. The hope is that the advent of AI and the automation of rapid translation will be a game changer.

By placing the language at the forefront of the AI revolution, Jais and its successors could help to maintain Arabic’s global prominence and its distinctive cultural significance in the digital age.


Israeli strikes batter Lebanon, killing five medics

Updated 22 November 2024
Follow

Israeli strikes batter Lebanon, killing five medics

  • Israel has pushed on with its intense military campaign against Hezbollah, tempering hopes that efforts by a US envoy could lead to an imminent ceasefire
  • Hezbollah said it had fired rockets at Israeli troops east of Khiyam at least four times on Friday

BEIRUT: Israeli strikes battered southern Lebanon and the outskirts of the capital Beirut on Friday, killing at least five medics, as ground troops clashed with Hezbollah fighters in the south.
Israel has pushed on with its intense military campaign against the Iran-backed armed group Hezbollah, tempering hopes that efforts by a US envoy could lead to an imminent ceasefire.
US mediator Amos Hochstein said earlier this week in Beirut that a truce was “within our grasp.” He traveled on to meet Israeli Prime Minister Benjamin Netanyahu and Defense Minister Israel Katz before returning to Washington, according to the news outlet Axios.
His trip aimed to end more than a year of hostilities between Israel and Hezbollah along Lebanon’s southern border, which escalated dramatically when Israel ramped up its strikes in late September and sent ground troops into Lebanon on Oct. 1.
Israeli troops have fought Hezbollah in a strip of towns all along the border and this week pushed deeper to the edges of Khiyam, a town some six km (four miles) from the border. Hezbollah said it had fired rockets at Israeli troops east of Khiyam at least four times on Friday.
Lebanese security sources told Reuters that Israeli troops had also advanced in a string of villages to the west as well. They said Israel was most likely trying to isolate Khiyam ahead of a major attack on the town.
Israeli strikes on two other villages in southern Lebanon killed a total of five medics from a rescue force affiliated with Hezbollah, the Lebanese health ministry said.
The more than 3,500 people killed by Israeli strikes over the last year include more than 200 medics, the health ministry said.
Israel says its aim is to secure the return home of tens of thousands of people evacuated from Israel’s north due to rocket attacks by Hezbollah, which began firing across the border in support of Hamas at the start of the Gaza war in October 2023.
Israel also mounted more strikes on Beirut’s southern suburbs, a once densely populated stronghold of Hezbollah.
It issued evacuation orders on the social media platform X for several buildings in the area on Friday. Reuters footage showed one of the strikes appearing to pierce the center of a multi-story building, sending the whole structure toppling in a massive cloud of smoke.


UN reports heavy clashes between Israeli troops and Hezbollah in south Lebanon

Updated 22 November 2024
Follow

UN reports heavy clashes between Israeli troops and Hezbollah in south Lebanon

  • “We are aware of heavy shelling in the vicinity of our bases,” UNIFIL spokesman Andrea Tenenti said
  • Asked if the peacekeepers and staff at the headquarters are safe, Tenenti said: “Yes for the moment”

BEIRUT: Israeli troops fought fierce battles with Hezbollah fighters on Friday in different areas in south Lebanon, including a coastal town that is home to the headquarters of UN peacekeepers.
A spokesman for the UN peacekeeping force known as UNIFIL told The Associated Press that they are monitoring “heavy clashes” in the coastal town of Naqoura and the village of Chamaa to the northeast.
UNIFIL’s headquarters are located in Naqoura in Lebanon’s southern edge close to the border with Israel.
“We are aware of heavy shelling in the vicinity of our bases,” UNIFIL spokesman Andrea Tenenti said. Asked if the peacekeepers and staff at the headquarters are safe, Tenenti said: “Yes for the moment.”
Several UNIFIL posts have been hit since Israel began its ground invasion of Lebanon on Oct. 1, leaving a number of peacekeepers wounded.
The fighting came a day after the International Criminal Court issued arrest warrants for Israeli Prime Minister Benjamin Netanyahu, his former defense minister and a Hamas military leader, accusing them of war crimes and crimes against humanity over their 13-month war in Gaza and the October 2023 attack on Israel respectively.
The warrant marked the first time that a sitting leader of a major Western ally has been accused of war crimes and crimes against humanity by a global court of justice.
Israel’s war has caused heavy destruction across Gaza, decimated parts of the territory and driven almost the entire population of 2.3 million people from their homes, leaving most dependent on aid to survive.
Israel launched its war in Gaza after Hamas-led militants stormed into southern Israel on Oct. 7, 2023, killing some 1,200 people, mostly civilians, and abducting another 250. Around 100 hostages are still inside Gaza, at least a third of whom are believed to be dead.
Israel has also launched airstrikes against Lebanon after the Hezbollah militant group began firing rockets, drones and missiles into Israel the day after Hamas’ attack last October. A full-blown war erupted in September after nearly a year of lower-level conflict.


Gaza ministry: hospitals to cut or stop services ‘within 48 hours’ over fuel shortages

Updated 22 November 2024
Follow

Gaza ministry: hospitals to cut or stop services ‘within 48 hours’ over fuel shortages

  • All hospitals in Gaza would have to stop or reduce services “within 48 hours“

GAZA: The Hamas government’s health ministry warned Friday all hospitals in Gaza would have to stop or reduce services “within 48 hours” for lack of fuel, blaming Israel for blocking its entry.
“We raise an urgent warning as all hospitals in Gaza Strip will stop working or reduce their services within 48 hours due to the occupation’s (Israel’s) obstruction of fuel entry,” Marwan Al-Hams, director of Gaza’s field hospitals, said during a press conference.


Israel says to end ‘administrative detention’ for West Bank settlers

Updated 22 November 2024
Follow

Israel says to end ‘administrative detention’ for West Bank settlers

  • Practice allows for detainees to be held for long periods without being charged or appear in court
  • The Palestinian Prisoners Club advocacy group said in August that 3,432 Palestinians were held in administrative detention

JERUSALEM: Israeli authorities will stop holding Jewish settlers in the occupied West Bank under administrative detention, or incarceration without trial, the defense ministry announced Friday.
The practice allows for detainees to be held for long periods without being charged or appear in court, and is often used against Palestinians who Israel deems security threats.
Defense Minister Israel Katz said it was “inappropriate” for Israel to employ administrative detention against settlers who “face severe Palestinian terror threats and unjustified international sanctions.”
But, according to settlement watchdog Peace Now, it is one of only few effective tools that Israeli authorities to prevent settler attacks against Palestinians, which have surged in the West Bank over the past year.
Katz said in a statement issued by his office that prosecution or “other preventive measures” would be used to deal with criminal acts in the West Bank.
B’Tselem, an Israeli rights group, said authorities use administrative detention “extensively and routinely” to hold thousands of Palestinians for lengthy periods of time.
The Palestinian Prisoners Club advocacy group said in August that 3,432 Palestinians were held in administrative detention.
Israeli daily Haaretz reported on Friday that eight settlers were held under the same practice in November.
Yonatan Mizrahi, director of settlement watch for Peace Now, said that although administrative detention was mostly used in the West Bank to detain Palestinians, it was one of the few effective tools for temporarily removing the threat of settler violence through detention.
“The cancelation of administrative detention orders for settlers alone is a cynical... move that whitewashes and normalizes escalating Jewish terrorism under the cover of war,” the group said in a statement, referring to a spike in settler attacks throughout the Israel-Hamas conflict over the past 13 months.
Western governments, including Israel’s ally and military backer the United States, have recently imposed sanctions on Israeli settlers and settler organizations over ties to violence against Palestinians.
On Monday, US authorities announced sanctions against Amana, a movement that backs settlement development, and others who have “ties to violent actors in the West Bank.”
“Amana is a key part of the Israeli extremist settlement movement and maintains ties to various persons previously sanctioned by the US government and its partners for perpetrating violence in the West Bank,” the US Treasury said.
Excluding Israeli-annexed east Jerusalem, the West Bank — which Israel has occupied since 1967 — is home to three million Palestinians as well as about 490,000 Israelis living in settlements that are illegal under international law.


UK would arrest Netanyahu over ICC warrant: Senior politician 

Updated 22 November 2024
Follow

UK would arrest Netanyahu over ICC warrant: Senior politician 

  • Emily Thornberry: Britain has ‘obligation under Rome Convention’ to arrest Israeli PM if he enters country 
  • Court: ‘Reasonable grounds to believe’ Netanyahu responsible for war crimes, crimes against humanity in Gaza

LONDON: The UK will arrest Israeli Prime Minister Benjamin Netanyahu if he enters the country, a senior British politician has said.

The International Criminal Court issued an arrest warrant for Netanyahu on Thursday for alleged war crimes and crimes against humanity, alongside his former Defense Minister Yoav Gallant, pertaining to the Gaza war.

Emily Thornberry — Labour chair of the foreign affairs committee, and former shadow foreign secretary and shadow attorney general — told Sky News: “If Netanyahu comes to Britain, our obligation under the Rome Convention would be to arrest him under the warrant from the ICC.

“(It is) not really a question of should — we are required to, because we are members of the ICC.”

UK Home Secretary Yvette Cooper has refused to be drawn on whether Netanyahu would be arrested if he set foot on British soil, saying it “wouldn’t be appropriate for me to comment.”

She told Sky: “We’ve always respected the importance of international law, but in the majority of the cases that they pursue, they don’t become part of the British legal process.

“What I can say is that obviously, the UK government’s position remains that we believe the focus should be on getting a ceasefire in Gaza.”

Netanyahu’s arrest warrant is the first to be issued against the premier of a major Western ally by an international court for alleged war crimes and crimes against humanity.

His office denounced the warrant as “anti-Semitic,” adding that Israel “rejects with disgust the absurd and false actions.” Israel is not an ICC member and rejects the court’s jurisdiction.

US President Joe Biden called the warrants against Netanyahu and Gallant “outrageous,” adding: “Whatever the ICC might imply, there is no equivalence — none — between Israel and Hamas.”

Hungarian Prime Minister Viktor Orban said he plans to invite Netanyahu to visit Budapest, adding that the arrest warrant will “not be observed” by his government.

The Italian and French governments, however, have indicated that Netanyahu will be arrested if he visits either country.

The ICC said on Thursday it has “reasonable grounds to believe” that Netanyahu and Gallant “bear criminal responsibility” for “the war crime of starvation as a method of warfare; and the crimes against humanity of murder, persecution, and other inhumane acts.”

The court also issued a warrant for Hamas commander Mohammed Diab Ibrahim Al-Masri for alleged war crimes and crimes against humanity.

Israel says Al-Masri, believed to have been the mastermind behind the Hamas attack of Oct. 7, 2023, was killed in Gaza earlier this year.

The ICC said it issued the warrant for his arrest because of insufficient evidence to prove his death.