How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream

As Gulf states aim to become AI leaders by investing in R&D and startups (Supplied/MBZUAI)
Short Url
Updated 09 October 2023
Follow

How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream

  • ChatGPT understands inquiries in Arabic, but answers can sound unnatural or fail to convey the right message
  • Now homegrown LLMs can capture linguistic nuances and even comprehend dialects and cultural references

DUBAI: When ChatGPT made its debut last year, the artificial intelligence program caused a global sensation, as users found themselves communicating with a machine that could pass as another human being.

However, the enthusiasm among techies in the Arab world was somewhat diminished by ChatGPT’s limited grasp of Arabic, in part the result of the language’s complexity, diacritical markings, inflection system and regional dialects.

Although ChatGPT, which is based on a large language model, or LLM, can understand inquiries in Arabic and is able to translate, especially when using Modern Standard Arabic, answers can come across as unnatural, while literal translations do not always convey the right message.

That is why Jais, an LLM designed to support Arabic, was unveiled in July, bringing one of the world’s most widely spoken, though occasionally overlooked, languages into the AI mainstream.

Jais, a name that recalls the UAE’s highest peak in Ras Al-Khaimah, is the brainchild of a team of academics and engineers who embarked on the project because they felt too few LLMs were credibly multilingual.




The Ameca humanoid robot greets visitors at Dubai's Museum of the Future. (AFP)

Downloadable on the machine learning platform Hugging Face, Jais is the result of a collaboration between Cerebras Systems, Mohamed bin Zayed University of Artificial Intelligence, or MBZUAI, and a subsidiary of the Abu Dhabi-based G42 called Inception.

“It is vital that large language models are developed for languages other than English to ensure that innovation is accessible to everyone,” Andy Jackson, CEO of Inception, told Arab News.

“A quality Arabic LLM is critical for all sectors, businesses and organizations, as well as individuals. Innovation thrives when we collaborate, and Jais sets a new standard for AI advancement in the Middle East, ensuring that the Arabic language, with its depth and heritage, finds its voice within the AI landscape.

“Jais demonstrates our commitment to excellence, and our dedication to democratizing AI and promoting innovation.”

LLMs are functional machine learning models that use deep learning algorithms to process and understand natural human language. These models are then trained on large amounts of text data to learn patterns in the language.

These programs, which are rapidly proliferating in the wake of ChatGPT’s success, are capable of generating text on a seemingly endless array of subjects, producing everything from academic papers to poetry.

What is especially impressive about them is their ability to create responses to questions that are so convincingly human-like in almost any language, including coding.

But in order to make those languages sound convincing, native-speaking human programmers are often required to provide a critical layer of context and understanding that can enhance accuracy and reliability.

“Jais is purpose-built for the Arabic language and excels in capturing its intricacies and nuances, ensuring highly accurate and contextually relevant responses — a distinct advantage over general-purpose models,” said Jackson.




AI programs that are responsive to the Arabic language could widen access to a transformational new technology. (MBZUAI)

“This specialization is a pivotal development, opening up opportunities for governments, industries, and individuals across the Arab world to tap into the potential of generative AI.”

Currently considered among the foremost Arabic LLMs, Jais, a 13-billion parameter model, was trained on a newly developed 395-billion-token Arabic and English dataset on Condor Galaxy, one of the largest cloud AI supercomputers in the world, launched by G42 and Cerebras in July using 116 billion Arabic tokens and 279 billion English tokens.

“Jais was born in Abu Dhabi and offers more than 400 million Arabic speakers the opportunity to harness the potential of generative AI,” Preslav Nakov, professor and deputy department chair of Natural Language Processing at MBZUAI, told Arab News.

“It will facilitate and expedite innovation, highlighting Abu Dhabi’s leading position as a hub for AI, innovation, culture preservation and international collaboration.”

As an open-source model, Jais is expected to engage scientists, academics and developers to accelerate the growth of a an Arabic language AI ecosystem. It could also serve as a model for other languages now underrepresented in mainstream AI.

FASTFACTS

• Large language models, or LLMs, are a type of AI that can mimic human intelligence.

• Arabic is spoken by 400m people, but accounts for 1 percent of total global online content.

• Jais was created by Cerebras, MBZUAI, and a subsidiary of G42 called Inception.

“Jais outperforms existing Arabic models by a sizable margin,” said Nakov. “It is also competitive with English models of similar size despite being trained on significantly less English data.

“This exciting result shows that the model’s English component learned from the Arabic data and vice versa, opening a new era in LLM development and training.”

In Jais’s development, significant attention was devoted to pre-processing Arabic text, enhancing support for the language’s unique features, including its writing style and word order.

Jais also maintains a balanced Arabic-English dataset focus for optimal performance, offering a marked improvement over models with a limited Arabic text presence.

Its developers say Jais, unlike other models, captures linguistic nuances and even comprehends various Arabic dialects and cultural references.

“Jais facilitates faster customization for specific Arabic-focused use cases and addresses data ownership concerns by being based in the UAE, offering a reassuring solution for local enterprises,” said Inception CEO Jackson.




LLMs are functional machine learning models that use deep learning algorithms to process and understand natural human language. (Supplied)

The UAE’s Ministry of Foreign Affairs and Ministry of Industry and Advanced Technology, Abu Dhabi’s National Oil Company and Department of Health, Etihad Airways, First Abu Dhabi Bank, and global technology group e& are planning to utilize Jais, offering valuable insights to enhance the model and its applications across their industries.

Given the strong digital transformation efforts by several of the Arab Gulf governments, accompanied by huge investments in high-tech industries and homegrown tech startups, AI programs that are responsive to the Arabic language could widen access to a transformational new technology and challenge the monopoly of a clutch of Silicon Valley companies.

Last month, Technology Innovation Institute, an Emirati research center in Abu Dhabi, released Falcon 180b, an open-source AI model. Established in 2020, TII released Falcon 40b, the first version of its flagship open-source AI model, in May this year, after unveiling Noor, an Arabic-based AI model, last year.

According to a report in The Economist magazine, TII is the applied-research arm of the Advanced Technology Research Council, a government agency that employs an 800-strong multinational staff working on subjects from biotechnology and robotics to quantum computing.

“We are entering the game to disrupt the core players,” Faisal Al-Bannai, secretary-general of the ATRC, told The Economist, adding that TII will build new proprietary models and applications catering for specific fields such as medicine and law.

For its part, Saudi Arabia launched its National Strategy for Data and Artificial Intelligence in October 2020, aiming to become a global leader in the field as it seeks to attract $20 billion in foreign and local investments by 2030.

The Kingdom is also determined to future-proof its workforce, initially by training and developing a pool of 20,000 AI and data specialists. In May this year, Deloitte’s AI Institute was officially launched at the Experience Analytics conference in Riyadh.

Just last week Saudi Arabia launched a National Olympiad for Programming and Artificial Intelligence open to all middle- and high-school pupils. An estimated 300,000 students will be selected from 3 million participants for training in programming and AI, according to media reports.




The hope is that the advent of AI and the automation of rapid translation will be a game changer for Arabic content. (LEAP)

The initiative is a collaboration between the Saudi Data and Artificial Intelligence Authority, in collaboration with the Ministry of Education and King Abdulaziz and His Companions Foundation for Giftedness and Creativity (Mawhiba).

Saudi Arabia’s adoption of digitalization and emerging technologies is forecast to contribute about 2.4 percent to its gross domestic product by 2030, according to a recent report by global consultancy firm PwC.

In terms of average annual growth in the contribution of AI by region, Saudi Arabia is expected to grab a 31.3 percent share in the technology’s expansion between 2018 and 2030, the PwC report added.

“AI is developing rapidly, and its impact will be felt more and more across all sectors and areas of life,” said MBZUAI’s Nakov. “In this context, it is vital that the Arab world has access to an advanced LLM that can be adapted and utilized across all sectors.

“The rapid advancement of AI means that organizations that fail to adapt and start using AI sooner rather than later will be left behind, which makes it even more essential for the Arab world to have access to quality LLMs.”

Beyond its business applications, however, a crucial aspect of a program such as Jais is its ability to champion neglected languages, preserve them in a fast-changing economy, and promote digital inclusivity.

Although Arabic is an official language in 22 countries and is partly spoken in 11 others, it accounts for just 1 percent of total global online content, according to Jais’s creators. The hope is that the advent of AI and the automation of rapid translation will be a game changer.

By placing the language at the forefront of the AI revolution, Jais and its successors could help to maintain Arabic’s global prominence and its distinctive cultural significance in the digital age.


Iran FM warns against ‘destructive interference’ in Syria’s future

Updated 6 sec ago
Follow

Iran FM warns against ‘destructive interference’ in Syria’s future

  • Abbas Araghchi: Iran ‘considers the decision-making about the future of Syria to be the sole responsibility of the people... without destructive interference or foreign imposition’
BEIJING: Iran’s top diplomat warned Friday against “destructive interference” in Syria’s future and said decisions should lie solely with the country’s people, writing in Chinese state media as he visited Beijing.
Abbas Araghchi touched down in the Chinese capital on Friday afternoon, Iranian state media reported, to begin his first official visit to the country since being appointed foreign minister.
China and Iran were both supporters of ousted Syrian president Bashar Assad.
Assad fled Syria this month after an Islamist-led offensive wrested city after city from his control, with the capital Damascus falling on December 8.
Iran “considers the decision-making about the future of Syria to be the sole responsibility of the people... without destructive interference or foreign imposition,” Araghchi wrote in a Chinese-language article in People’s Daily published on Friday.
He also emphasized Iran’s respect for Syria’s “unity, national sovereignty and territorial integrity.”
Iran’s supreme leader – a key backer of Assad’s administration – predicted on Sunday “the emergence of a strong, honorable group” that would stand against “insecurity” in Syria.
Ayatollah Ali Khamenei said Syria’s young men would “stand with strength and determination against those who have designed this insecurity and those who have implemented it, and God willing, he will overcome them.”
In People’s Daily, Araghchi said supporting the Syrian people was a “definite principle (that) should be taken into consideration by all the actors.”
Beijing had also built strong ties with Assad – he met President Xi Jinping in China last year, where the two leaders announced a “strategic partnership.”
China has affirmed its support for the Syrian people and has said it opposes terrorist forces taking advantage of the situation to create chaos.
Araghchi’s two-day visit will include talks with his Chinese counterpart Wang Yi, according to Iran’s foreign ministry.
China is Iran’s largest trade partner, and a top buyer of its sanctioned oil.
Xi pledged in October to increase ties with Iran during talks with his counterpart Masoud Pezeshkian in Russia on the sidelines of a BRICS summit.
Araghchi told reporters in a video published by Iranian state media as he arrived in Beijing that the visit was taking place “at a very suitable time.”
“Now it is natural that there are sensitive situations, both the region has various tensions, and there are various issues at the international level, also our nuclear issue in the new year will face a situation that needs more consultations,” he said.
“The invitation of our Chinese friends was for this reason, that at the beginning of the new year... we should think together, consult and be ready for the challenges that will come.”
He wrote in his editorial that Iran and China shared the “common view” that calling for an immediate ceasefire in Gaza was the biggest priority in the Middle East.

Lebanese university students launch donation campaign to aid war-displaced families

Updated 40 min 55 sec ago
Follow

Lebanese university students launch donation campaign to aid war-displaced families

  • ‘Hardship of war should never be faced alone,’ says student Nour Farchoukh
  • More than 1,000 families benefit from food and clothing donations

DUBAI: Three American University of Beirut students have launched a donation campaign to support families across Lebanon displaced by the 13-month war with Israel.

Titled “Hope for our Lebanon,” the campaign distributes food supplies, sanitary boxes, and clothes through a collaboration with ‘Wahad Activism’ charity organization.  

Nour Farchoukh, Celine Ghandour, and Kian Azad told Arab News that they provide the aid based on the needs of each family.

“We put snacks or diapers if there are children. We also ask if they need clothes,” said Ghandour, adding that the group depends on people’s in-kind donations.

So far, the donation campaign has reached more than 1,000 families in Baabda, Beirut, Chouf, Batroun, Barouk, and Hazmieh among other areas.

Israel stepped up its military campaign in south Lebanon in late September after nearly a year of cross-border exchanges launched by Hezbollah in retaliation for the war on Gaza.

Over 13 months, the war killed more than 4,000 people across Lebanon, injured over 16,600 people, and displaced 1 million people, according to the latest figures of the Lebanese health ministry.

On Nov. 27, a 60-day ceasefire agreement, brokered by US and France, was signed between Hezbollah and Israel.

Azad said the campaign was still running after the ceasefire, with clothes donations being distributed to orphanages.

“We know that no matter how small the number of families we help, it will still make a difference,” he added.

“Every volunteer and every donation help rebuild Lebanon bit by bit. The hardship of war should never be faced alone,” Farchoukh said.

The three students have invited the community to take part in the initiative through donations or volunteering.


Israeli forces raid north Gaza hospital, health ministry says contact with staff lost

A woman and children react at the site of an Israeli strike in a residential area in the Tuffah neighbourhood, east of Gaza City
Updated 50 min 25 sec ago
Follow

Israeli forces raid north Gaza hospital, health ministry says contact with staff lost

  • Kamal Adwan Hospital is one of only three medical facilities on the northern edge of the Gaza Strip
  • Israeli forces order dozens of patients and hundreds of others to evacuate the compound

CAIRO/JERUSALEM: Israeli forces raided the Kamal Adwan Hospital, one of only three medical facilities on the northern edge of the Gaza Strip, on Friday, ordering dozens of patients and hundreds of others to evacuate the compound, officials said.

In separate incidents across Gaza, Israeli strikes killed at least 25 people, medics said. One of those strikes on a house in Gaza City killed 15 people, medics and the civil emergency service said.

The Palestinian health ministry said contact with staff inside the facility, which has been under heavy pressure from Israeli forces for weeks, had been lost.

“The occupation forces are inside the hospital now and they are burning it,” Munir Al-Bursh, director of the health ministry in Hamas-run Gaza, said in a statement.

The Israeli military said it had made efforts to mitigate harm to civilians and had “facilitated the secure evacuation of civilians, patients and medical personnel prior to the operation” but gave no details.

“Kamal Adwan Hospital serves as a Hamas terrorist stronghold in northern Gaza, from which terrorists have been operating throughout the war,” it said in a statement.

Kamal Adwan, as well as the Indonesia and Al-Awda hospitals, have been repeatedly attacked by Israeli forces, which have been clearing out the northern edge of the Gaza Strip for weeks, Palestinian medical staff say.

Friday’s raid comes a day after the army evacuated the nearby Indonesian Hospital and continued to press Al-Awda Hospital.

Bursh said the army had ordered 350 people inside the facility to leave to a nearby school sheltering displaced families. They included 75 patients, their companions, and 185 medical staff.

Hamas’ Al-Aqsa Television said that hours after the raid, Israeli forces set the hospital ablaze. Footage circulating on Palestinian and Arab media, which Reuters could not immediately verify, showed smoke rising from the area of the hospital.

There was no Israeli military comment.

Much of the area around the northern towns of Jabalia, Beit Hanoun and Beit Lahiya has been cleared of people and systematically razed, fueling speculation that Israel intends to keep the area as a closed buffer zone after the fighting in Gaza ends.

Israel denies the claims saying its campaign is to prevent Hamas militants from regrouping.

On Thursday, health officials said five medical staff, including a pediatrician, were killed by Israeli fire at Kamal Adwan Hospital in Beit Lahiya, where Israeli forces have been operating since October.

In a statement, Hamas held Israel and the United States responsible for the fate of patients, injured people and the medical staff inside the hospital.

Israel’s campaign against Hamas in Gaza has killed more than 45,300 Palestinians, according to health officials in the enclave. Most of the population of 2.3 million has been displaced and much of Gaza is in ruins.

The war was triggered by Hamas’ attack on southern Israel on Oct. 7, 2023, in which 1,200 people were killed and 251 taken hostage to Gaza, according to Israeli tallies.


Israel strikes ‘infrastructure’ on Syria-Lebanon border

Updated 27 December 2024
Follow

Israel strikes ‘infrastructure’ on Syria-Lebanon border

  • It did not specify whether the strikes were on the Syrian or Lebanese side

JERUSALEM: The Israeli military reported it conducted air strikes on Friday targeting “infrastructure” on the Syrian-Lebanese border near the village of Janta, which it said was used to smuggle weapons to the armed group Hezbollah.
“Earlier today, the IAF (Israeli air force) struck infrastructure that was used to smuggle weapons via Syria to the Hezbollah terrorist organization in Lebanon at the Janta crossing on the Syrian-Lebanese border,” the military said in a statement.
It did not specify whether the strikes were on the Syrian or Lebanese side, but they came a day after Lebanon’s army accused Israel of “violation of the ceasefire agreement by attacking Lebanese sovereignty and destroying southern towns and villages.”
There is no official crossing point near Janta but the area is known for illegal crossings.
The UN peacekeeping force in southern Lebanon, UNIFIL, has also expressed concern over “continuing destruction” caused by Israeli forces in south Lebanon.
The Israeli military said Friday’s strikes were aimed at preventing weapons falling into the hands of Hezbollah, with whom it fought a land and air war for more than a year until a ceasefire was agreed upon last month.
“These strikes are an additional part of the IDF’s (Israeli military’s) effort to target weapons smuggling operations from Syria into Lebanon, and prevent Hezbollah from re-establishing weapons smuggling routes,” the military said.
“The IDF will continue to act to remove any threat to the state of Israel in accordance with the understandings in the ceasefire agreement.”
The truce went into effect on November 27, about two months after Israel stepped up its bombing campaign and later sent troops into Lebanon following nearly a year of exchanges of cross-border fire initiated by Hezbollah over the war in Gaza.


Israel hospital says woman killed in stabbing attack in coastal city

Updated 27 December 2024
Follow

Israel hospital says woman killed in stabbing attack in coastal city

  • Israel’s police said the suspected attacker had been arrested

HERZLIYA, Israel: An Israeli hospital reported that a woman in her eighties was killed after being stabbed in the coastal city of Herzliya on Friday, while police stated that the suspected attacker had been arrested.
“She was brought to the hospital with multiple stab wounds while undergoing resuscitation efforts, but the hospital staff was forced to pronounce her death upon arrival,” Tel Aviv Ichilov hospital said in a statement. Israel’s police said the suspected attacker had been arrested.