• AIPressRoom
  • Posts
  • Abu Dhabi Releases Largest Brazenly-Out there Language Mannequin Falcon 180B

Abu Dhabi Releases Largest Brazenly-Out there Language Mannequin Falcon 180B

The Abu Dhabi authorities’s Expertise Innovation Institute (TII) launched Falcon 180B, at the moment the most important openly-available giant language mannequin (LLM). Falcon 180B accommodates 180 billion parameters and outperforms GPT-3.5 on the MMLU benchmark.

Falcon 180B was skilled on 3.5 trillion tokens of text–4x the quantity of information used to coach Llama 2. In addition to the bottom mannequin, TII additionally launched a chat-specific mannequin that’s fine-tuned on instruction datasets. The fashions can be found for business use, however the license consists of a number of restrictions and requires extra permission to be used in a hosted service. Though TII says that Falcon’s efficiency is “tough to rank definitively,” it’s “on par” with PaLM 2 Massive and “someplace between GPT 3.5 and GPT4,” relying on the benchmark used. In keeping with TII:

 

As a key expertise enabler, we firmly consider that innovation must be allowed to flourish. That’s the reason we determined to open supply or open entry all our Falcon fashions. We’re launching our newest Falcon 180B LLM as an open entry mannequin for analysis and business use. Collectively, we will mobilize and fast-track breakthrough options worldwide – catalyzing innovation for outsized affect.

 

Falcon 180B is predicated on TII’s smaller mannequin, Falcon 40B, which was launched earlier this 12 months. One innovation within the Falcon structure was using multiquery consideration, which reduces the mannequin’s reminiscence bandwidth necessities when working inference. Each the fashions had been skilled on TII’s RefinedWeb dataset; for the brand new 180B mannequin, the quantity of information was elevated from 1.5 trillion tokens to three trillion. Coaching Falcon 180B took roughly 7 million GPU-hours on Amazon Sagemaker, utilizing 4,096 GPUs concurrently.

On X (previously Twitter), a number of customers posted concerning the Falcon 180B. One user speculated that:

 

GDPR within the EU could make the Falcon 180B mannequin the one viable possibility for individuals who prioritize knowledge localization and privateness.

 

Though the mannequin’s measurement makes it tough for many customers to run domestically, Huggingface scientist Clémentine Fourrier pointed out that there isn’t a distinction in inference high quality “between the 4-bit Falcon-180B and the bfloat-16 one,” that means that customers might scale back reminiscence wants by 75%. Georgi Gerganov, developer of llama.cpp, a package deal that helps customers to run LLMs on their private {hardware}, claimed to be working the mannequin on an Apple M2 Ultra.

Commenting on the mannequin’s generative capabilities, HyperWrite CEO Matt Shumer famous TII’s declare that the mannequin’s efficiency was between GPT-3.5 and GPT-4 and predicted “We’re now lower than two months away from GPT-4-level open-source fashions.” NVIDIA’s senior AI scientist Dr. Jim Fan took issue with the model’s lack of training on supply code knowledge:

 

Although it’s past me why code is just 5% within the coaching combine. It’s by far essentially the most helpful knowledge to spice up reasoning, grasp device use, and energy AI brokers. The truth is, GPT-3.5 is finetuned from a Codex base….I don’t see any coding benchmark numbers. From the restricted code pretraining, I’d assume it isn’t good at it. One can not declare “higher than GPT-3.5” or “method GPT-4” with out coding. It ought to’ve been an integral half within the pretraining recipe, not a finetuning after-thought.

 

The Falcon 180B fashions, each base and chat, can be found on the Hugging Face Hub. An interactive chat demo and the RefinedWeb dataset are additionally out there on Huggingface.