Ai2 introduces advanced language models designed to compete with Meta’s Llama, offering new options in AI technology
A novel AI model family has emerged, one of the few that can be replicated from the ground up.
Ai2, the nonprofit AI research organization established by the late Paul Allen, released OLMo 2, the second family of models in its OLMo series, on Tuesday. (OLMo is an acronym for “Open Language Model.”)
Although there is an abundance of “open” language models to select from (e.g., Meta’s Llama), OLMo 2 adheres to the Open Source Initiative’s definition of open source AI, which implies that the tools and data utilized in its development are publicly accessible.
In October, the Open Source Initiative, a long-standing organization that is dedicated to the definition and “stewardship” of all things open source, completed its definition of open source AI. However, the initial OLMo models, which were introduced in February, also satisfied the criteria.
“OLMo 2 [was] developed start-to-finish with open and accessible training data, open-source training code, reproducible training recipes, transparent evaluations, intermediate checkpoints, and more,” AI2 wrote in a blog post. “We hope to provide the open-source community with the resources needed to discover new and innovative approaches by openly sharing our data, recipes, and findings.”
The OLMo 2 family consist of two models: one with 7 billion parameters (OLMo 7B) and one with 13 billion parameters (OLMo 13B).
Models with a greater number of parameters typically outperform those with fewer parameters, as parameters are essentially equivalent to the problem-solving abilities of the model.
OLMo 2 7B and 13B, like most language models, can execute various text-based tasks, including query answering, document summarization, and code writing.
Ai2 employed a data set of 5 trillion tokens to train the models. Tokens represent bits of unprocessed data; approximately 750,000 words are equivalent to 1 million tokens.
The training set comprised academic papers, math workbooks (both synthetic and human-generated), question-and-answer discussion forums, and websites that were “filtered for high quality.”
Ai2 asserts that the outcome is the development of models that are competitive in terms of performance, using open models such as Meta’s Llama 3.1 release.
“Not only do we observe a dramatic improvement in performance across all tasks compared to our earlier OLMo model but, notably, OLMo 2 7B outperforms LLama 3.1 8B,” Ai2 writes. “OLMo 2 [represents] the best fully-open language models to date.”
The OLMo 2 models and all of their components are available for download on Ai2’s website. They are subject to the Apache 2.0 license, which permits their commercial use.
Recently, there has been some controversy regarding the safety of open models, as Llama models are purportedly being employed by Chinese researchers to create defense tools.
Dirk Groeneveld, an engineer at Ai2, informed me in February that he was not concerned about the exploitation of OLMo, as he is of the opinion that the advantages ultimately outweigh the disadvantages.
“Yes, it’s possible open models may be used inappropriately or for unintended purposes,” he said. “[However, this] approach also promotes technical advancements that lead to more ethical models; is a prerequisite for verification and reproducibility, as these can only be achieved with access to the full stack; and reduces a growing concentration of power, creating more equitable access.”