Meta has built a massive new language AI—and it’s giving it away for free
“It’s a great move,” says Thomas Wolf, chief scientist at Hugging Face, the AI startup behind BigScience, a project in which more than 1,000 volunteers around the world are collaborating on an open-source language model. He says, “The more open models are the better.”
Large language models, which are powerful programs that can generate paragraphs and mimic human conversation, have been one of the hottest trends in AI over the past couple of years. However, they are deeply flawed , and can parrot misinformation, prejudice, or toxic language.
In theory, more people should be able to work on the problem. Language models are complex and require a lot of data to train. This is why they have remained projects for high-tech firms. The wider research community, including ethicists, social scientists, and others concerned about their misuse, have had to watch from the sidelines.
To support MIT Technology Review’s journalism, please consider becoming a subscriber.
Meta AI claims it wants to change this. Pineau says that many of us were university researchers. Pineau says, “We know the gap between industry and universities in terms of the ability build these models. It was easy to make this model available to researchers. She hopes others will look at their work and either take it apart or build upon it. She believes that breakthroughs are more likely to be made when there are more people involved.
Meta makes its model, the Open Pretrained Transformer(OPT), freely available for non-commercial usage. It also releases its code and a logbook, which documents the training process. The logbook includes daily updates from the team on the training data, including how it was added to a model, when it was added, and what failed. In more than 100 pages of notes, the researchers log every bug, crash, and reboot in a three-month training process that ran nonstop from October 2021 to January 2022.
With 175 billion parameters (the values in a neural network that get tweaked during training), OPT is the same size as GPT-3. Pineau says this was intentional. OPT was designed to be comparable to GPT-3 in both its accuracy on language tasks as well as its toxicity. OpenAI has made GPT-3 free to users, but it has not released the code or the model itself. Pineau says that the idea was to provide researchers with a similar model for studying language models.
OpenAI declined to comment on Meta’s announcement.
Google, which is investigating the use large language models in search products, has been criticized for not being transparent. The company sparked controversy in 2020 when it forced out leading members of its AI ethics team after they produced a study that highlighted problems with the technology.
So why is Meta doing this? After all, Meta is a company that has said little about how the algorithms behind Facebook and Instagram work and has a reputation for burying unfavorable findings by its own in-house research teams. Pineau, who has been advocating transparency in AI for many years, is a major reason for Meta AI’s new approach.
Pineau helped change how research is published in several of the largest conferences, introducing a checklist of things that researchers must submit alongside their results, including code and details about how experiments are run. Since she joined Meta (then Facebook) in 2017, she has championed that culture in its AI lab.
“This commitment to open science, is why I’m there,” she said. “I wouldn’t be here .”
Overall, Pineau wants us to change the way we judge AI. She says that “What we call the state of the art today can’t be just about performance.” It must be state-of the-art in terms .”
Still, Meta is making bold moves by giving away a large model of a language. Pineau says, “It will.” “It will.”
Weighing the risks
Margaret Mitchell, one of the AI ethics researchers Google forced out in 2020, who is now at Hugging Face, sees the release of OPT as a positive move. She believes there are limits to transparency. Is the language model being tested with enough rigor? Are the foreseeable benefits greater than the foreseeable harms, such as misinformation or the use of racist and misogynistic languages?
“Releasing large language models to the world where there is a wide audience, or being affected by their output, comes with responsibilities,” she said. Mitchell points out that the model can generate harmful content by itself and through downstream applications that researchers create on top of it. Pineau says
MetaAI audited OPT for some harmful behaviors. However, the goal is to release a model that researchers could learn from, warts included.
There were many conversations about how to do this in a way that allows us to sleep at night, knowing there’s a low risk of reputation and a low risk of harm. She rejects the notion that a model should not be released because it is too dangerous. This is why OpenAI did not release GPT-3’s predecessor GPT-2. She says, “I understand the flaws of these models, however that’s not an research mindset.”
Bender, who coauthored the study at the center of the Google dispute with Mitchell, is also concerned about how the potential harms will be handled. She says that grounding explorations and evaluations in specific use cases is key to mitigating any type of machine-learning technology’s risks. “What purpose will the system serve? Who will use it and how will it be presented ?”
Some researchers are skeptical that large language models can be built, given their potential harm. Pineau believes that these concerns should be addressed with more exposure than ever before. She says that extreme transparency is the only way to build trust.
” There are many opinions about the appropriate speech, and AI is part of that conversation. She doesn’t expect language model to agree with everyone. “But how can we deal with that?” You need many voices in that discussion.”
I’m a journalist who specializes in investigative reporting and writing. I have written for the New York Times and other publications.