OPT-175A: "I am your father."
Not even halfway through the year, we have already seen back-to-back large language models (LLMs) being released by tech giants. A month ago, Google's 540-billion parameter Pathways Language Model (PaLM) came out, followed by DeepMind's Chinchilla.
Meta AI has released an open pretrained transformer (OPT-175B), a language model with 175 billion parameters trained on publicly available data sets. Meta claims that its model is comparable to GPT-3 but requires only 1/7th of the carbon footprint to develop.
In a move usually not followed by big tech companies, the access to the model will be given to academic researchers, those affiliated with organisations in the government, civil society, academia and industry research laboratories around the world.
Check out the GitHub source code here.
Here's a list of some of the popular large language models:
Meta's move to make it available to the broader AI research community is a slap on OpenAI's GPT-3 face as it is available as a paid service, and no source code or model has been shared yet. "Isn't this like the most ironic thing ever? What's the 'Open' part of OpenAI," says Timnit Gebru, founder and executive director of the Distributed Artificial Intelligence Research Institute (DAIR).
Further, she questioned OpenAI's business model and said: "Wasn't OpenAI created to counteract how 'closed' big tech was and save humanity from them? So, they go from that with all the billionaires creating it cause 'white man's burden' and immediately go to $1 billion from Microsoft and paid service."
"So, why is Meta doing this?" she asked and said that Meta is a company that has said little about how the algorithms behind Facebook and Instagram work and has a reputation for burying unfavourable findings by its in-house research teams.
Hugging Face's Margaret Mitchell sees the release of OPT as a positive move but thinks there are limits to transparency.
In its research paper, Meta has revealed some notable findings of just how dangerous this machine can be. Across tests, they found that this model has a high propensity to generate toxic language and reinforce harmful stereotypes.
Compared to GPT-3, the researchers found that OPT-175B has a higher toxicity rate, and it appears to exhibit more stereotypical biases in almost all categories except for religion. Also, they found that this model can make harmful content 'even when provided with relatively innocuous prompts. "Meaning that it might do some nasty things regardless of whether or not you tell it to," shares Arthur Holland Michel on Twitter.
Further, he said that a big part of why Meta released the model is that a broader community can help address these issues. It looks like it will take a bunch of smart people to figure this out. "TBS, there may also be questions as to whether the researchers have set sufficient groundwork for that to happen. Not to mention whether OPT-175B will create real harms, even at this experimental stage," he added.
He said though some mitigation measures do exist to prevent harm arising from these systems, the authors admit that they have not applied such measures to OPT-175B. "Why? Because their 'primary goal' here was to replicate GPT-3," said Michel.
Michel said it appears that they chose not to attempt to reduce the system's propensity to be harmful because they fear that doing so would also reduce its performance relative to a competitor's AI. He said that the researchers also refrained from applying what they admit is necessary scrutiny to the AI's training dataset because their' current practice is to feed the model with as much data as possible and minimal selection.'
"Meanwhile, their model card for the system's training dataset seems to be a bit thin on detail," writes Michel. He said that the researchers are not aware of any 'tasks for which the dataset should not be used.'
Michel also said that in the model card, they seem to contradict themselves. "In one place, they admit that the dataset used to train the system does "relate to people," but on the following page, they claim that it does not," he added and said that this contradiction is non-trivial.
"By saying 'no' the second time, they were able to skip all the questions about whether the people in the dataset gave consent to be included," said Michel, stating that the company's transparency with this has been welcomed, but, at the same time, it has opened doors to some challenging and uncomfortable questions.