Kevin Dietsch | Getty Images News | Getty Images
OpenAI announced the latest version of its primary large language model, GPT-4, on Tuesday, which it says exhibits “human-level performance” on many professional tests.
ChatGPT-4 is “bigger” than previous versions, meaning it has been trained on more data and has more weights in its model file, which also makes it more expensive to run.
Currently, many researchers in the field believe that many of the latest advances in artificial intelligence come from running ever-larger models on thousands of supercomputers in training processes that can cost tens of millions of dollars. GPT-4 is an example of an approach centered around “scaling up” to achieve better results.
OpenAI said it used Microsoft Azure to train the model; Microsoft has invested billions in the startup. OpenAI did not release details about the specific model size or the hardware it used to train it that could be used to recreate the model, citing “the competitive landscape.”
OpenAI’s GPT large-scale language model powers many of the AI demos that have wowed tech industry folks in the past six months, including Bing’s AI chat and ChatGPT, and the latest version is a preview of new advances that could begin to filter down to consumer products such as chatbots in the coming weeks. Bing’s AI chatbot uses GPT-4, Microsoft said Tuesday.
OpenAI says the new model will produce fewer factually incorrect answers, go off track and chat about forbidden topics less often, and even outperform humans on many standardized tests.
GPT-4 performed at the 90th percentile on a simulated bar exam, the 93rd percentile on an SAT Reading exam, and the 89th percentile on the SAT Math exam, OpenAI claimed.
However, OpenAI warns that the new software is not yet perfect and that it is less suitable than humans in many scenarios. It still has a big problem with “hallucination,” or making things up, and it’s not factually reliable, the company said. It still tends to insist that it is correct when it is wrong.
“GPT-4 still has many known limitations that we are working to address, such as social biases, hallucinations and conflicting urges,” the company said in a blog post.
“In casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle. The difference emerges when the complexity of the task reaches a sufficient threshold – GPT-4 is more reliable, creative and able to handle much more nuanced instructions than GPT-3.5,” OpenAI wrote in a blog post.
The new model will be available to paid ChatGPT subscribers and will also be available as part of an API that allows programmers to integrate AI into their apps. OpenAI will charge about 3 cents for about 750 words of prompts and 6 cents for about 750 words of responses.