What Is GPT-3 and Why Is It Talking to Us?
Description: A rendering of some NVIDIA Graphics Processing Units (GPUs). Chips like these are used to quickly train large neural networks. Photo Credit: Nana Dua.
Silicon Valley is buzzing because of GPT-3, which is the newest piece of research from the Elon Musk-backed research institute OpenAI.
GPT-3 is the largest and latest in a long line of machine learning models that generate language. Internally, it works much like your phone keyboard's autocorrect--predicting the next words after a sequence of past words.
But unlike your phone's autocorrect, GPT-3 can spill out fully-coherent passages of text that seemingly have never been written before. The technology behind GPT-3 powers many advances in natural language processing--everything from language translation to document summarization.
Can GPT-3 Really Write Anything?
Well, people have been trying a lot of different things.
Gwern Branwen used GPT-3 to generate creative fiction. He also wrote a technical overview of GPT-3.
Kevin Lacker attempted to perform a variant of the Turing Test to GPT-3--asking questions designed to reveal whether the counterpart is human or a machine.
Andrew Mayne performed various experiments with GPT-3, including around its ability to summarize text.
The entrepreneur Arthur Breitman also tried to give another Turing Test. Then, Eliezer Yudkowsky--a researcher known for his theories around artificial intelligence--saw the results and remarked that GPT-3 might be "hiding" the full extent of its intelligence.
So I don't want to sound alarms prematurely, here, but we could possibly be looking at the first case of an AI pretending to be stupider than it is. In this example, GPT-3 apparently fails to learn/understand how to detect balanced sets of parentheses. (1/10.) https://t.co/cmO1xJuyAQ pic.twitter.com/jqAefa9OXW
— Eliezer Yudkowsky (@ESYudkowsky) July 20, 2020
McKay Wrigley built an app that asks GPT-3 to deliver responses in the style of a specific person--like, say, Albert Einstein. Here is a screenshot of me asking McKay's app about how to write a hit song.
Oh my god. This is better than I thought it would be.
— James Mishra (@rishmishra) July 17, 2020
I asked @mckaywrigley's GPT-3 app https://t.co/Lfpk8jPHvj about how to make a hit song. pic.twitter.com/LS7qIQ31qp
Jordan Singer used GPT-3 to create a plugin for Figma--an online visual design tool--that can design seemingly anything you describe. Jonathan Lee discusses Jordan's tool in greater depth.
This changes everything. 🤯
— Jordan Singer (@jsngr) July 18, 2020
With GPT-3, I built a Figma plugin to design for you.
I call it "Designer" pic.twitter.com/OzW1sKNLEC
Sharif Shameem developed a GPT-3 powered layout generator for web applications.
This is mind blowing.
— Sharif Shameem (@sharifshameem) July 13, 2020
With GPT-3, I built a layout generator where you just describe any layout you want, and it generates the JSX code for you.
W H A T pic.twitter.com/w8JkrZO4lk
OpenAI employee Amanda Askell demonstrated the ability for the ability of GPT-3 to write music.
Guitar tab generated by GPT-3 from a fictional song title and artist. pic.twitter.com/ZTXuEcpMUV
— Amanda Askell (@AmandaAskell) July 16, 2020
People have also been trying to get GPT-3 to write software. Florent Crivello highlights this below:
GPT3 writing code. A compiler from natural language to code.
— Flo Crivello (@Altimor) July 2, 2020
People don't understand — this will change absolutely everything. We're decoupling human horsepower from code production. The intellectual equivalent of the discovery of the engine. https://t.co/QGJbQRBdQv pic.twitter.com/CJIaRK8j0M
Faraaz Nishtar was one of the few people that found that GPT-3 can write a small amount of SQL--a programming language used to query databases.
I got GPT-3 to start writing my SQL queries for me
— Faraaz Nishtar 🦊 (@FaraazNishtar) July 22, 2020
p.s. these work against my *actual* database! pic.twitter.com/6RoJewXEEx
The startup OthersideAI demoed a feature that uses GPT-3 to generate responses to emails. Gmail's Smart Reply feature uses a similar model to GPT-3.
GPT-3 is going to change the way you work.
— OthersideAI (@OthersideAI) July 22, 2020
Introducing Quick Response by OthersideAI
Automatically write emails in your personal style by simply writing the key points you want to get across
The days of spending hours a day emailing are over!!!
Beta access link in bio! pic.twitter.com/HFjZOgJvR8
Has This Been Done Before?
Researchers have spent decades trying to use statistics to model human language, but only recently have we been able to generate realistic human text.
Claude Shannon, the father of information theory, applied his theory to model the English language in a 1950 paper.
In 1954, IBM researchers demonstrated a machine that translated six hundred (simple, carefully-selected) Russian sentences into English. This primitive start sparked military interest in automatically translating enemy communications--turning machine translation and language modeling into another front of the Cold War. The research community later built more comprehensive linguistic on the mathematics of Andrey Markov.
Later, researchers developed a family of statistical models known as neural networks to solve a variety of different problems. Starting in 2001, researchers began modeling language with neural networks. Innovations in neural network design and ever-faster computers led to these initially-small networks growing to gigantic proportions. Leo Gao wrote about GPT-3 and scaling trends in neural networks.
How Big is GPT-3?
OpenAI's largest version of GPT-3's predecessor--GPT-2--had 1.5 billion parameters. GPT-2's ability to generate realistic English was so good that OpenAI hesitated for several months to publicly release the model. GPT-3 has 175 billion parameters. Currently the model is only available through OpenAI's servers, and only to users that have been invited.
These networks' larger size allows them to memorize increasingly complex patterns in the English datasets that the models learn from. The models don't necessarily have intelligence or insight, but simply have more bandwidth for imitation.
Is GPT-3 Overhyped?
It depends. Building ever-larger neural networks sometimes feels like throwing money at a math problem, but it also takes significant technical innovation to make the creation of such networks even possible.
Sam Altman--currently the CEO of OpenAI--recently remarked their creation GPT-3 may be overhyped.
The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.
— Sam Altman (@sama) July 19, 2020
Jerome Pesenti, currently the Vice President of AI at Facebook, identified GPT-3's tendency to generate socially-uncomfortable text when given certain handpicked inputs.
#gpt3 is surprising and creative but it’s also unsafe due to harmful biases. Prompted to write tweets from one word - Jews, black, women, holocaust - it came up with these (https://t.co/G5POcerE1h). We need more progress on #ResponsibleAI before putting NLG models in production. pic.twitter.com/FAscgUr5Hh
— Jerome Pesenti (@an_open_mind) July 18, 2020
Further Reading
OpenAPI API (OpenAI's invite-only interface for GPT-3)
Language Models are Few-Shot Learners (the original GPT-3 paper)
OpenAI’s new language generator GPT-3 is shockingly good—and completely mindless
How GPT3 works. A visual thread.
Too big to deploy: How GPT-2 is breaking servers
A history of machine translation from the Cold War to deep learning
A Review of the Neural History of Natural Language Processing