[DeepMind’s AlphaCode outperforms many human programmers in
tricky software challenges]
[[link removed]]
AI LEARNS TO WRITE COMPUTER CODE IN ‘STUNNING’ ADVANCE
[[link removed]]
Matthew Hutson
December 8, 2022
Science
[[link removed]]
*
[[link removed]]
*
[[link removed]]
*
*
[[link removed]]
_ DeepMind’s AlphaCode outperforms many human programmers in tricky
software challenges _
Snippets of code in white come from the AlphaCode artificial
intelligence system, whereas the purple code snippets were written by
humans trying to solve similar problems., DEEPMIND
Software runs the world. It controls smartphones, nuclear weapons, and
car engines. But there’s a global shortage
[[link removed]] of programmers.
Wouldn’t it be nice if anyone could explain what they want a program
to do, and a computer could translate that into lines of code
[[link removed]]?
A new artificial intelligence (AI) system called AlphaCode is bringing
humanity one step closer to that vision, according to a new study.
Researchers say the system—from the research lab DeepMind, a
subsidiary of Alphabet (Google’s parent company)—might one day
assist experienced coders, but probably cannot replace them.
“It’s very impressive, the performance they’re able to achieve
on some pretty challenging problems,” says Armando Solar-Lezama,
head of the computer assisted programming group at the Massachusetts
Institute of Technology.
AlphaCode goes beyond the previous standard-bearer in AI code writing:
Codex, a system released in 2021 by the nonprofit research lab OpenAI.
The lab had already developed GPT-3, a “large language model” that
is adept at imitating and interpreting human text after being trained
on billions of words from digital books, Wikipedia articles, and other
pages of internet text. By fine-tuning GPT-3 on more than 100
gigabytes of code from Github, an online software repository, OpenAI
came up with Codex. The software can write code when prompted with an
everyday description of what it’s supposed to do—for instance
counting the vowels in a string of text. But it performs poorly when
tasked with tricky problems.
AlphaCode’s creators focused on solving those difficult problems.
Like the Codex researchers, they started by feeding a large language
model many gigabytes of code from GitHub, just to familiarize it with
coding syntax and conventions. Then, they trained it to translate
problem descriptions into code, using thousands of problems collected
from programming competitions. For example, a problem might ask for a
program to determine the number of binary strings (sequences of zeroes
and ones) of length n that don’t have any consecutive zeroes.
When presented with a fresh problem, AlphaCode generates candidate
code solutions (in Python or C++) and filters out the bad ones. But
whereas researchers had previously used models like Codex to generate
tens or hundreds of candidates, DeepMind had AlphaCode generate up to
more than 1 million.
To filter them, AlphaCode first keeps only the 1% of programs that
pass test cases that accompany problems. To further narrow the field,
it clusters the keepers based on the similarity of their outputs to
made-up inputs. Then, it submits programs from each cluster, one by
one, starting with the largest cluster, until it alights on a
successful one or reaches 10 submissions (about the maximum that
humans submit in the competitions). Submitting from different clusters
allows it to test a wide range of programming tactics. That’s the
most innovative step in AlphaCode’s process, says Kevin Ellis, a
computer scientist at Cornell University who works AI coding.
After training, AlphaCode solved about 34% of assigned problems
[[link removed]],
DeepMind reports this week in _Science_. (On similar benchmarks,
Codex achieved single-digit-percentage success.)
To further test its prowess, DeepMind entered AlphaCode into online
coding competitions. In contests with at least 5000 participants, the
system outperformed 45.7% of programmers. The researchers also
compared its programs with those in its training database and found it
did not duplicate large sections of code or logic. It generated
something new—a creativity that surprised Ellis.
“It continues to be impressive how well machine-learning methods do
when you scale them up,” he says. The results are “stunning,”
adds Wojciech Zaremba, a co-founder of OpenAI and co-author of their
Codex paper.
AI coding might have applications beyond winning competitions, says
Yujia Li, a computer scientist at DeepMind and paper co-author. It
could do software grunt work, freeing up developers to work at a
higher, or more abstract level, or it could help noncoders create
simple programs.
David Choi, another study author at DeepMind, imagines running the
model in reverse: translating code into explanations of what it’s
doing, which could benefit programmers trying to understand others’
code. “There are a lot more things you can do with models that
understand code in general,” he says.
For now, DeepMind wants to reduce the system’s errors. Li says even
if AlphaCode generates a functional program, it sometimes makes simple
mistakes, such as creating a variable and not using it.
There are other problems. AlphaCode requires tens of billions of
trillions of operations per problem—computing power that only the
largest tech companies have. And the problems it solved from the
online programming competitions were narrow and self-contained. But
real-world programming often requires managing large code packages in
multiple places, which requires a more holistic understanding of the
software, Solar-Lezama says.
The study also notes the long-term risk of software that recursively
improves itself. Some experts say such self-improvement could lead to
a superintelligent AI that takes over the world. Although that
scenario may seem remote, researchers still want the field of AI
coding to institute guardrails, built-in checks and balances.
“Even if this kind of technology becomes supersuccessful, you would
want to treat it the same way you treat a programmer within an
organization,” Solar-Lezama says. “You never want an organization
where a single programmer could bring the whole organization down.”
_MATTHEW HUTSON is a freelance writer for Science. He covers
artificial intelligence (AI), robotics, cybersecurity, and the
Internet of Things. He has a bachelor’s degree in cognitive
neuroscience from Brown University and a master’s degree in science
writing from the Massachusetts Institute of Technology, where his
thesis explored AI and creativity._
_Matt has written for Wired, The Atlantic, Newsweek, The New York
Times Magazine, The New Yorker online, and elsewhere. He is a former
news editor for Psychology Today and is the author of The 7 Laws of
Magical Thinking, about the psychology of superstition and religion.
He lives in New York City. _
_Look to the future with SCIENCE.
The discoveries and innovations that you read about in Science are
destined to change the world as we know it. Join AAAS to access
exclusive benefits including a subscription to Science and this
limited-edition T-shirt.*_
_JOIN AAAS NOW [[link removed]]_
* Science
[[link removed]]
* artificial intelligence
[[link removed]]
* programming
[[link removed]]
* computing
[[link removed]]
* Google
[[link removed]]
*
[[link removed]]
*
[[link removed]]
*
*
[[link removed]]
INTERPRET THE WORLD AND CHANGE IT
Submit via web
[[link removed]]
Submit via email
Frequently asked questions
[[link removed]]
Manage subscription
[[link removed]]
Visit xxxxxx.org
[[link removed]]
Twitter [[link removed]]
Facebook [[link removed]]
[link removed]
To unsubscribe, click the following link:
[link removed]