New AI System Can Give Human Programmers a Run for Their Money

Published date: 25.05.2022

Read time: 6 min

In 2021, we saw an interesting development in AI-generated code such as OpenAI Codex, which could help programmers by autocompleting parts of their code. However, in early 2022, we are seeing even greater developments in AI that allow machines to actually compete with humans in terms of who can write better code. In other words, AI is no longer simply assisting developers but is actually creating the code from scratch. Today we will take a closer look at AlphaCode, the technology that could potentially write better code than humans in the future, and explore the data annotation that is required to create such technology.

What is AlphaCode and Why is It So Impressive?

AlphaCode is an AI system created by DeepMind, a subsidiary of Google, that can write computer programs at a competitive level. DeepMind claims AlphaCode is the first AI code generation system that performs at a competitive level in code competitions for human developers. In fact, AlphaCode achieved an estimated rank within the top 54% of participants in programming competitions by solving new problems that require a combination of critical thinking, logic, algorithms, coding, and natural language understanding.

What this means is that AlphaCode can write code at the same level as an average human programmer. However, this is still an impressive result because this is the first time an AI code generation system has reached a competitive level of performance in programming competitions.

How Can Businesses Take Advantage of AlphaCode?

One of the main benefits AlphaCode offers is that it can make developers more productive. Previously, companies were limited to single-line suggestions and restricted to certain languages or shortcode snippets. AlphaCode does not have such limitations and can write large chunks of code. This could also help people who are not programmers express a solution without knowing how to write code.

The applications of machine programming are vast in scope — explaining why there’s enthusiasm around it. According to a study from the University of Cambridge, at least half of developers’ efforts are spent debugging, which costs the software industry an estimated $312 billion per year. Even migrating a codebase to a more efficient language can command a princely sum. For example, the Commonwealth Bank of Australia spent around $750 million over the course of five years to convert its platform from COBOL to Java.

AlphaCode is also perfectly in line with the growing trend of low-code, which is a software development approach that requires little to no coding in order to build applications and processes. In fact, by 2030, the global low-code/no-code development platform market is expected to produce $187 billion in revenue. It will account for more than 65% of application development activity by 2024. Therefore, AI technology is making development work much more accessible and AlphaCode is a giant leap forward in that direction.

What are the Limitations of AlphaCode?

While AlphaCode is certainly very impressive, it does have some limitations. For starters, there might be some vulnerability issues. For example, models can generate code with exploitable weaknesses, including “unintentional vulnerabilities from outdated code or intentional ones injected by malicious actors into the training set. Another concern is that such a level of automation could reduce demand for developers but, according to DeepMind, the system is nowhere near being a threat to human programmers, but its systems need to be able to develop problem-solving capabilities to help humanity.

How Was AlphaCode Trained to Write Code So Well?

According to reports, AlphaCode’s pre-training dataset included 715 GB of code from files taken from GitHub repositories written in C++, C#, Go, Java, JavaScript/TypeScript, Lua, Python, PHP, Ruby, Rust, and Scala. All of this training data would need to be annotated using sentiment annotation, semantic annotation, as well as many other techniques. This includes things like simply labeling code as C++, C#, etc. just so the system can recognize each language. Then, more advanced annotation would be required such as function and variable annotations in Python.

Trust Mindy Support With All of Your Data Annotation Needs

Annotating large volumes of data in-house, such as the amount mentioned above, can be very tedious, time-consuming, and expensive. It’s also unnecessary since you can outsource such work to a service provider like Mindy Support. We are a global company for data annotation and business process outsourcing, trusted by several Fortune 500 and GAFAM companies, as well as innovative startups. With nine years of experience under our belt and offices and representatives in Cyprus, Poland, Romania, The Netherlands, India, and Ukraine, Mindy Support’s team now stands strong with 2000+ professionals helping companies with their most advanced data annotation challenges. Contact us to learn more about what we can do for you.