On the 12th of August, Open AI hosted a hackathon for all those interested in trying out Codex. Codex is a new generation of their GPT-3 algorithm that can translate plain English commands into code.
We at Serokell thought it would be interesting to try this out: right now free access to the beta is accessible only to a small group of people. One of our teammates got access to it after being on the waiting list for over a year.
What was the format?
The point of the challenge was to solve 5 small tasks that were the same for everyone to test the system. To be fair, they were quite simple – maybe because Codex can’t solve complex problems. To give an example, one of the tasks was to use pandas’ functionality to calculate the number of days between two dates in a string. There was a simple task dedicated to algorithms as well: for a binary tree it was needed to restore the original message.
Our main motivation was to see what Codex can do, how well it understands tasks, and monitor the logic of its decisions. Spoiler alert: not everything was as great and smooth as during the Open AI demo!
What was the problem?
The first problem was connected with server lagging – maybe the company wasn’t ready for such a huge number of participants (a couple of thousands). Because of that, we wasted a lot of time trying to reconnect. Interestingly enough: the leaderboard had a weird logic. The solutions were rated by the time of completion, not by the time needed to solve the problem. So people who were late for the beginning of the challenge were apriori low on the scoreboard.
To us, it seemed that Codex is not a very smart coder. First of all, it made quite many syntax mistakes. It can easily forget the closing bracket or introduce extra columns. Because of that, the code becomes incorrect. It really takes time and effort to catch these errors!
Secondly, it seems that Codex doesn’t know how to work with data types. You as a programmer have to be very careful, or the model will mess things up.
For instance, in the previous example of a task that simply is counting days between dates, Codex messed up the sequence of actions for us. It forgot to convert string to date and tried to perform an operation with it as it is.
Finally, the solutions that Codex proposes are not optimal. It’s a huge part of being a good programmer: to understand the task, break it down into realizable pieces and implement the most optimal solution in terms of execution time…
Continue reading: http://www.datasciencecentral.com/xn/detail/6448529:BlogPost:1064712