On Vibe Coding

Recently whenever I’ve opened LinkedIn¹ the majority of the posts it’s shown me have been favourable opinions of “vibe coding”, so I decided to write a summary of my experience using ChatGPT for a variety of coding and adjacent tasks, categorised by programming language. I have also tried other models including Google Gemini and more than one of Meta’s ones albeit more briefly - from what I’ve seen there are some differences between them which I will describe later in this article but my overall impressions haven’t differed significantly.

Per-language Assessments

Bash

This is the only programming language for which I actually found ChatGPT useful. Within certain limitations I can tell it what I want to do with a bash script and it will output something that does exactly that.

C++

If I put my rabbit on my computer’s keyboard it would have a greater chance of bringing about progress on one of my C++ projects than asking ChatGPT for help.

Rust

Every Rust function I’ve seen ChatGPT produce has had both syntax and logic errors. Despite always ignoring some part of the prompt it still produces over-direct translations of it into something I can’t exactly call rust.

Haskell

ChatGPT consistently produces Haskell code which has valid syntax, but by the time I’ve edited enough to satisfy the type-checker it bears no resemblance to what ChatGPT originally output.

Differences Between Models

While ChatGPT seems to be the most recommended model on LinkedIn and is the one I’ve tried using the most it’s actually the second worst I’ve tried after DeepSeek. When asked to perform programming tasks it routinely produces boilerplate-only code with internal comments indicating where the actual thing asked for would have to be inserted². Even when generating boilerplate it usually makes trivial errors such as leaving out function parameters and frequently even outputs the wrong language entirely.

Every time I’ve asked Llama or OLlama to produce code it’s actually made an attempt at function bodies, I just haven’t seen either produce code that actually works. While the overall capabilities of Llama are greater than those of OLlama their performance on software development tasks is indistinguishable.

Gemini performs very similarly to Llama and OLlama. While I wouldn’t necessarily say its output is closer to working code I will at least say that the problems with it tend to be a bit less obvious to my human brain.

I haven’t fully had a chance to try out Copilot - A former employer trialled it while I was working for them and on paper I was part of the trial but for some reason my attempts to install the required software on my work computer were unsuccessful.

Closing Thoughts

There’s enough talk about how LLM based coding tools in general and ChatGPT in particular are going to become indispensible that it would have been irresponsible for me not to investigate.

For all but one of the languages I tried it on using ChatGPT was more time consuming than doing the tasks myself, mostly because ChatGPT was unable to perform them and I had to do them myself anyway.

I don’t feel morally comfortable having exacerbated California’s water problems for effectively nothing. I don’t think I will be using a cloud-based LLM again, and I don’t think I’ll bother with a locally run one either unless I expect it to be significantly better at coding than the current state of the art.

I believe our society’s preferred modes of communication are unhealthily dependent on a small number of giant corporations, and this also applies to LinkedIn. However; I still need to communicate with people who I won’t be able to if I leave it.↩︎
There is actually a reason for this: the primary training objective for current generation LLMs is learning how to predict how documents found on the internet will continue given the text up to a certain point. One problem with this for the application of vibe coding is that text on the internet phrased as a request to write code is more likely to be a homework problem than something directly associated with a solution, so GPT ends up predicting how the homework assignment sheet would continue rather than what the solution would look like.↩︎