Unreliable narrator, untrustworthy partner

Today I reached the limits of ChatGPT 4.0’s usefulness for coding, and I think it’s a pretty bad sign for the potential of LLMs as a transformative technology. Up to this point, it’s been useful as a sort of very fancy spell check. If I need a quick refresher on basic syntax in Python, or if a section of code is not working due to a rookie error, ChatGPT is a nice way to get things unstuck as fast as possible. I don’t want to have to pull a physical dictionary off a shelf to look up how to spell “reservoir” every time I write it, and I don’t want to plumb the depths of Stack Overflow every time I want a refresher on how class inheritance works.

The problem is that when things get to an intermediate level of complicated, ChatGPT can start giving you VERY confident-sounding answers that are completely wrong, or at least so dumb and backwards that you’re better off starting from scratch. It’s hard to put my finger on exactly when this starts. When it comes to coding, I’m not a domain expert (obviously, or I wouldn’t be asking ChatGPT), but sometimes over the course of a series of queries, I get a sort of a sense, a little tingle, a suspicion that the solutions offered seem way too loopy and repetitive to be any good.

For me today, this happened during a series of queries to help with building a GUI for my RSS reader using Tkinter. The details are tedious, but it boils down to ChatGPT offering me a solution that didn’t work, then offering me a different solution that didn’t work, then offering me the FIRST solution again, and that’s when I realized this thing isn’t actually thinking. OK, I knew that already, intellectually, but it can really sound like it is trouble shooting! LLMs specialize in convincing you that they aren’t just massive statistical engines pooping out the most plausible next token, but that’s really all they are.

For whatever reason, when it came to the Tkinter library, ChatGPT was incapable of explaining the basic structure, terrible at offering elegant structures to get started, and worse at debugging things when they ran into trouble. Part of me suspects this is because the output of Tkinter is graphic, and the model breaks down where the linguistic direction is supposed to translate into a visual result. So maybe that. But the other problem is that there is more than one way to do things in Tkinter. You can slave and place widgets in the UI in any number of ways depending on preference, but you need to take a consistent approach, and ChatGPT clearly wasn’t able to keep track of it over the course of trouble shooting.

This makes me very cautious about using ChatGPT in the future to compose complex sections of code employing tools I don’t fully understand yet. While it can be good for getting started, today’s experience taught me that I’m better off learning from other sources before I dive in, as I might just have to do it all over again anyway.

ChatGPT is still an excellent tool for basic coding support, but it is a support tool, one of many in an IDE. This is worth emphasizing because some LLM boosters would have you believe this technology is set to change life as we know it, replacing human workers and revolutionizing how we do literally everything. It’s just not. There’s no depth to it. It’s inconsistent, unreliable, and untrustworthy. And I’m pretty worried the powers that be are going to try to force it into every corner of our digital lives anyway.

2 responses to “Unreliable narrator, untrustworthy partner”

  1. @pjk "Part of me suspects this is because the output of Tkinter is graphic, and the model breaks down where the linguistic direction is supposed to translate into a visual result."

    That's an interesting observation. I don't trust it enough to use code it generates, that I can't fully validate based on my own understanding.

    That said, I'm still finding it helps me make connections and find approaches that work. Even though I'm still doing the bulk of the work, I can work so much faster with it.

  2. @pjk I completely agree with this take, well said. After using Copilot for a while now, I'm considering turning it off because I feel that there's a lot of mental overhead of always having to "trust but verify" on every suggestion outside the simple stuff.

Leave a Reply

Your email address will not be published. Required fields are marked *