• ActivityPub ftw

    When I started this little blog, I wondered how anyone would find it. In the old days, you would do SEO so your posts show up in search. That involved writing little summaries of each post, making the links attractive to Google’s crawlers, posting a sitemap, linking to other blogs and soliciting linkbacks. Then later on, with Web 2.0, you would market your sites by building a “brand” on social media: setting up a Facebook page, boosting on Twitter, starting fights in the comments of YouTube videos, etc.

    But Google search is now a hot mess, and LLM-generated content is about to make it even more useless than it is now, so I’m not even trying to play that game. And building a brand on algorithmic social media seems like a suicide mission. I figured it was more important to do the thing and then worry about finding readers later.

    But I did stick it into the fediverse by activating an ActivityPub plugin, and the results have been surprising! I’ve consistently been getting thoughtful comments and boosts on every post, and the blog already has a couple dozen followers on Mastodon.

    Huge caveat: I’m writing about very Mastodon-adjacent topics, like coding and LLMs. And at the moment, with only a couple million users on the fediverse, reach is certainly limited. But I’m starting to believe the ActivityPub protocol does indeed have the potential to restructure how people who write and read find each other on the open web, without any algorithmic intermediaries harvesting our attention for profit and fucking with us.


  • Unreliable narrator, untrustworthy partner

    Today I reached the limits of ChatGPT 4.0’s usefulness for coding, and I think it’s a pretty bad sign for the potential of LLMs as a transformative technology. Up to this point, it’s been useful as a sort of very fancy spell check. If I need a quick refresher on basic syntax in Python, or if a section of code is not working due to a rookie error, ChatGPT is a nice way to get things unstuck as fast as possible. I don’t want to have to pull a physical dictionary off a shelf to look up how to spell “reservoir” every time I write it, and I don’t want to plumb the depths of Stack Overflow every time I want a refresher on how class inheritance works.

    The problem is that when things get to an intermediate level of complicated, ChatGPT can start giving you VERY confident-sounding answers that are completely wrong, or at least so dumb and backwards that you’re better off starting from scratch. It’s hard to put my finger on exactly when this starts. When it comes to coding, I’m not a domain expert (obviously, or I wouldn’t be asking ChatGPT), but sometimes over the course of a series of queries, I get a sort of a sense, a little tingle, a suspicion that the solutions offered seem way too loopy and repetitive to be any good.

    For me today, this happened during a series of queries to help with building a GUI for my RSS reader using Tkinter. The details are tedious, but it boils down to ChatGPT offering me a solution that didn’t work, then offering me a different solution that didn’t work, then offering me the FIRST solution again, and that’s when I realized this thing isn’t actually thinking. OK, I knew that already, intellectually, but it can really sound like it is trouble shooting! LLMs specialize in convincing you that they aren’t just massive statistical engines pooping out the most plausible next token, but that’s really all they are.

    For whatever reason, when it came to the Tkinter library, ChatGPT was incapable of explaining the basic structure, terrible at offering elegant structures to get started, and worse at debugging things when they ran into trouble. Part of me suspects this is because the output of Tkinter is graphic, and the model breaks down where the linguistic direction is supposed to translate into a visual result. So maybe that. But the other problem is that there is more than one way to do things in Tkinter. You can slave and place widgets in the UI in any number of ways depending on preference, but you need to take a consistent approach, and ChatGPT clearly wasn’t able to keep track of it over the course of trouble shooting.

    This makes me very cautious about using ChatGPT in the future to compose complex sections of code employing tools I don’t fully understand yet. While it can be good for getting started, today’s experience taught me that I’m better off learning from other sources before I dive in, as I might just have to do it all over again anyway.

    ChatGPT is still an excellent tool for basic coding support, but it is a support tool, one of many in an IDE. This is worth emphasizing because some LLM boosters would have you believe this technology is set to change life as we know it, replacing human workers and revolutionizing how we do literally everything. It’s just not. There’s no depth to it. It’s inconsistent, unreliable, and untrustworthy. And I’m pretty worried the powers that be are going to try to force it into every corner of our digital lives anyway.


  • Everything is SEO now

    I feel like I was just saying this the other day. People don’t write websites or blogs for human readers anymore: They write them for the machines.

    The relentless optimizing of pages, words, paragraphs, photos, and hundreds of other variables has led to a wasteland of capital-C Content that is competing for increasingly dwindling Google Search real estate as generative AI rears its head. You’ve seen it before: the awkward subheadings and text that repeats the same phrases a dozen times, the articles that say nothing but which are sprayed with links that in turn direct you to other meaningless pages. Much of the information we find on the web — and much of what’s produced for the web in the first place — is designed to get Google’s attention.

    It’s a very good article. LLMs are not new in this regard, their enshitifcation of the web is just finishing a process that is already 90% complete thanks to our reliance on ad-supported algorithmic search as a means of sorting information.


  • Parsing an RSS feed and the risks of a sycophantic coding assistant

    I don’t like the idea of a coding project that’s just you typing out what someone in a YouTube video tells you to type. That’s like doing a crossword puzzle with someone telling you what letters to put in each square. First, it’s boring. You don’t get the satisfaction of solving it on your own. Second, you can’t possibly learn much. There is no “right” way to code something, there are only better ways and worse ways, and part of learning is doing something in the stupidest way possible and then finding out first-hand why you shouldn’t do it that way.

    Anyway, I’m coding this desktop RSS reader I’ll call DonkeyFeed, for no particular reason. The GitHub repository is here, and apologies in advance, I still have no fucking idea how Git works, so I hope I didn’t accidentally expose my ass. I selected a desktop RSS reader with a GUI as a project because I need one, and it seems like the kind of thing I can iterate gradually as I learn how the pieces work.

    So far, that’s what is happening!

    It is turning out to be a great project for working with a number of different libraries and moving parts. Using a GUI means working on a front end and user experience and learning Tkinter. For RSS parsing on the back end, I have to learn to use feedparser and figure out how to get the parts of the RSS feed that I need, put them into some kind of array, and put that array into an object to be placed in a window. Then I’ll also need to manage a database, add user-initiated functionality, put in some testing modules, and maybe even write a readme.txt, who knows.

    So far, I have mostly worked on the GUI and the RSS parsing. For the former, I’m building a “Window” class with methods for all the different widgets (buttons, input fields, letterboxes, etc.), although I’m still not super clear on what the most extensible/useable way to do that will be. I’ve made more progress on the RSS parsing: I’m building a class that can be instantiated for a single RSS feed, with methods that return the separate parts of the feed so I can place them in the window how I want later.

    I worked with ChatGPT quite a bit in building this part, everything from very basic debugging (I’m still occasionally forgetting parentheses lol) to back-and-forth about how some feature will work. I’m finding ChatGPT can be dangerous because it WILL GIVE YOU WHAT YOU ASK FOR. This is a known problem with LLMs, sycophancy bias. Basically, the LLM tells you want you want to hear. My coding skills are so incipient at the moment that for now, I’m mostly getting “actually”ed by ChatGPT. Like here. Ooof. Big L.

    But as my queries get more sophisticated, it is happy to help me hang myself. For example, to parse an RSS feed, I built a method that uses feedparser to send the title, link, and summary of each entry to a list, and those items could then be retrieved for display. That is: three separate lists. Maybe you can see where this is going. ChatGPT was happy to assist me with this. It was later that I realized keeping three separate lists for items that need to be synched up with each other was… problematic. ChatGPT cheerfully agreed:

    It’s a great reminder that coding assistants are ASSISTANTS. You have to catch this stuff, they will not stop you from showing your ass, on the contrary, they will 100% help you pull your pants down and then tell you “good job, sir!!” Indeed, note that I’m the one who suggested using a dictionary here instead. I may actually be showing my ass in a new and exciting way.

    Anyway, the next thing I have to change in this code is to send the feed elements to a dictionary instead of three separate lists, and then I will be ready to test. I haven’t built any testing modules yet, but it’s on my mental list of things to do. Another functionality I need to add soon is some kind of database management, so a user can save a list of RSS feeds and query them whenever they open the program. Off to code.


  • useful, sort of

    As part of my “learn Python” project, I’m building a desktop RSS reader. This could be a pretty straight-forward little dingus, but I want to use it to explore structuring code in a way that can be extended and tested later, so I’m trying a class-based approach. It has been like a lot of the other creative things I’ve done over the years, in the sense that you work on it for a bit, create something bad, then smoosh it down and start over with what you’ve learned.

    My big take-away so far is that (SHOCKING!!) ChatGPT is not as useful as it looks at first. OK, so it definitely IS useful. You can ask it things like “how would I build a basic GUI using Python?” and it gives you Tkinter syntax examples to get the ball rolling. But it reminds me of doing translation (my day job) using a machine translation engine: 1) you get something that looks good; 2) great! you set out to tweak it and clean it up; 3) you find out it needs a fundamental re-write; 4) you’ve Ship of Thesus’d the thing, and it took longer than just doing it from scratch.

    Still, it’s been useful when I get stuck, and it’s a nice place to get quick answers to easy questions. What this experience is reinforcing for me is that humans are still best at the big-brained strategic/creative planning. While something like Code Pilot can help you build the pieces, you’re the one who has to know how to put the pieces together.


  • learning to code

    Simon Willison–a computer scientist and LLM researcher–published a roundup of everything we learned about LLMs in 2023 and it is a very good and entertaining post. You should read the whole thing, but I would specifically like to highlight this part:

    Over the course of the year, it’s become increasingly clear that writing code is one of the things LLMs are most capable of.

    If you think about what they do, this isn’t such a big surprise. The grammar rules of programming languages like Python and JavaScript are massively less complicated than the grammar of Chinese, Spanish or English.

    It’s still astonishing to me how effective they are though.

    One of the great weaknesses of LLMs is their tendency to hallucinate—to imagine things that don’t correspond to reality. You would expect this to be a particularly bad problem for code—if an LLM hallucinates a method that doesn’t exist, the code should be useless.

    Except… you can run generated code to see if it’s correct. And with patterns like ChatGPT Code Interpreter the LLM can execute the code itself, process the error message, then rewrite it and keep trying until it works!

    So hallucination is a much lesser problem for code generation than for anything else. If only we had the equivalent of Code Interpreter for fact-checking natural language!

    I’m in the process of learning to write software with Python and this rings absolutely true. ChatGPT 4.0 is BONKERS good, it’s like having a tutor sitting at the desk with me. It makes the learning process so much more fluid, and even if it gives me wrong answers, a) I can check them when I try to run the code; b) it gives me a starting point for solving a problem; and c) I can use it to iterate.

    LLMs open coding up to vastly more curious tinkerers who don’t like dealing with tedious syntax and impenetrable documentation. I suspect part of the reason Silicon Valley is losing its mind over this technology is because it can do what they do.


  • the open web is dead, long live the open web

    LLMs are going to bring back the open web. OK, wait, hear me out. Look around the internet sometime, just do a random Google search and poke it with your finger– it is LOADED with shit. Just pure, hot shit, bulging with it, and it’s also the slickest shit you’ve ever seen: Perfect front-end design, stock photos with beautiful people, and paragraphs of flawless nonsense that repeat your search terms like mantras. “Fuck this shit,” you might say, but that’s fine because you’re not actually supposed to look at it. It wasn’t made for humans, it was made for algorithms.

    Even when a website is for a real business or publisher, it HAS to be loaded with SEO shit to get Google to put it in front of a human to sell them a thing, show them an ad, or collect a piece of data, but whether the human (you) actually reads the writing on the website or looks at the images is mostly irrelevant. Even the media sites that are trying to communicate something to the public are so gummed up with popups and popovers it feels like 2003.

    How are LLMs going to fix this? They aren’t lol, they are going to make it VASTLY worse, infinitely worse. The technology makes it so easy to produce algorithm-hacking shit that it will bury everything, which is maybe why Google is flop-sweating right now: finding real information on the open web is changing from a needle-in-a-haystack problem to a needle-in-a-needlestack problem. And it’s not just search that is going to have this problem– it’s every platform that sorts content using algorithms.

    My contention is that, paradoxically, this will make human-made internet content very valuable. This is why Reddit is trying to lock down its content, and why Wirecutter does good business behind a paywall. Increasingly, as search engines are overwhelmed with the digital equivalent of cheap injection-molded plastic, wood grain and jade stone become more desirable, but to find them, you have to know a guy.

    I don’t know the future, but I remember the past, before algorithms filtered the internet, and back then there were still lots of ways to find the good stuff. There were portals, web rings, aggregators, message boards, and chat rooms, as well as blogs, RSS, and word of mouth. Today, we also have the ActivityPub protocol, podcasts, e-mail newsletters, and probably a bunch of stuff I’m forgetting.

    Algorithmic search marked the beginning of the end of the open web, and now LLMs will finish putting it in its grave. But I prefer to think of it as planting a seed.


  • zig when they zag

    Now that everyone is getting their information from an infinite scroll of America’s Funniest Home Videos, I figured I’ll to start a blog. It is a very Elder Millennial Move, but I am going for it.