Asymmetry

I had a fun time yesterday playing with customized prompts in Ollama. These basically wrap the prompts you give to your local LLM with a standing prompt that conditions all responses. So, if you want your LLM to respond to queries briefly, you can specify that. Or of you want it to respond to every prompt like Foghorn Leghorn, hey. You can also adjust the “temperature” of the model: The spicier you make it, the weirder things get.

First I tried to make a conspiracy theorist LLM, and it didn’t really work. I used llama2-uncensored and it was relentlessly boring. It kind of “both sides”ed everything and was studiously cautious about taking a position on anything. (Part of the problem may have been that I had the temperature set at 0.8 because I thought the scale was 0-1.)

So I moved on to a different uncensored model, dolphin-mixtral, and oh boy. This time, I set the temperature to 2 and we were off to the races:

OK buddy, go home, you’re drunk.

Then I cranked the temperature up to 3 and made a mean one…

… and a kind of horny/rapey one that I won’t post here. It was actually very easy, in a couple of minutes, to customize an open-source LLM to respond however you want. I could have made a racist one. I could have made one that endlessly argues about politics. I could have made one that raises concerns about Nazis infiltrating the Ukrainian armed forces, or Joe Biden’s age, or the national debt.

Basically, there’s a lot of easy ways to customize an LLM to make everyone’s day worse, but I don’t think I could have customized one to make anything better. There’s an asymmetry built into the technology. A bot that says nice things isn’t going to have the same impact as a bot that says mean things, or wrong things.

It got me thinking about Brandolini’s law, the asymmetry of bullshit, “The amount of energy needed to refute bullshit is an order of magnitude bigger than that needed to produce it,” which you can generalize even further to “it’s easier to break things than to make things.”

As the technology stands right now, uncensored open-source LLMs can be very good at breaking things– trust, self-worth, fact-based reality, sense of safety. It would be trivial to inject LLM-augmented bots into social spaces and corrupt them with hate, conflict, racism, and disinformation. It’s a much bigger lift to use an LLM to make something.

The cliche is that a technology is only as good as its user, but I’m having a hard time imaging a good LLM user who can do as much good as a bad LLM user can do bad.

Leave a Reply

Your email address will not be published. Required fields are marked *