A mayoral candidate in Wyoming, Victor Miller, wants to use an AI bot to run the capital city of Wyoming. His idea is to use ChatGPT, specifically GPT-4, to manage the city, analyze different laws, and make decisions in place of the mayor. He’s currently running for office and has recently garnered a lot of publicity because of this idea. It’s fascinating because it reflects the state of excitement around artificial intelligence, particularly with tools like ChatGPT. People are eager to apply these solutions everywhere, even in areas where they may not be useful.
This is something that could likely only happen in the U.S., a country where you find both the best and the worst of everything. Americans are generally more willing to take risks and are less fearful compared to people in other parts of the world. So, while I wasn’t surprised to see this happen in the U.S. first, it’s still somewhat shocking — perhaps even ‘crazy’ is the right word — to see it actually unfolding. When you understand how large language models are trained, you realize they are not truly capable of reasoning. They can provide good recommendations but can also produce nonsensical ones depending on the input. It’s completely unreasonable to think that an AI bot could effectively run a city.
There are many claims about the reasoning abilities of large language models, but the truth is that these models perform statistical matching and inference based on lexical cues. They aren’t truly reasoning; they’re just pattern matching based on the text they’ve been trained on. Large language models are very good at mimicking reasoning for information they were trained on or similar information. However, if you ask them something outside of their training, which is difficult because these models are trained on almost the entire content of the internet, you’ll find that they struggle. When faced with truly novel questions, large language models often fail to respond effectively. They don’t perform well when confronted with queries that fall outside the context of their training data.
This is something to keep in mind, particularly in real-life situations outside the mundane activities people typically do online, like searching for information. Of course, you’re not the first person searching for a topic, so a large language model will usually be able to give you an answer. Sometimes the answer will be wrong, but most people won’t even notice the errors unless they’re particularly curious. It’s important to understand that when questions are genuinely novel and outside the training set, large language models will struggle.
My take is that large language models are not yet capable of making autonomous decisions. I would be curious to see the experiment, because large language models are more reasonable than people in some respects because people often make emotional or irrational decisions. However, allowing a large language model to make decisions about people’s lives is very dangerous due to the ethical limitations. OpenAI actually closed Victor Miller’s account when they learned about his project, but he simply created another account. I suppose he could even use an open-source model if he wanted. I don’t think he will win, but it is amusing to see this happening. It’s also a good opportunity to reflect on what large language models are capable of and what they are not capable of.