EA - AI Safety Newsletter #2: ChaosGPT, Natural Selection, and AI Safety in the Media by Oliver Z

The Nonlinear Library: EA Forum - Een podcast door The Nonlinear Fund

Podcast artwork

Categorieën:

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Safety Newsletter #2: ChaosGPT, Natural Selection, and AI Safety in the Media, published by Oliver Z on April 18, 2023 on The Effective Altruism Forum.Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.Subscribe here to receive future versions.ChaosGPT and the Rise of Language AgentsChatbots like ChatGPT usually only respond to one prompt at a time, and a human user must provide a new prompt to get a new response. But an extremely popular new framework called AutoGPT automates that process. With AutoGPT, the user provides only a high-level goal, and the language model will create and execute a step-by-step plan to accomplish the goal.AutoGPT and other language agents are still in their infancy. They struggle with long-term planning and repeat their own mistakes. Yet because they limit human oversight of AI actions, these agents are a step towards dangerous deployment of autonomous AI.Individual bad actors pose serious risks. One of the first uses of AutoGPT was to instruct a model named ChaosGPT to “destroy humanity.” It created a plan to “find the most destructive weapons available to humans” and, after a few Google searches, became excited by the Tsar Bomba, an old Soviet nuclear weapon. ChaosGPT lacks both the intelligence and the means to operate dangerous weapons, so the worst it could do was fire off a Tweet about the bomb. But this is an example of the “unilateralist’s curse”: if one day someone builds AIs capable of causing severe harm, it only takes one person to ask it to cause that harm.More agents introduce more complexity. Researchers at Stanford and Google recently built a virtual world full of agents controlled by language models. Each agent was given an identity, an occupation, and relationships with the other agents. They would choose their own actions each day, leading to surprising outcomes. One agent threw a Valentine’s Day party, and the others spread the news and began asking each other on dates. Another ran for mayor, and the candidate’s neighbors would discuss his platform over breakfast in their own homes. Just as the agents in this virtual world had surprising interactions with each other, autonomous AI agents have unpredictable effects on the real world.How do LLM agents like GPT-4 behave? A recent paper examined the safety of LLMs acting as agents. When playing text-based games, LLMs often behave in power-seeking, deceptive, or Machiavellian ways. This happens naturally. Much like how LLMs trained to mimic human writings may learn to output toxic text, agents trained to optimize goals may learn to exhibit ends-justify-the-means / Machiavellian behavior by default. Research to reduce LLMs’ Machiavellian tendencies is still in its infancy.Natural Selection Favors AIs over HumansCAIS director Dan Hendrycks released a paper titled Natural Selection Favors AIs over Humans.The abstract for the paper is as follows:For billions of years, evolution has been the driving force behind the development of life, including humans. Evolution endowed humans with high intelligence, which allowed us to become one of the most successful species on the planet. Today, humans aim to create artificial intelligence systems that surpass even our own intelligence. As artificial intelligences (AIs) evolve and eventually surpass us in all domains, how might evolution shape our relations with AIs? By analyzing the environment that is shaping the evolution of AIs, we argue that the most successful AI agents will likely have undesirable traits. Competitive pressures among corporations and militaries will give rise to AI agents that automate human roles, deceive others, and gain power. If such agents have intelligence that exceeds that of humans, this could lead to hu...

Visit the podcast's native language site