EA - How to Diversify Conceptual AI Alignment: the Model Behind Refine by adamShimi
The Nonlinear Library: EA Forum - Een podcast door The Nonlinear Fund
Categorieën:
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How to Diversify Conceptual AI Alignment: the Model Behind Refine, published by adamShimi on July 20, 2022 on The Effective Altruism Forum. This work has been done while at Conjecture Tl;dr: We need far more conceptual AI alignment research approaches than we have now if we want to increase our chances to solve the alignment problem. However, the conceptual alignment field remains hard to access, and what feedback and mentorship there is focuses around few existing research directions rather than stimulating new ideas. This model lead to the creation of Refine, a research incubator for potential conceptual alignment researchers funded by the LTFF and hosted by Conjecture. Its goal is to help conceptual alignment research grow in both number and variety, through some minimal teaching and a lot of iteration and feedback on incubatees’ ideas. The first cohort has been selected, and will run from August to October 2022. In the bigger picture, Refine is an experiment within Conjecture to find ways of increasing the number of conceptual researchers and improve the rate at which the field is making productive mistakes. The Problem: Not Enough Varied Conceptual Research I believe that in order to solve the alignment problem, we need significantly more people attacking it from a lot different angles. Why? First because none of the current approaches appears to yield a full solution. I expect many of them to be productive mistakes we can and should build on, but they don't appear sufficient, especially with shorter timelines. In addition, the history of science teaches us that for many important discoveries, especially in difficult epistemic situations, the answers don't come from one lone genius seeing through the irrelevant details, but instead from bits of evidence revealed by many different takes and operationalizations (possibly unified and compressed together at the end). And we should expect alignment to be hard based on epistemological vigilance. So if we accept that we need more people tackling alignment in more varied ways, why are we falling short of that ideal? Note that I will focus here on conceptual researchers, as they are the source of most variations on the problem, and because they are so hard to come by. I see three broad issues with getting more conceptual alignment researchers working on wildly different approaches: (Built-in Ontological Commitments) Almost all current attempts to create more conceptual alignment researchers (SERI MATS, independent mentoring...) rely significantly on mentorship by current conceptual researchers. Although this obviously comes with many benefits, it also leads to many ontological commitments being internalized when one is learning the field. As such, it's hard to go explore a vastly different approach because the way you see the problem has been moulded by this early mentorship. (Misguided Requirements) I see many incorrect assumptions about what it takes to be a good conceptual researcher floating around, both from field-builders and from potential candidates. Here's a non-exhaustive list of the most frustrating ones You need to know all previous literature on alignment (the field has more breadth than depth, and so getting a few key ideas is more important than knowing everything) You need to master maths and philosophy (a lot of good conceptual work only uses basic maths and philosophy) You need to have an ML background (you can pick up the relevant part and just work on approaches different to pure prosaic alignment) (No Feedback) If you want to start on your own, you will have trouble getting any feedback at all. The AF doesn't provide much feedback even for established researchers, and it has almost nothing in store for newcomers. Really, the main source of feedback in the field is asking other researchers, but when y...
