Google recently released a paper outlining their new medical LLM, called AMIE (“Articulate Medical Intelligence Explorer”). AMIE can take medical histories, diagnose medical conditions, and communicate directly with patients. Despite all the caveats they place on their conclusions, it’s clear that AMIE works–in simulated medical settings, it outperformed human doctors on most measures (28 of 32 categories according to medical experts, and 24 of 26 according to the patients). AMIE not only made more accurate diagnoses than the doctors, but patients rated it as more empathetic than the humans.*
There are so many new models coming out these days, both specialist and generalist, that this didn’t get a lot of attention. But it’s a big deal, for obvious and non-obvious reasons. Here are the five lessons to be learned from AMIE:
- This is the real deal. We have shortages of doctors all over the world, including both primary-care physicians and specialists in many areas. AMIE may or may not be the winning medical AI tool, but either it or something like it will become ubiquitous over the coming decade. Once we have the technology to allow 10 doctors to do the work of 20, economic logic ensures that that technology is going to be deployed. And it should be deployed–patients may not like giving their histories to a robot instead of a doctor, but in the absence of enough doctors to provide medical care to everyone in the traditional manner, tools like AMIE will save lives.
- AMIE is the first of many “do my job” LLMs to come. Everyone is still experimenting with AI, and most of the real-world deployments to date have either been extremely niche or involved minor productivity boosts like creating first drafts of copy. Although it’s intended as an assistant, AMIE could, if required, provide end-to-end basic care for many patients, in a way that generalist LLM chatbots generally can’t do the entire job of a skilled worker. Robot doctors will be followed by robot lawyers, robot accountants, and robot teachers, to name a few.
- Specialist LLMs are more powerful than people realize. A crude formula for the effectiveness of an LLM is: Power (parameters, data, and design quality) divided by Breadth (the number of different things the LLM needs to be able to do). If you only want your LLM to function in a relatively narrow field, and it doesn’t need general knowledge or language skills outside that field, then you can get much higher levels of quality (as shown in accuracy and clarity) for a given level of underlying power. Generalist LLMs will always be useful for open-ended casual use, but professional applications will become the domain of narrower specialist models.
- AMIE keeps humans in the loop–but it doesn’t really need them. Google is understandably nervous about the prospect of an LLM replacing doctors entirely, both for ethical and practical reasons. So AMIE is intended only to help doctors in a supervised clinical setting. But there’s no reason why a medical LLM, or any professional LLM, has to be designed like that. It’s technically possible for someone to make a version of AMIE designed for fully independent use, and make it available for patients to use independently (and indeed I think that’s pretty likely in the near future).
- We face an imminent decision on how to use the productivity gains from professional AI tools. It’s almost certain that more tools like AMIE will be deployed in the coming years, and while that raises a host of issues to work through, we will see significant benefits as we do so. There is a sociopolitical question with how we should use the productivity gains that result from these tools. In medicine, it’s likely that we will use them to relieve shortages of medical professionals, and I can’t imagine it will be contentious. But in fields without those kinds of shortages, it might be a different story. We can simply harvest those efficiency gains and spend less money on different activities (with fewer people employed in them), or we can reinvest the benefits, allowing the people who have those jobs more time to do the “human” part of the job. Personally, if we’re going to have robots assisting our teachers (for example, creating and assessing homework assignments), I hope that we don’t decide to employ fewer teachers–it would be so much better to give them that time back to spend with their students directly. We can make this choice differently, but we will have to make it.
Some people will love AMIE; some will find it terrifying. Either way, though, the right answer is to consider how we can ensure it plays the most constructive role possible in society, instead of simply celebrating it or criticizing it. AMIE is an early glimpse into the future of work, and by noticing that future, we can give ourselves more time to be ready for it.