Artificial intelligence is transforming how we discover and develop new medicines. But how far can it really take us?
In this episode of Hard Drugs,
and trace the path of drug development from discovery to testing, manufacturing, and delivery. They explore where AI could speed things up, and where it still hits the limits of biology, data, and economics. They ask what it would take, beyond algorithms, to actually cure and eradicate diseases.Hard Drugs is a new podcast from Works in Progress and Open Philanthropy about medical innovation presented by Saloni Dattani and Jacob Trefethen.
You can watch or listen on YouTube, Spotify, or Apple Podcasts.
Saloni’s substack newsletter: https://www.scientificdiscovery.dev/
Jacob’s blog: https://blog.jacobtrefethen.com/
Transcript
Saloni Dattani:
Is AI about to cure all diseases? This year, Demis Hassabis, the CEO of DeepMind, said, “I think one day maybe we can cure all disease with the help of AI. I think that’s within reach, maybe within the next decade or so. I don’t see why not.” Is he right? In this episode, we figured we’d tackle that head on. I’m Saloni Dattani, this is Jacob Trefethen, and we’re presenting the new podcast, Hard Drugs.
Jacob Trefethen:
So far, we’ve talked about what proteins are, how they can be medicines like insulin, how AI can help scientists improve proteins to make even better drugs, and how AI can help design entirely new proteins never seen in nature.
Saloni Dattani:
Now we’ll zoom out to look at the drug development process as a whole, talk about what AI might speed up and where new drugs might still get stuck. This one is necessarily more speculative than usual; we’re going to draw on examples from the past and talk about the possibilities of the future.
Jacob Trefethen:
You can leave the end of the episode with better guesses of whether AI is about to change everything or whether it will be one tool among many that scientists can draw on. Will we cure all disease in 10 years? Let’s get into it.
Saloni Dattani:
I recently read this blog post by
, and he says, “We still can’t predict much of anything in biology.” I thought that was kind of interesting because the last two episodes we’ve talked about how AI is being used to improve protein structure prediction and design new proteins. But he basically explains that even though there has been a lot of progress, there are still a lot of really unsolved problems. Biology is much more complex than people would imagine, and even the types of problems that have been solved are not necessarily representative of all the problems that are out there. We haven’t really modeled the whole complexity of cells, organs, and organisms as a whole.He says, “I remember in the early 2000s, David Baker was revolutionizing computational protein design with his Rosetta software suite, winning CASP competitions left and right, and writing papers that gave the impression computational protein design was solved. For example, computational design of novel folds was solved by 2003. Protein docking was solved by 2003. Enzyme design was solved by 2008. Atom level co-folding of multi-peptide chains was solved by 2009. Yet here we are 20 years later. All of these topics are still active areas of research, and if you have any particular system of interest, you may find that none of the available methods perform that well.”
He gives three different reasons for this — not that the research is sloppy or anything like that, but essentially that the types of work that people have approached with AI far have been ones where the likelihood of success is much higher. They’re kind of areas where you could make those predictions well already, and we have much more data available on those topics. We have only just started solving these soluble proteins — proteins dissolved in water — and not all of these other types. In nature, proteins might be wiggling around; they might be attached to a membrane, or they might be attached to some drug. They’re doing different things, and those things haven’t yet been solved.
Jacob Trefethen:
Yeah, I found that quote so interesting that you just read out because it can always feel like you’re on the precipice of sudden change. Hearing people talk about — people who have had that feeling before especially — what was the same that time around and what’s different this time around is really useful. There are different lenses and worldviews that people apply to AI progress, and I think people come at this question from extremely different places. I just want to outline those worldviews in case you’re a listener who feels described by one of them and are worried we’re not going to hear you properly because you don’t have a microphone in front of you.
On the one hand, many people who think AI will change everything. Hold aside the debate about whether people trying to make artificial super intelligence are actually going to achieve that. And just project forward AI progress will continue and assume that we’ll get to more powerful systems of some sort. I think a lot of people look at that and think, “Well, in principle, a lot about the physical world is knowable, and we just don’t know it yet. Science is about discovering that knowledge, and scientific discovery does progress. It is often limited on having some of the smartest people in the world, Einstein and all of those lot, devote their energies to thinking about the external world around us. If we’re about to invent systems that can reason well, debate with us, and debate with each other, instead of having hundreds of thousands of working scientists alive at any time working on discovering the nature of the universe, we can have hundreds of millions maybe, but they are AI agents.”
There might be a period when you apply that to human biology, where in a few years, we ask those helpful assistants, “If we want to learn about the human body and develop drugs that will prevent death and prevent illness, what in the fewest number of experiments, just help us do this and communicate it to us in a way we’ll understand.” In a few years, those experiments will get carried out, they’ll get published, and humanity will be much better off. You’ll walk into a booth, you’ll get genotyped, you’ll maybe give a few blood samples, give your medical history, and an AI doctor will tell you, “Okay, here’s your regimen of seven daily pills you’ll take for the next year.” That will actually prevent ill health for you, based on all the knowledge we discovered of the human condition. I would say that this worldview is quite common in where I live, San Francisco, so I hear this from friends quite often.
But let me say the other view, which is very common too, and probably more common from practicing scientists and people who’ve worked in drug development, which is: human biology is really complicated. We understand very little of it. It’s hard to even take good measurements of the human body, in certain parts of the human body — the brain, the heart, and so on — without doing harm to a given person. In the case of software, sure enough, transistors have faced Moore’s law where they’re getting cheaper and cheaper to make over time, and that’s led to a big boom in technology from a software point of view.
We have the reverse in drug development. We have what people sometimes call Eroom’s law- The reverse. E-R-O-O-M.
Saloni Dattani:
It’s Moore’s law backwards.
Jacob Trefethen:
Everything’s getting harder and more expensive. Progress is really hard won. The more positive spin on that is we’ve actually already technologically solved some of the worst health problems humanity used to face, in rich countries at least. We now have antibiotics. We now have vaccines for childhood diseases. We have statins to reduce your risk of heart disease. There’s more progress coming, but it’s fiddly, it’s difficult. The bottlenecks aren’t mostly in discovery where AI might help; they’re in off-target effects and toxicity from these drugs. The expense of clinical trials is the block. Manufacturing new modalities is the block. Health systems themselves are the block in making sure that people who need new drugs can actually access them. I think that’s a whole different worldview, and sometimes one person can be both of those people, but a lot of the time those people don’t communicate that well.
Saloni Dattani:
Do you think we should get all of these different people in a room and just watch them fight, or are we going to try to mediate them and solve all of their questions?
Jacob Trefethen:
In a world where we could get them all to fight, I think that might get us even more downloads on this podcast. But for now, you and I are going to have to hash it out in this episode.
Jacob Trefethen:
To figure out how AI could affect medicine, we’re going to talk about the steps to making medical progress today and see if AI can speed that step up or let us skip that step entirely.
Drug discovery
Saloni Dattani:
So, one of the areas that AI seems most promising is drug discovery. That means finding potential drugs that could be used as treatments. You probably think understanding the disease is crucial to developing new drugs, right? Actually, wrong. Drugs can be developed without understanding the disease at all, and there are many different ways that that can happen. This was very common in the past, but it’s still common today. To explain why, I want to give you three different examples from different parts of history: one is Jenner’s smallpox vaccine, two, the discovery of a new drug for malaria in the 1960s, and three, a new schizophrenia drug that was approved last year.
Let’s start with Edward Jenner in 1796, a very long time ago. As many people have probably heard, Edward Jenner developed a vaccine against smallpox by using the pus from cowpox infections. Dairy maids who were infected by a related virus that causes small pustules on their hands were also protected from smallpox. He extracted that pus and transferred it to children, adults, and so on to protect them from a potential outbreak of smallpox. This is really interesting because the way that he had discovered that was really through other people’s case reports, observation, and collecting data. He went across many dairy farms and asked them, “Has anyone in your family been protected from smallpox in a previous outbreak? Did any of them not catch it?” He collected data on all of these different individuals who had contracted cowpox at some point and after that had been protected from a smallpox outbreak that all the rest of their family had been infected with. Through that and through experimentation, he developed a new vaccine. This is quite interesting; this is basically a very early form of epidemiological analysis. He’s collecting all of this data from these different examples in front of him, and he’s doing experimentation.
Similarly to that, the way that Louis Pasteur and Emile Roux developed the rabies vaccines in the 1880s was that they did lots of experiments in the lab. They didn’t really have any idea of how vaccines actually worked. At that point, it was quite common to think that the way a vaccine worked was it somehow depleted specific nutrients from your body that the virus or the infection needed in order to cause disease. If you used the vaccine, it would deplete those nutrients, and then you’d be protected from the real infection later on. They did lots of experimentation, seemed to find these methods that were effective, but really they had no idea why; they were very empirical in the way that they did this research. This is again, much before it became clear how immunologically any of these vaccines worked at all.
There’s another example from the 1970s of the discovery of artemisinin, which is a malaria drug. That was discovered by a Chinese scientist called Tu Youyou. She was part of this secret research project in communist China, where she was part of the small medical research institute trying to discover new malaria drugs from ancient medical texts. At this point in the 1960s, the Vietnam War was going on, and the previous drugs that people had against malaria were gradually not working anymore. The parasite and the mosquitoes were becoming resistant to them, and it was becoming more of a problem for people fighting in the war, so they needed to find new potential medicines.
What she did was, she looked through more than 2,000 different ancient medical texts — recipes of traditional medicines — for potential herbs and preparations that might be effective. She then narrowed down all of those thousands to a few hundred, and then tested some dozens of them in the lab and tested them in animals and people. Eventually, isolated this particular compound called artemisinin from the Qinghao or sweet wormwood plant. This was just one of hundreds or thousands of different potential recipes that could have worked according to those texts.
What she did is essentially a very early high-throughput screening. Even today, when pharmaceutical companies are trying to find potential drugs, they might just have this library of chemicals that they’ve used for different purposes, that they’ve experimented with before. They want to see, “Do any of those work against this disease that we’re trying to treat?” They will do this mass screening, testing all of these different drugs in the lab to see how they affect whatever — the receptor in cell culture or in animals and things like that. Sometimes they might come upon some that do work. Even in the 1970s, people were applying similar methods where they were really scrolling through all of these hundreds or thousands of different potential recipes and trying to find something.
Related to that, it’s quite funny that the two examples we’ve used seem a bit traditional or ancient herbal medicines. You might think, “Okay well, why do we need modern science then? What’s the point of trying to do drug development the way that we do it now?” I think it makes me think about the types of refinements that you can do once you’ve found these compounds. The modern versions of them tend to be a lot better; they contain fewer contaminants — once you’ve identified the key ingredient that is responsible for the effects, you can then tweak that, you can then remove the impurities. You can try to test the dosing and try to get into a range that is both safe and effective. You can also tweak and improve the efficacy, reduce the side effects, or make it more heat-stable or more soluble or more easy to manufacture. That’s what happens with artemisinin. After Tu Youyou discovered this from this plant, people then adapted and improved on the compound to make it more bioavailable, so you would require a smaller dose to have the same effect.
Jacob Trefethen:
It’s so cool to do both — to use the wisdom of the ancients and the tools of modern science. It makes me want to go read some old texts and see what I can discover.
Saloni Dattani:
I’ve given two old examples, one from the 18th century and one from the 1970s. There’s also a third example that I want to give because there really are these three or four different pathways that you can go about designing or finding a new drug without understanding the disease. That is a schizophrenia drug that was approved last year called xanomeline trospium. I’ve written about this a little bit.
What’s really interesting about it was that it was discovered, or well, it was initially tested as a potential drug for Alzheimer’s disease in the 1990s. While conducting that research, scientists discovered that, surprisingly, it seemed to reduce people’s symptoms of hallucinations, delusions, and agitation in those patients. It wasn’t really slowing down the cognitive decline, but it seemed like, “Well, if this is reducing hallucinations and so on, maybe it could be useful as a schizophrenia drug instead.” After that, they started to test whether it could be repurposed as a schizophrenia drug. But in those trials, it caused a lot of side effects like vomiting and stomach pain. Because it was so hard for people to take, they just shelved the drug and didn’t continue that work.
In the meantime, other researchers were trying to figure out what was actually going on. Why is this drug seemingly causing a reduction in hallucinations and so on, but also causing all of these horrible digestive side effects? The reason is this drug targets muscarinic receptors. This is a type of receptor on the outside of your nerve cells in the brain, and that was the way that it seemed to be reducing these hallucinations and delusions. But also, very similar receptors are in other parts of your body, and they’re doing different things. In your digestive tract, there are muscarinic receptors as well. When both of them are targeted by this one drug, then you could have both of those effects.
What they did was, they combined this original drug, which was xanomeline, with another drug called trospium, which meant that it was unable to target the muscarinic receptors outside of the brain in the gut. It could only really affect the receptors in the brain and reduce those side effects. You get the benefits of reduced hallucinations, but you also don’t have the side effects that were seen earlier. The reason I brought up these three different examples is that there are three different ways that you can develop drugs without really understanding the disease at all.
The first one, Jenner’s smallpox vaccine or Pasteur’s vaccine, is really about this empirical analysis. You’re learning from epidemiology, you’re seeing, “Oh, it seems these people are protected, don’t know why.” You’re trying to experiment with that, see if you can tweak and improve the method; that’s one way. Another way is the malaria drug development, where you’re just testing hundreds or thousands of different compounds, seeing if anything works in the lab or in animals, and then based on that, you’re tweaking and improving it. The third way is you’re repurposing an existing drug. When you’re testing for one condition, you notice something else that might be helpful for another condition. So, can AI replace these methods? What do you think?
Jacob Trefethen:
I’m in two minds about it. I think that contrasting the examples you just laid out, where we didn’t need a full mechanistic understanding, with the more rational version that people might picture, I have some hope for AI spotting patterns that we haven’t yet. If large language models had ingested those ancient Chinese texts, would that have helped prompt good ideas for scientists to investigate further? I don’t see why not. Similarly, every time some new idea comes up with repurposed drugs, I always wonder, “Well, why didn’t that pop up earlier?” So those kinds of questions, I do have quite a bit of hope for AI.
A real input there, though, is what kinds of empirical observations are accessible to different large language models or to different reasoning agents? Are they located in published form? Are they in actual medical papers, are they in case reports, in doctor’s notes? Given different privacy and legal institutional concerns there, will there be access to that data of some form that could prompt these ideas? One of the things that makes me so hopeful though, is that especially for repurposing, we already know as a society how to make very cheap versions of small molecule drugs. The more you can improve people’s health by giving people the right combination of small molecule drugs, the more hope I have that many people can access health-improving technology. There are many other more complicated routes that you might be able to develop drugs than small molecules, and those get me...
Saloni Dattani:
Like antibodies or vaccines or things like that.
Jacob Trefethen:
Antibodies, vaccines...
Saloni Dattani:
Gene editing therapies.
Jacob Trefethen:
Gene editing, surgeries, organ transplants, all of those, incredibly important, but much harder. I worry more about the ability to really reach everyone who might need it.
Saloni Dattani:
I think you’re right; I think this area of drug discovery is maybe one of the more optimistic areas that we’ll talk about. But at the same time, even when you have spotted the patterns or the similarities in diseases, or you’ve analyzed the data from healthcare records and things like that, you still need to do lots of experiments in cells and animals and humans to confirm that they work. We’ll talk about that later on, but basically, even once you do have this collection, you still have to filter that down, and it’s really going to be just a fraction of those that will work in reality; that’s one thing to remember, I think.
The other is that when you’re testing so many different combinations of different drugs and different diseases, you actually need a huge amount of data to have the statistical power for making those inferences. If some of these drugs are not being taken by that many people and they don’t have other diseases, it’s hard to actually test, are those people actually getting better? I think maybe there are some diseases that this approach works better for and some that it works worse for.
My thinking is that there’s probably much more potential for drug discovery with neglected diseases through this route, or rare diseases and things like that, because there is more of a tendency for those types of conditions to be caused by a single exposure, like a single pathogen or a single environmental pollutant. They haven’t really been studied as much yet, so if we do make some effort, you could potentially make lots of progress. But at the same time- and some of these are genetic congenital conditions that are very rare, and so trying to figure out what single gene might be responsible for some of them could help design or develop new drugs.
But at the same time, there are quite a lot of different rare diseases, right? And lots of different neglected diseases where the data collection hasn’t really been that comprehensive, and there isn’t that much for AI models to go on. They haven’t been studied as much yet. There are a lot of different rare diseases; collectively, they’re not that rare, about 5% of the population has some kind of rare disease. But studying each one is very hard because there just aren’t that many people. There are usually 20 per 100,000 people in the population who have that, so there hasn’t been that much research done on that. Once you do collect the data, I would assume that it would be easier for AI to make a lot of progress on those types of conditions. But at the same time, you still have to collect that data in the first place. If we’re thinking about where AI would have the most impact, given the existing amount of data and effort, probably that’s not the rare diseases or the infectious diseases, but if you were to collect that data, there’s a lot more potential that you could have.
Jacob Trefethen:
Got it. What sort of data should we be going after then?
Saloni Dattani:
I think one is sequencing data. People who have rare genetic conditions often they’re not in the types of genetic data that are commonly collected right now, which are... If you’ve done 23andMe, for example, it basically just tests a very small fraction of your entire genome. Those are the areas that are very common for people to vary on, and basically helps you to predict your ancestry and common differences between people. But most of the genome is not included there. People with rare genetic diseases tend to have mutations or changes in individual or specific parts that are very uncommon, and those tend to have much larger effects on their risk of diseases or things like that. That would be one that I think, if we have better sequence data from people, it’s going to be much easier to spot those patterns of people with these conditions tend to have these very rare mutations that haven’t been studied very much so far.
The other one is collecting data on environmental exposures. One that I think is quite interesting that I recently read about was ALS, which is motor neurone disease or amyotrophic lateral sclerosis. There was recently this article about how there’s this cluster of cases of some dozen people living in this small town in the Swiss Alps who all developed ALS. That’s very unusual because it’s quite a rare disease; you wouldn’t expect that in a small location. These researchers went to that town, talked to each of the individuals, their families. They talked to other people in the town, collected very detailed histories of their occupation, their genetics, their daily habits, the things that they ate and they did, things like that.
What they found was that all of the people who were cases in that town had previously eaten this type of mushroom, which is called Gyromitra venenita and there’s another one called Gyromitra esculenta. They’re wild mushrooms, and all 13 of these ALS cases had eaten these wild mushrooms in that town. None of the controls in that town had ever eaten them. I think there’s a more common name for this type of mushroom, which is false morels. There are different types of false morel mushrooms, but the specific species that they ate also contain neurotoxins; this is well-known, and people are generally recommended not to eat these types of mushrooms. I generally think this is some fairly strong evidence that this is a cause, but still obviously people need to do more research and confirm this in more studies. This idea of going out and collecting all these data from these people on these kind of rare exposures — you’re looking at mushrooms or maybe some toxic plant — that’s stuff that doesn’t exist in the existing literature unless someone actually goes out and collects it. So I do think things environmental factors, very uncommon exposures, if you do go out and collect them, you’ll find some surprising things.
Jacob Trefethen:
Okay, we need more case reports. It’s interesting just stepping back on this principle you’re discussing: you don’t always need to understand a disease in order to make a drug. I think the relation between science understanding and technology of intervening for some purpose is not always what you’d expect in fields outside of medicine as well. There’s often this perception of you have to understand something so that you can develop a tool or technology to intervene on it. How much of aerodynamics did you really have to understand before you could invent flight and airplanes?
Saloni Dattani:
Right. Steam engines were invented way before people understood thermodynamics.
Jacob Trefethen:
Yes, steam engines invented before thermodynamics. The big one, really, is fire. We were using fire well before we understood combustion or knew what oxygen was.
Saloni Dattani:
Right. This really reminds me of this article that Jason Crawford wrote for Works in Progress years ago called “Innovation is not linear.” He basically explains that people think of these as two very different approaches, and that you start off with doing basic research, just exploring, seeing what happens, developing theories, things like that. Separately, there’s engineering and tinkering and trying to make products or trying to make different tools and technologies. People generally think of that as being a linear process: you start with the basic research, you understand the disease or whatever, the theory, and then you develop the engineering outputs of that. But really, that’s not the only thing that happens. There’s a lot of feedback between these two different places, and often you start off with the product and then you figure out how it works, and then you develop these theories and those theories allow you to go forth and make way more technological improvements. You don’t have to start by understanding the disease in this case, but also once you do understand the disease, it can be really helpful.
Jacob Trefethen:
Invention feeds back into science and understanding, and they have this kind of loop together, which could have implications for AI if you think that AI is going to, for example, get better at reasoning before it gets better at taking new samples of the real world. Okay so, you talked me through some cases where we don’t need so much understanding in order to make medical progress. What about the reverse? Are there cases where medical progress is currently bottlenecked on understanding a disease, or recent cases where understanding a disease was helpful?
Saloni Dattani:
So I guess there are a lot of examples where once you understand a disease, or once you understand how a drug works, you can then tweak it and make a lot more tools related to that. Obviously, vaccines are a great example of this. In the 19th century, people didn’t really understand how they worked at all. But developing these processes of weakening a microbe in the lab — what’s called attenuation — meant that people could develop many more vaccines with that approach.
There are other approaches as well. Once you understand specifically, it’s not about having the entire microbe being weakened or something; you don’t need that entire thing as a vaccine. You could just have a specific antigen, or a specific part of this vaccine that your immune system recognizes and then matches to the pathogen in the wild. That knowledge or that theory was only really put together in the 20th century, in the 1920s to 40s. It meant that people could then develop better vaccines where you don’t include the entire pathogen, but you just have the specific proteins or the specific outer parts that are needed, and they’re much safer and easier to scale up and things like that.
We also talked about some of these other examples in the previous episodes. In the first episode, we talked about HIV treatment. One of the big breakthroughs in HIV treatments was developing protease inhibitors. Protease is an enzyme that the HIV virus has in order to mature into its infectious form. Only by understanding what the shape of that protein looked did people develop drugs that fit into one of the little gaps in that enzyme to block it. That’s another example. Gene editing, for example, is much easier if you know the specific genetic cause of a condition. You could specifically target that gene with CRISPR or RNA therapies or things to silence the specific gene that is overactive or something like that.
There are also things like devices or surgeries, where the only way to develop a pacemaker, for example, is to understand that the heart uses electricity to pump blood, right? Knowing what kind of heart rhythm is needed for that helps you develop a pacemaker. Or if you’re conducting a surgery and you’re opening up the chest — initially when people tried to do that, you would try to open up the chest, and their lungs would immediately deflate and they would just suffocate and die. Also, if you’re trying to operate on the heart, people would lose blood so quickly that, again, they would just rapidly die.
In 1950, I think, people developed this machine called the heart-lung machine, which basically replaces these two functions of the heart and lung, in order to keep people alive during a cardiac surgery. The heart-lung machine essentially pumps blood — you’re connected to this machine, it pumps blood, keeps blood pumping — and then also bubbles oxygen into it. That means people can continue to have blood flow, but you can only really make that once you know that you need to do both of those things.
Jacob Trefethen:
Making the connection to AI then, I wonder how much do you think that, firstly, in big diseases, we still don’t have that knowledge connection, and if so, will AI be able to help? Secondly, if we do have that connection between a molecular target, say, and disease, do you think that means we’ll get a drug really soon, because AI will help?
Saloni Dattani:
I think there are a lot of diseases where we don’t have the right data that’s collected yet for people to understand the causal pathway. One thing that you and I are both interested in is tuberculosis, right? A lot of people are trying to get rid of tuberculosis in the general population by targeting what’s called latent infections. The bacterium is basically just hidden around; it’s not doing very much, but eventually it might reactivate and cause disease. In order to know how many people can be targeted with that or exactly how to treat those people and get rid of tuberculosis, you need good ways to test for people who have those latent infections. My understanding is that those testing methods are currently not very good, and because of that, we have a huge uncertainty about how many people even have latent infections in the world. The previous estimates were that a quarter of the world’s population has a latent TB infection. Unfortunately, that estimate is affected by people who are also vaccinated with BCG, right?
Jacob Trefethen:
And turn up positive on the skin test.
Saloni Dattani:
Right. Or have been infected in the past at some point, but have cleared it naturally. Many of these people don’t actually still have latent infections, and that means it’s really hard to actually test whether the drug that you’ve developed is going to treat and remove the bacterium from them. The new estimates are that actually only 3 to 6% of the global population has latent infections. I feel like we just don’t really know very well right now, and we won’t know that until people go out and collect this data with better testing and better immunological methods.
Jacob Trefethen:
As good as your drug design is, you still need to figure out what’s going on in lots of different people.
Saloni Dattani:
Right. More data collection is really important. This actually reminds me of something that happened in the 1850s, which makes me sound like an extremely old person. The story I’m going to tell you is about the discovery of what caused cholera. In the 1850s, London was having this big cholera outbreak. As you might have heard, the story of John Snow tracing that to a water pump in Broad Street in London in one of the local epidemics, that pump was contaminated with the cholera bacteria. He didn’t know that it was caused by that bacterium, but he did have this idea that something in that pump was contaminated and was causing disease.
What I find really fun and interesting about these historical scientific discoveries is, you know, that the way people often hear about them is, “Oh, it’s so obvious. If only someone had gone out and collected that data, it would just be obvious that that was the cause. Why did anyone not believe him at the time?” or something like that. But actually, there were these competing hypotheses and theories even at that point. There was a very common theory that cholera was not caused by germs or anything like that; it was actually caused by either bad air or by elevation, like from how far away you are from the sea, or maybe it’s just because of poverty.
The people who proposed these theories were not really stupid. They had collected lots of data. They had mapped out cholera cases across London, and they had noticed this correlation between, “Okay, the people who live closer to the River Thames have a higher rate of developing cholera.” Maybe it’s just caused by poverty or being close to the Thames. They assumed that socioeconomic factor was the cause and it wasn’t some germ. In this case, you need to think through what’s happening here. It’s not really that obvious that just collecting data on one thing or collecting data on the other thing is going to give you the answer.
But in this case, the environmental factors are actually confounders that lead to a higher risk of many different diarrheal diseases and many different diarrheal pathogens, including cholera, that someone might be infected with. At the time, people wouldn’t have been able to distinguish these different diarrhoea diseases. Both of these theories could look correct, but only once you really understand the causal path — you’re doing experiments, actually identifying the microbe involved, and seeing which tissues in the body it infects — will you be able to identify that that’s the cause of the disease. It’s an example of where you do need more data, but you also need to model, have this theory of how these different hypotheses work together and how to distinguish between them. You still have to do lots of experiments and stuff to figure out what’s actually going on.
Jacob Trefethen:
It’s so interesting to think about the interventions that different forms of knowledge unlock in that context of cholera, because in that case, you can then have better sanitation and make sure you have cleaner water, which reduces the number of cases of cholera. At the same time, you don’t yet have a full understanding of molecular biology, certainly. Where I work now at Open Philanthropy, we just funded the development of a cholera conjugate vaccine. We’re still dealing with the problem of cholera in some parts of the world. We have only got enough knowledge about proteins and carbohydrates and the immune response of kids to different vaccine technologies that, in 2025, this is now going ahead, and we’ve co-funded a phase two being run by a vaccine developer. That would have been useful in the 1850s as well, but a lot more knowledge had to come first.
Saloni Dattani:
Right. There’s another example from COVID as well. Lots of people will know that the risk of a severe COVID infection, or dying from the coronavirus, is affected by your age, and that exponentially increases your risk. But there’s another biological factor as well. One of those is interferon antibodies. Some people, actually quite a large fraction of people who have severe disease from COVID, have antibodies in their body that are reactive to another type of protein, called type one interferon, which usually helps fight viruses. In this case, your antibodies are instead attacking this protein that you need to fight the infection off.
Some 20% of people in some research who died from COVID have these specific autoantibodies, and it raises the risk of death by 6 to 17 times if you have autoantibodies to this protein. The chances of having those kind of reactive antibodies are much higher among older people. But it’s this example of where, once you collect data on specific biomarkers or specific types of antibodies or immunological data, you can understand the causes of severe disease much better than you would from just the general data about age and things like that, and that might then help you develop better drugs.
Jacob Trefethen:
I think that example, and I guess some of the other ones you mentioned, does sort of reveal... we were at the beginning thinking, “Well, if you don’t have an understanding of a disease, can you still develop a drug?” And the answer is sometimes yes. The opposing version of that is, “If you have a perfect understanding of a disease, can you rationally develop a drug to hit certain targets?” And the answer is sometimes yes there. In fact, with most human diseases, we’re sort of in the middle. We’ve been developing our understanding, and we understand some things, and we’ve taken some forms of measurements. If we were taking other forms of measurements, we might start to understand those diseases more.
That brings me to a question of, “Okay, in the messy middle there — of where we have some knowledge, but we aren’t quite sure if the real bottleneck is knowledge or engineering and drug development, how much is AI going to help there? And will it unlock some new progress?” Take neurodegenerative diseases. I think with Alzheimer’s, it depends which Alzheimer’s researcher you ask how much they think we know versus don’t know. There’s definitely some knowledge about some protein targets or relation to some processes in the body — amyloid beta, tau, inflammation. There’s some knowledge about how it’s related, but we already have drugs now approved that reduce amyloid beta plaques. Those drugs do not cure you from Alzheimer’s all the time. We have a developed theory, and the theory can’t be simple and quite right; there must be something more complicated going on. I wonder in that case, do you think that the bottleneck is more understanding? Do you think the bottleneck is something else in drug development? Will AI help?
Saloni Dattani:
I think it’s probably lots of things. I think that’s a great summary. One thing that we don’t yet have with Alzheimer’s is really better animal models, trying to test out these drugs before they get to the clinic. The brain is also just really hard to study. Most of the research comes from post-mortem tissues instead of live brains, for obvious reasons. If we did have better methods, maybe we could learn, in a more real-time way, what these drugs are actually doing or how the disease is progressing.
The other is how to actually safely deliver drugs to the brain. The two drugs that, I think both maybe, of the drugs you mentioned that were approved to treat Alzheimer’s cause brain bleeds. I think there are various side effects and problems that a lot of medications that are targeted at the brain have. Part of the reason is that generally speaking, the brain is fairly protected from the rest of our body in terms of the toxins and the chemicals that go around our bloodstream. There’s a blood-brain barrier that means that it’s harder for certain compounds to get across and actually have any effect. That’s probably a good thing in general; you don’t want toxins to repeatedly go past and target your brain. But it also means that designing drugs in such a way that they’re effective and also safe is still quite hard.
I think the other is there are probably lots of things that we don’t know yet about the specific progression of the disease. There is also sadly lots of research fraud in this area, and that probably has slowed things down a bit where people are falsifying different experiments. That means it’s hard to know what’s actually going on. Knowing about what is not working is sometimes just as important as knowing what works. You’re not just repeating other people’s failed efforts in the past.
I think probably AI would be helpful in the fraud detection. Hopefully, it would be helpful in screening drugs for repurposing or trying to find potential drugs that target the amyloid plaques that develop in Alzheimer’s. I don’t want to say we need to solve this stuff in order to find an effective drug, because as we just talked about, you don’t need to understand these things to develop drugs sometimes. But I think that will make a difference.
Jacob Trefethen:
By the way, you know what I think we should do for delivery? It’s hard to get past the blood brain barrier, usually a good thing. Be careful what you wish for, if you get past it. My colleagues, Chris and Heather, have looked into funding a sort of gel where you go up the nose, the olfactory nerve. You know the mummies where they used to pull their brains out of their nose before they got buried?
Saloni Dattani:
Well, I don’t know any of them, but I’ll take your word for it.
Jacob Trefethen:
It’s funny, that’s not what they told me. But if you could just sniff something or rub a little gel so that you could deliver the drug through the nose, you could get hundreds of times the dosage. If you try and go through the blood, probably. You’ve got to be a little careful if you give a thousand times the dosage or something, though.
Saloni Dattani:
Have you heard of microbubbles?
Jacob Trefethen:
Microbubbles?
Saloni Dattani:
So they’re tiny bubbles. But these microbubbles basically can also be used as this drug delivery system. The bubble can be coated with something, but inside essentially it can contain some gene therapy or some chemical molecules or things like that. It’s this new type of technology that’s currently mostly being used in, I think, radiology and diagnostics. Because what you can do with these bubbles is that you can control where they go. With ultrasound, you can control when they pop. The bubbles kind of respond to sound waves. If you have a little sound, you could pop the bubble and release the drug in the right place. They can also be used to open up bits of the blood-brain barrier potentially with this little bubble getting popped over there. I’ve just been hearing about this from someone who’s writing an article about this for us, but I just thought that’s so fun. It’s still very far away from being used as a treatment for things. Right now it’s mostly being used in diagnostics.
Jacob Trefethen:
You know, that’s so fun that I really want you to have all the fun you want, so I think you should volunteer for that one first. Let me take the side of a AI advocate here. We just said that Alzheimer’s is hard to develop drugs for, for various reasons. I think some people listening might be thinking, “Well, hold on a second. If we get really advanced AI, just from the kind of reasoning agent point of view, take an LLM and make it even smarter and be able to reason in English and debate with other copies of itself in a virtual university.”
Saloni Dattani:
Wait, does that mean if it speaks in a different language, it’s just not useful to us?
Jacob Trefethen:
Well, it might be useful to communicate with itself in a different language; it’s more information dense.
Saloni Dattani:
I would love to talk to myself in different languages.
Jacob Trefethen:
You don’t do that already?
Saloni Dattani:
No.
Jacob Trefethen:
My internal monologue is sort of like, boop, boop, boop, boop, boop. But actually, you have to be a little careful about internal monologues for AI agents because one thing is so lovely from an AI safety point of view with the LLMs so far is that they reason in English, and you can read some of their reasoning output as they go. Now, of course, they may be encoding messages for each other that are not actually being reflected in English, but it’s quite nice they’re in English. In any case, they’re mostly going to be talking to each other for a while and then talking to us a bit, I would guess.
If I’m an AI booster here, I’d say, “Well, Saloni and Jacob, you’re just not being imaginative enough because you’re not taking seriously that we’re going to have the output of a lot of cognitive labor here.” Some of that output might say, “Okay, we agree that we don’t have good measurements of the brain yet, so what we’re going to do is design this non-invasive device that maybe you hold to your head and uses ultrasound safely, but actually takes better measurements.” Here’s a really, if you’re not quite comfortable with that happening with a human brain, then we’ll make it as safe as we can. All you need are 12 non-human primates, and you need to take these 17 different measurements, and then you’ll actually understand what’s going on in the brain.” What do you say to that?
Saloni Dattani:
Well, okay, but even if that is the case, we do actually need to get those non-human primates and we do actually have to apply those ultrasound techniques ourselves. But I think there’s another issue, which is that for a lot of these conditions, that data just doesn’t exist. It’s really hard to come up with these hypotheses if there isn’t much to go on. We’ve seen that with the protein design and protein structure prediction problems that we talked about in the last episodes, where it’s not as good as predicting things where the data collection is very limited. I suspect that that’s still going to be a problem here. That might just be a matter of timing; someone needs to collect that data first, but I think you still need to do lots of experiments in order to find out what is working and what different types of tools you might need for different problems.
Jacob Trefethen:
Experiments are important; you got me there. In particular, when you’re saying experiment, I assume you mean we’re not just looking at observational data we might have picked up, we’re not just looking at case reports, we’re actually perturbing the real world with controls to try and get at the causality of a given system.
Saloni Dattani:
Right. I basically think that in some disciplines or when you’re working with some types of tools and techniques, when you are able to do experiments, you can understand the pathways and how things are connected to each other much better. Essentially, there’s three reasons basically that I think experiments are really helpful. One, you can directly manipulate a specific point in this massive network of causal processes. Imagine your entire body as being a collection of all these different causal pathways. You have thousands or hundreds of thousands of different things interacting with each other. There are different nodes and connections. Maybe each one is a different hormone, signaling protein, enzyme, cell, whatever. When you have a drug or something that you can experiment with, you can intervene on these specific points in that giant network of different pathways. Doing that can help you understand the impact that that pathway is having, what is actually happening in the body or what is happening in this collection of cells and things like that. Because once you intervene on a specific area, it will vibrate and it will affect the other things that it’s connected to. Two, you have this controlled environment, so in the same way that you adjust for confounders in an observational study, you can keep the rest of the environment stable and just focus on that specific process. And third, you introduce lots of variation. You maybe introduce a new drug, something that’s never been seen before, and you see what effect that might have. Those large differences or interventions might not exist in the real world, and that can make it much harder to study what is happening outside there in observational data.
Jacob Trefethen:
How can we design systems that don’t necessarily require an entire human being put at risk in order to run experiments? Are there ways that we can simulate that system that might be experiments on a computer or experiments in something that doesn’t require the traditional ways of taking measurements that biologists might be used to? I think we have to, next up, talk about models.
Saloni Dattani:
Models. Well, yeah, let’s talk about models. Who’s your favorite fashion model?
Jacob Trefethen:
Probably Naomi Campbell, to be honest. Something about that walk.
Animal models
Saloni Dattani:
Let’s say we have discovered potential candidate drugs. What happens next? What’s the experimental process, or what are the next steps that we might have to test whether they work?
Jacob Trefethen:
Let me walk you through a toy model of that. The proviso, as you might expect, is that it does differ for different diseases and different drugs, but here’s a basic one. Let’s say you’ve got a drug candidate in hand that you think might work against some disease. You first will probably test in the lab in cells or on a plate against some biological material, whether it has some effect that you might care about. If the answer is ding, ding, ding, then you might test in non-human animals.
You, in many cases, will first test in mice, sometimes in mice that have been genetically altered to recapitulate some form of the disease that you’re trying to test against. After that, you will often try in another animal model that ends up being relevant for the disease you’re looking at. If you’re looking at a disease that affects the lungs, you might look at ferrets. Ferrets are used, I think, because their transmission of viruses — respiratory viruses — are a particularly useful property, but I don’t know the details so well. You sometimes use woodchucks or other birds when it comes to hepatitis B.
Saloni Dattani:
Wow, that’s so random.
Jacob Trefethen:
It is very random. Historically, people have used chimpanzees for a lot of research that nowadays we would think of as unethical because chimpanzees can’t consent in the same ways that humans can and yet are probably sentient and can suffer. There’s a lot of non-human primates that are not chimpanzees, but usually smaller or further away on the evolutionary tree from us who still are participants in medical research. Often a non-human primate, because of the genetic similarity to a human, will recapitulate a disease you’re looking at best of all.
Saloni Dattani:
There’s also this trade-off because primates are just probably much harder to work with than mice or mosquitoes or something like that in the lab. They’re much more expensive. They require much more space. It takes a much longer time for them to develop the disease. Their lifespan is much longer.
Jacob Trefethen:
Absolutely. People may remember in COVID that there were many vaccines and drugs that people wanted, scientists wanted, to test in different primates, but there were not many primates available because the laboratory system that we have is not so scalable to crises. You’re completely right that even just from a practical point of view, primates require a lot of space and a lot of food and a lot of care, and that does cost money and that will increase your grant budget required to do the experiments by a lot versus mice.
Saloni Dattani:
Right. You could use mice, and you have a cheaper animal model to work with, but then you might also lose these features of the disease.
Jacob Trefethen:
Yes. If you talk to anyone who works in biomedicine, most people are just very skeptical of most mice models. I don’t think that’s an exaggeration, but certainly, I mean, we just talked about Alzheimer’s, mouse models for Alzheimer’s. What the heck are they even showing you? It’s just so far from...
Saloni Dattani:
Do mice have cognitive decline?
Jacob Trefethen:
Well, exactly. They don’t even live for anywhere near as long as humans, and then they have very different brains. So what the heck are we even looking at? If you get past, usually the FDA will ask for two animal models, different species of animals. Then you can submit in the US case to the FDA, in different countries to your health regulator of choice, an IND, Investigational New Drug. That’s a whole big data package saying, “I would like to take this drug into humans.” At the same time, you can upgrade your manufacturing processes to make sure that the type of drug you’re making is what you think it is and is definitely safe- well, you don’t know if it’s safe for sure, but at least is what you think it is.
Then you can go into what’s called a phase one clinical trial. That is usually a clinical trial in healthy adults who may not be affected by the disease you care about at all, but are just participants to check that when you put this drug in the human system, it’s not causing big problems. If you get a tick there and you’re not causing big problems, then you will go to a phase two trial, which tests both for safety and efficacy. You will usually be in the population who is affected by the given disease you’re studying.
Phase one might have tens of people. A phase two might have hundreds of people. You’re measuring safety at a larger scale in terms of for many more people and maybe in more depth, and you’re measuring initial signs of efficacy. You might be trying multiple different doses of the drug and measuring in each case, how is this affecting the outcomes that we care about?
If after a phase two, you’ve got a drug that looks pretty safe and a drug that looks like it might be effective at some dose you’ve chosen, you’ll go into the most expensive stage usually, which is a phase three trial, usually with just one dose to confirm versus a placebo or versus standard of care of another drug that’s already used in the health system. That might be thousands of people, sometimes hundreds, sometimes thousands, sometimes tens of thousands, to determine that your drug is efficacious and to determine at the largest scale yet that your drug is not causing safety problems that are prohibitive.
And then if that all looks good, you’re going to submit a huge data package to the FDA and say, “Can I please sell this drug in America?” The FDA will take 6 to 10 months and review your data, review your thousands of pages of submission and get back to you with a thumbs up or a thumbs down. After you’re selling your drug, you’re still collecting data. The FDA might require further studies after they approve your drug if there are particular questions they have that things should be addressed. If in those studies you end up with a negative result, they might withdraw your ability to sell the drug; that happens somewhat frequently. Also, you’re going to be collecting in the real world, more side effect data. Once hundreds of thousands of people are using a drug, you will spot more side effects. They won’t be randomized, so you won’t necessarily get as high quality data, but you at least get more data as things come in, so the evidence collection does not stop once you get approval.
Saloni Dattani:
Wow, great. That’s a very long process. So I have two things about mice that I wanted to talk about. One is how we actually find animal models at all is kind of interesting. Second, why are mice models so common? It’s not just because they’re easy to work with. But let me start with the first one.
Imagine you’re trying to develop treatment for a disease. Let’s say you’re trying to develop drugs against malaria or something, and you don’t want to immediately test them in humans. You’re like, “Well, let’s see how this works in the different animals that we can work with.” But what you need to find out first is, “Are there any animals that have a similar disease that we have?” So let’s say you’re doing this for malaria. Malaria is caused by a parasite that is transmitted by a mosquito. What people would do in order to find a malaria animal model is look for other animals that are infected with the same or very similar parasites, which are infected by mosquitoes as well.
This was a really difficult problem in the 1930s, and or the early 20th century, in order to find animals that were also infected with malaria. They first, I think scientists first found birds and ducks that could be infected by similar parasites, but those didn’t seem to work. If you tested drugs with birds and ducks, they seem to be effective there, but then in humans, they cause all these toxic side effects. They were kind of looking for a better animal model. The way that you would do that is really you’re trying to actually test these different animals to see, are they infected by malaria right now? That really requires there to be a local malaria outbreak, which is kind of a rare occurrence maybe in some cases. You have to go out and collect that data from the individual different types of animals.
Another thing that you could do is you could test the mosquitoes. You could see basically when mosquitoes take a blood meal, they also ingest lots of other proteins that are in your bloodstream. You can test for those and see which other animals was it drinking blood from. This is one way that people tried to figure out which animals could be infected in the 1930s and 40s, I think. They tested whether the mosquitoes had ingested the blood and had particular proteins that were seen in different animals. I think cattle, sheep, dogs, primates, stuff like that. All of those seemed to be negative.
Some researchers thought, “Okay, well, maybe this is a good thing. Maybe this suggests that it’s rodents who are getting infected. Let’s try to look for any rats or mice that are infected by malaria.” There were some researchers in what was then the Belgian Congo and they were kind of just collecting lots of rats and mice in different, you know, near different rivers and villages and things like that, and testing them for evidence of malaria infection. I think I was reading this because I had written this piece on the malaria vaccine a while ago now.
What was so strange to me was that they’re trying to find, you know, where are these rats or mice that are potentially infected by malaria. They really couldn’t find any for several years. One of the reasons for that was because there was a forest fire in that area, and that deterred the mosquitoes from the area, so there weren’t any local outbreaks. It was really hard for them to- They eventually came upon this little thicket rat.
Jacob Trefethen:
Thicket rat?
Saloni Dattani:
Well it’s a type of rat, I don’t know that much about it, but it’s a type of rat.
Jacob Trefethen:
In the thick of it.
Saloni Dattani:
Right, and it happened to be infected by a different similar but different type of plasmodium strain that causes malaria. That was the first ever rodent malaria model that was found. It just goes to show how difficult it is to actually figure out which animals are potential models that you could use in the wild by doing all of this data collection, just testing things out. You kind of have to hope in some ways that there’s an animal outbreak or infection that you can capture. Or you get the species in the lab and you try to deliberately infect them, but that might not work — maybe they’re only infected by a different strain of the parasite or something like that. Sometimes it’s better to look for the natural outbreak and see what similarities it might have.
Jacob Trefethen:
It makes me quite grateful that we have division of labor in science because I really like the job of talking about it in front of a microphone. If I was out there catching thicket rats, I might go do another one line of work.
Saloni Dattani:
I think they tested some 200 rats before they found one that was infected. Even after they did find that one that was infected, they still had to optimize the... in order to study that in the lab, you have to recapitulate the environmental conditions that that rat is infected by malaria with. That was my first mouse story. Second mouse story, why are mice and rats so common in laboratory research?
Jacob Trefethen:
Yes, we sort of take that for granted. It’s just the default.
Saloni Dattani:
There was this great article recently in
on the origin of the lab mouse. The author talks about, you know, the majority of lab animals are mice or rats, 95% of them. They’re probably around 30 million different rodents that are used for biomedical research every year in the US and Europe. This huge supply of mice is not just because they’re easier to work with and they’re small and have shorter lifespans than some of the other animals that you could work with, but it’s partly because there was previously in the early 20th century and the late 19th century there was this culture or community of people who collected and bred different mice varieties. They’re called mouse fanciers.Jacob Trefethen:
Right.
Saloni Dattani:
I had never heard about this before reading this article and it was super interesting.
Jacob Trefethen:
I had not either, but I did read this article. Yeah.
Saloni Dattani:
In the early 20th century, basically they’re creating all these different varieties where some of them have spots and some of them are different colored mice breeds and things like that. People were just breeding these as a hobby. But then some of this was used in research. In the 1920s, there was this group of researchers in Maine who tried to standardize these mouse varieties and tried to basically create lineages of mice where they were very inbred. What that would hopefully help with was to reduce the amount of variation in the different mice in your lab. If you were testing a drug and it worked on some of the mice, but not the other ones, it would be really annoying if the reason for that difference was because they just had different genetics or something like that.
These researchers basically tried to have these very standardized, purebred line of mice where each of the mice has the same genetics, essentially. Their response to different drugs is not going to vary because of their genetics. This happened in the 1920s and people were developing these mouse model or lineages of mice to study the genetics of cancer. But then their funding dried up during the Great Depression and they moved to just working on creating more lineages of lab mice to supply other labs with. That is now, that laboratory, the Jackson Laboratory, is now one of the biggest providers of mice to laboratories around the world.
Jacob Trefethen:
It still is, all these years later.
Saloni Dattani:
Yeah. They have these really detailed procedures to keep their facility contamination-free. You have these physical barriers. If you want to enter the mouse room, you have to wear specialized equipment and stuff like that. They take really extreme measures to prevent any kind of contamination of those mice as well, which is a little bit grim, but also was very interesting to read about.
Jacob Trefethen:
I mean, it’s fascinating from a historical point of view. It sounds also like, you know, Hansel and Gretel, these scientists in this forest that make it American. It’s interesting, though, how much you gain by being able to control the genetics and the environment, but also how much you lose. Economists would call this internal validity and external validity, what I’m about to say. By making the mice so inbred, you can isolate causal factors of what is occurring when you give them drugs of different forms. When you’re making a drug, of course you care about, does this generalize, not just to one person in one room with one genetic code, but to all people who might benefit from the drug.
At some point, you’re going to have to have some experimental step, which is way broader, because you really need this generalizable effect in order for it to make sense, at least in the current paradigm of medicine. The way that we’ve set things up, we’ve decided to have this controlled, but something that most of the time does not generalize. That leads to a ridiculous amount of failure in the clinic where you have these drugs that look perfectly good in black mice or in whatever mouse model you’ve used and absolutely don’t work in humans.
Saloni Dattani:
You’re right. I do think that the extreme amount of control does let you understand some specific pathways, but it does also mean that once you are in the real world with a much wider variety of things, other things might be involved in that process as well. They might be modulating what is happening there and without that broader information, you can’t be sure really if these drugs are going to work for the average person. You don’t know if the average person is like those little mice in your lab. Probably they’re not.
Jacob Trefethen:
Reminds me of a story I heard about a plant biologist recently who said that, you know, you can tinker around with your theories in these great plant models as much as you want. But if he actually really wanted to know if something worked, he’d go out to the farmer’s market and get some spinach. Because if it worked in some spinach he found at the farmer’s market, well, that generalizes. Otherwise, probably it’s an artefact.
Saloni Dattani:
Right. One other thing. Do you have a favorite animal model?
Jacob Trefethen:
Gosh, I mean, if I’m being honest, to some degree, I hate all animal models. I just think it’s so awful that this is how we have to do medicine at this current stage of development. I just basically think animals can’t fully consent, so I hate to be a downer, but if I were to pick one, I do like the zebrafish because part of the reason a zebrafish gets used is because it’s transparent or translucent, so you can actually see visually things that might be going on. If you want to test something where that might be important, maybe go for a zebrafish. Zebrafish in the wild, I don’t think are transparent, but scientists have managed to change it so that whatever causes the pigmentation is different, so you can get transparent zebrafish.
The question of what’s our favorite animal model and us really kind of wishing that fewer animals were involved in medical research gets to the next question I wanted to ask you about, which is, can you design systems that might give you data that is useful, but where the systems are not alive or are not full organisms? One that comes up sometimes in the work that we’ve supported at Open Philanthropy, and we’ve funded some of this actually, organoid systems. Do you know much about organoids?
Saloni Dattani:
I know a little bit about organoids. They’re not fully organs; they’re parts of an organ, right? They’re derived from stem cells and they’re cultured in 3D. Imagine the dish that has the cells on it, but not that. It’s in a 3D shape or something like that. They’re kind of organized into little clusters of different types of cells, and they might reproduce some features of tissues in your body, including the different types of cells that are involved, or they might be doing some types of functions, but they’re not fully an organ in the lab. Is that right?
Jacob Trefethen:
I think that’s right. I think they’re sort of in between just doing experiments in cells and doing experiments in animals where you want to have a bit more complexity you can represent beyond just a cell. They’ll often have multiple cell types. You might have a lung organoid that has epithelial cells and then also some other cells. You might have a brain organoid that has neurons, but also microglial cells or something like that. I think this is a growing area that different fields or different organs honestly have gotten further along or less far along with. There’s various research going on that I’m cautiously hopeful we might get better organs on a chip over time.
Saloni Dattani:
When I hear the word organoid, I’m thinking about Futurama, like there are these jars filled with people’s heads only or different body parts or something like that. But it’s not really like that at all. It’s just cells in a 3D dish.
Jacob Trefethen:
That’s right. But they are 3D and there are some sort of bizarre things you observe when you try and grow some types of cells, neurons in particular. There’s a bit of a question of how big can this clump of neurons get before we feel sketchy about this experiment?
Saloni Dattani:
Sketchy in what way? Like it could start thinking?
Jacob Trefethen:
Yes.
Saloni Dattani:
What?!
Jacob Trefethen:
I think that people will often have up to maybe a million neurons, but the bigger you get than that, it does start resembling a brain a little bit, and then you might have some more philosophical questions about what you’re doing.
Saloni Dattani:
Huh. When I was in medical school for biomed, we had this anatomy class where we actually saw brain slices. I remember thinking how un-brain-like they were. They were fixed with this chemical to preserve them, and they had the consistency of a very thick tofu.
Jacob Trefethen:
That sounds pretty brain-like.
Saloni Dattani:
Yeah, right, but I guess when I’m imagining a brain, it’s quite active, it’s fluid filled, it’s a bit squishy, doing lots of electric pulses going on in there. That felt not very real somehow. This is a very different version of that. That’s in cell culture on a computer chip, or it’s on some 3D printed scaffold or something like that. But it is kind of alive, right?
Jacob Trefethen:
Well, I’m sure there are neurologists listening or other scientists listening who have strong opinions about that answer. It’s certainly alive in the sense that the cells are alive. The questions on top are, you watch them form these structures. You’re like, “Oh gosh, those... I don’t like the look of this.” You kind of want them to form structures so that they can be useful in experiments in terms of the similarity that they share with how a brain’s neurons are structured. I once went into a lab when I was visiting Boston a couple years ago, where they had the brain organoids in one room, basically. I did have a certain feeling of, “Oh, this room is strange.”
Let’s really bring this back to AI because everything we’ve described, even these organoids, still involves the physical world and still involves perturbing physical things in experiments. And I think part of the dream that AI boosters have is that you can do even more of the research and drug development without having to engage in a system outside of silicon. You want this to really happen computationally if you can, because then you go around way more experiments, you do way more, way quicker.
I don’t want to immediately laugh at it because we just talked about how mice don’t generalize that well to humans. I think it’s perfectly possible and I’m indeed hopeful for a world where in a few decades time, mice are much less involved in medical research and we’ve found other routes through. What’s more up for debate is what are those routes through? Do they involve perturbing biological systems, or are there things that you can do purely computationally that we’re not doing yet? Which brings me to the question I wanted to ask you next, which is around, I’m hearing a lot of talk about the new hypey area of biotech: virtual cells! Have you seen much of this talk in the last nine months?
Saloni Dattani:
I have actually. The thing that I read about this was this article by
. I don’t know if you read his substack; it’s very good.Jacob Trefethen:
It’s a great substack.
Saloni Dattani:
He has this article where he writes about the history of the virtual cell, where it’s at now, what the research is like, what the future might look like. He kind of starts by talking about one of the first mechanistic virtual cells. When I first heard the phrase “virtual cell”, I imagined it was like, I don’t know, it’s in a computer, you’re looking at this 3D diagram or something. But sadly, it’s nothing like that at all. It’s just this computer model, and you don’t really know what is happening.
But with the first system, the first virtual cell that was built, it was a little bit like that, in that there was this team at Stanford that was led by a scientist called Markus Covert, and he studied a bacterium called Mycoplasma genitalium, which is the smallest free-living bacterium probably that we know of. It causes various genital infections, and it has a tiny genome of just 600 genes, and that’s why they picked it. What they wanted to find out was if we can map what each of these genes are doing in this one bacterial cell, maybe this will help us understand how cells actually work as a whole.
What they did was they created this computational model where they represented each individual cellular process in that one bacterium, from their DNA replication, their metabolism, whatever. They collected lots of data from thousands of papers and tried to encode each of these processes as a separate module in their computer module. They linked all of that together and then tried to approximate what would happen in a living cell by updating that model at one second intervals. In 2012, I think they finished making this model and they could use it to predict cell growth and division. It was the first time that a full organism, one bacterium, was simulated in a computer. And one thing that-
This is a really cool project, obviously, but in our second episode when we talked about proteins and all the cool things they’re doing in the body, I do remember talking about how many collisions there are between different molecules in a cell and how many times enzymes collide with other molecules. It was 50,000 or something per second. I do think that even though they’ve made this computer model of the virtual cell, if it’s updating in one-second intervals and it has these, that’s missing a lot of things that happened within that one second, right, as we talked about. So even this very simple bacterium with this very complex computational model is still quite far, I think, from the real biological bacterium in a lab or in real life.
Jacob Trefethen:
Is there any way we can draw out the expectations here that we might have on future models? Is there some hope that next year there’s an even better virtual cell and the year after an even better one, and it’s that kind of problem? What do you think?
Saloni Dattani:
I think there are lots of efforts right now to improve these virtual cell projects. The Arc Institute has this Virtual Cell Atlas where they are basically putting together lots of different data sets, measuring different things within a cell and helping people to create better computer models of what’s happening inside each cell. What you’re then doing is trying to perturb each cell with a virtual gene edit. You’ve mapped out in this computer model what is happening between all of the different genes and proteins in the cell, and then according to the other literature or the data that you’ve collected, you’re saying, “Okay, what would happen if this gene was dysfunctional or something? How would that affect the cell as a whole?” They’re simulating billions of experiments and trying to predict their effects using that.
I think these things are definitely improving, but they are limited to an extent by the data that you can collect at all. We don’t have a great way to collect so much information at the level of milliseconds or microseconds, which is often how fast lots of things in biological systems happen. You can try to approximate longer processes and things like that that are happening, but it’s still going to be, I think, quite hard to get to the real life version of what is happening inside an organ or a tissue or a cell.
I think maybe it’s a little bit similar to weather forecasting. With weather forecasting, you’re also trying to make these predictions; weather is very complex, and there are loads of different local environments and things interacting with each other. There’s lots of local data collection efforts that are being pieced together for these bigger weather forecasting projects and computational models. It’s sort of similar to this, right, in that you might not have the data for each individual specific locality and you might not have the right level of data sometimes, but you can still make some predictions.
What’s going to be important is to compare those predictions with what you actually observe in the real world. Just like with weather forecasting, the way that you would improve a model, I think, is by making those predictions, comparing it to the real experiments or something like that and seeing what’s the difference between that and then can we use that to improve the performance of the model. You still do need to do lots of experiments and collect lots of data in order to improve these virtual cells at all. Even when you do, I think there’s still a lot of additional complexity that happens if you’re trying to understand how different cells work together, how different tissues develop or organs interact with each other and so on.
In order to understand those things, maybe you don’t need the specific pathways between every single protein in order to understand these wider structures. Maybe there’s a higher level of data that you need to collect and that could help you understand or predict those processes better. But if you’re really trying to recapitulate the entire thing, that’s going to be still incredibly hard.
Jacob Trefethen:
It’s interesting to try and think through if there are ways to represent complexity that mean that you don’t- to sort of reduce the number of real world samples you have to take. It’s been quite interesting to see over the last decade or so, these projects that are incredibly empirical to try and map out paradigm structures that are beyond just one cell. What comes to mind for me are the brain maps of most recently the fruit fly, where when I was growing up, I heard about scientists have mapped C. elegans, which is 302 neurons of complexity.
Saloni Dattani:
And that’s the worm.
Jacob Trefethen:
That’s a worm, yes, sorry. A small worm, in fact. But scientists know how each of the neurons in that worm’s brain connects to each other. More recently, well, what is a connection? A synapse is where neurons connect to each other. If you’ve got 300 neurons, you’ve actually got 7,000 synapses. More recently, scientists have tried to look at organisms closer to humans than worms, but not quite at the human level, of the fruit flies. They’re mapping in absolutely exhaustive detail the 50 million synapses that occur in a fruit fly’s brain. Humans, we’re dealing with tens of billions of neurons and I assume trillions of synapses, so we’re not yet at the human level of complexity, but having these models that are able to be represented computationally, but derived from a real fruit fly might be a nice kind of bridge here where we know that fruit flies share many genes, or sort of precursors and sometimes genes themselves, that we have related to our brain, so we know there’s going to be some relevance. We can hone in with a bit more detail on that. But do you- at the same time, this is not a functional model; this is a diagram of a brain more than anything else. Do you hold out hope there of those bridging models being useful for AI?
Saloni Dattani:
I mean, I think in some ways, yes. I think you’re right; this shows basically all the connections in the brain of the worm or the fruit fly or the human, but it doesn’t necessarily show how they interact with each other and what happens if one thing changes. I think that’s basically where you need the experimental information. But one thing that reminded me of was the very time-consuming, very manual effort of actually putting together these brain maps in the first place.
I don’t know if you know about how people figured out the process of programmed cell death, which is just this process by which as cells are developing in an embryo or whatever, there are some cells that basically die off at predictable points throughout an organism’s lifespan. In order to find out about this- there was this biologist called John Sulston, and he tried to map out the development of the embryo of the C. elegans, that worm that you described. He basically looked under extreme magnification at a single worm.
For two to four hours every day, for 18 months, he looked at the single worm and he noted every single cell division and death that was happening in the embryonic worm, as it was growing up. He noted down every single cell division and death as this series of circles. When he went home for his break or at the end of his shift, he kept the worm embryo in cold temperature to freeze it for them, and would come back the next day and unfreeze it and then see what happened next. He mapped out this entire worm embryo, its developmental pathway, and discovered that every C. elegans worm, the cells kind of divide and die at predictable points.
This work basically generalized later on to other species, in different ways obviously. But just the amount of care and detail and data collection and how manual some of that labor can be, in order to understand some of these processes, was really interesting to me. Hopefully, there are better models now that can recognize those things under a microscope so that someone doesn’t have to look under the microscope for two to four hours every day for 18 months to understand these cellular processes. But at the same time, it just goes to show how much data you need to collect in order to understand some of these pathways.
Jacob Trefethen:
I mean, thinking about that individual worm, there’s something almost spiritual about it, like the amount of knowledge that that one worm gave us.
Saloni Dattani:
Thank you to that worm.
Jacob Trefethen:
Yeah, thank you to that worm. It makes me think of a science fiction short story I wanted to write sometime. You know the short story by Ursula K. Le Guin, “The Ones Who Walk Away from Omelas?”
Saloni Dattani:
I don’t know it.
Jacob Trefethen:
I think I’m allowed to spoil it because it is a short story and the gist is famous, but fast forward 30 seconds if you don’t want the spoiler. Basically, there’s essentially a utopia that people are living in, but in order for this utopia to be maintained, there has to be a child who is living in terrible conditions and is at the heart of this otherwise glorious city, sort of living in dirt and not being treated like a human. There are some people who, as the title says, walk away from Omelas. I don’t know how to pronounce that. There are different interpretations of this, and I won’t get into that.
But the short story that I have sometimes wanted to write that’s maybe sci-fi horror is a more scientific version where, let’s say that people in the city called Earth or San Francisco or somewhere think that AI systems will be able to cure lots of diseases if only they had access to high quality data from a human. One person sacrifices themselves and says that for the next 18 months, like that worm, “You can study me with any sensors you want and perturb me in any way you want to generate the knowledge that will allow everyone else to be immortal and live for the rest of time.”
Saloni Dattani:
That’s amazing. I hope you write this short story.
Jacob Trefethen:
Once again, I’m not quite sure what the implication should be from it.
Saloni Dattani:
That did remind me actually of another very similar project in the 1990s called the Human Genome Project. This was again, a type of map, right? You’re trying to map what human genomes look like, where the genes are, what each gene is doing, which proteins it’s producing, and so on. I think the initial, the kind of reference genome that a lot of people use is this composite of just one single genome. They compare genetic mutations that you see in other people with that composite reference genome. While that is useful in a lot of ways, it’s also very limited because naturally there’s a lot of variation between people, and that reference might not be exactly the right reference that you should be using for a particular thing, or you could just be missing out on the internal variation, the larger structure, it’s not necessarily just that one mutation that might be different, but there could be many larger segments of your DNA that could be very different between people, and that’s something that would be hard to tell if you’re just using a single genome.
I think in the last five years, people had recently published this new research project called the PanGenome Consortium, where instead of just having a single human genome reference, they mapped some, I think it was like a few dozen people’s entire genome sequences and used diagrams to represent how different parts of their genome varied from each other. You have this more diagrammatic way to show what the entire genome looks like rather than the single linear sequence. I think it’s sort of similar because even with all of these, the brain maps or the developmental maps and the single worm that was used for this pathway, obviously, there is lots of stuff that we can learn from that. But at the same time, most of the interesting things in humanity are things that vary between us. If you aren’t able to study that and you don’t have good references for what the variation looks like, then it’s harder sometimes to tell what is causing differences between people, how different processes are linked to each other, how to actually understand the causes and effects within the body.
Jacob Trefethen:
Okay. Well, stepping back, we’ve discussed all different forms of models that currently get used in drug development and some models that are at the frontier getting more use. Let’s just see, is this going to still be a necessary part of drug development going forward, and will AI help or not so much?
Saloni Dattani:
My summary from all of what we talked about here, the animal models, the cell culture, the organoids, the virtual cells, is essentially all of them have various limitations. The animal models are easy to work with sometimes; they’re what we’ve traditionally used. But in many situations, they don’t really recapitulate what human disease is like. With the organoid models or even the cell culture, they’re quite limited specific features that you’re trying to replicate. At least with the organoids, I think they’re human cell derived, so there are some parts of them that might be more similar to us, but they don’t capture the whole complexity of what your organs are interacting with in your body and things like that.
The virtual cells, I think, are very interesting, but you still need to either experimentally or computationally perturb them in order to understand what is actually going on. Because really it’s this complex, almost black box model, and you need to do various computational things, or you need to do experiments in order to understand what the pathways are like within that cell, and it’s just a single cell. I think we are still pretty far from capturing what the biological complexity of a tissue or an organ or a human body is like.
Jacob Trefethen:
A human body, a human body, a human body. I think it’s time to talk about human clinical trials.
Drug efficacy
Saloni Dattani:
Okay, so animal models, organoid models, and virtual cells all have different limitations. But I think really the benchmark for whether something will show success in treating human disease or preventing human disease is actually by testing it in humans, in us.
Jacob Trefethen:
Fundamentally, what we’re asking in human trials can be split into two things in terms of the knowledge we’re trying to gain, I would say, which is: safety and efficacy. There are further things that a drug approver will be interested in that are to do with the manufacturing process of a given drug to assure that you’re making things to the right product quality and that kind of thing. But for simplicity, I’m just going to talk about knowledge generation of safety, knowledge generation of efficacy. You can pick, which do you want to do first?
Saloni Dattani:
Let’s do efficacy first. I want to know if things work.
Jacob Trefethen:
What are you doing when you’re testing for efficacy? Well, you are trying out a drug in a population that you think is relevant enough to the people who might end up using the drug offer approval to show that it works for those people as good or better than the other options they might have. In order to do that, there’s a few steps you got to go through. First of all, how do you get those people in this clinical trial? That’s actually one of the under-appreciated maybe bottlenecks of drug development is that very few people actually take part in a clinical trial.
There are something like 5% of people in America have taken part in a clinical trial in their lives. At least that’s the last statistic I saw. That means that you’re already dealing with a big loss of potential. If more people volunteered, we would get more medical research done. We might want to talk about why more people don’t volunteer.
I think there’s a couple reasons for that. One is just that information; people are not aware that they might actually be able to contribute to medical research in a given topic, or it just doesn’t come to mind. You have to be told by a doctor if you have a particular disease, then they might make you aware in a one-to-one communication, or maybe you’ll see a poster on a pinboard somewhere and volunteer that way. I’m such a lazy person that I, even when I do think, “Oh, I should volunteer for something,” it ends up, you know, the last one I was considering volunteering for, it just clashed with my actual work life.
Saloni Dattani:
That seems normal.
Jacob Trefethen:
Oh, thank you. I didn’t end up doing it, but just the practicalities of life get in the way. Another one is compensation. There’s a debate in the bioethics community about how much you should compensate people who participate in clinical trials. Should you just give reimbursement for travel to get to the clinical trials, or should you compensate people with actual payments that are more substantial? There are different opinions on that question, but it is clear that it does affect how many people do partake in medical research. Some people report that they’re perfectly fine with the risk, but the fact that they have to take a day off work means that they will get less income and all of that.
Saloni Dattani:
That totally makes sense. I feel like there are also maybe other types of compensation. This is just speaking from my own personal interests, but I once tried to sign up for a study just because they said that they would provide me with a scan of my brain, and I was like, “I want that up on my wall or something. I want to know what my brain looks like.”
Jacob Trefethen:
You know, there’s compensation, but some things are priceless.
Saloni Dattani:
I totally agree that these things really are incentives because you are taking time out of your day, sometimes several days per month or sometimes for a period of months or even years potentially, to participate in a clinical trial. You really have to think about what the other things that people could be doing are. What is the opportunity cost for someone to be part of this clinical trial versus continue their job or relax at home or something like that? There’s also the potential risks that are involved with taking a new treatment that’s still in an experimental stage and things like that, and how you can compensate that as well.
Jacob Trefethen:
I think just walking through other examples of cases I’ve seen where recruitment has been difficult before, you know, before proposing any solutions, I mean, there’s this example from hepatology.
Saloni Dattani:
The liver?
Jacob Trefethen:
The liver, that’s right. There’s some viruses down there. One of them is hepatitis C. A group of scientists were trying to test a vaccine to see if it worked to prevent hepatitis C infections led by Andrea Cox.
Saloni Dattani:
Hepatitis C, that’s the one that causes cancer and liver disease, right?
Jacob Trefethen:
Yes, there are a few hepatitis viruses that cause liver disease, liver cancer, cirrhosis. Hepatitis C is one of the happy medical research stories of the different hepatitis viruses, because it was discovered in the 1980s and the first cure was developed for it only 22 years later, I think 2011. We now have many different cures for hepatitis C that you can take. Now, in addition to cures, which take about three months to complete, it would be quite useful to have a vaccine because a lot of people who get cured, a lot of people who are affected are people who inject drugs. If you don’t sterilize a needle, then you’re at risk of getting reinfected. Even if you get cured, you might get reinfected.
There was a research group that tried to run a clinical trial on a hepatitis C vaccine to determine whether it works. To do that, you have to enroll a lot of people. In this case, people who often don’t have the best contact with the medical establishment, people who inject drugs are often not as likely to show up for all future events in this clinical trial, so you’re already dealing with some complexity there. In addition, in order to statistically show that the vaccine is better than the placebo, you have to accumulate enough infections in the placebo more than the number of infections in the treatment group to show that you have a vaccine that works. In this case, the limitations on how hard it was to enroll people plus the number of people you had to enroll to reach that conclusion meant that they ran the trial for six years, and it took six years to reach the conclusion that the vaccine probably didn’t work. It’s kind of horrifying how slow things can be when statistics is what is the real driver there.
Saloni Dattani:
That also reminds me of another example of Zika virus, right? It’s quite different, but similar problems in the end. Zika virus, as people might know, is transmitted by mosquitoes. It is this epidemic infectious disease. There are big outbreaks in some years, and then for many years or a decade or more, there might be very little of this spread once people have built up enough immunity.
But what’s very challenging about, therefore, developing vaccines against the next outbreak is that in the meantime, when you have this lack of outbreaks going on for years, it’s very hard to get enough infections in the trial in order to see whether the vaccine protects you more, or reduces that number further, if the number to begin with is very low. But there’s another challenge, even if there was a local outbreak, which is that the problem with Zika virus- so, for most people, they don’t tend to get symptoms from the infection. A fraction of pregnant women who get infected, that is really the concerning part of the disease because it can cause miscarriage and it can cause congenital Zika syndrome, which causes various birth defects and things like that. That’s really what you want to prevent.
If you are trying to run a normal trial and you’re trying to see, “Well, is this vaccine going to protect against reduce the number of cases of congenital Zika syndrome?” that’s really hard. You actually would need, in a typical type of trial, you would need hundreds of thousands of participants, because not very many of them develop any symptoms; there might be very few participants in the trial who are getting pregnant during the trial; and you aren’t necessarily testing that frequently for whether they’re infected.
There could be another option. I think there are different options that you can take to making this whole process faster. One is instead of only testing against congenital Zika syndrome, which is in pregnant women and babies, you could instead test, does it reduce infections of Zika virus in general? You could test everyone frequently with PCR testing or something like that to see if they’re infected by Zika virus and see how much the vaccine reduces that in the trial.
Or you could go even further and do what is called a challenge trial. You could deliberately infect volunteers, not pregnant women, but other women who are not planning to be pregnant, taking contraception and things like that in order to test whether it protects them from an infection. Those volunteers would be deliberately exposed to the virus. If you did something like that, because you can time the infection, you can actually say, you’re not just waiting for the infection to happen, you’re not waiting for an outbreak, but you’re deliberately giving them an infection. You get to monitor them closely and you can test them very frequently and actually understand the specifics of the disease and how it develops, you can test more carefully whether this vaccine works. In contrast to the hundreds of thousands of people you would need in a typical type of trial, in this case, you would only need a few hundred, or even less sometimes. You can kind of speed up this process by really just changing the design of the trial, and that’s something that I find really interesting, but it’s just one way.
I think there are many different ways that you can speed up the process of doing this recruitment or running a trial. I do think challenge trials are really cool and sometimes they’re really the only option if you can’t wait for another Zika virus outbreak or something like that. But on the other hand, they’re quite difficult to actually set up. Sometimes it’s hard to recruit volunteers to infect them with a dangerous pathogen or something like that.
Jacob Trefethen:
That’s crazy.
Saloni Dattani:
Imagine trying to run a challenge trial for rabies or something, which kills most people and infects. Or trying to run a challenge trial for some disease that infects children, you know, and that has a lot of ethical issues around it. But there are also scientific challenges as well, like trying to actually infect someone in a similar way that they would get infected naturally, to really see how well this vaccine or drug might work in the real world. Sometimes it’s quite hard to culture a pathogen or a microbe in the lab in order to do that and then to be able to infect them at all and know what the right dose is that you should give them and so on. Of course, this doesn’t really work at all for diseases that are not caused by infections. I don’t know if there’s a version of a challenge trial with other non-communicable diseases. For injury trials, I hope we’re not actually cutting off someone’s foot or something in order to test the treatment, but maybe there’s some analogy.
Jacob Trefethen:
I’m wondering if there’s, you know, those attempts to heal burns. Is there a challenge for you to get burned?
Saloni Dattani:
Burn heal!
Jacob Trefethen:
Burn heal. Final generalization point I was going to make on challenge trials, though, is about adults versus children. A lot of vaccines you’re making to try and benefit children, not always, but often. It would not be ethical to have children in a challenge trial because they can’t consent in the same ways that adults can. You end up doing trials in adults that you hope might generalize, but you really don’t know because the immune system of people at different ages is pretty different, and they’ve had different levels of exposure to the pathogen of interest before sometimes.
Saloni Dattani:
Right. I was going to say, maybe you just choose some really short adults like me or something and hope they’re representative of children. But you’re right that obviously I’m different from a seven-year-old.
Jacob Trefethen:
That’s my best guess.
Saloni Dattani:
So I mean, if we think about improving recruitment, for example, maybe you could have better websites or better search tools that match people according to the conditions that they have or their interests to the trials that are ongoing, something like that. Or in some cases, you could automatically enroll people into a trial if the different options that they would get for treatments are already things that they would get in real life. So let’s say it’s not for an experimental drug, but it’s for treatments that are already varying in the population. That might be an option. Maybe there are different administrative things you could do to make trials run faster.
Jacob Trefethen:
It sounds like most of those things that could be done are not AI-specific problems. Some of them might be amenable to some boosting from AI. So maybe the website one and the recruiting one. What do you think?
Saloni Dattani:
I think those two for sure. Right now in the US or even in the UK, it’s quite hard to actually figure out which trials are ongoing and to register your interest in them. There are some websites that do that, but if you have a disease or some condition, it’s quite hard to then get matched to trials. Even simplifying something like that would be probably really useful. But at the same time, maybe it’s just that 5% of people in America who are going to multiple trials. How are we going to actually expand the population beyond that? I think that’s something that needs more thinking. This could be a situation where you want that automatic enrollment for the different treatments that are already available to test what’s better. Or better compensation, or other incentives to get people volunteering — if the only reason that they have to not volunteer is because it’s too time consuming, or it doesn’t pay them enough, or it’s just not worth not doing other things they could be doing.
Jacob Trefethen:
Right. Just to give one more example that I think illustrates something that AI won’t be able to get around. Sometimes the data you’re collecting is more invasive than others. You just talked about a Zika challenge trial where you really want to stare at that before deciding to volunteer or not and make sure you’ve understood the risks. An example that we’ve come across in my work at Open Philanthropy is we’ve funded trials related to Alzheimer’s where you want to sample, basically do a spinal tap, sample cerebrospinal fluid to be able to test two years into a trial whether something is having a benefit or not. That’s quite invasive; having a spinal tap is painful and not exactly fun. Sure enough, we’ve seen that dampen recruitment in trials because you just can’t get people to sign up for that part unless they really have to. Sometimes trials end up switching from, “Okay, we were planning to collect through spinal fluid, but for some participants or maybe all participants, we’re actually just going to do an Alzheimer’s blood test and use that as an input.” But then you didn’t actually get as useful a sample that- the AI is not going to be able to deal with as much data there because you didn’t get to collect it. That’s absolutely as it should be. People should get to decide when they give up their cerebral spinal cord, so that’s another real-world difficulty here.
I do have some optimism on the efficacy front, though, from an AI point of view. Can I pitch you on a couple of things?
Saloni Dattani:
Sure.
Jacob Trefethen:
What you’re saying makes sense and it takes me back to our last discussion about animal models. One of the big things you got to always have in mind is how much is this animal model going to generalize to human populations we care about? Even human models, such as challenge models, you have to have generalizability fully in sight. Your example with Zika has me thinking about other ways that things won’t generalize. Hepatitis C, I just mentioned, has most people or many people get infected from injecting drugs, which means that, strangely enough, that is quite similar to how you might get infected in a lab because that’s what doctors are used to doing.
But something more like flu or strep or rhinovirus, if you’re trying to do a challenge model there, the way that you might get infected by a doctor, if they use a needle there, well, that’s not how I’m getting infected with rhinovirus usually. You’re like, “Okay, so you’re going to simulate a classroom where there’s some kids coughing next to me.” Not so similarly, you might get the exposure level way too high, for example, or if you get it too low, you won’t get an infection.
Saloni Dattani:
I was going to say it reminds me of the challenge trials that are used for malaria research, malaria vaccines, where people are in a room filled with mosquitoes or they have their hands in a little or in a big jar that’s filled with mosquitoes and they’re just waiting to get bitten by them.
Jacob Trefethen:
I really want to volunteer for a malaria one because they’re so well established and they’re actually extremely safe because malaria is so curable. But something about intentionally putting your arm in somewhere where you know you’re going to get a bite that will give you malaria is just so hardcore, I love it. It reminds me of-
Saloni Dattani:
But it’s hundreds of bites, probably.
Jacob Trefethen:
You know, those old TV shows of challenges you had to complete to win some prize. I used to watch one growing up where you had to go into a snake pit.
Saloni Dattani:
Fear Factor, right?
Jacob Trefethen:
Yes. Oh, for sure. That’s exactly the vibe.
Saloni Dattani:
Are there examples of trials that have been quite fast and successful?
Jacob Trefethen:
Yes, and in that lies some of my optimism for AI. I think we’ve talked about how hard it is to recruit and how long things can take, but all of that is a statistical question. Now, if you have a drug that’s okay versus a drug that’s excellent and cures everyone; the second drug, you will need fewer participants to statistically prove it cures everyone, you can get there pretty quickly. There was this case of Gleevec or imatinib is the drug name. Gleevec is the brand name of a cancer drug that was so good in the phase two trial that they actually got FDA authorization for the drug before the phase three trial had even reported out, and the phase three was almost more of a confirmatory trial. My hope would be that if you get AI improvements that lead to better drugs coming into clinical trials, you actually would see a benefit of cheaper and less long-lasting, hopefully, and less recruitment-contingent efficacy trials, too. What do you make of that?
Saloni Dattani:
That’s a really good point. I mean, there are other examples of this as well, right? In our first episode, we talked about AZT, azidothymidine, for HIV as a treatment, and that also was stopped early in phase two. I remember between the two different arms, in the placebo arm, it was 19 people who died from HIV, versus in the treatment arm, only one did. They stopped it early because this is clearly a big difference statistically, given the number of people in the trial. That meant you could, because it was so effective in this short amount of time, it was so dramatic, you could end this trial ahead of schedule using these preliminary results. That’s all you really need to know for that point.
But then, sometimes it’s not just about, is this treatment working on average? Because one of the reasons that people do phase three trials is not just to get a better efficacy data, but also to test it out in a larger number of people where there is more variation. Some of them are going to have rare side effects, or for some of them, it’s not going to work. In order to understand why those differences exist and to study them better, you sometimes still do need large trials.
It also reminds me of COVID. The COVID trials were very fast, finishing within a year. That’s despite them having some 30,000, 40,000 participants per trial, right, for the COVID vaccines. That really shows some of the things that can be improved in order to speed up clinical trials in general. I think there are several things that worked in those cases. One is, it was much easier to recruit people into COVID trials. Many more people were interested in it. Also, the disease was very prevalent to begin with, so it was easier to get the number of infections in the control group to see what the effect of the vaccines was.
The other reason was that the trials kind of happened in parallel. The phase one and two trials were happening at the same time, and the phase two and the phase three trials sometimes were happening at the same time. That means the full timeline can be shortened if you’re doing these different stages simultaneously.
The next thing was there was this rolling regulatory review. When you first described the drug development pathway, you said you go through phase one, phase two, phase three, and then you collect, you submit this huge data package to the regulator to see whether they approve the drug. But in this case, what happened was the regulators were looking at the data as it was coming in. They sort of started looking at the vaccine manufacturing sites as the trials were ongoing. There’s Operation Warp Speed and the amounts of funding meant that pharmaceutical companies could take risks in the sense that they could do these phase three trials very soon without waiting for the results of the phase one and phase two trials to come in, and they could decide that let’s just do them all at once.
Jacob Trefethen:
Sounds like some of those are not amenable to AI, though, I mean, you tell me if I’m wrong. I’m curious if there are alternatives to long efficacy trials that you think might be more amenable.
Saloni Dattani:
I guess the other option is instead of looking at the general outcomes for: has someone developed some disease or have they died from the infection or whatever, you look at biomarkers or things earlier on in the disease progression, in order to see whether the treatment or the vaccine works. Sometimes that can work. If you’re looking at the shrinkage of tumors or something, in trying to understand whether this is going to improve survival of people with cancer, or if you’re looking at the viral load of something in the body, maybe that is a good correlate of how severe the disease is.
When it comes to, I mean, if you know about human papillomavirus, for example, the way that’s the efficacy of those... Human papillomavirus is a type of virus that causes genital infections, and some of those can lead to cancers developing over a longer period. If you’re able to tell what these initial changes are in cells before they turn into full-blown tumors, and if you can see whether the vaccine reduces those initial stages earlier, then you could... I mean, if there’s a progression between those things, then you can hopefully have a pretty good idea that they’re also going to reduce the various cancers that it’s associated with. That’s exactly what happened. The HPV vaccines are very effective, were very effective against genital warts in clinical trials. Since then, they’ve also been shown to be extremely effective against reducing the types of tumors that people could get from HPV virus.
Jacob Trefethen:
As of now, there were no cases of cervical cancer last year in Scotland because the vaccine worked so well.
Saloni Dattani:
Yeah, so it was rolled out in schools and different cohorts. I think the cohort that was born roughly around my age, none of them had cervical cancer versus dozens or more in previous cohorts.
Jacob Trefethen:
That is so awesome. When thinking about new biomarkers, can we have the next case of that where you have some intermediary thing that does block off the really bad thing at the end? I think for me, this is an open question with AI, but I do frankly hold some hope out for it.
I think you can design systems that recognize particular signatures that someone in a disease state might be giving off, whether that’s from a blood sample, whether that’s from some other type of sample, that AIs might be better at clustering and creating those signatures from many different inputs that we might not think to put together. Now, for any one of those signatures, you’re going to really have to validate that it’s real. Wherever an AI spots some correlation, in the cytokines and white blood cell counts of some blood sample with a disease state, you then want to test in a new population, well, does that correlation hold? If you perturb these systems such that that count changes, are you definitely changing the disease state? So I don’t think it would be magical, but, you know, it might help a lot if you could validate more of those clustered setups. I do think AI is going to be better than us at some of those. Am I too hopeful? What do you think?
Saloni Dattani:
I think you’re probably right. I think it’s still quite hard because again, these are really just correlations. I think you often still do need to think about, you know, what is the causal pathway? Are these confounders? How are they related to the outcome that you’re interested in? Does the drug or the vaccine, if it reduces that biomarker, does it actually reduce the disease, or is it just reducing something else that’s a byproduct of the disease that isn’t going to cause its progression? With Alzheimer’s, for example, do we know that the amyloid plaques themselves are causing the disease or are they a confounder? I feel like I still don’t know the answer to that. If that’s the case for other diseases as well, you need stronger evidence sometimes, even if some of the correlational evidence is quite strong.
Jacob Trefethen:
Okay, you’re puncturing my optimism.
Saloni Dattani:
I do agree that I think probably AI models are going to be better than regular statistical models just testing each correlation one by one. Probably it’s easier to spot patterns and things like that with larger models and scan many different datasets and find these comparisons much better.
Jacob Trefethen:
What do you think about rare diseases?
Saloni Dattani:
With rare diseases, it’s often really hard to recruit enough participants for a regular trial, right, because it’s rare to begin with. Secondly, all the people who have some rare disease might all be contacted by different trial research groups to participate, but they can only really participate in one or two maybe, and so that’s often quite difficult. There are some registries online where people with rare diseases can sign up and then be notified if there’s a clinical trial for their condition. But the problem is really, how do you design a trial so that you can find out whether treatment is effective when there are only a small number of people who have that condition to begin with?
I think there’s maybe two different approaches to this. One is running a smaller scale trial. The people with these rare very severe conditions, don’t have any other options. If we can study them or monitor them closely and we can target the treatments very specifically to the condition that they have, then we don’t necessarily need a big trial. It’s the example that you gave before where let’s say someone has a rare genetic condition and you can use CRISPR or some kind of gene therapy to specifically target the individual gene that’s involved. If you’re able to do that, it might have a massive effect size. You don’t need that many people in the trial to see whether it’s effective. You still might want a larger trial to see what the side effects are like, to see how much heterogeneity there is, how much variation there is, between different people with that disease and how they respond to the drug or vaccine.
But there’s another option as well. That option is to set up collaborations. So not just to work in a single individual country, but set up these collaborations to recruit people from around the world for this particular condition. One really successful example of that is with childhood leukemia. When I was growing up, the movies that you would watch about kids with leukemia were extremely depressing. This child suddenly develops this horrible cancer and they only have a few years to live and their parents are really struggling with that future prospect being taken away from them. But the situation is very different now, and the survival rates are much higher than they were in the past. I think before the 1960s, there was a survival rate of 15%, I think, would survive more than five years. If they managed to survive more than five years, that generally means they’ll have a roughly average lifespan, and by that point, you can see them as effectively cured of the condition. But now, that survival rate has moved up from 15% to 85% or 90% for some types of childhood leukemia and 60% to 70% for others.
That’s really high. I think that amount of progress, there are different reasons for that. One big reason is that there have been these collaborative research groups across the world essentially set up to study the condition. There were these collaborations to enroll different children with leukemia across the United States, across Canada, and then separately in Europe. They kind of merged, and now there are these international research groups where you’re testing particular treatments or different types of regimens across kids in different hospitals in different countries in the world. Because childhood leukemia itself is quite rare, if you’re able to find more people with the condition across the world, you don’t have to just limit yourself to one particular country, you could then get enough data for a regular clinical trial. From my understanding, that collaboration has been a big driver of progress in the condition, just learning what types of regimens work better, how they work differently for different kids with leukemia, with different mutations in their cancers, genomes, and things like that. They’ve really helped to make the treatment more effective and safe.
Jacob Trefethen:
That’s amazing. It’s the least AI-able thing, human cooperation.
Saloni Dattani:
I could imagine it’s really hard to set up these trials if you don’t speak the language there, or if you’re not aware of the other trials going on. I would imagine that to some degree, AI can probably help with that.
Jacob Trefethen:
While we’re talking about different efficacy trial designs and multicenter trials and all that, is there anything clever, AI or unrelated, that you wish more drug developers would try out?
Saloni Dattani:
I think one different approach is really to change the design of the trial itself. In a regular clinical trial, you are usually testing one treatment versus one control group, right? That’s fairly inefficient, I think, because you have to repeat having this control group for each new trial that you’re developing. There are better ways to do this.
One example is a platform trial. What you’re doing there generally is you’re testing multiple treatments against one control group. Because there are multiple treatment groups as well, you have more data, which means that you can kind of see what the natural fluctuation is or the natural amount of variation is between people. That allows you to reduce the sample size overall. But it also means instead of having five control groups for five trials, you have one control group for the five different treatments. That is a much faster way to run clinical trials. The most famous example of this was during the pandemic. The RECOVERY trial in the UK was a platform trial where researchers tested more than a dozen types of treatments against the control group in the same trial, in one single trial, across a period of two years, they managed to test these 12 different treatments.
I think that kind of thing is much more efficient than a regular type of trial. These types of trials are pretty great, I think, in terms of efficiency, but it’s hard to actually coordinate them and set them up because now instead of just working with one drug developer or something, you’re working with five, and they have different timelines. Maybe they don’t want to test their drugs against their competitors’ drugs in the same trial, and they don’t want theirs to look worse and they don’t want to take that risk. So it’s hard to set those up, but if you can find better incentives for them or better protocols, then sometimes they’re much more effective.
Jacob Trefethen:
My colleague Ray Kennedy at Open Philanthropy funded a trial I thought was very cleverly put together by the researchers, which was a platform trial of two different drugs that hopefully work as antivenoms if you get bitten by a snake. The way that that worked, given the difficulties you mentioned, is that both of those drugs were repurposed drugs, I think they were both off patent, so that just makes it way easier if a philanthropy or a government wants to do that. I think that was also true in the case of most, possibly all of the drugs in the RECOVERY trial that you mentioned. They were already drugs in use for other purposes, which means that a government can fund them to be used without having to question a pharmaceutical company too much.
So we talked about human efficacy trials and we’ve tried to speculate on some of the ways AI might help and AI might not help. What’s your overall take? How much will AI help with human efficacy data?
Saloni Dattani:
My takeaway was, human efficacy — collecting data on whether things work in humans — is really the goal. If you want to develop treatments or vaccines or preventives for human diseases, you at some point need to use data from humans. The problem is sometimes it takes a very long time to collect that data. Sometimes it’s really hard to recruit people into those trials. Sometimes the trials are just not designed very well and they just take much longer, or are not very informative. There are lots of ways to speed that up through improving recruitment, improving the design of the clinical trials, maybe by having big collaborations and things like that, or testing different outcomes using biomarkers, things like this.
I think with these various different approaches, there are ways that AI can help, but it can’t really replace this need for actual human data. The human body is really complex. It’s often hard to predict how effective things are going to be. Even when you use biomarkers, sometimes they don’t correlate very well with the disease. Sometimes they’re confounded. Sometimes we might just not have good biomarkers at all for the condition. Sometimes, even if you find ways to better match people to enroll in clinical trials, they still need to actually come in. You still need the nurses and the doctors to run the test, do the operations, or actually perform different functions in order to collect that data and see whether the drug or vaccine works.
Jacob Trefethen:
That all makes sense to me. I would say my biggest hope personally for efficacy trials and improvements coming from AI to efficacy trials actually comes from AI improving drugs earlier in the design pipeline. It’s not about the trials per se, it’s that if you have better drugs that are more likely to work really well entering the clinic, then you don’t have to have as expensive trials.
Saloni Dattani:
Yeah, I think I agree with that. I also feel like in some ways we are talking about various obstacles that are still going to remain even with AI. But I think in another sense, we’re actually kind of giving people a roadmap for what other things need to be fixed, what other things need to be reformed or sped up.
Jacob Trefethen:
Absolutely right.
Drug safety
Saloni Dattani:
So, we’ve talked about collecting data on whether a drug works, but what about whether it’s safe? I guess I tend to think of safety and efficacy as really two sides of the same coin. We can often study them in similar ways. The safety data is often collected as part of a regular clinical trials, so you could analyze the difference between different safety outcomes in the same way that you would analyze the difference between the efficacy of a drug or a vaccine.
But in practice, that doesn’t really work. The reason is that it’s usually difficult to predict which side effects will develop in the participants. Usually what happens is you’re only comparing, you’re only doing a statistical analysis on the overall number of side effects that are seen in the treatment or the placebo arm, so each of the individual side effects might vary quite a lot between people.
The second thing that’s different is that, it’s quite hard to detect rare complications in a trial without a very large trial. For example, the COVID vaccine trials showed that the vaccines are very effective, and they had 30,000 or more participants in each trial. These are some of the largest vaccine trials in history, and 30,000 or more is really large, I think, for a clinical trial. But at the same time, it’s still not enough participants to detect some of the rarer side effects of the vaccines like myocarditis, which affected 1 in 100,000 people overall with the mRNA vaccines. If a trial has 30,000 participants, that’s obviously a lot if you want to study the efficacy, but it’s not if you want to look at these rare side effects. Maybe only a single person in that trial might have had this side effect. That I think means that you have to treat safety as slightly different from efficacy, and often you’re thinking about collecting that data in different ways. What do you think about that summary?
Jacob Trefethen:
I think that’s a fair summary. Also I often think about the timeline being different. You will get many adverse events that would count as safety flags near to the time that you first take the drug. If it’s a vaccine, you might get negative effects straight after taking; injection site pain is probably the most common adverse event from vaccines and you get that immediately. Within seven days, you’ll see a lot of negative events. If they’re going to show up, they’ll show up by then, within 30 days, especially. Same with drugs, if you ingest a drug that is to make you nauseous, it’s not going to make you nauseous in six months; it’s going to make you nauseous when you ingest it, sometimes the safety will show up early.
Whereas for efficacy, again, it depends on the particular case you’re wondering about. Sometimes a drug will be efficacious within the day as well. But if you’re talking about curing hepatitis C like we did earlier, that takes three months. If you’re talking about curing or treating hepatitis D, that’s actually sometimes one year or two years of treatment. If you’re talking about the new weight loss drugs to show how much weight you’ve lost, that might take months, you know? I think about that as another difference.
The statistics are also different. Basically, the worst case for efficacy is that it doesn’t work. But the worst case for safety is much worse than that. If you have a rare event, that can be absolutely awful. You have to be conscious of that, whereas there’s no equivalent with efficacy of a rare event.
Finally, there’s another difference I think about when it comes to prevention versus treatment. For a treatment, when you already have a disease, you might have a different efficacy bar you’re willing as a patient to accept. If something might work, well, I might as well take it, that kind of thing. On the safety front, you might tolerate a lot more side effects because you want to get rid of the given disease.
With preventive drugs such as lenacapavir from our first episode, such as the COVID vaccine you just mentioned, because it’s less likely that you’re going to get the worst case of a given disease, you need it or most people will want it to be much safer and have much fewer side effects. That’s the other lens I think about.
As I put those different lenses through the question of whether AI will help, I think in the best case, the answer is for cases where you can observe safety quickly. If you can get AI designing drugs early in the system that then enter clinical trials that have a higher chance of being efficacious, you could get drugs way quicker because you will have a high chance of it working and you won’t need as many people. You can do a larger safety population of thousands of people, but they don’t have to hang around forever in your trial.
Where I think AI, I personally struggle to see it helping as much is if the safety events may be happening later in time. If you are taking a daily drug for many years, that might accumulate in parts of your body that actually are bad. I start getting worried. Then I start getting extra worried if that’s a daily preventive drug, because people will want to have a higher bar on safety. If you’re taking a drug that prevents your progression of Alzheimer’s, but might risk brain bleeding because of neuroinflammation, but some of those brain bleeds happened two years in. Oh my gosh, that seems like really hard for me to imagine how AI is going to help with that one. But I just said a lot there all at once. I’m curious what you think in reaction to that or other aspects of safety trials.
Saloni Dattani:
I think I basically agree. I think it’s going to be harder to predict long-term side effects and kind of optimize for them because it’s harder to get the data for that. There is less data from past trials on this so far. The other thing that I’m thinking about where safety is different from efficacy is that you care more about the heterogeneity than you do when you’re looking at efficacy, because the types of side effects might vary quite a lot between people. You really want to be able to capture as much of that variation as you can in order to know how to treat someone, especially if it’s a preventive. Then you don’t want to be unnecessarily treating people who are going to have the side effects that could be quite severe for them.
But I do wonder, with most of these drugs, that many drugs don’t stay in the body for very long. Vaccines don’t stay in the body for very long. Why worry about these hypothetical long-term side effects that we can’t see in trials anyway?
Jacob Trefethen:
I think I worry about them more for the drugs you’re taking daily. I’m now trying to think about, do I worry about them for one-off drugs? Do you?
Saloni Dattani:
Sometimes. Imagine that it’s antiviral or is some kind of drug that causes damage to some vulnerable organ. But most of your body has a lot of reserve capacity. In your lungs, you have a lot of reserve capacity. You might not actually notice the symptoms until quite late, and that’s true for diseases, but it’s also true for potential side effects. There are things that you might only know if this drug you have been either using repeatedly or after a very long period. That’s one.
Then, I don’t know, there are kind of other, like, potentially dangerous effects that are only seen in clusters or rare or smaller demographics. One example that I often think about is clozapine. Clozapine is a schizophrenia drug, and it was initially developed in 1958, but it was only approved in the US in 1990. It’s quite effective against schizophrenia that is not responsive to other types of treatments. I remember reading about this and thinking, “Wow, that’s more than 30 years that it took to reach the clinic. What happened?”
The reason was that it was approved first in Europe in the late 1950s and early 1960s. In the countries that approved it, they saw these clusters of this rare condition called fatal agranulocytosis. Basically, the white blood cells are getting depleted, and white blood cells are really important in your immune system. If that happens, then getting an infection could be very dangerous. There were these clusters of people in Scandinavia, I think, that developed this condition after taking clozapine, and many countries in Europe then withdrew the drug. But it really was quite effective at the same time.
Is there a way to kind of manage this trade-off or find a way to reduce the side effects? There were these pharmaceutical companies in the US who tried to develop, do the trials again, try to see if they could develop a safer version. I think in order to get it approved in the US, the FDA required them to set up a safety monitoring system. In order to prescribe the drug, basically, the way that you would get a refill for the drug was to send in blood tests. That was very expensive, and until that system was set up, they weren’t able to approve it, and that only happened in 1990.
Then, of course, there are other situations like medical implants. Maybe there’s some kind of heart implants or some brain implant that someone gets that might be effective for that particular condition, but also it carries a small risk of infection, or it might just damage different parts of that organ. Over time it rusts, or something happens, and over the long term, it could cause side effects.
Jacob Trefethen:
That’s a really good point.
Saloni Dattani:
I think basically there are these long-term risks, but there can sometimes be these monitoring approaches or these secondary treatments to manage them or to understand which demographics are more affected by them. But at the same time, you do need to collect lots of data over a long period. Especially when there’s a lot of variation between people, it’s harder to skip that process, I think, and it’s hard to predict because the individual people who have these side effects might have some other differences that are not seen in a smaller clinical trial or in an AI model.
Jacob Trefethen:
Another thing that rules out some drugs from being useful for is a safety question of drug-drug interactions. If you’re on one drug and you’re going to go on another drug, are they going to interact? Do you think AI might help with that, or what do you think the story is there?
Saloni Dattani:
In one sense, that’s very hard to figure out because in order to know whether drugs have interactions in people, so not in a lab or not in animals, you need to have enough people who are taking those multiple drugs, who are taking the combinations. That’s much less common than someone just taking one drug, right? Because there are so many potential pairs of combinations, most of these are too rare to study systematically, and you often need some kind of network model or something to compare how these drugs might be interacting.
Right now, the way that we collect data on these potential interactions is either it’s in labs or it’s in clinical trials, or it’s basically through these voluntary reporting systems and these online databases that just have these scattered evidence that they’ve compiled. There are websites like Drug Bank and Drugs.com where there are potential side effects from having multiple drugs that interfere with each other. But it’s quite hard to know whether that’s causal because you just don’t have enough of the data in trials and you’re essentially just relying on people to report what effects they had after a drug.
First of all, those reports are sometimes not even verified. The US has two systems, one for vaccines and one for drugs where people can report potential side effects, and they are not necessarily side effects of a drug; they are just things that happened after the person took the drug or the vaccine. They could include things like the guy got struck by lightning or the guy got divorced; these are genuine examples of entries that people have put into these systems, so they’re not necessarily causal at all. It’s really just this system to have a public repository for people to answer potential complications, and then it’s something that other researchers will then follow up on and do proper analysis on. So really I think of that as an initial step.
With a lot of drugs like this, if you’re just relying on data that’s collected naturally from the population after a clinical trial has finished, you are often capturing things that are related to the disease, not the drug. The way that we understand whether something is linked to the drug itself usually is by running a randomized controlled trial in a clinical trial. It’s hard to fully replicate that in when you’re doing analysis of these internet databases or whatever you might collect afterwards. Right now, I think we don’t have anywhere near the amount of data that’s curated and that’s verified in order to have good predictions of drug interactions from that.
But on the other hand, I think there’s two potential ways that you could think this might be improved. One is maybe scraping the internet, not just using the databases that exist. Maybe they’re reporting on Reddit that they took these two drugs and then they felt really sick or something. Obviously this also needs verification, and maybe a robot wrote that or something, I don’t know, but it could be a first step. I can imagine sometimes that it will be helpful to use AI to scrape public databases or Reddit or Twitter or something like that.
The reason that I say this is because during COVID, one of the symptoms of COVID infections, the loss of smell, was initially- the reason that we know that that was linked at all was because people were reporting it online. People were tweeting about, you know, having something like suddenly losing their sense of smell and suddenly coffee tasted like poop or something. I remember reading a tweet along those lines and feeling really bad for the person. But it’s things like that where you wouldn’t really think to study that. That might not be a thing that people are collecting data on in a clinical trial or even thinking to input into these side effect reporting systems. They’re just these other random potential side effects. It’s only when people freely talk and report different things, their experiences from these drugs, that you can look into them. But really, I see that as a first step.
In this case as well, looking at these tweets online helps researchers then do epidemiological research on, is this loss of smell related to COVID specifically? Is it not just some natural, I don’t know, some other cause?
The second possibility is maybe like the virtual cells that we were talking about, maybe someone does a virtual liver. The way that a lot of these drug-drug interactions happen, or even how the side effects happen, is that the liver is not clearing them well enough, or one drug is interfering with the enzyme that usually clears another drug. If you’re able to predict the pathways in the liver itself, I would guess that you can go a long way.
Jacob Trefethen:
Most of the issue with drugs getting dropped from clinical trials and not making it to people is just from off-target effects and raw toxicity, right? What’s the current mechanisms for tracking that? What are we going to do about that?
Saloni Dattani:
Right now, there are a few. I mentioned the US has this drug side effect reporting system. This is people’s reports of potential side effects; it’s publicly available, but it’s not necessarily verified. There are other datasets as well. They’re more healthcare-focused ones, where the clinicians enter online reports about potential side effects using their own judgment of what would naturally happen with other people. Then there’s some other surveillance systems in the US where you use electronic healthcare records and insurance claims to try to monitor the safety of drugs. I think the US also has various toxicology and poisoning databases, and I’m sure this is true in other wealthy countries as well, where there are these systems where there’s poison control tracking and toxicology data where if someone has some very short-term reaction to something that is often investigated and followed up and entered into these databases for researchers to use.
Jacob Trefethen:
Because AI is often training on public data, one thing that sounds great about that system is that a lot of that is public, and then it takes my head to another implication of what is not yet public that would be useful for predicting off-target effects. When drug companies apply to the FDA at various stages, if they don’t get to the finish line with a drug, I imagine that some of the data that has some useful insight into toxicology for that failed drug ends up staying private rather than going public. That’s one thing. I don’t know if the FDA could release more from what they’ve already been submitted along those lines and not be quite as careful around making sure that drug companies are happy on that front and just take the public interest over the private interest there.
Another thing is, could you- if you were purposely trying to create public data sets that were useful for predicting off target effects, what would you do? I know there’s this nonprofit focused research organization called EveBio that’s trying to create a dataset like that, that is basically going in with the public interest in mind of, “Okay, I’m not actually trying myself to develop a drug, but I want everyone who’s making drugs to have some knowledge of what they can dodge and weave that’s better.” That’s another angle on it, I think.
Okay, so, putting it all together, safety is something that you want to test for in real humans and you don’t want to take for granted. What do you think about AI’s effect, if any, on safety trials?
Saloni Dattani:
I think it’s helpful probably as a first step for flagging these potential side effects that people are reporting online. Secondly, I think if you’re able to build good models of liver metabolism of different drugs, that might be pretty helpful. Trying to model drug interactions is going to be hard and it’s going to need a lot of data, I think. But at the same time, I can imagine that analyzing that with network models or AI or something like that can probably help improve what we know about interactions. And then, other side effects, the problem that we described is really the number of people. Some of these side effects are rare. In order to predict them, you’re going to need a large sample. Things like this are often quite hard to replace with computer models that are only trained on a more limited set of patients.
Jacob Trefethen:
I think I would take the more skeptical side on this one. As I introspect about especially preventive medicine, would I myself take a preventive drug that an AI predicted would be safe, or would I wait for the safety data? Would I advise a family member who’s older to take a preventive Alzheimer’s drug that had not yet been taken for at least five years by one person, but it’s a daily drug? That feels hard to me. Firstly, it might be illegal for a drug company to sell it before they accumulate that data. Secondly, if you’re putting something, if something is strong enough to affect your brain enough to reduce Alzheimer’s, it is strong enough to do some damage up there too. I think that I struggle to see AI getting around the need for multi-year safety data in real humans for each of these new drugs.
All that said, I wrote a blog post about AI and medical progress once and got a response from someone who I respect very highly, who said I was being way too conservative, and in fact, safety data is not going to be as much of a problem as I think. This is really getting at the edge of speculation and I’m not sure.
Saloni Dattani:
Well, I think there’s two last things that I wanted to mention. One is the long-term safety data that you mentioned, where that’s even harder to collect because you need such a long follow-up. The second thing I think is, I mean, sometimes the disease is bad enough that I can imagine that even with, I don’t know, with more limited safety data collection, I think some people would be willing to try them just because if you can demonstrate that this has a large efficacy, then maybe that’s worth that trade-off.
I guess I’m also thinking maybe the way that people might be thinking about the risk profile or whether they’re willing to take the risk of a drug that’s only been validated by AI or something like that. It’s a bit like participating in a trial. You don’t really know what’s going to happen. But it’s worse than participating in a trial because it’s not randomized.
Jacob Trefethen:
No!
Saloni Dattani:
So probably your experience doesn’t help as many people as it would if you were in a clinical trial.
Jacob Trefethen:
Unfortunate.
Manufacturing and healthcare
Saloni Dattani:
Once we found these potential candidate drugs and we’ve tested them in lab models or in animals, or we’ve tested them directly in humans, we found out that they’re effective and safe. I think the next part might be even more challenging for AI to solve. I think it would be fun to talk about what happens after that. Once you have a drug, how do we get it out to people? How do we scale it up? What do you think that pathway is going to look like? How hard is it, and how much can AI replace?
Jacob Trefethen:
Well, let’s break it down. First you have to manufacture the drug in question. If a lot of people are receiving it, you have to manufacture it at scale. Then what you’re going to have to do is find a way to plug into a health system of the given country you’re in and make sure that people who need the drug have access to the drug and can get delivered it.
Let’s start manufacturing. Manufacturing as it stands is very different for different modalities or different types of drug. We have really, as a society and as a species, nailed it when it comes to small molecules, at least in my opinion.
Saloni Dattani:
So those are chemical drugs, right?
Jacob Trefethen:
Well, all drugs are chemicals in a sense, but this is really, yeah, small chemicals. Just imagine a string of atoms and imagine it’s usually not that big and it’s small enough that it can diffuse into your cells and sort of do some useful property once it’s in there. When most people think of a drug, that’s probably what they’re thinking of. If you swallow a pill, it’s probably got a small molecule as the active pharmaceutical ingredient in it.
There are other forms of drugs too. In our previous episodes, we were talking a lot about protein drugs. There are protein drugs, protein therapeutics that, for example, antibodies are proteins, so if you’re ever getting antibody treatment — there are many different cancers that get treated with antibodies — that’s actually a protein, not a small molecule. You also have vaccines that are proteins, not small molecules. You have peptides like GLP-1s, which are small proteins.
Then there’s other modalities still. There’s RNA, so you can get mRNA vaccines. You can get RNA medicines, siRNA, or different forms of RNA: they can be in circles, they can be linear, they can be all sorts. There’s DNA. You can get DNA therapies, DNA is extremely cheap to make. Then on the more complicated end, there’s entire cells. Like there’s CAR T therapy where you get your own cells, T cells redone into a chimera.
Saloni Dattani:
Oh, that’s nice. Plastic surgery for my cells.
Jacob Trefethen:
Yes. Very, very chic. Then really, go way up to the other end and you can replace whole organs, not just cells. You can get a completely new, I don’t know, what do you want today? Kidney?
Saloni Dattani:
Heart.
Jacob Trefethen:
Heart, yeah, we can even replace hearts these days. Those are all interventions at different scales. Those are all things entering your body that are doing particular functions. But the ways to manufacture each of those vary a lot. My starting perspective here on AI is that there’s one vision of how AI can be useful that is mostly about small molecule medicines. I think that it would be astonishing if you only needed to take one small molecule and it magically did all sorts. GLP-1s are about as close as you can get. That’s a pretty small string. It’s a small peptide and it is pretty magical seeming.
But I think the vision of AI helping with small molecules is more like a vision of personalized medicine where people are getting a lot of different things they might need at different times. So the manufacturing, I don’t think will be a problem there.
But we can talk about what the problem might be. But if we’re thinking of a vision that is more to do with more complicated modalities, if it involves something like CAR T or involves cell therapy or involves something more like dialysis, where you have to go in every couple of weeks to have some procedure done and you have to be monitored by a healthcare professional while the procedure is being done to make sure that you don’t get harmed. That is not just going to be high price in the sense of a patented small molecule is really expensive, how are we going to afford it? It’s actually going to be high cost in that even if there was no profit in the system and it was simply paying the salaries of the people who are monitoring the equivalent of the dialysis machine and simply paying for that machine itself, that costs a lot and someone’s got to pay. It might be an individual, it might be your insurance company, it might be your government, but someone has to pay for that. So I think it’s a little underappreciated by some AI advocates who hope for a great speed up of drugs here, that the manufacturing of those more difficult modalities is not yet commoditized, and that even if it gets there, if you need a lot of monitoring the cost will be high, if you need an MRI machine to diagnose you, then that’s going to cost a thousand dollars to do a scan, and the machine itself costs a lot more than that, if you need a PET scan, it costs five thousand dollars. That’s where my head goes when it comes to scaling.
Saloni Dattani:
I’m going to, for once, take the AI booster side. I think in some cases, you could simplify this manufacturing process. One thing that I’m thinking of is when people used to try to culture cells in the lab. They used to use serum from different animals as the media to grow to help keep the cells alive. Then some researchers figured out that actually you don’t need the whole serum to keep the cells alive. You only need a small number of molecules: some amino acids, some vitamins, some key ingredients, and basically you could replace all the other stuff. Obviously, you’re not going to capture the whole complexity of a living organism, but you can at least do those basic things with just a small number of molecules. And that, I kind of wonder, with these different machines and these different testing methods and things like that, maybe there’s a simplified version that can get quite a long way and that is easier to scale up.
Jacob Trefethen:
I mean, I hope you’re right, of course, but my first response is just to look at the historical track record for modalities so far. Small molecules are now dirt cheap. You can get a drug where the active pharmaceutical ingredient maybe only took pennies to create, so let’s call that solved. The next biggest product class are antibodies. Antibodies have had 50 years now to improve the process of how they’re developed and become cheaper in how they’re developed. Still, the cost is not low enough to manufacture them at scale to provide for some use cases that would save lives. You should by now, in my opinion, be able to have preventive malaria monoclonal antibodies that kids in West Africa get given before rainy season when malaria is common. But we still cannot get the price low enough. We can’t get it below $10 per gram. That is, it’s not cost effective enough for public budgets to then pay for it. There has been more innovation in antibodies, and we’ve had 50 years, I’m like, oh my gosh. I’m curious what you respond to that.
Saloni Dattani:
I think there’s a few things that come to mind. One is, one version of antibody treatments is anti-venoms, right? You’re trying to treat people who have had a snake bite and sometimes they don’t know which snake has bitten them and they don’t know which anti-venoms they should use. I think if you had more data collection on which venomous snakes are in your area and maybe you have better recognition technology of what the bites look like versus what the snakes are like. Maybe you can kind of help predict which antivenoms the person should be taking. That hopefully reduces the delivery costs because you then get to narrow down which ones are given to people. That doesn’t really work for the other types of antibody therapy, obviously.
But I do wonder if maybe antibody therapy needs an effort like the, you know, after the human genome project, there was another project to get the thousand dollar genome. Before that, it was millions of dollars just to sequence one single human genome. Now it costs less than a thousand dollars. I wonder if, if we develop some big initiative like that, for making antibody therapy cheaper, maybe it’s possible.
Jacob Trefethen:
You got me to sign up, but I don’t think that AI is going to be the bottleneck there. I think the way to get it cheaper will probably be a lot of process improvements that you learn as you go. You’re going to have to have a bunch of capital investments in big new ways of scaling up antibody production that you’re only going to be able to improve and tweak as you go, and that will take many years.
And then that’s only antibodies. What if we want to do something more like, well, I mean, let’s say RNA that is fixed in a particular conformation that’s hard to print. That describes some pretty interesting new medicines, but are we going to be able to scale it? I’m not sure yet. What about if it needs to be personalized like CAR T and you need to take a sample and then make it in a one-off fashion? I’m not yet seeing the AI improvements there.
Saloni Dattani:
What if someone develops a 3D printer for antibodies?
Jacob Trefethen:
I mean, I won’t rule it out because that’s basically how we make mRNA vaccines. mRNA vaccines, approximately you print the RNA, then you shove it in a liquid nanoparticle machine and you’re done. So yeah, maybe someone could do an antibody printer.
Saloni Dattani:
The other thing with the CAR T cells, so I guess I agree that that is personalized right now, but maybe it doesn’t need to be personalized. So I was recently reading about what happens after you get an organ transplant or transplant of pancreatic cells.
When people get transplants, they usually have to be also given immunosuppressants because their body might react to the new cells that are from someone else and recognize that as foreign material and try to destroy it, and that can cause various problems for them. But you can probably find ways to, I don’t know, downregulate the molecules on the surface of those transplanted cells to basically make them silent to your body if they’re transplanted into you. Then you wouldn’t need to personalize the way that that transplant happens. You wouldn’t necessarily need to match it as well, as closely. Maybe that’s the same for other types of personalized treatments, that there are other ways to kind of improve them in such a way that they’re less personalized while still working. I’m not really saying that AI can solve this. I’m basically saying, what if the personalized therapies become less personalized?
Jacob Trefethen:
I am hopeful for that too. Now, let’s then go to the next part of the debate on delivery then. Okay, let’s just grant that the manufacturing costs have gone way down. You still need a trained doctor to stand next to you to do the injection, to make sure that they’re doing it right. Are you imagining a robot for that?
Saloni Dattani:
I wasn’t imagining a robot. I think this is the part that’s hardest to solve because you actually need to deliver these drugs to different places. I mean, maybe you can have drone deliveries of medicines and stuff, but you still can’t do surgical procedures with them. From what I understand, the hardware of doing robots is often much harder than the software improvements.
Jacob Trefethen:
You know, for what it’s worth, I have heard AI boosters say the opposite of you often get more limited on software, but I don’t have a strong opinion there. I mean, I think the thing that will be necessary is, at the very least, a huge amount of capital investment in robotics that pays off along these lines. I think more fundamentally, I just think you’re going to need economic growth for this to work because I think the underlying cost structure of delivering complicated, dangerous personalized medicine — dangerous in the sense of if you get it wrong, it’s dangerous — is going to be high. To deliver that to many people, we have to have a larger economy, and either growth that people’s incomes grow and they pay for it themselves if they want, sort of Brian Johnson style, or in the sense of you have a big tax base and really good healthcare and healthcare is a really large proportion of a large economy.
Saloni Dattani:
That I agree with. It sort of reminds me of this quite horrible situation, or well not that horrible, but it was horrible for the few days that I had a situation that was, this I think probably couldn’t be replaced by a robot unless it was a very skilled one.
So last year, after I had a cold, I think a few days later, I woke up one day and I turned my head to the side. I think I was just scrolling on my phone or something, and suddenly the room started spinning and I basically had developed vertigo. That was probably a complication of the infection, which was probably a virus. Basically the entire room was spinning, spinning around. The whole disorientation I had from that really made me feel, firstly that I was going crazy or something very scary had happened to me, and secondly, even after it had slowed down and stopped, I basically just felt like throwing up and just kept vomiting for a while. I was like, “Something has gone wrong in my brain. What is going on?”
I remember when it had finally calmed down, I kind of tried to Google to find out what I should be doing. Are there any, I don’t know, is there any self-treatment I can do for now? I also called up the emergency medical service here in the UK, and one of the things that is recommended online is this maneuver.
Jacob Trefethen:
Manoeuvre.
Saloni Dattani:
When you get vertigo — one of the causes of vertigo is vestibular neuritis, is I think the one that I had — what happens is in your inner ear, you have your ear and then slightly in deeper than that is your eardrum. Beyond that is this thing that the cochlea, which looks a bit like a snail, and it has these little loops and those loops are called semicircular canals, and those canals have little hairs in them, I think, and also have fluid in them. As the fluid moves around in the canals and touches the hairs, it sort of detects your position relative to gravity or whatever. That helps your brain figure out which position you’re in and stuff like that.
What is happening with this condition, with this type of vertigo that I had, is that there are these tiny little calcium crystals in a part of just at the border of those loops, in this place called the utricle. Those calcium carbonate crystals get dislodged from that usual spot, and they move up and they start floating in the semicircular canals. Because of that, they kind of mess up the fluid detection that’s happening to detect your balance and your position and that. You feel like the room is spinning around. What you’re supposed to do is try to get those crystals back in the right place.
Jacob Trefethen:
This is like the worst video game of all time.
Saloni Dattani:
It’s like those little toy games where there’s a little marble or something and you’re trying to move around the box to get the marble into the hole. That’s basically what you have to do with your head, except you don’t know what the little canals look like at all, and where the crystals are. It’s fairly elaborate, but it takes about half a minute or something to do, to guide those crystals back into the right place. You have to do these precise head and body angled movements to make that happen.
So I read these instructions online when I was feeling very sick. I think partly because I was feeling very sick, but also partly because I’d never done this before. I was trying to do this maneuver myself and just follow the instructions. Can you imagine having these crystals in this little part of your ear? If you’re doing them the wrong way, it’s going to make it worse.
Jacob Trefethen:
Oh no!
Saloni Dattani:
The crystals are moving around in the wrong direction. I was trying this maneuver on myself, and it really made me feel very sick, even worse than I did before.
Jacob Trefethen:
This story is horrible.
Saloni Dattani:
But I think this, it was a Saturday, and I also was so confused about this whole situation, I was so freaked out. I had the day before bumped my head a little bit on the wall, and I was like, “What is this? Is this because of that?” I had no idea what was going on, so I went into A&E. After a while, I finally saw a nurse and she asked me a few questions. I think she was kind of skeptical at first that I actually had any problems because I looked quite calm, even though I had a vomit bag with me and was throwing up.
Jacob Trefethen:
Ridiculous.
Saloni Dattani:
But then she noticed, she asked me to do this eye exercise. Then she noticed that actually my eyes were basically spinning around or whatever, my vision.
Jacob Trefethen:
Your eyes were moving?
Saloni Dattani:
Yeah, yeah. Your eyes can move left to right in response to the weird balance that you feel. Because they usually try to-
Jacob Trefethen:
This is giving me vertigo hearing about this. Your brain’s trying to right it, yeah.
Saloni Dattani:
Have you ever seen those videos of a pigeon or something or a chicken? And the chicken is kind of moved around, but then their head stays in the same place.
Jacob Trefethen:
Okay. I can kind of- like an owl?
Saloni Dattani:
Yeah, yeah, yeah. Your body basically tries to do that with your vision. So when you move around, it tries to make sure that you’re still looking in the same place, so your eyes are kind of fixed on the same thing, but while your whole body is moving around, so your eyes are kinda moving. This is basically what’s happening to me, so my eyes were kind of spinning around a little bit and she noticed that and she was like, “Oh, okay, I better do the maneuver.” She did the maneuver and I felt a little bit sick, but it basically solved the problem. It like moved the crystal back.
Jacob Trefethen:
She did the maneuver.
Saloni Dattani:
She did the maneuver on me.
Jacob Trefethen:
Okay. I thought you meant like a yoga instructor. She was like, “Okay, now you have to do this pose.”
Saloni Dattani:
No, no, no. She picked up my head and did the specific maneuvers on my head.
Jacob Trefethen:
She picked up your head? What are you doing? Your head is attached to your body. How did she pick up your head? Oh my god.
Saloni Dattani:
I love that we’re four hours through this. This is the best story ever.
Jacob Trefethen:
If you ever find yourself four hours into a podcast, you will start describing the maneuver.
Saloni Dattani:
She eventually, you know, moved around my head and my shoulders and stuff to reposition the little crystals in my inner ear and to move them into the right place, and it kind of solved the issue for a short while. There was still the issue that basically that viral infection was inflaming, I think, that part of my ear. If I turned in a certain direction, the crystals would dislodge again and then the whole, you know, room would spin and blablabla. It sort of took a few days to get back to normal. It was very hard, but it was only a few days.
Jacob Trefethen:
A few days of vomiting?
Saloni Dattani:
A few days of the room spinning if I moved my head a certain way.
Jacob Trefethen:
Oh my god, Saloni, no, I don’t want this.
Saloni Dattani:
But it basically just made me think, this is the connection to AI that I feel like it would be really hard for an AI to replace that because even when you do have the instructions, you’re too sick to really do the thing properly. Even if you weren’t, it’s really hard to get it right the first time and it makes you feel much worse if you get it wrong. You need a trained professional sometimes to perform these procedures in real life, and you need a person or maybe a very sophisticated robot.
Jacob Trefethen:
Our message from this podcast is no science, no robots, what we really need is a skilled professional to do crystal therapy.
Saloni Dattani:
I love that there are literally crystals in your ear that mess up your balance.
Jacob Trefethen:
The human body can’t be real. No, I think the real response here is that what we need to develop is not only software AI, but also software AI that can plug into a human-sized gyroscope and can then twiddle you around in the right orientation.
Saloni Dattani:
Wow. I don’t know what I think about that. I feel like I’m imagining one of those, when you go into a cinema and it’s a 4D cinema and the chair shakes around.
Jacob Trefethen:
Smell-O-Vision.
Saloni Dattani:
You need training and you need to make the connections. That’s very hard to replace even when you have the instructions or something for someone to try to do it themselves.
Jacob Trefethen:
Fair enough. Okay, so what’s our takeaway?
Saloni Dattani:
I think AI can kind of, to some degree, maybe improve which modalities are being used or to narrow down the way that they’re used, with the case of anti-venoms — if you’re able to better match patients who have been bitten by a venomous snake to the right antibody that they should be taking, that seems like something that AI could help, but I can’t really imagine it being that helpful for delivery systems. How does AI help develop better hospital clinics? You still need human people to do those things, even if I think there are various places where AI might help.
Jacob Trefethen:
I’m with you on that, and I’ll end the segment by being the AI booster in that maybe there’s some way around this that does involve some combination of vitamins, supplements, and small molecules that can be determined to be safe enough that you don’t need much monitoring, you don’t have to go in and get checked up all the time, and you do get really good personalized advice. I think that is some place I hold out some hope. But my head then goes to a different bottleneck, which is no longer manufacturing and delivery, but who’s going to pay to generate the knowledge that a particular combination will work for you.
Saloni Dattani:
What about drones and robots and nanobots?
Jacob Trefethen:
Drones and robots and nanobots will save us all.
Saloni Dattani:
Will they?
Jacob Trefethen:
No, I mean, take the last one of that. I think that drones already are used for delivery to rural areas; that will help in some cases. Robots are already used in surgeries, used in — massively used depending on how you define what robot means — in manufacturing, a lot of that is robotic, it’s not a human that’s holding all of that liquid.
I think my perspective on robots is that there will be a set of improvements to biomanufacturing that will make things way better with loads of modalities. But I think it will take time and iteration in the real world rather than something AI can magically do.
Then, nanobots. I mean, I hear about nanobots probably more than your average person just because I live in San Francisco. When people say nanobots, what they’re often using the term to mean is a future technology that can go into your body, swim around, and sense different things going on in your body and then tune up things it sees that are broken. That, to me, is a concept that fills in at such a level of generality that I can’t yet comment on it, really. Basically, it sounds sometimes like people want to reinvent what a cell is or something, and then replace your cells with better cells. Maybe that will lead to healthier bodies. I think we need to get a couple more steps before we can have that debate in a way where we know what we’re talking about though.
Saloni Dattani:
I think, imagine you did have one supercell that does, it swims around in your blood, maybe it detects, “Oh, something might be wrong in your finger or something.” But how is it actually going to solve that problem? Can it actually manufacture the right drugs or gene therapy or something to deliver it to that specific part of your body? That’s really hard. That’s like combining the transcription, the small space — getting through each of these different parts of your tissues not being killed off by your immune system or just by enzymes and things like that, or just falling apart. Then also it’s somehow delivering these drugs to specifically where they need to go. That does sound like it’s reinventing a cell, except a really complicated cell, and we don’t even really know how to reinvent one cell.
Jacob Trefethen:
I’d love to be wrong on this one, and I look forward to people writing in and saying that we’re simply unimaginative.
Saloni Dattani:
Well, I feel like, okay, even if that’s possible, that’s definitely not possible in the next 10 years, I would say. That’s a prediction I feel quite comfortable making.
Jacob Trefethen:
I think the other side would be, “Look, Jacob and Saloni, what you’re not taking seriously enough is that before we get all these improvements in manufacturing and improvements in nanobot technology, that’s coming from having got these enormous explosive improvements in software intelligence, where there’s going to be hundreds of millions of people, who are not people, they’re actually in a data center and they’re actually AI agents who think of themselves as scientists or who we have told to think of themselves that way. But those entities have all the time in the world to think through all the problems that we may be coming up with. They will have all the best podcast debates you could imagine about what constitutes a nanobot and what experiments you have to do to make sure that it works and doesn’t get ejected by your immune system and can recapitulate metabolism, so it still has energy to move around and all of that. ‘Jacob and Saloni, why are you even having this conversation about 10 years? What you should be having a conversation about is...’”
Saloni Dattani:
Maybe we should be listening to the Jacob and Saloni nanobots instead.
Jacob Trefethen:
I wonder if we can let those bots loose on Spotify and Apple Podcasts to give us ratings.
Saloni Dattani:
All right. Let’s say we can, in principle, develop new drugs to tackle these difficult diseases. We can test them in humans. We can test their safety and efficacy. We can manufacture them at scale. Even if all of those things are possible theoretically in the next 5 or 10 years for certain diseases, I think we’re still not going to make... a lot of things are still going to go unsolved. One of the reasons for that is bad economic incentives and just who is working on the problems, what kind of problems they’re working on. Would you say that’s fair?
Jacob Trefethen:
Well, it makes me depressed to think that’s true, but let me get into the headspace. I think that if you look at what, in fact, the world currently looks like. Imagine Jacob and Saloni two generations ago. What would that have been like? Talking through a similar set of questions, but without reference to AI and just thinking about how the future might go. We might have said something like, “I think that people are going to discover lots of new drugs and will save lots of lives.” That would have been true. We have way lower mortality rates globally than we did back then, and a good portion of the contribution to that fact is drugs, not the only contribution. Sure enough, at that time, Jacob and Saloni, two generations ago, would have been talking when artemisinin was being discovered and was being created for curing malaria.
But now let’s fast forward two generations. With all this economic growth and the fact that small molecules now cost pennies to make the active ingredient of, are people still getting malaria? “Well, they can’t be because there’s a cure for malaria. Artemisinin, we invented it in the 70s in China.” Well, I hate to bring it to you, but people are still dying of malaria. That’s because of more complicated societal and economic problems. If your kid gets malaria, but you live in rural Tanzania, far away from a clinic or hospital, then you might not have access to artemisinin in time. Or if the drug regulator of Tanzania has not found a way to verify the imports from different countries manufacturing that drug, sometimes you’ll get the drugs, but they’ll be substandard or it’ll be made up and it will be a fake drug. There’s a lot of stuff that means that even with arbitrarily cheap, essentially, technology of artemisinin, oral or injectable drugs, people are still dying of malaria; 600,000 people are dying of malaria every year.
Saloni Dattani:
I guess the same is true for maybe TB and hepatitis C, right? You mentioned that hepatitis C is now curable, but still lots of people have it today and are untreated. Are there maybe diseases like that in richer countries where there is a pretty good cure or treatment, but there’s still a big gap in how many people receive the drug or vaccine?
Jacob Trefethen:
There are, and in the US, hepatitis C is one example where many people who are in prison have hepatitis C and are not treated. Currently, the main treatment drugs are still on patent, so it’s still very expensive in the US. That means that the prison system itself would have to pay to get those people treatments that would be useful for them and they, you know, that would- that’s not something that they’ve budgeted for from their central budget, so there are cases right now. The way that health systems in different countries work, even between different high-income countries, varies a lot. In the US, you have mostly a private insurance system. In the UK, you have mostly a national payer and a nationalized healthcare. It really gets down to the specifics of who gets affected positively and negatively by each of those. I think one shared similarity between many countries though is that people rurally get worse healthcare on average.
Saloni Dattani:
Right. I mean, I guess I’m also thinking about, you know, with, let’s say, measles. We have a really effective vaccine against it, but some parents don’t take it. I guess there are other issues where maybe there is a cure, there is a treatment, but someone hasn’t gotten diagnosed for it or for the disease, or they’re skeptical of healthcare in general, or maybe they’ve, maybe it’s quite difficult for them to access hospitals or the right specialists to get treated with that therapy, or it’s some complicated procedure and they have decided that it’s not worth the risks for them, or they are not interested in taking it.
But I’m also interested in maybe some of the diseases or problems that are not solved and where there is no treatment or cure for it because it’s not seen as a priority, or the economic incentives to develop these drugs and take them through trials, or test them and manufacture them at scale, is missing. What do you think of that? Are there things that come to mind?
Jacob Trefethen:
There definitely are, and actually, I currently work at a foundation that gives away money on this theme of what R&D is undersupplied because it would develop products that would help mostly people in lower income countries, so pharmaceutical industry is not that well incentivized to produce those technologies. I mean, there’s a couple of themes I would point to there. One is the wealth gap, but another is just the way that pharmaceutical development is incentivized even in the US and in the UK and in higher income countries has kinks in the system.
If you are developing a new chemical entity that you can patent, or actually any form of chemical entity that you can patent, that gives you a 20-year window from when you invent it to get through clinical trials, and if it works, sell at a high price. If you are a company that has been invested in by different investors who are maybe pension funds originally, and they expect a return, then you will charge high prices and then maybe make a return or make some profits, and that sort of system has its own internal logic.
However, if there are drugs that have already been invented, that have passed through the 20-year window, or if there are supplements that aren’t even drugs, say, like vitamin D or like lithium orotate, which some people are now looking into whether that might be useful for Alzheimer’s delay or prevention, but it’s not well known yet. Then there’s no pharmaceutical company that would be incentivized to pay the tens or hundreds of millions of dollars of a phase three trial to determine that lithium orotate prevents Alzheimer’s because out the other end, they can’t charge a high price because it’s a commodity market and anyone can sell you lithium.
That is another relatively well-known market failure of the generation of knowledge. There is not knowledge being generated in clinical trials that would be useful for people in medical practice. There are a few ways you can try and solve that, and I think I personally am somewhat hopeful that some of those ways will get tried out because there’s some building energy around it at the moment. Again, I’m not sure that that is more of an economic problem and an incentive problem than an AI-amenable problem, I think. Do you agree with that?
Saloni Dattani:
I think so. It reminded me of antibiotic development and the incentives for that. I think now AI models to discover and develop new antibiotics are getting pretty good. What you can do in order to develop them is you can sort of mine the genome of bacteria and fungi to see if they’re potentially producing compounds that would interfere with another bacteria’s growth. If you want to develop new antibiotics that bacteria are not resistant to, it’s really important to have this pipeline of new drugs coming out.
One other interesting idea on this front is if a bacteria becomes resistant to one antibiotic, sometimes that will make it more vulnerable to other antibiotics. Can you find pairs of antibiotics that you can put together so that if it becomes resistant to one, it then becomes vulnerable to the other, so it’s really hard for it to develop resistance overall? I think there’s lots of promising potential ways that you could develop new antibiotic compounds. But at the same time, the actual development of antibiotics has really slowed down.
Most of the antibiotics that we have today came from a 20-year period in the mid 20th century. This is what was called the golden age of antibiotics, and that has really slowed down, I think, for several reasons. One big one is that, okay, there’s one thing is that infectious disease prevalence has kind of reduced in a lot of wealthier countries with better sanitation, food safety regulation, things like that, so there’s sometimes less need for antibiotics in general. But the other reason is that the economic incentives to produce a new antibiotic are kind of really limited because there’s this problem of resistance developing to most antibiotics that we see. When a new antibiotic comes along, people like doctors want to reserve it for the most serious cases, they don’t want to give it out to everyone because we might have resistance developing and then there’s no last resort remaining.
So any new antibiotic that comes onto the market now has a smaller and smaller population that can use it, and so the market is just very limited. Potentially there are economic, I don’t know, you can change the incentives around or change how you pay for these drugs to incentivize better innovation.
One thing that the UK is doing is using this subscription model, so I’ve heard, where instead of paying per consumption or per purchase of the antibiotic, you’re paying a specific amount per year in order to use stocks of new antibiotics that are developed. You’re able to have this market that drug developers know is going to be there once they finish developing a new product, but it also means that you can control how much of the drug is given out to reduce the chances of resistance developing. I think that’s, I mean, I can see how you can use different economic incentives to incentivize better innovation in these areas, but it’s something that actually needs political will. It requires better incentives; it needs more funding, and that’s not necessarily something that AI is going to be able to solve in the same computational way.
Jacob Trefethen:
I think that people who, again, with my AI booster hat on, are thinking, “Well, AI will actually help solve a lot of these problems.” Often, I think what they’re imagining is that AI will help with cognitive labor and be able to replace or augment human researchers at an increasing rate. My synthesis of these positions is that I think in the world where AI starts increasing or replacing cognitive labor, some of the conclusions that those systems will come to are ones we already know. I think that there will be recommendations coming out of that digital university, such as you should have a subscription model for antibiotics. I think that some of those, you can actually do better than just waiting. You can allocate capital better now. You can do more public funding of science that may be not exactly perfect, but is extremely reasonable, and you will have been glad that that’s how societal resources were spent in 2025, even if AI goes gangbusters from here.
R&D funding
Saloni Dattani:
I think maybe we should do a little fun fact section on just how much the R&D funding landscape is skewed and some of the statistics that people might not know about in this area.
Jacob Trefethen:
Ooh, fun, I love it.
Saloni Dattani:
My first question, how much is spent on R&D globally per year, in let’s say, 2023?
Jacob Trefethen:
Ooh, fun. Okay, here are the ways I’ll try and get to an answer. Number one, I think of R&D as being about 2% of the GDP of OECD countries, or high- and middle-income countries. Then I could back into my answer via global GDP. But now I have to think what global GDP is. I’m going to guess that it is, okay, what’s the US? The US is like 20 trillion and the US is 25%. Okay, I’m going to say global GDP is 80 trillion and I’m going to say 2% of that because it skews to richer countries, which equals one- no god, what is it? One hundred, wait hold on, this can’t be right, yeah no that’s right... 1.6 trillion.
Saloni Dattani:
That was a really good reasoning process. You’re kind of off by, well, like half the way there. The estimate is 2.75 trillion US dollars.
Jacob Trefethen:
Oh I’ll take that, I’m happy with that.
Saloni Dattani:
Maybe that is because of the national differences and the sizes of the countries that are spending more. But anyway, that was a great guess.
Jacob Trefethen:
Maybe we should stop there. I think that one went fine. Okay. Well, I’ll ask you one.
Saloni Dattani:
Okay.
Jacob Trefethen:
My question for you is, where is that research taking place? Where are the world’s researchers located? I’m going to give you, actually, let me make it more specific. What do you think the population share of the world is in sub-Saharan Africa? What do you think the research share is in sub-Saharan Africa?
Saloni Dattani:
You know, I don’t have some of these basic facts in mind with these. I’m like, “I don’t know, ten? Three?” But, what percentage of the world’s population is in sub-Saharan Africa? The world’s population is seven, eight billion people. I know that three billion or so of those are in India and China. So the remaining, and then there’s another 0.5 billion people in the U.S. There’s only three or four billion left. How many of them are in sub-Saharan Africa? I don’t know, maybe 10% of that remaining population. How much is that? That’s like 10% of 4 billion, which is 400 million, which is what percent of the world’s population is that? Is that like 20% or so?
Jacob Trefethen:
I think the last two steps you got wrong, but they kind of canceled out.
Saloni Dattani:
Damn it.
Jacob Trefethen:
I think it’s more like a billion, but the number I’ve got is 14 percent. It sounds like roughly eight billion divided by eight.
Saloni Dattani:
Well, thank you for my two errors that canceled out.
Jacob Trefethen:
For people who do back of the envelope calculations, a tip I really have is that if you don’t have systematic directional reasons that your estimates are bad, then they will cancel out pretty often because you’ll get some too low and some too high, and then it’ll end up being okay. Anyway. Okay, and what about the research share?
Saloni Dattani:
You know, I think it’s going to be more skewed than that. I don’t know how much more skewed. I’m going to say, 5 to 10 times more skewed than that. So what is that, like 2 or 3%?
Jacob Trefethen:
Good guess, but the skew is 20 times, so it’s 0.7%. 14 to 0.7.
Saloni Dattani:
Oh, wow. 0.7% of the world’s researchers are in sub-Saharan Africa.
Jacob Trefethen:
Yep.
Saloni Dattani:
I think is definitely going to affect the types of diseases that get studied and the types of potential treatments that people make. Maybe the considerations that they have, if you’re trying to develop treatment for a disease in a wealthier population, maybe you’re not thinking about the same types of bottlenecks or you’re not thinking about the same degree of variation and like, how heat stable the thing has to be or how easy it is to transport or whether you can, I don’t know, whether you can take it in pill form versus injection and lots of things like that, I would assume are affected by that amount of skew.
Jacob Trefethen:
Yes, absolutely, and there’s actually evidence in the economic literature for that.
Saloni Dattani:
Okay, I have another question. How much is spent on healthcare in the US versus in Nigeria per person?
Jacob Trefethen:
Okay. So US, I have in my head that healthcare is shockingly high percentage of GDP. I think the last number I remember is 18. That’s roughly half public spending, half private spending. So if that were true and GDP per capita is 60K, call it 20%. So say roughly 10,000.
Saloni Dattani:
Wow, that was a really good guess. That’s 12,000.
Jacob Trefethen:
Boom. Okay. Nigeria, I don’t have any of those statistics to hand, including GDP per capita, although maybe that’s 5,000 or something like that. So, but let’s say, or maybe a bit lower, but let’s say, oh God, I don’t even know how to make a guess in Nigeria, but I’ll go with the GDP per capita is 5,000 or 3,000 and that the share going to healthcare is lower and that means say 5%. So that would be, oh god, okay, $150.
Saloni Dattani:
Also, I feel like the reasoning here is great, but it’s more like $90.
Jacob Trefethen:
Okay, that’s further off to be fair. Wait, 90. Okay, so I was out in the wrong direction on each one, which means that the spread is even larger. Okay, so you’re saying 12,000.
Saloni Dattani:
12,000 per person per year in the US is spent on healthcare versus $90 per person per year in Nigeria. So the difference is more than 100 times.
Jacob Trefethen:
Wow. Yeah, my gosh, that is not exactly going to get you multiple medical interventions per year. That’s really not getting you much at all.
Saloni Dattani:
What that means is that there often, there is lots more research effort that goes into diseases that we still see in wealthier countries, but have not yet eliminated in poorer countries. Especially infectious diseases, but I think also maybe things related to food safety and things like that.
Any diseases that are caused by disease, pollutants and exposures like that are probably harder to treat because there’s less effort that’s going into them because we have other infrastructural reasons that those things have been largely reduced in wealthy countries. For example, things like cholera are still quite common and there are big outbreaks in different parts of South America and Africa, especially near monsoon season, I think, and near coastal areas. But in wealthier countries where sewage systems and clean water systems are much better, the bacteria get filtered out and you have a much less, lower chance of getting infected or having severe disease from those conditions.
Jacob Trefethen:
The happy news is that some technologies that get developed in richer countries first do spill over in the sense of they, you know, if you do all of that investment in mRNA vaccine technology, while at the time that COVID hit, that was useful in the US, but probably was not that useful in Nigeria. But if you fast forward the next 10 years and people keep iterating on mRNA and making it more stable and making it require less cold chain and delivery, and improving the profile in various other ways, maybe that will find its way into a vaccine for a totally different disease in Nigeria in 10 years time. We can’t know, but if it does, that will have been helped by the, or a necessary driver of that was the, original vaccine investment in rich countries. The clearest example recently of this for me is lenacapavir, the drug we did our first episode about, where Gilead, an American company, spent many years doing very complicated tweaking R&D of this molecule. The people who will most benefit from it are women in Southern Africa. That is not where Gilead is going to make most money from the product; they’ll make this money in the US.
Saloni Dattani:
I think that’s a really good example. What I’m wondering about as well is, let’s say you do have currently an expensive process for developing a treatment or it’s some complicated surgical operation or something like that. How does the transfer of that knowledge actually happen right now? Can AI help speed that up or improve the distribution and the access to that knowledge?
Jacob Trefethen:
I think sometimes yes, and sometimes it’s tricky. I think that often a lot of knowledge in manufacturing is fiddly and you have to be able to walk another outfit through, “Here’s how we got this cell line to work so that it would produce, you know, the protein vaccine that we’re trying to make in the right quantity and, oh, this thing we tried doesn’t work and this thing does.” It’s hard to just implement a checklist, though the checklist helps. What do you think?
Saloni Dattani:
Is that because there are just too many things to record? That sort of reminds me of how sometimes it’s quite hard to reproduce or replicate findings even if the research is done well, there might just not be enough information in an academic paper to know how to replicate it at all. Let’s say someone has developed a new method to either make a vaccine or some cell therapy or something. Unless they really give a lot of detail, it’s quite hard to do the same thing in a different lab. Is that the kind of problem that you mean? Would this be solved with, I don’t know, video cameras? If someone like a lab technician wears a helmet with a video camera and records everything they’re doing, will that solve the problem?
Jacob Trefethen:
I think that’s such a cool question. I think that in the limit with better AI, you could imagine that helping. I think the problems often helped by software that can transfer between companies already, even without the cameras, there are parts of the process that have been digitized. You’ll be taking measurements of what’s happening in a given machine or bioreactor or something. If the conditions get too warm, you will automatically turn on a cooler. That kind of thing where you’re tuning the conditions does transfer already probably pretty well and probably better than the lab example that’s hard to reproduce because in order to sell a product, you need to demonstrate to a health regulator that you have a consistent way of making a product that you are sure what you’re getting out the other end. So actually that probably transfers better. At the limit of that, small molecules transfer extremely well already. There’s many- you don’t need to do an 18-month tech transfer process that costs $10 million and lots of flights and visits. Small molecules, knowing the chemical itself is a pretty big first step.
Saloni Dattani:
That makes me think that maybe for, let’s say, biological drugs, a lot of this kind of technical know-how is probably proprietary knowledge. It’s probably quite hard for other firms to learn from the process unless it’s all documented publicly.
I think probably there are some pharma companies or biotech companies who will share some of this knowledge publicly, but it’s not that common. I do remember reading the Genentech book which is behind me on my shelf. In that, Sally Smith Hughes talks about how one of the things that drew people to Genentech was that they were willing to publish their methods and their results in academic journals. That was important for the reputation of academics who wanted to join Genentech and leave academia for it because it meant they still have this standing of, you know, they’re producing, they’re doing this research, they’re sharing it with the world and other people can trust what they’re doing because if they can see exactly how the methods work, then they can replicate it and see and kind of build on that knowledge. I think that the building on this knowledge could be really helpful. If there’s some way to make that more public information, it would be really valuable probably, to help speed up development by other research groups or other firms working on similar problems.
Jacob Trefethen:
One source of data I would love, if it was more public, is all of the decades of documents and spreadsheets and data packages that pharma companies have for drugs that never made it to the market. Where, if a drug does make it to the market, then you have to, well, most regulators, especially the EMA in Europe, will publish a lot of information about their review of your drug, which is wonderful and helps patients and helps doctors. If the drug doesn’t make it to the market, you get less of the data, even if the drug does work, you sometimes have redactions to protect the proprietary information of the company. I think what you’ll see at least at the beginning of the AI technologies getting used in pharma is that traditional pharma companies that have operated for a long time will have a big data advantage for that reason.
Saloni Dattani:
Is there some way to use AI to un-redact redacted information?
Jacob Trefethen:
That is a fun punk science project. Yeah, someone listening, give that a go, but don’t cite us.
Saloni Dattani:
The thing that I’m imagining is there was this meme a few years ago, I think, of the TV show CSI that someone would use AI or something to... You would have camera footage of some crime taking place. The camera quality is not very high. The person would be like, “Magnify!” and then the software would just somehow improve the resolution of the footage, but actually you would just be hallucinating the information there. Which is really bad because you’re probably creating this fake persona that is doing this crime and has not actually done it. I wonder if the same thing would happen here. You’re just kind of hallucinating the details by using AI to project it.
Trust and ambition
Saloni Dattani:
Let’s say we’ve fixed all of the research and development funding problems. We have more Open Philanthropies, for example, who are helping to plug in these gaps or try to realign and get people to recognize which areas are neglected. Are there still going to be problems that we can’t solve?
Jacob Trefethen:
Aside from just simply public health systems being inadequate in a general sense — it’s not worth having a whole episode on US health insurance — reforming health insurance in many countries is going to be a requirement for getting people healthcare they need. Let’s just leave that aside. The final thing I would return back to is something you said about measles, which is measles is, technologically speaking, a solved disease in the sense of the measles vaccines are great and they’ll stop you from getting measles. But not everyone trusts that that is true and not everyone trusts the messengers who say that. So, at the end of the day, if people are going to benefit from new health technologies, the final boss is societal trust and individual trust.
That sounds, as I say it, like a negative, like, “Oh my God, AI is just going to make it so much worse because AI is populating my Twitter feed with deep fakes and I don’t know what the hell’s going on and why could I trust anything.” I’m actually not sure that that’s the way it will go. I have some hope that AI will increase people’s trust in better information, but I don’t want to bank on it. I mean, if I just look anecdotally, I think that large language models so far, I’d be so curious what you think of this. What I’m going to wager is that so far, they have led to better medical information, not worse. Because if people are getting, a lot of people talk to ChatGPT or Claude or Gemini about what drug they should take, what’s going on with some symptom they’re having. I think that, although each of those hallucinate and make stuff up, they probably give better answers by a lot than the next best alternative, which is asking your friends. So you might have more trust in science if not more trust in doctors. But what do you think?
Saloni Dattani:
Hm, I think to some degree I agree with that because I think when you’re thinking about what is the effect of talking to ChatGPT about medical information and things like that, the counterfactual to that is not having information necessarily from the CDC or something, right? The counterfactual is usually asking Google or looking up the first search results and maybe the first search result isn’t very good. I know that Google has kind of changed their algorithms in a way to prioritize better healthcare websites and information from them. But at the same time, usually that information is pretty limited. It’s, you know, you’re Google searching or you’re asking your friends or you’re asking one local doctor or something like that. People tend to have more trust in those sources of information, but generally speaking, I would say that an LLM is going to be better at providing information about, you know, should you take a measles vaccine? What are the benefits and risks and things like that? It’s going to be able to actually tell you about what the literature says and summarize that, and you can kind of talk back and forth with it and try to get it to answer your questions, and ask it to explain that to you in plain language and things like that.
But at the same time, my guess is that there’s also so much variation in this. It’s sort of like I want to run a clinical trial or something to see how well ChatGPT affects trust in the population because I can imagine that sometimes it just goes wrong and the hallucinations are actually really harmful and that if someone has some, I don’t know, rare condition or maybe they even have a common condition, but ChatGPT convinces them that they need some horrible procedure for, that might actually make them take a worse decision. It’s hard to predict how that will go. But I guess I’m also thinking about things beyond the LLMs and the social media spaces that are now cluttered with bots and deepfakes and things like that. That didn’t feel like it was the case 10 years ago. Obviously there are other issues with the internet back then, but this feels new and this feels like it’s maybe reducing public trust.
Jacob Trefethen:
It’s hard for me to take the optimistic side on that presently. If I really try and summon my optimism, it might be that we’re passing through a particular era that is not so epistemologically healthy. But we may be, we all, at the end of the day, have some desire to see truly and some incentive to see truly when it comes to our own health. That’s where my optimism comes from.
Saloni Dattani:
What about the actual ambition to solve some of the problems that we talked about? A lot of them are going to take people actually deciding to go out there and solve some of these economic incentive issues, or we need to maybe train some doctors, or we need to run better clinical trial designs, or we need to find ways to improve the recruitment of participants. Maybe some of these are policy changes, or they’re cultural changes, or they’re training people up. How is all of that going to happen? Is AI going to solve all of this as well?
Jacob Trefethen:
I think that one is for humans, and probably for the listeners of this podcast. When I get again in an optimistic mood, I just think a lot of these problems require people to take them seriously, think them through, and then you can make progress. You look at some of the big pushes of the past that have made progress, and we’ve done things much harder. We eradicated smallpox — like, oh my God, that’s a big push. I think this just being a little bit more self-confident is some of what’s missing. Will AI help with our self-confidence? I don’t know, maybe. I think overall, we just got to take a deep breath in and give it a go.
Saloni Dattani:
Do you think we would be able to eradicate smallpox today?
Jacob Trefethen:
Oh, what a horrible question. I hate to contemplate that the idea might be no. Oh my gosh.
Saloni Dattani:
I’ve been meaning to write a piece about this because my view is that smallpox was surprisingly cheap to eradicate. I don’t know if that means it was surprisingly politically easy, and I think it wasn’t; I think that was the difficult part. But if you look at just the amount of spending that went into that, it was less than the annual funding that went into the malaria eradication program. That suggests to me that there are some diseases where, if we put in the effort and we have the right technology and things like that, we can actually make really huge amounts of progress, as long as people come together and decide to coordinate, develop these efforts, and target the disease in a specific or an effective way. It does make me think that there’s a lot out there that could still be solved. What are your top three public health problems that people should solve that you think could be done with more ambition and effort?
Jacob Trefethen:
I think lenacapavir in HIV is a special case where a new tool could really change the game on a massive disease. People already know a lot about HIV. People already want to keep making a dent. Now’s the time to just...
Saloni Dattani:
Eliminate it. Use this extremely effective preventive drug.
Jacob Trefethen:
Yep. Next up would be more countries should follow Egypt’s lead on hepatitis C. Egypt in a few years went from having very unusually high rates of hepatitis C, due to public health campaigns in the 70s with needles that weren’t sterilized properly. They had probably over a 10% positive rate for hepatitis C, and now they’ve got it down to near zero because they screened and treated people. It’s the same in Georgia, the country, not the state. This is an issue that other countries could pick up, get loans from an international health funder, and just do it.
My third would be, I guess, malaria vector control. It’s possible that something like gene drives or some other form of biological vector control could, in some countries, eradicate malaria. I don’t think it will suddenly eradicate malaria everywhere; you’d have to do such a strategic set of releases, and it might pop back up, but you could really have a discontinuous effect, a drop way down. That would be wonderful. What are yours?
Saloni Dattani:
What are mine? I’m going to go a different route. I think all of yours were really great ideas. I’m going to say there are specific data collection efforts that people should be working on that will help for better research, but also will help AI tools help us more.
One of them is actually to map out the world. I don’t know if you’ve heard about OpenStreetMap, which is kind of Google Maps, but the open access, open source version of that, and anyone can contribute. You have to take coordinates from where you are, and you can insert onto this global map where the roads are, where the hospitals are, where the schools are, and specific little features on a map. For, I think it’s over 10 years, people have contributed to this mapping project voluntarily. I think they also received some nonprofit funding now to do humanitarian mapping, which is basically volunteers using satellite information, or going out into the field and mapping specific roads, hospitals, and clinics and things like that in some of the poorest parts of the world. In order for people who are delivering drugs for running nonprofits to provide humanitarian aid to these remote places, so they know exactly where to go, how to travel there, and where the nearest clinics are, and things like that, to help people map and improve that distribution process. I think that kind of thing is definitely one. Just better information about the world out there seems like a really big thing that could be hopefully solved with collaboration and open source data collection efforts.
Another one is the Demographic and Health Surveys. This is a huge initiative to collect data on causes of death in children and mothers, and also to estimate various other things related to that, like fertility rates and mortality rates. I think there are some specific efforts to collect data on HIV and malaria, like the prevalence and drug resistance and things like that, and that helps to inform lots of other efforts that are involved in distributing and delivering some of these things, and also to see how effective they’ve been, what the trends are like, and if anything is ticking up or if there are new outbreaks.
Then I think maybe the third one is, I feel like there are just a bunch of diseases that could be eliminated or eradicated that we haven’t really tried hard enough for yet. Rabies is one of them, where you could just vaccinate all of the wild animals in different ways. In Europe, for example, there have been efforts to... well, you catch rabies by a bite from a rabid animal, like a bat or a dog or a fox. Humans don’t transmit it between each other. The place that the rabies virus lives, the reservoir, is in wild animals, usually dogs, foxes, and bats. If you are able to vaccinate all the bats or vaccinate all the foxes in the wild, then you could eliminate rabies without having to develop a treatment or without even having to vaccinate humans at all.
In Europe, I think in the 2000s and 2010s, there were large-scale efforts to drop oral bait vaccines from helicopters into forests and get all the foxes and dogs and things vaccinated, so they would just eat these vaccines up. That’s not super effective because some of the foxes might be eating more of the vaccines than others, and some of them are not getting protected, and it’s funny to think about gobbling up vaccines in the wild. But you could have efforts like that in other countries as well, where you’re trying to catch infected animals and vaccinate them, or educate the population to teach them what the signs are of a rabid animal. I just think there are various diseases like this where, if we put in the effort, we already have some of the tools to eliminate them. We could just do that and why don’t we, you know?
Jacob Trefethen:
I love it. Let’s do it. Let’s do it. Let’s do it.
Summary
Saloni Dattani:
We are on the last stretch of this episode, so we should do a summary of what we’ve talked about so far.
Jacob Trefethen:
Okay, what did we cover?
Saloni Dattani:
Well, in order to develop drugs at all, we said that biological understanding is not necessary. Sometimes you can develop drugs without understanding how the disease works, and that has happened in the past. It’s less common today, and we have more rational drug design but it does help to understand how things work. You can have new theories, you can improve the technologies you have, and you can make big breakthroughs once you can understand the actual mechanisms. Then you can really filter down this search from before where you might just experiment with hundreds or thousands of different compounds, and narrow that down once you understand how the disease works. Even if we do understand how diseases work, sometimes it’s still hard to develop drugs for different reasons. Sometimes it’s hard to deliver the drugs to the right place, like in the brain, or it’s hard to make drugs that are effective and safe. There’s still a lot left even if AI is able to help us develop better candidate drugs, we still have to go through lots of testing before they can cure disease.
Jacob Trefethen:
Next, we discussed models. Animal models in the status quo are okay, but not great. Sometimes animals don’t get the same disease or don’t recapitulate infection in the same way as humans. It’s hard to generalize from animal results to human results, which leads to a lot of failure later in the clinic. Organoids are currently fairly limited, but can be a more human-like approach, and involve human cells and aggregations of multiple types of cells; they’re a newer part of the toolkit. Virtual cells that can happen on a computer are still in the early stages of utility, but absolutely an exciting area to push further forward. More complicated virtual systems that involve more than just a virtual cell and involve many different cells connected to each other. Also those are in the early days, but there are some exciting things that you could keep pushing forward and do more.
Saloni Dattani:
Right, and we often need experimental data to even validate whether those virtual cells are working correctly.
The next thing is, imagine you have made a candidate drug. You then need to test it out in humans eventually, at some point, if the drug or the vaccine is meant for people to take. I think to some degree, AI can improve the pipeline of drugs that get to this stage, but it’s really hard to still collect human efficacy data. Sometimes it takes a really long time to run trials because the disease might be rare, or the outbreak might have passed and it’s going to take a while before the next outbreak happens, and in the meantime, you might be waiting years, you might have to have thousands or tens of thousands of people in your clinical trial in order to see whether a potential drug has any effect at all. Sometimes even after that long period, you might find that it has failed. There are probably lots of non-AI related reforms that you could do to improve this process: better trial design, improved recruitment of participants into trials, and maybe AI can help in some ways better matching people to trials that they might be interested in. There are still a lot of human, regulatory, and policy-related obstacles that mean efficacy data is quite hard to collect. Then there are various statistical reasons that it takes a long time, and it’s something that is very difficult to automate or predict in advance.
We also talked about biomarkers, how they are sometimes useful and can sometimes tell you about the earlier stages of a disease, but you still need to validate them. You still need to collect data to find out how well they correlate with the later stages of a disease. Then you need causal inference research. You need to find out whether the biomarkers are just a correlate or are they just a byproduct of the disease process, and if you treat them, you might not actually improve the overall disease that much. I think that to some degree, AI can help synthesize the existing research, but you still need to actually do experiments, and you still need to validate this stuff in real life.
Jacob Trefethen:
That’s the human efficacy data. We then discussed safety. Safety is probably harder, at least in my view, to skirt around or improve upon with AI than efficacy. When it comes to preventive drugs, my optimism would be for treatment drugs, where your other option is really bad, and you’re in a bad disease state already, where maybe there’ll be some acceleration that people will tolerate more, there are other things you could try and do with AI. You could try and model liver metabolism better, model drug interactions better, and that will help. A lot of the knowledge or data that already exists that would help you do that is proprietary in drug companies, rather than public, so there’s a limitation on how you can build public models there currently.
When it comes to safety data, another thing that’s hard to get around is that variety and heterogeneity between different people means you need larger samples. You need larger samples to detect rarer events, and you just need even more data to detect drug-drug interactions. Existing datasets, when they get really large, are usually the ones that are not randomized because they are people self-reporting, so it’s hard to get the causal links there. You don’t want to blast through safety too quickly for things that could do a lot of harm.
Saloni Dattani:
Once we’ve developed drugs, we’ve tested them, and it turns out they’re effective and safe, there are still various challenges that AI might not be able to solve. What are some of those?
Jacob Trefethen:
Manufacturing and delivery is what we talked about next. Some technologies here are going to be easier than others, especially small molecule drugs and anything else you can already buy over the counter, vitamins, supplements, all of that. AI may help if knowledge can be generated about what combinations of those drugs people could use to benefit themselves. Ultimately, some new modalities, in my opinion, we need more economic growth to support the high costs that they may require, and in particular, even if we manage to get manufacturing costs down, there’s always going to be the cost of delivery. The more you need an expert human or humans to be present to be doing procedures or to be monitoring procedures, the more regular those procedures are, the more expensive the cost structure here will be. So some of those technologies get invented by AI, that will be wonderful, but we’ll need some economic growth to go alongside and some good health systems and good insurance reforms to go alongside those inventions to deliver the drugs to people who need them.
Saloni Dattani:
The economic growth angle you mentioned is really interesting. It also leads to this other thing that we talked about, which is the skew of which problems we actually have decided to tackle so far. For now, most of the problems that the new technologies that are being developed in medical research are kind of rich country diseases. They’re diseases where the incentives for new drug developments are very high, the markets are high, and there’s going to be a higher return on investments. That’s much less the case for tropical diseases, for rare diseases, for areas where the funding model is just not going to work, like antibiotics, where new drugs that are developed have a very limited market and there isn’t much of an incentive to develop the drugs at all, or once they’re developed, to actually take them through clinical trials and to manufacture them at scale. Better incentives are needed, better funding models are needed. Otherwise, some products that are developed won’t get delivered, and some diseases won’t have drugs developed against them. Even when cures do exist, they won’t reach everyone who wants or needs them.
Jacob Trefethen:
The next thing we talked about was trust, societal trust in the medical system, and trust from different individuals in the information that they receive about what might benefit them. In my opinion, this is actually the biggest wild card, but a wild card in the sense that it won’t necessarily get worse from where we are right now. Possibly it will get better, if better information is delivered in ways that people trust via large language models or other routes that AI improves things, I think we’ll just have to wait and see on this one.
Saloni Dattani:
Finally, we talked about ambition. I think more ambition is generally important for solving public health problems. Lots of problems, lots of diseases can be eliminated, are solvable, but it doesn’t happen without the funding, without the incentives, without the willpower, and also the right ideas and incentives, types of reforms and ideas to solve those big challenges by focusing on the right barriers and actually making progress against some of the more tractable areas here. I feel like I am optimistic for the future of medical progress. I’m less certain about how much AI will help here. I do think that there are lots of ways that it can help, but there are just so many other problems outside of this that need other types of reform, other types of efforts.
Jacob Trefethen:
What parts of the drug development system do you think will be easier for AI to solve?
Saloni Dattani:
I think the areas where there is lots of abundant data and there’s clear structure to them, people have verified the data, collected it in a curated way. Protein and drug design, for example, that we talked about in the previous episodes, are areas where there has been a lot of progress. Even then, we haven’t really solved some of the bigger challenges. We’ve made a lot of great headway against predicting the structure of proteins in dissolved water, but we haven’t really done that for other types of interactions, how they move around, and their dynamics. Similarly, if you’re trying to repurpose drugs or discover new potential targets for drugs, I think those are things that AI could be quite helpful for.
Maybe to improve the clinical trial recruitment process or designing better trials or identifying better biomarkers, I think those are things where AI could help. I still think there are quite a lot of areas where you need someone to actually think through what are the right incentives to develop for this, what are the policy changes, and what are the specific people who are needed to be involved to improve or collect the data in the first place in order to improve. I guess some of these more trickier things where you want to solve a problem, but you need to integrate lots of different data: genomics data, imaging data, chemical data, if you wanted to simulate something; that’s going to be much easier for AI to do than for one person or a small team of researchers.
Jacob Trefethen:
Which parts of drug development do you think are going to be harder for AI to help with?
Saloni Dattani:
Well I guess in some ways the reverse. Areas where there’s very limited data, where the data isn’t digitized, things like that. Also the physical operations, the manufacturing that we talked about, manufacturing things on the frontier, the types of technologies where it’s hard to replace or simplify the process, and where the capital is very expensive. Then areas where you need regulatory reform, or maybe you need medical regulatory capacity and staffing and expertise to build. Also, the areas where the disease or the incentives are not very good, diseases of poorer countries, other conditions where developing treatment doesn’t have very good incentives for it, places where there isn’t very much political will or political capital for people to be interested in solving that problem. Maybe finally, areas where it’s hard to get data on efficacy and safety. Maybe we haven’t developed the right tools yet in order to test the right places of our body, such as the brain, or it takes a really long time to see what the outcome of these studies is going to be.
Jacob Trefethen:
Putting that all together, let’s go back to the question that we started with. Saloni, will AI cure all disease in a decade?
Saloni Dattani:
No, I think no. Not even if there’s a radical change in the drug development pipeline. For lots of diseases, the data isn’t there. Biology is really complex, it’s hard to predict. We don’t have the tools yet. We don’t have the starting point of the data to go to make these predictions very well. The economic incentives are still quite limited for many diseases. There’s a lack of access to drugs even now where there are cures for diseases or that are extremely effective vaccines. Things like getting the trust, the ambition, and the access, those are still barriers. I think yes, there are lots of things that could be improved, reforming drug development overall. We can make lots of medical progress by trying to solve all of these different issues. I think AI could be impactful, but I think no, it’s not going to solve this alone. What do you think?
Jacob Trefethen:
I don’t think AI will cure all diseases in a decade, but I do think that some of the progress we might see from AI may lead to more discontinuous types of progress within parts of the scientific system that we have so far seen in proteins and discussed in this series. So I would maintain probably actually more optimism than the average scientist that things might start looking pretty interesting.
The level of optimism I’m not at yet is that, if AI progress in software terms keeps continuing, and AI companies build new large language models or reasoning agents who can act as scientists in digital form, talk to each other, and reason with each other, I think that that could lead to a lot of scientific progress too and might be very hard for us to predict because we haven’t approached that type of system yet.
What I would emphasize to people who are thinking through that lens of AI progress is that even in that case, some of the things we’ve discussed today will keep applying, and the bottlenecks to medical progress in manufacturing, in delivery, in working on problems that affect people that don’t have great financial incentives to work on, all of those will still exist.
The AI will maybe advise us to start working on those problems, but the great news is that we can already start working on those problems right now. Those problems are not bottlenecked on more intelligence per se. They are bottlenecked on ambition, they’re bottlenecked on energy, they’re bottlenecked on money, and they’re bottlenecked on caring. Let’s get that right starting now, let’s take it seriously. Let’s take some of the advice the AIs of the future may give us, but take it in 2025.
Saloni Dattani:
We are now at the end of the episode. Thank you so much for listening. I hope you subscribe and share this with your friends and enemies. You should give us a rating on Spotify or Apple or wherever you listen to this. If we’ve changed your mind, let us know. If we haven’t, let us know still. But I hope you enjoyed this episode.
Jacob Trefethen:
Thanks for staying with us. Thanks for letting us speculate. Great to talk to all of you again.
Saloni Dattani:
Bye!
Show notes
Blogposts:
Claus Wilke (2025) We still can’t predict much of anything in biology https://blog.genesmindsmachines.com/p/we-still-cant-predict-much-of-anything
Elliot Hershberg (2025) What are virtual cells? https://centuryofbio.com/p/virtual-cell
Jacob Trefethen (2025) Blog series.
1) What does AI progress mean for medical progress? https://blog.jacobtrefethen.com/ai-progress-medical-progress/
2) AI will not suddenly lead to an Alzheimer’s cure https://blog.jacobtrefethen.com/ai-san-francisco/
3) AI could help lead to an Alzheimer’s cure https://blog.jacobtrefethen.com/ai-optimism/
Articles:
Wendi Yan (2024) Discovering an antimalarial drug in Mao’s China https://www.asimov.press/p/antimalarial-drug
Jason Crawford (2020) Innovation is not linear https://worksinprogress.co/issue/innovation-is-not-linear/
Shayla Love (2025) An ‘impossible’ disease outbreak in the Alps https://www.theatlantic.com/health/archive/2025/03/als-outbreak-montchavin-mystery/682096/
Alex Telford (2024) Origins of the lab mouse https://www.asimov.press/p/lab-mouse
Jonathan Karr et al. (2012) A whole-cell computational model predicts phenotype from genotype https://pmc.ncbi.nlm.nih.gov/articles/PMC3413483/
Wen-Wei Liao et al. (2023) A draft human pangenome reference https://www.nature.com/articles/s41586-023-05896-x
Per-Ola Carlsson (2025) Survival of transplanted allogeneic beta cells with no immunosuppression https://www.nejm.org/doi/pdf/10.1056/NEJMoa2503822
Saloni Dattani (2024) Antipsychotic medications: a timeline of innovations and remaining challenges https://ourworldindata.org/antipsychotic-medications-timeline
Saloni Dattani (2024) What was the Golden Age of antibiotics, and how can we spark a new one? https://ourworldindata.org/golden-age-antibiotics
Books:
Sally Smith Hughes (2011) Genentech: The beginnings of biotech
Theses:
Alvaro Schwalb (2025). Estimating the burden of Mycobacterium tuberculosis infection and the impact of population-wide screening for tuberculosis.
Acknowledgements:
Aria Babu, editor at Works in Progress
Graham Bessellieu, video editor
Abhishaike Mahajan, cover art
Atalanta Arden-Miller, art direction
David Hackett, composer
Works in Progress & Open Philanthropy













