Lessons from Vivek Natarajan, Lead of Med-PaLM-2 Google Health and Viswesh Krishna, CTO and Co-Founder of Valar Labs building AI applications and models in healthcare
Recap from Pear Healthcare x AI Panel
Subscribe to our Substack for weekly updates and listen on Apple Podcasts or Spotify.
Welcome back to the Pear Healthcare Playbook! Every week, we’ll be getting to know trailblazing healthcare leaders and dive into building a digital health business from 0 to 1.
Today, we’re excited to get to know Vivek, AI researcher at Google and one of the lead researchers for Med-PaLM2, and Viswesh, CTO and Founder of Valar Labs!
Vivek is a Research Scientist at Google Health AI advancing biomedical AI to help scale world class healthcare to everyone. Vivek is particularly interested in building large language models and multimodal foundation models for biomedical applications and leads the Google Brain moonshot behind Med-PaLM, Google's flagship medical large language model. Med-PaLM has been featured in publications such as The Scientific American and Forbes. Vivek graduated with his masters from UT Austin in Computer Science and Bachelors at National Institute of Technology in India.
Viswesh is the CTO and Co-Founder of Valar Labs, building clinical grade deep learning to analyze each patient's characteristics and provide clarity to oncologists during decision making. Their AI is built with oncologists at the center and provides interpretable and actionable insights. Prior to founding Valar Labs, Viswesh was a Research Assistant in Stanford's Artificial Intelligence Laboratory (SAIL) leveraging cutting edge artificial intelligence to solve healthcare problems. He was also the founder of Kanna, a patented and clinically-validated method to detect Amblyopia in children in India. Viswesh graduated with a bachelors in Computer Science at Stanford.
In this episode, Vivek and Viswesh shares how they got into Healthcare AI research and how they fell into different career paths, one leading Research at Google Health AI and the other as the CTO and Co-Founder of Valar Labs. We talk about the future of LLMs in healthcare, and also how to build defensibility in AI healthcare startups.
If you prefer to listen, here is the recording!
What did you want to be growing up?
Vivek: “Cricket commentator.”
Viswesh: “Healthcare.”
What brought you to this intersection of Healthcare and Machine Learning? What were the different pivotal moments in your life?
Vivek:
Growing up, healthcare had a huge impact on Vivek. He grew up in parts of India where for most people, going to the doctor meant walking 30 miles in extreme weather and giving up a day’s wages. Most people would go their entire lifetimes without seeing a doctor.
Vivek studied electronics engineering during his undergrad in India. His final thesis was an app called Ask the Doctor Anytime, Anywhere. He had very little idea of AI or machine learning, but he shares that he was a product of the MOOC (Massive Open Online Courses) revolution.
He stumbled onto a course one day when he was browsing the Internet with whatever little bandwidth he had access to. It was an ML course from Professor Yasser Abu Mustafa at CalTech called Learning from Data, and that got him hooked on to AI and machine learning. When he came to grad school, he took as many courses in machine learning and AI as possible and landed his first industry role in Facebook AI Research back in 2014.
Vivek spent four years at Facebook working on a bunch of different problem areas: conversational AI, natural language processing, computer vision, speech recognition. He loved that they were deploying models into real products for hundreds of millions of users at scale, and that it was lifting key metrics like engagement or revenue likes never seen before.
When it came to his next move, Vivek thought, “Where should I work on AI? Where should I try to use AI next? And the answer to me was healthcare.” Google Health was coming together at that time with researchers from Google Brain and Deep Mind. Vivek joined and has loved his time there, trying to build ever increasingly general, medical AI models, the latest instantiation of which is Med PaLM2, and hopefully many more to come. He shares that this is probably the most exciting time to be in at the intersection of AI and life sciences.
Viswesh:
Viswesh shares that when his brother was 3, he was diagnosed with amblyopia He started this deep learning journey trying to build solutions that could catch this disease early from the start. Over several years, they ended up taking this from an idea through conversations with doctors to deploying it into a solution.
When Viswesh came to Stanford, he joined the Stanford AI Lab (SAIL). We’ll get into this more later with Valar Labs’ founding story.
Google Health
Google Med-PaLM 2 model has reached SOTA performance of 86.5% on MultiMedQA and Med-PaLM2 answers were judged to better reflect medical consensus 72.9% of the time compared to physician answers. Further, Med-Palm2 performed significantly better than Med-PaLM across every axis, and the two models were in less than 3 months apart. I asked Vivek about the secret sauce:
Vivek thinks a lot of the work is just building on the shoulder of giants. Vivek would attribute the larger language model revolution to three big discoveries or papers over the last few years:
The Transformer Architecture paper (2017)
The rise of decoder-only large language models from OpenAI GPT3 (2020)
The development of very strong alignment techniques, which has enabled companies to put these models out there in the real world without causing issues
The availability of datasets and benchmarks that researchers at the National Library of Medicine, National Institute of Health have created over a long period of time
How did it significantly improve in 3 months? Can we expect it to improve even more?
Vivek couldn’t share much of the details, but he thinks that the pace of improvement is going to be the same, if not more steep. Google researchers have been building PaLM2 which allows them to build on top of that, and having access to great models helps.
How are clinicians involved in the design?
The entire team has a lot of involvement from clinicians. Vivek shares that his co-partner in crime conspirator is Alan Karthikesalingam, an MD and PhD.
“We believe medicine and healthcare is a multi-disciplinary endeavor, so if you call come together and work with people in regulatory policy and other domains with expertise in ethics, safety, and bias, then you can build something because this domains needs all of that.”
What is the amount needed to be viable for medical use cases? Is our current USMLE result enough?
Vivek shares that the USMLE result is a headline-grabbing result, but that tells you nothing about how these models are useful in practice— whether that's in provider workflows or more safety, critical diagnostic applications. The work that they need to do to understand the capabilities of these systems is a long and arduous journey, but they’re excited to take it on.
Valar Labs
Can you share the founding story of Valar Labs? How did you meet your co-founders?
Viswesh shares that when he first came to Stanford, he immediately joined the Stanford AI Lab. Led by Andrew Ng, the Lab has two programs: one in climate change and the other in healthcare. He worked through a series of different projects and papers that covered a whole gamut of healthcare delivery, and that’s where he and his co-founders first met.
With the help of Stanford Med’s close relationship to Stanford AI, they ended up speaking to hundreds of oncologists, pathologists, and radiologists in order to deeply understand the cancer care journey for the patient. What they found was that there’s a lot of uncertainty in which treatment to give a cancer patient at the start of their care journey. Here’s an example:
“As soon as a patient is diagnosed with pancreatic cancer, the oncologist has two frontline decisions to make— there’s basically two treatments that he can give the patients. And, if you look at the Phase 3 data for these drugs, they’re almost equivalent in terms of their efficacy. So how do you choose?”
Viswesh shares that oncologists will typically start on one treatment and if it doesn’t work, then they’ll try the next treatment. What they found was that in cases where the patient fails the first treatment, the second treatment was likely to succeed. Clearly, there was an opportunity here to make identifying the right treatment more clear.
This is the pain point that started it all, and what they found was that you need data to make this decision and that the data piece comes in with pathology (a slide created by a pathologist that diagnoses a cancer patient/identifies the cancer). Viswesh shares that this slide actually has a really rich amount of information. It’s currently only being used to detect the presence of cancer (yes or no), but from pathology, you can decide the biology of the patient’s cancer, what treatment is the right treatment to give them, etc.
For those not familiar with Valar, tell us about your product.
Valar Labs builds clinical-grade AI diagnostics that, by mapping the biology of their cancer, identify what treatment a patient should get at the start of their care journey. It optimizes patient care as well has helping the clinician make the right decision for the patient.
Viswesh shares that the key piece of the discovery phase, and something many people underestimate, is “understanding what the right clinical questions are”. It’s important to identify the most pressing need for the clinician, and that’s what they spent a lot of time doing. Treatment decision uncertainty is a huge problem for every clinician— the work was then figuring out where technology would be most useful in that process.
Valar’s initial focus was on pancreatic and bladder cancers. Viswesh shares why:
There are certain decisions where digital pathology can make a huge impact. Using bladder cancer as an example, clinicians usually look at whether it’s a big or small tumor and decide treatment from there — which isn’t the best way to be treating patients. A lot of people tend to think about diagnosis in terms of genomics, but Viswesh shares that less than 10 or 20% of cancer patients have actionable genomic mutations, so it’s not as impactful as you would think. Pathology exists for every patient, and visual pathology can enhance utility especially in cancers where 90% of diagnoses are uncertain.
Can you share the progress Valar has made in the last year on initial studies (e.g., ASCO)?
Viswesh shares that Valar has focused on putting a lot of their research work out there over the last year. In pancreas and bladder, they’ve published a large number of publications / full-length papers and gone to conferences to help folks understand how useful their platform is. Outside of research, they’ve focused on working with a lot of collaborators and getting people to buy into the platform.
After this first phase of clinically validating these diagnostics, the next phase is to start shipping them out and getting them to patients.
Perspectives on healthcare and AI at large
What are your predictions on how generative AI will impact healthcare? Do you think it’ll be easier to create software for clinician workflows or back office workflows?
Vivek:
Vivek sees AI being integrated into everything eventually. In the near term, he thinks administrative documentation workflows will be the first thing freed up for care providers and clinicians.
Diagnostic applications are further out because they require much more rigorous validation, studies, and regulatory frameworks— but this is Vivek’s north star, and his estimate is that we’ll get there in 5-10 years.
“The amount of data that we’re generating today — at the individual level, but also at the population in terms of life sciences, bio, health, everything — I don’t think humans can feasibly incorporate and integrate all of that data and use that to make decisions. We definitely need AI, but that AI has to be validated to ensure that it has the right safety and accuracy.”
Finally, Vivek also believes AI will help us better understand human diseases / mechanisms of how things evolve, which will help us discover new therapeutics.
Viswesh:
Viswesh doesn’t think Valar will end up using generative AI, but Valar is still fueled by the foundation model improvements that we’ve seen over the last 5-10 years. From drug development to patient management to further surveillance, there’s plenty of places to implement AI— some using generative AI and some not.
What excites you about AI applications in healthcare generally, including outside of generative AI?
Vivek:
Vivek personally believes there’s a ton of short-term value, but the ultimate goal is to put a world-class GP in smartphones — or whatever devices that billions of people around the world will be able to use. “Put an AI doctor into the pockets of billions.”
Med-PaLM plans to partner with healthcare organizations using Google Cloud’s through the process tester program. They’ve already begun with big providers and life sciences companies. Vivek believes it’s unlikely that they’ll open up these models for everyone to use since they haven’t fully characterized the capabilities or limitations yet, but they are excited to work with researchers in the community who have the right amount of expertise to help them understand these models. He thinks they’ll gradually expand access as time goes on.
Is the future of healthcare LLMs applications built on open source projects?
Viswesh thinks there are two sides to it:
The open source community is going to continue to innovate.
On the application side, it’s going to be more closed due to the fact that you have to be careful about using patient data.
He sees both being involved — more open on the discovery side but more closed on the patient management side.
How do you suggest controlling for patient data and building on-premise servers?
Viswesh shares that genomics has defined a solid pathway to partnering with hospitals: you send any of the required data out and then move the results back into the physician’s hand.
In terms of the cloud vs. on-prem side, Viswesh believes there will be more cloud adoption because that’s where the industry is going. But in general — “healthcare tends to move more slowly because it’s all about ‘do no harm’ for the patient”, so the cloud will slowly eat into healthcare applications.
How should healthcare AI companies think about differentiation and moat? Is there such a thing?
Viswesh says this is always complicated— sometimes execution is a moat, IP is a moat, etc. In terms of healthcare companies, the moat is developing a data engine. His prediction is that if you can build a product that adds AI into it, that’s useful.
“But, if your product is simply just AI and you don’t have anything that’s surrounding it, then it’s very easy for someone to come in and potentially produce a better model.”
Vivek shares that today, incumbents are integrating AI very quickly given their distribution. His opinion is that this isn’t where the opportunity is…
“The opportunity is to take more of a long-term view and reimagine, for example, how care delivery is done today. Look at problem areas where people aren’t paying enough attention to— like reimagining drug discovery, for example.
I think this is where the opportunity is because AI is fundamentally a platform shift similar to the Internet, mobile, everything.”
Vivek is noticing similarities to previous platform shifts. People tried to integrate these technologies or platforms into existing workflows. The companies that became unicorns were concepts that reimagined and rebuilt workflows on the new platform. “I don’t see anyone doing that yet, but I think that’s where the opportunity or the most value will be created.”
Is there anything that worries you about the use of AI in clinical delivery, drug discovery, or other aspects of healthcare? What are the biggest risks in your mind and what are some interesting ways to address them as humanity?
Vivek:
Vivek believes that safety is paramount. Depending on the application scenario, healthcare can be high stakes— like diagnostics. Consequences can be life threatening, so it’s important that they validate with proper studies, they understand the limitations of these models and show that there are fail safes.
“Doing it responsibly matters a lot more to us. There’s nothing more important. If you make a mistake, that sets back the entire field and community by years, and we don’t want that.”
How do you deal with figuring out commercialization? What do you think about the regulatory tailwinds in AI and healthcare?
Vivek:
Vivek shares that it’s still too early to give a definite answer, but he sees an open mind from people on the regulatory side. Everyone clearly sees the opportunity, but it has to be done in the right way. He shares that it’s an ongoing dialogue, but he expects there to be more clarity on regulation policies the near future.
Viswesh:
Viswesh thinks the SaMD (Software as a Medical Device) vs. LDT (Lab Developed Tests) distinction is important here. For SaMD, there’s a regulatory procedure that you need to go through which certifies your software. LDTs are regulated by different bodies. Valar’s tech ended up looking more like LDTs which gave them a different regulatory paradigm. Viswesh shares that SaMD needs to be updated for AI, and whatever is finalized there will decide it.
What advice would you give someone wanting to pursue a similar career?
Vivek:
The deep learning landscape has shifted quite a bit since 2014. Back then, he shares that it was easy for someone with no credentials to jump in as long as they were ready to learn and work hard and it was right place right time— consider Alec Radford at OpenAI who doesn’t have a PhD. Unfortunately, Vivek believes credentials matter a lot more now. What’s changing is the open source movement and community. “You can sit back at your home in your garage and have a GPU– and maybe train a model.”
His advice: “Go out and build stuff.” If you’re working in healthcare, make sure the patient is your primary concern and be driven by that mission.
Viswesh:
Viswesh echoes the sentiment that being in the right place at the right time matters. His advice is that it’s important to be intentional about planning, but not always having to stick to those plans.
Questions from the crowd
Question 1: What are the different datasets Med-PaLm2 uses?
Vivek:
Vivek shares that they haven’t deviated too much from the core architecture they have for their base language models. By training on internet scale data, the models become very endowed with intelligence and easy to build upon. Next word prediction requires a lot of reasoning; the models need to learn about physics and medicine and chemistry, etc.
Vivek uses the comparison of a student going to medical school. In medical school, you learn more about biology and clinical medicine, but you also experience real world clinical settings/workflows. You’re taking these very intelligent models and specializing them, work that Vivek calls “build[ing] on the substrate of general intelligence that we think is necessary”. This helps them rapidly adapt the models to the requirements of the domain, but then ensure that these models are really specialized and suited for safety critical applications.
Their evaluation process is very fine grained, looking at multiple axes like scientific precision, harm bias, and more. Vivek shares that they also train and reward models to continue aligning them to the requirements of the domain; then, they can control the outputs through fine grained knobs. The alignment process is detailed and tailored to the requirements of the domain, but that Vivek thinks is necessary.
Question 2: How do you qualify a model?
Viswesh:
Viswesh shares that Valar thinks about this very differently from the way most AI deep learning research is done. Typically you have black box models that spit out outputs, and you go with the flow. These models tend to break as the distribution shifts, let’s say from hospital to hospital as different procedures will lead to radically different predictions.
Valar is focused on building the first stage: showing enough adaption to ensure that the model itself produces something that’s interpretable by humans, and then building complex mathematical signatures to model the biology as a second stage.
Vivek:
Vivek is more okay with AI systems being black boxes— it’s difficult to understand. He shares that he’s more interested in ensuring failsafes. “When we’re deploying these models, how accurately are we monitoring them for any sort of shifts and predictions? Do we have humans in the loop to verify the solutions or catch in case something goes wrong?”
AI is just one part of a bigger solution. There’s a lot that comes before and after. Vivek recommends reading this paper, a human centered evaluation of an AI diabetic retinopathy model. You need to understand the workflow where you’re deploying these models.
Question 3: Hype?
Vivek:
Vivek’s honest answer is to ignore overhyped products. If something’s really worth incorporating into your workflow, it’ll be around after six months or longer.
Question 4 from Stanford faculty: financial incentives are not aligned + how do we bring expensive technology to the communities that need it the most?
Vivek:
Vivek believes that there is rapid digitization happening in certain places around the world, and people’s health records are now becoming easier to access. Data is coming online very rapidly. He also believes that healthcare is only one application of this technology — there’s thousands more. The cost of training these models are amortized over so many different applications, and so he’s less worried about the financial incentives given how much value will be created.
Second, as you train these models at scale, they become more and more robust and more and more data efficient. The amount of data you need to adapt these models to new situations or fine tune them is also rapidly decreasing, and so is the amount of work.
Viswesh:
Viswesh only has two things to say: software’s eating the world and technology’s deflationary. These two theses take care of all of this.
Question 5: Specialties
Viswesh:
Even within a specialization, there are hundreds of tasks. Previously, AI wasn’t generalizable. You had to build an AI for each individual task, and that can never scale. This is the key shift that’s happening— AI is becoming generalizable, and over time we’ll see how that affects us.
Vivek:
Vivek shares that for him and his team at Google, everything is an interdisciplinary endeavor. Whether it’s the metrics they want to optimize or how they train the models, everything is done by a combination of research, clinicians, and others with the proper knowledge. “It’s an interdisciplinary endeavor, and we need all of working together… There’s still going to be a lot of value for specialist knowledge.”
Question 6: How do you ground models with local sources?
Vivek:
The first step is to get hold of evaluation sets that reflect the use cases and the workflows that they have, and then see how well your model is performing. Even if it’s small enough, that still goes a long way in terms of benchmarking.
Vivek shares that as the LLM space evolves, we’re seeing retrieval augmented language models that can make more use of local context documents and use it as additional information. This will help LLMs become more and more real solutions to institutions.
Vivek shares that we’re getting better and better at grounding the model responses in authoritative sources that people can trust; that’s going to help solve this domain generalization challenge.
Question 7: Future multi-modal applications
Vivek:
If you look at how doctors provide care today, they’re able to interpret data from many different modalities— lab records, medical images, EHRs.
“We’ve so far been talking about language models. We obviously need to become multimodal so that you can provide the full context to the model when it starts making these consequential decisions.”
Along with these new multimodal capabilities, we’ll have new systems that can interpret this data.
Advice for people entering today’s ecosystem of deep learning:
Vivek:
The checklist: 1. Choose what you want to build. 2: Try building it. 3: See if it has traction. If it has traction, then the rest will follow.
A note from our sponsor: PacWest
Looking for guidance, connections, resources, opportunity? Pacific Western Bank’s banking products and services are built to support your evolving needs as you navigate the challenges of growing a successful business. As you continue to scale, our team will be with you every step of the way. Ready to take your business to the next level? Learn more: pacwest.com