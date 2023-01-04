MedPaLM addresses multiple-choice questions and questions posed by medical professionals and non-professionals through the delivery of various datasets. These datasets come from MedQA, MedMCQA, PubMedQA, LiveQA, MedicationQA, and MMLU. A new dataset of curated, frequently searched medical inquiries called HealthSearchQA was also added to improve MultiMedQA.

The HealthsearchQA dataset consists of 3375 frequently asked consumer questions. It was collected by using seed medical diagnoses and their related symptoms. This model was developed on PaLM, a 540 billion parameter LLM, and its instruction-tuned variation Flan-PaLM to evaluate LLMs using MultiMedQA.

Med-PaLM currently claims to perform particularly well especially compared to Flan-PaLM. It still, however, needs to outperform a human medical expert’s judgment. Up to now, a group of healthcare professionals determined that 92.6 percent of the Med-PaLM responses were on par with clinician-generated answers (92.9 percent).

Excited to share Med-PaLM, a large language model aligned to the medical domain to generate safe and helpful answers.



Our work advances SOTA in 7 medical question-answering tasks, including achieving 67% on MedQA USMLE improving prior work by >17%.https://t.co/FSSpzATotz pic.twitter.com/B0rvtUEysV — Shek Azizi (@AziziShekoofeh) December 27, 2022

This is surprising as only 61.9 percent of the long-form Flan-PaLM answers were deemed to be in line with doctor assessments. Meanwhile, only 5.8 percent of Med-PaLM answers were deemed to potentially contribute to negative consequences, compared to 6.5 percent of clinician-generated answers and 29.7 percent of Flan-PaLM answers. This means that Med-PaLM replies are much safer.

Other AI-based ventures

This isn’t the first time Google ventured into AI-based healthcare. In May of 2019, Google joined up with medical researchers to train its deep learning AI to detect lung cancer in CT scans, performing as well as or better than trained radiologists, achieving just over 94 percent accuracy.

In May of 2021, Google rolled out a diagnostic AI for skin conditions on smartphones, which would allow every smartphone owner to have an idea of what their diagnosis might be. The app did not replace the role of a professional dermatologist, but it was a significant step forward for the field of AI healthcare.