Microsoft's New AI Tool Can Diagnose Patients More Accurately Than Doctors

6 hours ago 2
ARTICLE AD BOX

Microsoft researchers unveiled a new artificial intelligence (AI) system on Monday that can diagnose patients more accurately than human doctors. Dubbed the Microsoft AI Diagnostic Orchestrator (MAI-DxO), it includes multiple AI models and a framework that allows it to go through patient symptoms and history to suggest relevant tests. Based on the results, it then suggests possible diagnoses. The Redmond-based tech giant highlighted that apart from the accuracy of the diagnosis, the system is also trained to be cost-effective in terms of tests conducted.

Microsoft Develops Benchmark to Test MAI-DxO's Performance

In a post on X (formerly known as Twitter), Mustafa Suleyman, the CEO of Microsoft AI, posted about the MAI-DxO system. Calling it a “big step towards medical superintelligence,” he said the AI system can solve some of the world's toughest medical cases with higher accuracy and lower costs compared to traditional diagnostic measures.

MAI-DxO simulates a virtual panel of physicians with diverse diagnostic approaches who collaborate to solve medical cases, the company said in a blog post. The Orchestrator includes a multi-agentic system where one provides a hypothesis, one picks the tests, two others provide checklists and stewardship, and the last challenges the hypothesis.

mai dxo workflow mai dxo workflow

MAI-DxO workflow
Photo Credit: Microsoft

Once a hypothesis passes this panel, the AI system can either ask a question, request tests, or provide the diagnosis if it feels it has enough information. In case it recommends a test, it performs a cost analysis to ensure that the overall cost remains reasonable. Interestingly, the system is model agnostic, meaning it can perform with any third-party AI models.

Microsoft claims that the system boosts the diagnostic performance of every AI model that was tested. However, OpenAI's o3 fared the best by correctly solving 85.5 percent of the New England Journal of Medicine (NEJM) benchmark cases. The company said that the same cases were also given to 21 practising physicians from the US and UK, and all of them had between five to 20 years of clinical experience. The human doctors had an accuracy of 20 percent.

MAI-DxO can be configured to operate within defined cost constraints, the company said. Once an input budget has been added, the system explores cost-to-value trade-offs while making diagnostic decisions. This helps in the AI system only ordering the necessary tests, instead of every possible test to rule out all causes of the symptoms.

To assess the AI system, Microsoft also developed a new benchmark dubbed the Sequential Diagnosis Benchmark (SD Bench). Unlike typical medical benchmark tests that ask multiple-choice questions, this test assesses AI systems' ability to iteratively ask the right questions and order the right tests. Then it evaluates the answers by comparing them to the outcome published in the NEJM.

Notably, the MAI-DxO is not yet approved for clinical use, and is meant as initial research into developing AI capability in diagnostic operations. Microsoft said that its AI system can only be approved for clinical usage after rigorous safety testing, clinical validation, and regulatory reviews.

Read Entire Article