Axios AI+

A floating, translucent blue 3D render of the human brain.

August 29, 2023

Hi, it's Ryan. Today's AI+ is 1,271 words, a 5-minute read.

1 big thing: ChatGPT plays doctor with 72% success

Illustration of AI elements coming out of a doctor's suit where their head should be. — Illustration: Maura Losch/Axios

As AI capabilities advance in complex medical scenarios that doctors face on a daily basis, the technology remains controversial in medical communities.

The big picture: Doctors are grappling with questions about what counts as an acceptable success rate for AI-supported diagnosis and whether AI's reliability under controlled research conditions will hold up in the real world.

Driving the news: A new study from Mass General Brigham researchers testing ChatGPT's performance on textbook-drawn case studies found the AI bot achieved 72% accuracy in overall clinical decision making, ranging from identifying possible diagnoses to making final diagnoses and care decisions.

Why it matters: AI could ultimately improve both the efficiency and the accuracy of diagnosis as healthcare in the U.S. gets more expensive and complicated as individuals live longer, and the overall population ages.

While America is home to many of the best physicians and hospitals in the world, in 2021, the U.S. spent around 18% of GDP on health care, nearly twice as much as the average advanced economy.

Details: The Mass General Brigham study is among the first to assess the capacity of large language models across the full scope of clinical care, rather than a single task.

The study "comprehensively assesses decision support via ChatGPT from the very beginning of working with a patient through the entire care scenario" including post-diagnosis care management, the report's co-author Marc Succi, executive director at Mass General Brigham's innovation incubator, told Axios.
ChatGPT got the final diagnosis right 77% of the time. But in cases requiring "differential diagnosis" — an understanding of all the possible conditions a given set of symptoms might indicate — the bot's success rate dropped to 60%.
A second study across 171 hospitals in the U.S. and the Netherlands found that a machine learning model called ELDER-ICU succeeded at identifying the illness severity of older adults admitted to intensive care units, meaning it "can assist clinicians in identification of geriatric ICU patients who need greater or earlier attention."

Be smart: While AI has outperformed medical professionals in some specific tasks, such as cancer detection from medical imaging, many studies of the possible medical uses of AI have yet to translate into real world practice, and some critics argue that AI studies aren't grounded in real clinical needs.

Of note: AI tests in a research setting come with no risk of malpractice lawsuits, unlike humans operating alone or with the assistance of AI in real clinical settings.

What they're saying: Succi told Axios there's more work to do to "bridge the gap from a useful machine learning model to actual use in clinical practice."

The value of AI assistance to doctors is clearest "in the early stages of patient care when little presenting information (is available) and a list of possible diagnoses is needed," Succi said.
"Large language models need to be improved in differential diagnosis before they're ready for prime time," Succi said, while noting their potential use in tasks that do not require final diagnosis, such as emergency room triage.
Succi said that ChatGPT is starting to exhibit the capabilities of a newly graduated doctor, but judging whether AI is adding value to a doctor's work will remain complicated.

What's next: To allow ChatGPT or comparable AI models to be deployed in hospitals, Succi said that more benchmark research and regulatory guidance is needed, and diagnostic success rates need to rise to between 80% and 90%.

2. Scoop: Zuck and Musk headed to Schumer forum

Schumer speaks during a news conference on July 26. Photo: Drew Angerer/Getty Images

The CEOs of the most powerful U.S. tech companies are heading to Capitol Hill next month for Senate Majority Leader Chuck Schumer's first AI insight forum, sources tell Axios Pro Policy's Maria Curi and Ashley Gold.

What's happening: The closed-door forum, scheduled for Sept. 13, will feature a slew of heavy hitters, including X's Elon Musk, Meta's Mark Zuckerberg, Google's Sundar Pichai, OpenAI's Sam Altman, Nvidia's Jensen Huang and Microsoft co-founder Bill Gates.

Microsoft CEO Satya Nadella, former Google CEO Eric Schmidt, civil society groups and unions will also attend.
The meeting is expected to last between two and three hours and will focus on the implications of AI, sources say.

Why it matters: Schumer's bipartisan AI Insight Forums are intended to educate lawmakers on the rapidly evolving technology and lay the groundwork for regulation.

Observers have expressed skepticism that the fast-moving private sector is really ready to deal with government guardrails.

A version of this story was published first on Axios Pro. Unlock more news like this by talking to our sales team.

3. Google DeepMind's permanent AI labels

The SynthID watermark remains detectable even after modifications such as adding filters, changing colors or adjusting brightness. Image: Google DeepMind

Google's DeepMind unit is unveiling today a new method for invisibly and permanently labeling images that have been generated by artificial intelligence, Axios' Ina Fried reports.

Why it matters: It's become increasingly hard for people to distinguish between images made by humans and those generated by AI programs. Google and other tech giants have pledged to develop technical means to do so.

Details: Google says its "experimental launch" of the new watermark technology, dubbed SynthID, will aim to validate that the watermarks remain persistent without degrading image quality.

Google says the watermark is designed to remain detectable even after modifications such as adding filters, changing colors or adjusting brightness. And unlike visible watermarks, SynthID can't be removed just by cropping.
DeepMind will test SynthID by making it available to users of using Google Cloud's Imagen text-to-image generator.

The big picture: Google was one of a number of leading AI companies that pledged to the White House in July to adopt watermarking.

An Adobe-led consortium has been developing a means for encoding how an image was created or captured and documenting subsequent changes.

Between the lines: It's likely that bad actors will choose AI engines that don't document or label images as AI-generated.

Separately, Google is announcing broad availability of Duet AI for Google Workspace, a generative AI tool similar in concept to the "copilot" that Microsoft is building into Office and other apps.

Google is also announcing updated versions of Vertex AI, its cloud based suite of generative AI tools.

4. Workers want AI to help with burnout

Companies plan to hire more as a result of generative AI, per research from freelance platform Upwork, but workers with data science and natural-language processing skills are in short supply, Axios' Hope King reports.

A separate study from Microsoft shows workers angling for help from AI to manage their workloads.

Why it matters: Many workers are worried about automation and AI taking their jobs. But what they might actually want is for AI to help them do their jobs more efficiently.

Driving the news: 64% of C-suite respondents say they will hire more due to generative AI, the strongest level of agreement among the 1,400 U.S. business leaders surveyed by Upwork in May.

State of play: While 49% of people say they're worried about job displacement, 70% would actually delegate as much work as possible to AI in order to reduce their workloads, according to the Microsoft study.

76% said they would be comfortable using AI for administrative tasks and also for helping them formulate ideas for their work.

What they're saying: "People are more excited about AI rescuing them from burnout than they are worried about it eliminating their jobs," organizational psychology expert Adam Grant said in the Microsoft study.

5. Training data

Meet AI's global "ghost workforce," which includes workers from Scale AI's micro-tasking platform. (Washington Post)
OpenAI released a new version of ChatGPT tailored for the needs of large businesses. (Reuters)
Trading places: Jacob Helberg, a member of House Speaker Kevin McCarthy's U.S.-China Economic and Security Review Commission, has joined Palantir as a senior adviser.

6. + This

Film or fire? Airline execs were unhappy with these crew members who danced on a Boeing 777's wing — but the internet loved them.

Thanks to Scott Rosenberg for editing and Bryan McBournie for copy editing this newsletter.