AI worse than humans in every way at summarising information, government trial finds

Introversion

Pie aren't squared, pie are round!
Kind Benefactor
Super Member
Registered
Joined
Apr 17, 2013
Messages
11,468
Reaction score
17,297
Location
Massachusetts

Amazon conducted the test earlier this year for Australia’s corporate regulator the Securities and Investments Commission (ASIC) using submissions made to an inquiry. The outcome of the trial was revealed in an answer to a questions on notice at the Senate select committee on adopting artificial intelligence.

The test involved testing generative AI models before selecting one to ingest five submissions from a parliamentary inquiry into audit and consultancy firms. The most promising model, Meta’s open source model Llama2-70B, was prompted to summarise the submissions with a focus on ASIC mentions, recommendations, references to more regulation, and to include the page references and context.

Ten ASIC staff, of varying levels of seniority, were also given the same task with similar prompts. Then, a group of reviewers blindly assessed the summaries produced by both humans and AI for coherency, length, ASIC references, regulation references and for identifying recommendations. They were unaware that this exercise involved AI at all.

These reviewers overwhelmingly found that the human summaries beat out their AI competitors on every criteria and on every submission, scoring an 81% on an internal rubric compared with the machine’s 47%.

 

dickson

Hairy on the inside
Super Member
Registered
Joined
Mar 12, 2017
Messages
3,904
Reaction score
4,789
Location
Directly over the center of the Earth

Brigid Barry

Crazy horse person
Kind Benefactor
Super Member
Registered
Joined
Jan 22, 2012
Messages
9,886
Reaction score
16,731
Location
Maine, USA
The ONE thing it's good at: it goes through Amazon reviews to find repeated words and condense it into a blurb. If you’re looking at pillows and everyone says "very soft" and a handful of people say it smells funny, it'll spit out "customers found this soft but some customers found it smelled funny".

Amazon has rolled out an AI assistant (has something to do with Alexa and cannot be removed/disabled in the app) that's supposed to be able to make recommendations and answer questions and I can't wait to read the article about how Rufus recommends products based on how much the seller/manufacturer pays.
 

dickson

Hairy on the inside
Super Member
Registered
Joined
Mar 12, 2017
Messages
3,904
Reaction score
4,789
Location
Directly over the center of the Earth
The ONE thing it's good at: it goes through Amazon reviews to find repeated words and condense it into a blurb. If you’re looking at pillows and everyone says "very soft" and a handful of people say it smells funny, it'll spit out "customers found this soft but some customers found it smelled funny".

Amazon has rolled out an AI assistant (has something to do with Alexa and cannot be removed/disabled in the app) that's supposed to be able to make recommendations and answer questions and I can't wait to read the article about how Rufus recommends products based on how much the seller/manufacturer pays.
Yet another reason for me to radically curtail the amount of money I give to Jeff Bezos.