ChatGPT Fails 6th Grade Tests in Math, Science, but Quickly Learns

OpenAI’s ChatGPT failed math and science tests that sixth-graders in Singapore take.
It made errors in simple addition and could not comprehend any diagrams, per The Straits Times.
But ChatGPT appeared to have learned from its mistakes. It got some questions right when Insider tested it.

When the viral, AI-powered ChatGPT bot was asked to solve questions from Singapore’s sixth-grade examinations, it failed miserably.

During an experiment in February, ChatGPT was asked by the Singaporean news outlet The Straits Times to answer questions from the Primary School Leaving Exam. The PSLE is an exam that all 12-year-olds in Singapore must take, and determines which secondary school they go to.

ChatGPT was given questions from the PSLE’s 2020, 2021, and 2022 papers on mathematics, science, and English.

It scored an average of 16 out of 100 marks for the three mathematics papers it took, per The Straits Times. During the test, it could not understand or answer any questions that referenced diagrams or graphs, and was given zero marks for these questions.

But ChatGPT also made mistakes with simple, text-based questions. When asked for the sum of 60,000, 5,000, 400, and 3, it said the answer was 65,503, The Straits Times reported.

The correct answer is 65,403.

However, when Insider tried the same question, ChatGPT’s answer was correct.

ChatGPT fared a little better at the science papers, getting an average of 21 out of 100 marks.

But on Monday, when Insider tested ChatGPT on two PSLE science questions — one from 2020 and another from 2022 — it got both questions right.

ChatGPT managed to pass the English tests and scored an average of 11 out of 20 marks across the three papers it took, The Straits Times reported. During the English test, ChatGPT still ran into problems — this time, with questions containing words that have multiple meanings.

One example The Straits Times cited was the word “value.” ChatGPT disregarded the question’s context, where “value” referred to one’s moral principles, and answered as if it meant monetary value.

ChatGPT was developed by the artificial intelligence company OpenAI and launched in November. It had 100 million users by the end of January.

The bot’s inability to pass Singapore’s sixth-grade exams is surprising — it managed to pass a final exam at the Wharton business school, passed tests in four law school courses, and comfortably cleared a US medical licensing exam.

Universities are now revamping examinations over concerns that AI bots could be used for cheating, The New York Times reported in January. This pivot in testing involves more oral exams, group work, and handwritten assessments instead of typed submissions, per The Times.

Representatives at OpenAI and the Ministry of Education in Singapore did not immediately respond to Insider’s request for comment.

Source link

Leave a ReplyCancel Reply