Is ChatGPT running out of steam? AI chatbot's performance raises burnout concerns

By Stacker

Published August 11, 2023 2:20 AM

Canva

Is ChatGPT running out of steam? AI chatbot’s performance raises burnout concerns

Person sitting at a table using a smartphone.

After taking the tech world by storm and almost single-handedly starting an AI arms race among big tech companies, OpenAI’s ChatGPT appears to be experiencing fluctuations in performance, raising questions about potential burnout.

What Happened: A study conducted by researchers at Stanford University delved into the performance of ChatGPT over several months, focusing on four diverse tasks: solving math problems, answering sensitive questions, generating software code, and visual reasoning.

The study revealed wild fluctuations, referred to as drift, in the chatbot’s ability to perform these tasks. Benzinga reviewed the research and highlighted the study’s key findings.

GPT-3.5 Vs. GPT-4

The study compared two versions of OpenAI’s AI-powered platform: GPT-3.5 and GPT-4. Surprisingly, GPT-4’s performance in solving math problems showcased a significant decline in just three months – between March and June.

In March, the model correctly identified that 17077 is a prime number 97.6% of the time, but by June, its accuracy plummeted to a mere 2.4%. Conversely, GPT-3.5 demonstrated an almost opposite trajectory, with the March version answering correctly only 7.4% of the time and the June version consistently right 86.8% of the time.

The Black Box Dilemma

Since OpenAI decided against making its code open source, researchers and the public have little visibility into the changes made to the neural architectures or the training data, making it difficult to understand the hidden complexities behind these fluctuations.

ChatGPT’s Fading Explanation Skills

In addition to the declining performance, ChatGPT’s ability to explain its reasoning has become less apparent over time. The study stated that the chatbot provided step-by-step reasoning for specific questions in March, but by June, it ceased to do so without clear reasons.

Not Sure Of The Exact Reason

James Zuo, a Stanford computer science professor and study author, highlighted the unintended consequences of tweaking large language models. These adjustments aimed at improving specific tasks can harm others due to complex interdependencies in the model’s responses, which remains poorly understood due to the model’s closed-source nature, reported Fortune.

Why It’s Important: Earlier last month, it was reported that ChatGPT has been experiencing a summer slump with a surprising 9.7% decrease in website traffic in June compared to May, raising concerns about its sustained popularity.

Unique visitors also dropped by 5.7%, and time spent on the site declined by 8.5%, hinting at a possible decline in user engagement. Some experts suggest that the initial novelty of ChatGPT may be wearing off, while the launch of the iOS app in May could have also diverted traffic to the more convenient mobile application.

This story was produced by Benzinga and reviewed and distributed by Stacker Media.

Article Topic Follows: Stacker-Money

Jump to comments ↓