Tech News
Gemini’s data-analyzing abilities aren’t as good as Google claims
Google’s flagship generative AI models, Gemini 1.5 Pro and 1.5 Flash, are touted for their ability to process large amounts of data. However, recent research suggests that these models may not be as effective as claimed.
Two separate studies found that Gemini 1.5 Pro and 1.5 Flash struggle to accurately answer questions about large datasets, with correct answers given only 40-50% of the time in some tests.
Researchers have observed that while these models can technically process long contexts, they may not truly understand the content they are analyzing.
Gemini’s context window is lacking
Gemini’s context window refers to the amount of input data it considers before generating output. The latest versions of Gemini can handle up to 2 million tokens as context, the largest of any commercially available model.
Despite demonstrations highlighting Gemini’s long-context capabilities, research has shown that the models struggle to accurately evaluate true/false statements about complex works of fiction.
In another study, Gemini 1.5 Flash performed poorly in reasoning tasks involving video content, suggesting limitations in its ability to understand and analyze visual data.
Google is overpromising with Gemini
While the studies have not been peer-reviewed and tested older versions of Gemini models, they raise concerns about Google’s claims regarding the capabilities of these AI models. Other models tested in the studies also showed poor performance in similar tasks.
Google is the only model provider that prioritizes context window in its advertisements, according to Saxon. While the technical details of models are important, the real question is how useful they are in practical applications.
Generative AI is facing increased scrutiny as businesses and investors question its limitations. Recent surveys show that many C-suite executives doubt generative AI’s ability to boost productivity and are concerned about potential errors and data breaches. Deal-making in generative AI has also declined in recent quarters.
Customers are seeking innovative solutions amidst chatbots that provide inaccurate information and AI search platforms that generate plagiarized content. Google tried to differentiate itself with Gemini’s context feature, but it may have been too early.
There is a lack of transparency in how models handle long context processing, making it difficult to verify claims of reasoning and understanding. Without standardized benchmarks and third-party evaluation, it is challenging to assess the true capabilities of generative AI models.
Google did not provide a comment on these issues. Saxon and Karpinska suggest that better benchmarks and more independent scrutiny are needed to address exaggerated claims about generative AI. Current benchmarking methods may not accurately measure a model’s ability to answer complex questions, leading to potential misconceptions about their capabilities.
-
Tech News4 months ago
Bangladeshi police agents accused of selling citizens’ personal information on Telegram
-
Motivation3 months ago
The Top 20 Motivational Instagram Accounts to Follow (2024)
-
Self Development4 months ago
Don’t Waste Your Time in Anger, Regrets, Worries and Grudges
-
Destination4 months ago
Our new fixed tours are your ultimate Aussie & Kiwi adventure!
-
Breaking News4 months ago
Democrats and allies to flood airwaves, drop more than $125M on abortion push
-
Activities4 months ago
Family Holiday Checklist | What To Pack Family Holiday
-
Gaming3 months ago
Concord price, beta, preorder details for PS5 and PC confirmed
-
Destination3 months ago
Turkish Airlines carries 7.2 mn passengers in May, launches new sustainability brand, BA