Google’s Gemini AI has again faced criticism when researchers compared the tool with GPT-3.5 Turbo. According to the article published on arXiv.org, an in-depth look at Gemini’s language ability found that its pro model has achieved unexpectedly inferior accuracy as compared to older OpenAI’s ChatGPT-3.5 turbo. The paper also includes the performance of the Large Language Model (LLM), in which the researchers found that Gemini Pro performed worse in most tasks than the ChatGPT-free model.

Researchers Discover That Google Gemini Isn’t Even Close To GPT-3.5 Turbo In Quality

According to the researcher’s team at Carnegie Mellon, the paying subscribers of ChatGPT Plus, GPT-4, and GPT-4V found a better experience than Gemini. They also suggested that hardworking Google researchers work more to get it to perform well and produce accurate answers to the users. Google researchers responded to the Carnegie Mellon team, saying that they found the Gamini Pro much better than GPT-3.5 and promised to launch a more powerful version, Gemini Ultra, by the year 2024.

The research team actually tested different LLMs, which include Google’s Gemini Pro, OpenAI’s ChatGPT 3.5 Turbo and GPT 4 Turbo, and AI’s newly founded startup Mistral’s 8x7b. Researchers have used Light LLM and AI aggregator to conduct the test. They ran all the models of each AI tool for 4 days in a different prompt while asking different multiple-choice questions and questions from Humanities and Social Science to test the knowledge-based QA.

In multiple-type questions, labeled in the form of A, B, C, and D, Gemini more often chose option D without using the logic. While Open AI’s ChatGPT 3.4 and GPT4 took time to answer the question, you can choose the right option for the maximum time. According to the researchers, Gemini was not trained to answer multiple choice questions as compared to GPT 3.5 Turbo and even worse as compared to the GPT 4 version.

Also Read: The QakBot Malware Is Back With New Tricks Aimed At The Hospitality Industry

Additionally, researchers also found that Gemini was even worse than GPT 3.5 in knowledge-based QA. The team asked questions on various categories of all the LLM models, which include human sexuality, professional medicine, some formal logical questions, and mathematical calculations. Gemini’s behavior was shockingly weird as it refused to answer some questions, saying that it would not compile this instruction due to content restrictions.

The Outcome Of The LLM Test

So, the team of researchers wrote in their report Gemini outperformed in both tests as compared to GPT 3.5. Additionally, they also concluded that GPT 4 still dominates all the model tests and proven that it is the most accurate LLM model of this time. They suggested that Google’s research team work more on Gemini’s LLM model and optimize its choice of answers.

While countering their analyzes, Google’s research team claims to be better than GPT 3.5 Turbo. They also promised to launch Gemini Ultra by next year, which would be more powerful than GPT 4. They are working day and night to improve their LLM to achieve more accurate answers to the questions.

Meanwhile, the performance of Gemini was clearly below average, which questions Google’s ambitions towards AI. For now, Google needs to work hard to compete with Open AI in the generative AI race. According to the sources, it is not possible for Google to launch its Gemini Ultra even in the next year, which will push the company behind in the AI race.

Overall, OpenAI’s ChatGPT is still the king of the generative AI field and one of the best choices for consumers and enterprises. Some AI influences, like Ethan Mollick, who is also a professor at the University of Pennsylvania, agreed with the research report.

More: Apple Designer Peter Russell-Clarke Resigns After Working With Jony Ive