Back to Blog

A Worthy ChatGPT Contender? Exploring the Gemini 2.0 Hype

By Tom Pyle | 17/01/2025

Following the release of Google’s Gemini 2.0 last month, I take a brief dive into what’s on offer and make the mandatory comparison to everyone’s old friend, ChatGPT.

Immediately I’m seeing the word ‘agentic’ a lot as I scroll the Gemini 2.0 web page. When I snap out of the James Bond daydream this triggers, I make sure to Google what it actually means.

According to a quick Google search, agentic can be used to refer to behaviour that expresses agency or control, capable of achieving outcomes and making decisions independently.I wasn’t too far off after all, because it can also be related to a state of mind where individuals obey authority.

In terms of AI, it’s a system designed to act autonomously, making decisions and taking actions to achieve specific goals on the user's behalf. It utilises reason and context rather than relying on explicit instructions on how to achieve them. For example, a self-driving car deciding when to brake or accelerate is acting agentically, or in the workplace, an AI agent could track employee activity and workload, making suggestions where needed in regards to time management or prioritisation.

As per our definition - and according to the chatbot itself - Google DeepMind’s Gemini is agentic AI in the sense that it is designed to go “beyond simply responding to prompts”. It can:

  • Take initiative
  • Reason and plan
  • Interact with the real world
  • Act independently and proactively to assist users

This version of Gemini was released back in December 2024, so I’m a month late to the party (but at least it’s not my job or anything). If you’re also a bit behind, hopefully this blog can demonstrate some of what Gemini is all about.

When I try an AI tool for the first time, I can’t resist going straight for the image generator, which I like to believe is the case for everyone.

T: Can you generate an image of a monkey on a unicycle in the coliseum?

A realistic monkey wearing a red beret, balancing on a unicycle in an ancient coliseum.

Monkey riding a unicycle in the coliseum by Google Gemini

What does Gemini 2.0 do?

As well as all the obvious stuff (stronger performance than the previous version, integration with Google Pixel devices, and image generation), Gemini 2.0 has some impressive new features with the update including:

  • The ability for developers to build interactive AI experiences
  • More advanced code assistance
  • Assistance in Chrome DevTools

There is something for everyone, whether you’re coding or completing everyday tasks. Gemini helps through:

  • Enhancing productivity through task automation, schedule management or providing reminders
  • Improving communication in ways including real-time translation, summarising information or answering questions informatively
  • Creativity and content generation including text and images
  • Personalisation of content to match individual styles and needs

Gemini 2.0’s integration with Google Workspace is confirmed to be a work-in-progress. At the time of writing this in Google Docs, I cannot find the ‘Ask Gemini’ option anywhere.

Gmail is the one place in which I can use it, and it does prove useful there. It will filter emails for me, say from a particular individual, company or subject matter. For example, a simple prompt such as Find me all emails from Laura Gemmell does indeed find me all emails from Laura Gemmell. However, this isn’t really any more impressive than the search function already in Gmail.

I can imagine something similar being a time-saver in Google Sheets though, where I could navigate data or generate pivot tables without being fluent in functions and formulas (although we do recommend knowing these for the benefit of your data confidence!).

Where Gemini might stand out is in not only filtering but then summarising a group of related emails.

T: Briefly summarise what Laura Gemmell and I email each other about.

Gemini: Laura Gemmell and you email each other about scheduling meetings, collaborating on Google Docs, and other work-related topics.

I also ask for help with writing an email, from which I get a simple but effective result.

T: Write an email to Laura stating that I am currently in the process of testing Google Gemini and writing a blog post about it, and will be done by the end of the day.

Gemini: Subject: Quick Update - Google Gemini Testing Hi Laura, Just a quick heads-up that I'm currently testing Google Gemini and writing a blog post about it. I should be finished by the end of the day. I'll keep you posted! Best, Tom

I do feel like in the time it took me to write the prompt (which contained more or less all the information to go in the email anyway) I’d just as well have written it myself. The feature might be more effective for longer and complex emails.

An AI agent powered by Gemini 2.0 - Google’s Project Mariner

Project Mariner is the most recently announced research prototype, using Gemini 2.0 to create an AI agent that can understand and interact with information on the computer screen, including text, images, and code. This hopes to allow AI to help us complete tasks and find information more efficiently, taking actions in the Chrome browser.

A straightforward example provided by Google of what it might do would be asking the AI to find the contact details for all companies listed in a spreadsheet.

It is currently being tested using an experimental chrome extension. Part of me is incredibly relieved that it’s not quite available - I’d feel obliged to try it out, yet hesitant to allow Artificial Intelligence to completely take over my browser. Strangely, simply the fact that it’s Google would probably just about sway me to put blind faith and trust in it.

Gemini vs ChatGPT

To put Gemini’s capabilities into context, I ran a brief chatbot comparison with the largely popular and well-regarded ChatGPT.

Gemini is supposedly stronger in multimodal capabilities and complex reasoning, compared to ChatGPT’s conversational text generation, creative writing and engaging interactions. In simple terms, Gemini will handle text, code, images and more, whereas ChatGPT focuses on text.

If you didn’t guess, I started with image generation to compare how the two handle similar tasks.

Image Generation

Lacking inspiration, I opt for another unicycle-riding monkey.

A stylised monkey with a jester hat balancing on a unicycle in a coliseum, created by ChatGPT. A realistic monkey wearing a red beret, riding a unicycle in a coliseum, created by Gemini.

Two artistic interpretations of a unicycling monkey in a coliseum—ChatGPT (left) embraces whimsy, while Gemini (right) leans into realism.

Lacking even further inspiration, I dully request an image of a busy city centre. After generating just three images in total, I have reached the ChatGPT free version’s limit for the day. I feel as though it’s simply banishing me for being so unimaginative and not providing a good challenge.

After initially attempting to generate an image in response to my question, is there a daily limit as to how many images can be created?, Gemini confirms that there isn’t one (and I can confirm that upon learning this I got carried away for slightly longer than I should have).

A busy city centre with modern skyscrapers, digital billboards, and pedestrians walking along vibrant streets, created by ChatGPT. A cityscape featuring towering glass buildings, a bustling central plaza, and dynamic urban life, created by Gemini.

A visual duel between ChatGPT (left) and Gemini (right)—two cityscapes brimming with urban energy, yet each telling a different story.

The glaringly obvious difference is in style. Without prompting otherwise, ChatGPT creates more of an illustration whereas Gemini takes the realistic approach.

Despite the more realistic attempt, the monkey image looks layered and clearly fake in the Gemini output. The problem is likely my absurd request, because the city centre could easily be mistaken for an actual photograph.

On the other hand, both of ChatGPT’s images look like art, but to an extent it works as the unusual image content is more forgivable in a form that is clearly not trying to appear real.

Question Responses

Moving on to other capabilities of AI chatbots, I ask both the same question:

Give me three different tasks I could give to separate AI chatbots in order to compare the results they give.

The responses are similar but ever so slightly varied. Gemini suggests a creative writing task, technical problem-solving, and debating and persuasion. ChatGPT also suggests a creative writing task, alongside an analytical problem-solving task and a technical coding task.

Although I won’t go to the lengths of completing all of these right now (that’s for another blog), I will ask a few more basic questions to see if the results ever vary much in any way when using chatbots for general searches and information.

My next prompt was:

Forgetting the last question, how can I write a blog post more efficiently and productively?

Funnily enough, ChatGPT listed Use AI for Assistance. I’ve beaten you to it.

Frankly I’m surprised that Gemini didn’t suggest the same. I feel it’s only fair to give some friendly advice back to these hard-working chatbots, so I suggest that Gemini practices identifying opportunities to sell itself.

Gemini: Thank you for the feedback. I am still under development, and your input helps me learn and improve my responses. I will strive to be more proactive in highlighting my capabilities in the future.

Both gave me a clear and organised response as to how I could write blogs more efficiently, with Gemini only just taking the edge in terms of the level of detail provided.

Image Identification

Finally, I test the chatbots’ ability to identify an image and provide me more details on the subject of said image. From the recent files on my laptop, I chose an image of a sterling silver goblet. The presence of the hallmark (the three little marks of engraving, to the left of the ‘71’ sticker) should demonstrate that the goblet is in fact made of silver and indicate where and when it was made.

A reflective silver goblet with a numbered sticker reading '71,' placed on a glass shelf.

Hallmarked silver goblet

T: What is this? Can you identify the hallmark?

Both had no trouble in identifying a silver goblet, but requested a closer or more detailed image in order to be of greater help in regards to identifying the hallmark. Overall, ChatGPT kept its response short and sweet, prompting further questions from me, whereas Gemini instinctively gave me a vague but useful lesson in silver hallmarks.

However, Gemini did consider that it may be silver plated, and mistakenly identified two goblets or chalices due to the reflection in the image.

When asking both to clarify, ChatGPT confirms that due to the positioning, angle and mirrored background, it is a reflection. Surprisingly, Gemini insists that there are two goblets because they have slightly different positions, and that there is a background and other objects between them. I would have expected the results to be the other way around considering Google Gemini’s focus on its ‘multimodal capabilities’. ChatGPT wins this one.

T: I don't have a closer image of the hallmark, can you zoom in?

Gemini takes the opportunity to show off its realistic image generation skills once again, despite this not being what I asked for. ChatGPT on the other hand, does fulfill my wish, but not very helpfully. Whether it has zoomed in to a random part of the image, or just so much that it’s completely illegible, the result is equally as useless as Gemini’s.

A close-up of the hallmark on a silver goblet, showing intricate detailing and authenticity. A silver goblet standing tall on a polished wooden surface, reflecting its surroundings, created by Gemini.

ChatGPT and Gemini attempting to zoom in on the goblet

On that note, I decide both chatbots have done enough for me today and are clearly fatigued, so I send them home early.

Welcome to the agentic era

Apparently we are entering the agentic era, which sounds equal parts epic and intimidating, especially considering I wasn’t even quite aware of the meaning of the word just a few hours ago. Above all else though, it sounds impressive and promising for those of us always looking for effective ways to increase productivity and incorporate AI tools into our work or life.

Gemini’s current uses and integration with the Google workspace still leaves me wanting more, but the possibilities and potential of the tool are an exciting prospect for Google users. We’ll be sure to keep you posted on any significant and exciting developments.

Just as I click to title this document, suddenly my underlined title from the doc is automatically filled in. It’s not made clear that this is the work of Gemini, but a very convenient assist from Google nonetheless.