Gemini GEMS: How to create wizards with default tools (Canvas, Deep Research, Nanobanana)

There is one thing that fascinates me (and makes me a little angry at the same time): how much we get used to doing stupid tasks “because it has always been done that way.” Opening an assistant and having to tell it every time “hey, search the internet”, “hey, use Canvas”, “hey, generate images”… is like entering your own house and asking permission to turn on the light.

The brutal update of Gemini with the GEMS comes to tell you: “Cristina, stop begging the robot.” Because now you can configure a wizard so that it is born with the right tools in place. And it’s not a small change: it’s the kind of improvement that turns a “chat” into a coworker with judgment… or at least with clear instructions.

The problem from before (and why it was taking up your time)

Until recently, opening a GEM was like hiring a brilliant intern… who forgets what he has to do every morning. You entered, and it was time to repeat the song: “Search the internet.”

“Activate Canvas.”
“Generate images.”
“Do deep research.”

And of course, if you use it alone, it’s okay: a little heavy, but you can handle it. The real drama comes when you share it with your team or with a client. Because each person uses it differently. And the assistant stops being “a system” and becomes a raffle.

The update: a GEM with default tools (finally)

Now, when creating a new GEM, you can link tools from the chatbot itself and leave them activated by default. This includes, for example: Canvas (to generate/edit/preview code and texts)

Deep Research
Nanobanana (image/infographic generation)
Dynamic connection with documents (Drive/Sheets/Docs “live”)

The best part: creating each wizard can take you less than 2 minutes. And yes, GEMS is still a free feature.

Example 1: A GEM with Canvas that builds web/app components aligned to your brand

This is one of my favorites because it’s the kind of thing that, if you do it right, saves you hours… and if you do it wrong, it leaves you with a Frankenstein website.

The idea

Create a GEM that generates visual “assets” for a website or app (sliders, carousels, buttons, blog tiles…), but with a non-negotiable condition: that it respects 100% of the brand identity.

The trick that makes it work

You upload brand documentation as knowledge: color palette, fonts, size hierarchies, rules of use. And you configure the GEM so that its default tool is Canvas, because Canvas is not just about writing pretty: it also generates, edits and previews code.

What it can generate (real examples from the video)

Horizontal scroll-controlled slider: 4 slides

Background image + centered text (title and subtitle)
Layer with color and opacity for readability without loading the image
invisible scroll
Automatically generated dark mode Blog post carousel: 3 columns
Featured image
Hover with title appearance
Consistent light/dark version Button with hover: Light “up” animation
Positive/negative color swap
In light and dark mode it is reversed logically (not randomly)

How to configure (simple structure)

ElementWhat you putWhy it mattersName and descriptionSomething specific (“CSS Assets for Lucit”)Prevents you from using it “for everything” and it is uselessDetailed instructions: how you should think, what to prioritize, what to deliverYou set the standard (accessibility, responsive, visual coherence)KnowledgeMarkdown with fonts, colors, rules, examplesThe brand stops being “opinion” and becomes a systemDefault toolCanvasSo that it produces code with preview and you can download it

Example 2: A GEM with Nanobanana that converts transcripts into notes + infographics

This case is pure gold if you do training, educational content or have an academy: you put in a transcription of a live/video and you get ready-made teaching material.

Entry and exit

Input: transcription (even if it is chaotic, with fillers, interruptions… real life)

Output: teaching text (structured notes) + interspersed infographics that illustrate what you have just explained

The important insight: alternating text and image in the same answer

It’s not “make me 5 single images”. It is: text → infographic → text → infographic. This maintains coherence, rhythm and reinforces learning. In the example in the video, the “pillars of your personal brand” appear (mission, vision, values, value proposition, tone of communication) and right after that an infographic with those same elements.

Minimal configuration (and here comes the beauty)

There is no need to set up a NASA. This GEM can work without additional knowledge: just with a generous instruction and the default tool in Nanobanana.

Extra powerful: persistent visual identity

By uploading reference photos into the GEM knowledge, you can generate consistent images of a person without asking “I attached photos” again every time. This, for personal brands, is crazy: visual consistency and speed. Basically, the assistant already “knows who you are.”

Example 3: A GEM with Deep Research for deep reports with pinned sources

Here we enter “serious” mode: real research, with a controlled corpus. Instead of having the wizard “look around,” you give him a locked library and tell him to look inside this.

What is charged

Dense documents: reports from consulting firms, Stanford’s annual AI report (hundreds of pages), papers on productivity, European regulation (AI Act), PEW Research, World Economic Forum… Come on, the type of reading that you say “I’ll read it on the weekend” and then three months go by.

What do you get?

Thematic reports (e.g. impact of AI in education)

An implicit research plan (you can tell that it is pulling the loaded material)
Exportable document (in the example, a result of 11 pages)

And here is the strategic key: fix the sources. This is editorial control. Quality control. And yes: peace of mind.

Example 4: A GEM dynamically connected to Drive/Sheets with live data

This is the typical “nobody talks about this and then they change your business.” Connect a GEM to Google Drive documents (Sheets/Docs) so that it consults data every time you ask it. It’s not a file uploaded once and forgotten: it’s a dynamic connection.

Simple (but devastating) demonstration

GEM of a “bike shop” connected to a sheet with orders. Questions: “How much has Hugo López spent?” Answer €30. You change the cell in the Sheet (you update units, total €75). You ask again. Answer €75.

That means the GEM can be: a sales consultant who looks at your history

an “operations assistant” who reviews stock/prices
a collaborative document reader (“what did the team update today?”)

My conclusion (from person to person): this is not about “AI”, it is about designing systems

The real leap is not “Gemini does more things.” Now you can create assistants that behave as they should from minute one. With predetermined tools, well-loaded knowledge and a clear objective, a GEM stops being a toy and becomes a repeatable process.

And if you are left with only one idea, let it be this: a good GEM is not one that responds beautifully; It is the one that works just as well when you are not there to correct it. That’s where the magic begins. And there, finally, the multiplied utility begins.