[PAID] 🧠 Gemini Extension to interact with the Gemini-pro model from Google

Black_Knight · June 28, 2025, 11:55pm

This is an example of how to use Gemini Vesion API with thermer chat extension

See the video :

The blocks :

Black_Knight · August 15, 2025, 4:22pm

New Update _{_gemini.aix}

Two old blocks of "single block response & streaming response" merged into this single block with new argument of isStreaming: boolean

blocks3338×646 212 KB

Two blocks were added to make it easy and fast to ask the model one single question without continuous chat, unlike the previous function GenerateGeminiContent block that enables you to create continuous chat :

blocks900×268 31.9 KB

blocks (1)1236×320 43 KB

sidrobo · August 18, 2025, 2:47pm

Can you please provide all features in one message

Black_Knight · August 18, 2025, 3:06pm

All Gemini extension features

Text Generation :
- Simple Chat : Provides a basic Ask function for simple text-in, text-out conversations.
- Advanced Generation : A powerful GenerateGeminiContent function that supports both single-turn and multi-turn conversations, system instructions, and optional tools.
- Streaming : Offers streaming versions of all major generation functions (StreamGenerateGeminiContent, StreamGenerateGroundedContent, etc.) that provide the response in real-time chunks.
Image Understanding (Vision) :
- Simple Image Queries : An AskWithImage block to ask questions about a single image.
- Multi-Modal Analysis : The ability to send multiple images, videos, audio files, PDFs, and text in a single prompt for comprehensive analysis.
- Multiple Input Sources : Accepts files from local paths, Base64 encoded strings, content URIs, and public URLs (for PDFs and images).
- YouTube Video Analysis : Can analyze content directly from a public YouTube URL (including Shorts) when provided with a prompt.
Image Generation & Editing :
- Text-to-Image : GenerateImage function to create an image from a text description.
- Image Editing : EditImage and EditImageFromPath functions to modify an existing image based on a text prompt.
- Multi-Image Editing : EditMultipleImagesSimple function to process a prompt against a list of images from various sources (URLs, paths, Base64).
Audio Understanding :
- Can process local audio files (e.g., MP3, WAV) as part of a prompt to be analyzed by the model.
Video Understanding :
- Can process local video files as part of a prompt, allowing the AI to analyze the video's content frame-by-frame.
Text-to-Speech (TTS) :
- Single Speaker : GenerateSingleSpeakerSpeech function to convert text into speech using a specified prebuilt voice.
- Multi-Speaker : GenerateMultiSpeakerSpeech function to create dialogue with multiple distinct voices from a structured script.

Advanced Features & Tools

Structured Output (JSON) :
- Users can provide a JSON Schema to force the model to return its answer in a structured JSON format, making it easy to parse and use data in the app. This is supported by multiple functions.
- Includes a CreateJsonSchema helper block to easily build the required schema.
Google Search Grounding :
- The StreamGenerateGroundedContent function can be enabled to have the model perform a Google search to ground its response in real-world information, providing source links for its claims.
Code Execution :
- The model can be given the ability to generate and execute code (like Python) to solve complex problems, with the results returned in the response.
File API Integration :
- Efficient File Uploads : Includes robust functions to upload large files (like videos) directly to Google's servers. This is highly efficient as the file is processed on the server and referenced by a URI, avoiding the need to send the full file with every request.
- File Management : Provides blocks to get detailed metadata (UploadFileAndGetMetadata) and the direct download link (GetFileContentUri) for uploaded files.
- Reusability : Uploaded files can be reused in multiple API calls by referencing their URI.

Events and Callbacks

The extension is event-driven, providing specific events to handle different outcomes:

General Responses : RespondedToGemini (for single responses), GotGeminiStream (for each piece of a streaming response), and StreamFinished.
Image & Audio Generation : GotImageResponse (returns Base64 image data and a saved file path) and GotSpeechAudio (returns Base64 audio and a saved file path).
File Uploads : FileUploadProgress (provides real-time progress for large uploads) and FileUploadComplete / GotFileMetadata (fires when a file is uploaded and processed, returning its URI and details).
Error Handling : A robust ErrorOccurred event that provides detailed error messages for easier debugging.
API Key Validation : APIKeyValid, APIKeyInvalid, and APIKeyCheckError events to confirm if the provided API key works.
Grounding Sources : GotGroundingInfo event that returns a list of source URLs and titles when using Google Search.

Utility and Helper Functions

File Encoding : Multiple blocks to encode various file types (images, videos, PDFs) into Base64 format.
Path Conversion : A GetFilePathFromURI function to handle file paths provided by components like the Activity Starter or File Picker.
Permission Handling : Blocks to check for and request the necessary storage permissions on Android.
Image Display : A DisplayBase64Image helper to easily display a Base64 string in an Image component.
Model Management : A GetGeminiModelNames function to retrieve a list of all available models for the user's API key.
Favicon Fetcher : A simple utility to get the URL for a website's favicon.

Configuration

Designer Properties : The extension allows setting key parameters directly in the MIT App Inventor designer, including:
- API Key and default Model Name.
- Generation controls: Temperature, Top P, Top K, and Max Output Tokens.
- Safety settings: Category and Threshold for content moderation.

sidrobo · August 19, 2025, 4:20am

Which gemini model needs to be used to access all features

Black_Knight · August 19, 2025, 10:04am

There is no model that can access all features

For General Analysis (Text, Chat, Vision, Audio, Video):
- Use Gemini 1.5 Pro or Gemini 2.5 Pro . This covers most of the extension's features. For a faster alternative, use Gemini 1.5 Flash .
For Generating and Editing Images:
- Use an Imagen model (e.g., imagen-4).
For Generating Speech (Text-to-Speech):
- Use a Gemini TTS model (e.g., gemini-2.5-flash-preview-tts).

Black_Knight · September 24, 2025, 7:23pm

I'm excited to share a major update to the Gemini extension!

We've just added a powerful new feature: Image Editing . To celebrate, we are also introducing our most powerful and a-peeling model yet: the Nano Bananana AKA gemini-2.5-flash-image-preview model!

Now you can perform powerful image edits directly within your App Inventor projects. Take a look:

We are very excited to see what you can create with this new functionality.

Happy Inventing

Black_Knight · November 19, 2025, 9:50am

Gemini 3 pro is here this is the game changer!

https://x.com/Google/status/1990924447402828120?t=1Avhi2kbi6XVDg7SQNkuQA&s=19

Black_Knight · December 11, 2025, 11:23pm

Major Update: Function Calling & Files API Integration!

Hello App Inventors!

We are thrilled to announce a game-changing update for the Gemini extension. This version transforms Gemini from a simple chatbot into a powerful AI Agent capable of controlling your app, while also giving you massive upgrades in file and image handling.

What's New?

Demo :

1. Function Calling: Turn Gemini into an Android Agent

The biggest feature in this update is Function Calling. You can now teach Gemini how to use tools within your app!

Instead of just returning text, Gemini can intelligently decide to trigger events in your app based on the user's conversation.

How it Works (The Tool Loop):

Send Request: You provide a prompt ("Give me the weather in Egypt") and a list of tools your app has.
Tool Needed: Gemini realizes it can't answer directly, so it asks you to run the get_weather tool.
Execution: Your app runs the function (e.g., gets data from a weather API).
Returning Data: You send the result (e.g., "30°C, Sunny") back to Gemini using SendFunctionResponse.
Completion: Gemini uses that data to give a final natural language answer: "The weather in Egypt is 30°C and Sunny."

Example Declaration: Here is how you define a function for Gemini using the functionDeclarations parameter:

[

{

"name": "get_weather",

"description": "this function job to get weather status for specific location",

"parameters": {

"type": "object",

"properties": {

"location": {

"type": "string"

}

},

"required": [

"location"

],

"propertyOrdering": [

"location"

]

}

}

]

Key Blocks:

DeclareFunctions: Define the available tools (like the JSON above).
FunctionCallRequested (Event): Fires when Gemini wants to perform an action.
SendFunctionResponse: Return the action's result back to the model.

2. Files API & Hybrid Image Engine

We've completely overhauled how files and images are handled to eliminate size limits and boost performance.

Hybrid Image Engine

Smart Switching: Small images (< 4MB) are processed instantly. Large images (> 4MB) automatically use the Files API.
No More Limits: Send full-resolution 20MB+ raw photos without crashing your app!

Full Files API Control

Manage your AI's knowledge base dynamically:

UploadFile: Upload PDFs, Audio, Video, or Images to Gemini's cloud storage.
ListUploadedFiles: View what's stored in your project.
AskWithFile / AskWithUploadedFiles: "Read this PDF" or "Watch this video" and answer questions about it.
DeleteFile: Manage your storage quota programmatically.

(Add a screenshot here of the new file management blocks)

Why Update?

Build Agents: Create smart home assistants, personal schedulers, or data analysis bots that actually do things.
Stable & Fast: The new image engine prevents "Payload Too Large" and Out-Of-Memory errors.
Multimodal Power: Analyze huge documents and long videos with ease.

PAID_file

Price: 5.99$ not 7$ for limited period
Purchase: PayPal Link or You can pay HERE using your credit card
In both cases after payment, you'll be redirected to the download URL. Contact me for any help or issues.

Happy Coding!

Black_Knight · January 5, 2026, 8:44am

Hi everyone,
I'm working on v2 of the Gemini Extension, and I want to make sure it covers your specific use cases.
Instead of just asking for features, I want to know: What kind of AI app are you trying to build right now?
Are you building a chatbot assistant?
An educational app for homework help?
A tool to generate marketing text?
If you tell me what you are building, I can add the specific blocks or parameters to make that easier for you. Let me know in the comments!