The Gemini extension for AI2 allows you to interact with the Google Gemini-Pro, Gemini-Pro-Vision, and Gemini 1.5 Flash models (including models Bard uses) to generate text and control streaming text generation.
Features of the Gemini Extension for AI2:
- Gemini API Text Generation: Use Gemini-Pro, Gemini-Pro-Vision, and Gemini 1.5 Flash models.
- Image Generation: Create new images from text prompts using supported models.
- Image Editing: Modify existing images using text instructions (from path or Base64).
- Streaming Text Generation: Real-time, interactive responses.
- Vision Support: Generate text from images, video thumbnails, PDFs (URL/Local).
- Local Audio Processing: Generate text based on local audio files.
- Gemini 1.5 Flash Model Support: Access the faster Flash model.
- Code Execution (Optional): Enable code execution in Gemini responses.
- Structured JSON Output: Define schemas for predictable results.
- File Handling: Optimized Base64 encoding (images, video, PDF, audio, etc.), path/URI handling, MIME type detection.
- PaLM API Integration: Basic text generation support for PaLM.
- List Available Models: Fetch model names.
- Stream Control: Start/Stop streaming functions.
- Error Handling: Events for API/file/JSON errors, stream completion/stopping.
- Asynchronous Operations: Non-blocking UI for responsiveness.
Benefits:
- Integrate cutting-edge AI text and multimodal generation easily into AI2 projects.
- Create dynamic user experiences with streaming and interactive features.
- Generate diverse content from text, images, videos, PDFs, and audio (local/web).
- Build data-driven apps with reliable structured JSON output.
- Seamlessly work with local and web files for richer AI interactions.
- Choose between Gemini (including Flash) and PaLM APIs/models.
- Easy to use within App Inventor, designed for extensibility.
Use Cases:
- Chatbots: Build bots understanding text, images, audio.
- Content Creation: Generate articles, posts, stories (with text, image, PDF, audio prompts).
- Media Analysis: Analyze images/videos, generate descriptions/summaries.
- Document Processing: Process PDFs (URL/Local) for summaries/Q&A.
- Audio Processing: Transcribe/summarize local audio.
- Code Tools: Generate/assist with code (optional execution).
- Data Structuring: Extract info into structured JSON via schemas.
- Edu/Creative Tools: Interactive learning, story generation.
The potential applications are vast!
Blocks


Explanation
Generating Content (Non-Streaming)
Use the
GenerateGeminiContent
block.
modelName
(String): Gemini model (e.g., "gemini-1.5-flash"). See docs.apiKey
(String): Your Google API key.contents
: List of dictionaries representing conversation turns. Each dict hasrole
(String) andparts
(List of dicts withtext
key).Blocks example:
The
RespondedToGemini
event handles the response.
apiResponse
: Raw API response.textParts
: List of generated text strings.role
: Role of the response.finishReason
: Why generation stopped.index
: Content index.safetyRatings
: List of safety rating dictionaries (category
,probability
).
Function: StreamGenerateGeminiContent
Stream content using the Gemini API, with optional Code Execution.
Parameters:
contents
: List of conversation turn dictionaries (same structure asGenerateGeminiContent
).apiKey
(String): Your Google API key.modelName
(String): Gemini model (e.g., "gemini-1.5-flash"). See docs.enableCodeExecution
(boolean): Enable code execution (true/false).
Blocks example
Functionality:
Initiates streaming via Server-Sent Events (SSE). Extracts text/code from chunks. TriggersGotGeminiStream
with content (Markdown for code). TriggersStreamFinished
on completion. TriggersErrorOccurred
for API or JSON errors.Callbacks:
GotGeminiStream(textValue)
,StreamFinished()
,ErrorOccurred(errorMessage, component)
.Usage Notes: For streaming text/code.
contents
handles multi-turn/images.enableCodeExecution
allows interactive code. Handle events for results and errors. Requires internet and valid API key.
GotGeminiStream` event.
Triggered with each chunk of streamed data.
text
: String representing the generated text chunk.
Use
StopStream
to manually stop.StoppedStream
event fires when stopped.
Use
IsStreaming
to check if a stream is active.
Function: GenerateGeminiThinkingContent
Generate content using the Gemini 1.5 Flash model (non-streaming).
prompt
(String): Text prompt.apiKey
(String): Google API key.
Function: StreamGenerateGroundedContent
Streams content from Gemini, instructing it to use Google Search for grounding (fact-checking).
prompt
(String): Text prompt.apiKey
(String): Google API key.
Event: GotGroundingInfo
Fires (usually near end of stream) with web sources used for grounding.
sourceUris
: List of URLs consulted.sourceTitles
: Corresponding list of page titles.
Function: GetFaviconUrl
Constructs a URL for a website's favicon using Google's service.
url
(String): The website URL.
Function: StreamGenerateGeminiThinkingContent
Stream content using the Gemini 1.5 Flash model. Retrieves content in chunks.
prompt
(String): Text prompt.apiKey
(String): Google API key.Triggers
GotGeminiStream
,StreamFinished
, andErrorOccurred
.
Function: StreamGenerateContentFromPdfUrl
Stream content based on a PDF file retrieved from a URL.
pdfUrl
(String): URL of the PDF.prompt
(String): Text prompt related to the PDF content.apiKey
(String): Google API key.modelName
(String): Gemini model (e.g., "gemini-pro-vision").
Function: StreamGenerateGeminiStructuredContent
-----------------------Stream structured JSON content matching a provided schema.
contents
(List): Conversation turns (same format asStreamGenerateGeminiContent
).apiKey
(String): Google API key.modelName
(String): Gemini model (e.g., "gemini-pro").scheme
(String): JSON Schema string defining desired output structure (useCreateJsonSchema
).Usage Notes: Get structured JSON streamed.
GotGeminiStream
provides chunks that form the final JSON. Requires model supporting structured output.
Function: CreateJsonSchema
Builds a JSON Schema string.
propertyNames
(List of String): Names of JSON properties.propertyTypes
(List of String): Corresponding types ("string", "number", "array", "boolean", "integer", "object").propertyDescriptions
(List of String): Descriptions for each property.requiredProperties
(List of String): Names of required properties.
Function: StreamGenerateContentFromLocalPdfPath
Stream content based on a PDF file from the device's local storage.
pdfPath
(String): Absolute path to the local PDF file.prompt
(String): Text prompt.apiKey
(String): Google API key.modelName
(String): Gemini model.
Function: StreamGenerateContentFromLocalAudioPath
Stream content based on an audio file from the device's local storage.
audioPath
(String): Absolute path to the local audio file. Needs storage permissions.prompt
(String): Text prompt related to the audio.apiKey
(String): Google API key.modelName
(String): Gemini model (e.g., "gemini-pro-vision" or future audio models).
Generating Content with Images (Streaming)
Use
StreamGenerateGeminiVisionContent
.
contents
: List of dictionaries. Can include text parts ("text": "..."
) and image parts ("inlineData": {"mimeType": "image/jpeg", "data": "base64_string..."}
).apiKey
: Your Google API key.Blocks example:
This block opens a stream; results arrive via the
GotGeminiStream
event.
StreamGenerateGeminiFileContentFromBase64
Streams content based on provided Base64 encoded files and text.
apiKey
(String): Google API key.modelName
(String): Gemini model. See docs.fileBase64List
(List): List of Base64 encoded file strings.mimeTypeList
(List): Corresponding list of MIME types.additionalText
(String): Additional text prompt.
GenerateImage
Creates a new image from a text description.
prompt
(Text): Description of the desired image.apiKey
(Text): Google API Key.modelName
(Text): Image generation model (e.g., "gemini-1.5-flash"). Check Google docs.
EditImage
Modifies an existing image provided as Base64.
prompt
(Text): Instructions for changes.inputImageBase64
(Text): Image to edit (Base64 string).inputMimeType
(Text): MIME type of input image (e.g., "image/jpeg").apiKey
(Text): Google API Key.modelName
(Text): Image editing model.
EditImageFromPath
Modifies an existing image using its file path.
prompt
(Text): Instructions for changes.inputImagePath
(Text): Full path to the image file on device.apiKey
(Text): Google API Key.modelName
(Text): Image editing model.
EditMultipleImagesSimple
Advanced editing/generation using multiple input images (URL/Path/Base64) and text.
prompt
(Text): Instructions involving the images.imageSourceStrings
(List): List of image sources (URLs, paths, or Base64 strings).apiKey
(Text): Google API Key.modelName
(Text): Multi-image capable model.
DisplayBase64Image
Helper block to display Base64 image data on an Image component.
base64Data
(Text): Base64 image data (fromGotImageResponse
).mimeType
(Text): Image MIME type (fromGotImageResponse
).imageComponent
(Component): The Image component to display on.
Event: GotImageResponse
Fires when image generation/editing succeeds.
imageBase64
(Text): Resulting image as Base64. Empty on failure.mimeType
(Text): MIME type of the result (e.g., "image/png").responseText
(Text): Any text from the API (e.g., errors if blocked).rawApiResponse
(Text): Full JSON response (for debugging).imagePath
(Text): Path where the result image was saved in app storage (ASD). Empty on failure.
Examples of generating and editing with Gemini
- StreamGenerateContentFromLocalVideoPath
Parameters:
videoPath (String): The local file path to a video file.
prompt (String): The text prompt related to the video content.
apiKey (String): Your Google AI API Key.
modelName (String): The Gemini model to use.
systemInstructionsValue (String): Optional system instructions.
jsonSchemaString (String): Optional JSON schema for structured output.
Description: Uploads a local video file using the File API, polls until the file is processed ("ACTIVE"), and then starts a streaming request based on the video content and prompt. Optionally includes system instructions and/or requests structured output via a JSON schema. Response chunks arrive via GotGeminiStream. Triggers StreamFinished when done or ErrorOccurred on failure.
StreamGenerateContentFromLocalVideoPathWithInstructions
Parameters:
videoPath (String): The local file path to a video file.
prompt (String): The text prompt related to the video content.
apiKey (String): Your Google AI API Key.
modelName (String): The Gemini model to use.
systemInstructionsValue (String): Optional system instructions.
Description: Similar to StreamGenerateContentFromLocalVideoPath, but only includes the option for system instructions (no structured output schema). Uploads the video, waits for processing, then starts the streaming request. Response chunks arrive via GotGeminiStream. Triggers StreamFinished when done or ErrorOccurred on failure. Uses standard Designer Properties for generation config.
- StreamGenerateContentFromYouTubeUrl (Overload 1 - Basic)
Parameters:
youtubeUrl (String): Public URL of a YouTube video (including Shorts).
prompt (String): Text prompt relating to the video.
apiKey (String): Your Google AI API Key.
modelName (String): The Gemini model to use.
Description: Starts a streaming analysis request using a YouTube URL and prompt. Uses default generation settings from Designer Properties. Response chunks arrive via GotGeminiStream. Triggers StreamFinished when done or ErrorOccurred on failure.
StreamGenerateStructuredContentFromYouTubeUrl (Overload 2 - Advanced)
Parameters:
youtubeUrl (String): Public URL of a YouTube video (including Shorts).
prompt (String): Text prompt relating to the video.
apiKey (String): Your Google AI API Key.
modelName (String): The Gemini model to use.
systemInstructionsValue (String): Optional system instructions.
jsonSchemaString (String): Optional JSON schema for structured output.
Description: Starts a streaming analysis request using a YouTube URL and prompt. Optionally includes system instructions and/or requests structured output via a JSON schema. Response chunks arrive via GotGeminiStream. Triggers StreamFinished when done or ErrorOccurred on failure.
GetGeminiModelNamesRetrieves a list of available Gemini model names.
apiKey
(String): Your Google API key.
Events:
- GotGeminiModelNames(modelNames as List): Triggered on success with the list of model names.
- ErrorOccurred(message, component): Triggered if an error occurs during the API request.
Encoding Images to Base64
The
EncodeImageToBase64
block encodes an image file path to Base64 (removes line breaks).
imagePath
: Path to the image file.Returns the Base64 string.
Error Handling
The
ErrorOccurred
event signals errors from the extension.
message
: Error description.component
: Name of the component causing the error ("Gemini", "Gemini-JSON", etc.).
Examples
Example: Generate text (non-streaming):
Example: Generate text (streaming):
Example: Generate text with images (streaming):
Example: Generate text with images in FreeForm Prompt (streaming):
Use TextFormater extension for FreeForm layout.
Applications that use this extension :
Videos preview:
Aix_file:
Check the comparison between PAID and FREE versions:point_down:
PAID_file
Price: $5.99
Purchase: PayPal Link or You can pay HERE using your credit card
In both cases after payment, you'll be redirected to the download URL. Contact me for any help or issues.
FREE_file
Gemini_Mini.aix (11.6 KB)
Have Inquiries?
For questions about the Gemini extension, contact me via PM on Telegram.
Note :
You can try Gemini and get your API key from Google AI Studio.