The Gemini extension for AI2 allows you to interact with the Google Gemini-Pro, Gemini-Pro-Vision, and Gemini 1.5 Flash models (including models Bard uses) to generate text and control streaming text generation.
Features of the Gemini Extension for AI2:
- Gemini API Text Generation: Use Gemini-Pro, Gemini-Pro-Vision, and Gemini 1.5 Flash models.
- Image Generation: Create new images from text prompts using supported models.
- Image Editing: Modify existing images using text instructions (from path or Base64).
- Streaming Text Generation: Real-time, interactive responses.
- Vision Support: Generate text from images, video thumbnails, PDFs (URL/Local).
- Local Audio Processing: Generate text based on local audio files.
- Gemini 1.5 Flash Model Support: Access the faster Flash model.
- Code Execution (Optional): Enable code execution in Gemini responses.
- Structured JSON Output: Define schemas for predictable results.
- File Handling: Optimized Base64 encoding (images, video, PDF, audio, etc.), path/URI handling, MIME type detection.
- PaLM API Integration: Basic text generation support for PaLM.
- List Available Models: Fetch model names.
- Stream Control: Start/Stop streaming functions.
- Error Handling: Events for API/file/JSON errors, stream completion/stopping.
- Asynchronous Operations: Non-blocking UI for responsiveness.
Benefits:
- Integrate cutting-edge AI text and multimodal generation easily into AI2 projects.
- Create dynamic user experiences with streaming and interactive features.
- Generate diverse content from text, images, videos, PDFs, and audio (local/web).
- Build data-driven apps with reliable structured JSON output.
- Seamlessly work with local and web files for richer AI interactions.
- Choose between Gemini (including Flash) and PaLM APIs/models.
- Easy to use within App Inventor, designed for extensibility.
Use Cases:
- Chatbots: Build bots understanding text, images, audio.
- Content Creation: Generate articles, posts, stories (with text, image, PDF, audio prompts).
- Media Analysis: Analyze images/videos, generate descriptions/summaries.
- Document Processing: Process PDFs (URL/Local) for summaries/Q&A.
- Audio Processing: Transcribe/summarize local audio.
- Code Tools: Generate/assist with code (optional execution).
- Data Structuring: Extract info into structured JSON via schemas.
- Edu/Creative Tools: Interactive learning, story generation.
The potential applications are vast!
Blocks

Explanation
Generating Content (Non-Streaming)
Use the
GenerateGeminiContent
block.
modelName
(String): Gemini model (e.g., "gemini-1.5-flash"). See docs.apiKey
(String): Your Google API key.contents
: List of dictionaries representing conversation turns. Each dict hasrole
(String) andparts
(List of dicts withtext
key).Blocks example:
The
RespondedToGemini
event handles the response.
apiResponse
: Raw API response.textParts
: List of generated text strings.role
: Role of the response.finishReason
: Why generation stopped.index
: Content index.safetyRatings
: List of safety rating dictionaries (category
,probability
).
Function: StreamGenerateGeminiContent
Stream content using the Gemini API, with optional Code Execution.
Parameters:
contents
: List of conversation turn dictionaries (same structure asGenerateGeminiContent
).apiKey
(String): Your Google API key.modelName
(String): Gemini model (e.g., "gemini-1.5-flash"). See docs.enableCodeExecution
(boolean): Enable code execution (true/false).
Blocks example
Functionality:
Initiates streaming via Server-Sent Events (SSE). Extracts text/code from chunks. TriggersGotGeminiStream
with content (Markdown for code). TriggersStreamFinished
on completion. TriggersErrorOccurred
for API or JSON errors.Callbacks:
GotGeminiStream(textValue)
,StreamFinished()
,ErrorOccurred(errorMessage, component)
.Usage Notes: For streaming text/code.
contents
handles multi-turn/images.enableCodeExecution
allows interactive code. Handle events for results and errors. Requires internet and valid API key.
GotGeminiStream` event.
Triggered with each chunk of streamed data.
text
: String representing the generated text chunk.
Use
StopStream
to manually stop.StoppedStream
event fires when stopped.
Use
IsStreaming
to check if a stream is active.
Function: GenerateGeminiThinkingContent
Generate content using the Gemini 1.5 Flash model (non-streaming).
prompt
(String): Text prompt.apiKey
(String): Google API key.
Function: StreamGenerateGroundedContent
Streams content from Gemini, instructing it to use Google Search for grounding (fact-checking).
prompt
(String): Text prompt.apiKey
(String): Google API key.
Event: GotGroundingInfo
Fires (usually near end of stream) with web sources used for grounding.
sourceUris
: List of URLs consulted.sourceTitles
: Corresponding list of page titles.
Function: GetFaviconUrl
Constructs a URL for a website's favicon using Google's service.
url
(String): The website URL.
Function: StreamGenerateGeminiThinkingContent
Stream content using the Gemini 1.5 Flash model. Retrieves content in chunks.
prompt
(String): Text prompt.apiKey
(String): Google API key.Triggers
GotGeminiStream
,StreamFinished
, andErrorOccurred
.
Function: StreamGenerateContentFromPdfUrl
Stream content based on a PDF file retrieved from a URL.
pdfUrl
(String): URL of the PDF.prompt
(String): Text prompt related to the PDF content.apiKey
(String): Google API key.modelName
(String): Gemini model (e.g., "gemini-pro-vision").
Function: StreamGenerateGeminiStructuredContent
-----------------------Stream structured JSON content matching a provided schema.
contents
(List): Conversation turns (same format asStreamGenerateGeminiContent
).apiKey
(String): Google API key.modelName
(String): Gemini model (e.g., "gemini-pro").scheme
(String): JSON Schema string defining desired output structure (useCreateJsonSchema
).Usage Notes: Get structured JSON streamed.
GotGeminiStream
provides chunks that form the final JSON. Requires model supporting structured output.
Function: CreateJsonSchema
Builds a JSON Schema string.
propertyNames
(List of String): Names of JSON properties.propertyTypes
(List of String): Corresponding types ("string", "number", "array", "boolean", "integer", "object").propertyDescriptions
(List of String): Descriptions for each property.requiredProperties
(List of String): Names of required properties.
Function: StreamGenerateContentFromLocalPdfPath
Stream content based on a PDF file from the device's local storage.
pdfPath
(String): Absolute path to the local PDF file.prompt
(String): Text prompt.apiKey
(String): Google API key.modelName
(String): Gemini model.
Function: StreamGenerateContentFromLocalAudioPath
Stream content based on an audio file from the device's local storage.
audioPath
(String): Absolute path to the local audio file. Needs storage permissions.prompt
(String): Text prompt related to the audio.apiKey
(String): Google API key.modelName
(String): Gemini model (e.g., "gemini-pro-vision" or future audio models).
Generating Content with Images (Streaming)
Use
StreamGenerateGeminiVisionContent
.
contents
: List of dictionaries. Can include text parts ("text": "..."
) and image parts ("inlineData": {"mimeType": "image/jpeg", "data": "base64_string..."}
).apiKey
: Your Google API key.Blocks example:
This block opens a stream; results arrive via the
GotGeminiStream
event.
StreamGenerateGeminiFileContentFromBase64
Streams content based on provided Base64 encoded files and text.
apiKey
(String): Google API key.modelName
(String): Gemini model. See docs.fileBase64List
(List): List of Base64 encoded file strings.mimeTypeList
(List): Corresponding list of MIME types.additionalText
(String): Additional text prompt.
GenerateImage
Creates a new image from a text description.
prompt
(Text): Description of the desired image.apiKey
(Text): Google API Key.modelName
(Text): Image generation model (e.g., "gemini-1.5-flash"). Check Google docs.
EditImage
Modifies an existing image provided as Base64.
prompt
(Text): Instructions for changes.inputImageBase64
(Text): Image to edit (Base64 string).inputMimeType
(Text): MIME type of input image (e.g., "image/jpeg").apiKey
(Text): Google API Key.modelName
(Text): Image editing model.
EditImageFromPath
Modifies an existing image using its file path.
prompt
(Text): Instructions for changes.inputImagePath
(Text): Full path to the image file on device.apiKey
(Text): Google API Key.modelName
(Text): Image editing model.
EditMultipleImagesSimple
Advanced editing/generation using multiple input images (URL/Path/Base64) and text.
prompt
(Text): Instructions involving the images.imageSourceStrings
(List): List of image sources (URLs, paths, or Base64 strings).apiKey
(Text): Google API Key.modelName
(Text): Multi-image capable model.
DisplayBase64Image
Helper block to display Base64 image data on an Image component.
base64Data
(Text): Base64 image data (fromGotImageResponse
).mimeType
(Text): Image MIME type (fromGotImageResponse
).imageComponent
(Component): The Image component to display on.
Event: GotImageResponse
Fires when image generation/editing succeeds.
imageBase64
(Text): Resulting image as Base64. Empty on failure.mimeType
(Text): MIME type of the result (e.g., "image/png").responseText
(Text): Any text from the API (e.g., errors if blocked).rawApiResponse
(Text): Full JSON response (for debugging).imagePath
(Text): Path where the result image was saved in app storage (ASD). Empty on failure.
Examples of generating and editing with Gemini
GetGeminiModelNamesRetrieves a list of available Gemini model names.
apiKey
(String): Your Google API key.
Events:
- GotGeminiModelNames(modelNames as List): Triggered on success with the list of model names.
- ErrorOccurred(message, component): Triggered if an error occurs during the API request.
Encoding Images to Base64
The
EncodeImageToBase64
block encodes an image file path to Base64 (removes line breaks).
imagePath
: Path to the image file.Returns the Base64 string.
Error Handling
The
ErrorOccurred
event signals errors from the extension.
message
: Error description.component
: Name of the component causing the error ("Gemini", "Gemini-JSON", etc.).
Examples
Example: Generate text (non-streaming):
Example: Generate text (streaming):
Example: Generate text with images (streaming):
Example: Generate text with images in FreeForm Prompt (streaming):
Use TextFormater extension for FreeForm layout.
Freeform preview example:
Applications that use this extension :
Videos preview:
Aix_file:
Check the comparison between PAID and FREE versions:point_down:
PAID_file
Price: $5.99
Purchase: PayPal Link
After payment, you'll be redirected to the download URL. Contact me for any help or issues.
FREE_file
Gemini_Mini.aix (11.6 KB)
Have Inquiries?
For questions about the Gemini extension, contact me via PM on Telegram.
Note :
You can try Gemini and get your API key from Google AI Studio.