[PAID] 🧠 Gemini Extension to interact with the Gemini-pro model from Google

Gemini

The Gemini extension for AI2 allows you to interact with the Google Gemini-Pro, Gemini-Pro-Vision, and Gemini 1.5 Flash models (including models Bard uses) to generate text and control streaming text generation.

Features of the Gemini Extension for AI2:

  • Gemini API Text Generation: Use Gemini-Pro, Gemini-Pro-Vision, and Gemini 1.5 Flash models.
  • Image Generation: Create new images from text prompts using supported models.
  • Image Editing: Modify existing images using text instructions (from path or Base64).
  • Streaming Text Generation: Real-time, interactive responses.
  • Vision Support: Generate text from images, video thumbnails, PDFs (URL/Local).
  • Local Audio Processing: Generate text based on local audio files.
  • Gemini 1.5 Flash Model Support: Access the faster Flash model.
  • Code Execution (Optional): Enable code execution in Gemini responses.
  • Structured JSON Output: Define schemas for predictable results.
  • File Handling: Optimized Base64 encoding (images, video, PDF, audio, etc.), path/URI handling, MIME type detection.
  • PaLM API Integration: Basic text generation support for PaLM.
  • List Available Models: Fetch model names.
  • Stream Control: Start/Stop streaming functions.
  • Error Handling: Events for API/file/JSON errors, stream completion/stopping.
  • Asynchronous Operations: Non-blocking UI for responsiveness.

Benefits:

  • Integrate cutting-edge AI text and multimodal generation easily into AI2 projects.
  • Create dynamic user experiences with streaming and interactive features.
  • Generate diverse content from text, images, videos, PDFs, and audio (local/web).
  • Build data-driven apps with reliable structured JSON output.
  • Seamlessly work with local and web files for richer AI interactions.
  • Choose between Gemini (including Flash) and PaLM APIs/models.
  • Easy to use within App Inventor, designed for extensibility.

Use Cases:

  • Chatbots: Build bots understanding text, images, audio.
  • Content Creation: Generate articles, posts, stories (with text, image, PDF, audio prompts).
  • Media Analysis: Analyze images/videos, generate descriptions/summaries.
  • Document Processing: Process PDFs (URL/Local) for summaries/Q&A.
  • Audio Processing: Transcribe/summarize local audio.
  • Code Tools: Generate/assist with code (optional execution).
  • Data Structuring: Extract info into structured JSON via schemas.
  • Edu/Creative Tools: Interactive learning, story generation.
    The potential applications are vast!

Blocks


image





image

Explanation


Generating Content (Non-Streaming)

Use the GenerateGeminiContent block.

  • modelName (String): Gemini model (e.g., "gemini-1.5-flash"). See docs.
  • apiKey (String): Your Google API key.
  • contents: List of dictionaries representing conversation turns. Each dict has role (String) and parts (List of dicts with text key).

Blocks example:




The RespondedToGemini event handles the response.

  • apiResponse: Raw API response.
  • textParts: List of generated text strings.
  • role: Role of the response.
  • finishReason: Why generation stopped.
  • index: Content index.
  • safetyRatings: List of safety rating dictionaries (category, probability).




Function: StreamGenerateGeminiContent

Stream content using the Gemini API, with optional Code Execution.

Parameters:

Blocks example

Functionality:
Initiates streaming via Server-Sent Events (SSE). Extracts text/code from chunks. Triggers GotGeminiStream with content (Markdown for code). Triggers StreamFinished on completion. Triggers ErrorOccurred for API or JSON errors.

Callbacks: GotGeminiStream(textValue), StreamFinished(), ErrorOccurred(errorMessage, component).

Usage Notes: For streaming text/code. contents handles multi-turn/images. enableCodeExecution allows interactive code. Handle events for results and errors. Requires internet and valid API key.





GotGeminiStream` event.

Screenshot 2023-12-16 235605

Triggered with each chunk of streamed data.

  • text: String representing the generated text chunk.

Screenshot 2023-12-16 235605

Use StopStream to manually stop. StoppedStream event fires when stopped.

Screenshot 2023-12-16 235605

Use IsStreaming to check if a stream is active.





Function: GenerateGeminiThinkingContent

Generate content using the Gemini 1.5 Flash model (non-streaming).

  • prompt (String): Text prompt.
  • apiKey (String): Google API key.




Function: StreamGenerateGroundedContent

Streams content from Gemini, instructing it to use Google Search for grounding (fact-checking).

  • prompt (String): Text prompt.
  • apiKey (String): Google API key.

component_event

Event: GotGroundingInfo

Fires (usually near end of stream) with web sources used for grounding.

  • sourceUris: List of URLs consulted.
  • sourceTitles: Corresponding list of page titles.




Function: GetFaviconUrl

Constructs a URL for a website's favicon using Google's service.

  • url (String): The website URL.




Function: StreamGenerateGeminiThinkingContent

Stream content using the Gemini 1.5 Flash model. Retrieves content in chunks.

  • prompt (String): Text prompt.
  • apiKey (String): Google API key.

Triggers GotGeminiStream, StreamFinished, and ErrorOccurred.




Function: StreamGenerateContentFromPdfUrl

Stream content based on a PDF file retrieved from a URL.

  • pdfUrl (String): URL of the PDF.
  • prompt (String): Text prompt related to the PDF content.
  • apiKey (String): Google API key.
  • modelName (String): Gemini model (e.g., "gemini-pro-vision").




Function: StreamGenerateGeminiStructuredContent

-----------------------

Stream structured JSON content matching a provided schema.

  • contents (List): Conversation turns (same format as StreamGenerateGeminiContent).
  • apiKey (String): Google API key.
  • modelName (String): Gemini model (e.g., "gemini-pro").
  • scheme (String): JSON Schema string defining desired output structure (use CreateJsonSchema).

Usage Notes: Get structured JSON streamed. GotGeminiStream provides chunks that form the final JSON. Requires model supporting structured output.




Function: CreateJsonSchema

Builds a JSON Schema string.

  • propertyNames (List of String): Names of JSON properties.
  • propertyTypes (List of String): Corresponding types ("string", "number", "array", "boolean", "integer", "object").
  • propertyDescriptions (List of String): Descriptions for each property.
  • requiredProperties (List of String): Names of required properties.




Function: StreamGenerateContentFromLocalPdfPath

Stream content based on a PDF file from the device's local storage.

  • pdfPath (String): Absolute path to the local PDF file.
  • prompt (String): Text prompt.
  • apiKey (String): Google API key.
  • modelName (String): Gemini model.



Function: StreamGenerateContentFromLocalAudioPath

Stream content based on an audio file from the device's local storage.

  • audioPath (String): Absolute path to the local audio file. Needs storage permissions.
  • prompt (String): Text prompt related to the audio.
  • apiKey (String): Google API key.
  • modelName (String): Gemini model (e.g., "gemini-pro-vision" or future audio models).




Generating Content with Images (Streaming)

Use StreamGenerateGeminiVisionContent.

  • contents: List of dictionaries. Can include text parts ("text": "...") and image parts ("inlineData": {"mimeType": "image/jpeg", "data": "base64_string..."}).
  • apiKey: Your Google API key.

Blocks example:



Screenshot 2023-12-16 235605

This block opens a stream; results arrive via the GotGeminiStream event.



StreamGenerateGeminiFileContentFromBase64

Streams content based on provided Base64 encoded files and text.

  • apiKey (String): Google API key.
  • modelName (String): Gemini model. See docs.
  • fileBase64List (List): List of Base64 encoded file strings.
  • mimeTypeList (List): Corresponding list of MIME types.
  • additionalText (String): Additional text prompt.


GenerateImage

Creates a new image from a text description.

  • prompt (Text): Description of the desired image.
  • apiKey (Text): Google API Key.
  • modelName (Text): Image generation model (e.g., "gemini-1.5-flash"). Check Google docs.

EditImage

Modifies an existing image provided as Base64.

  • prompt (Text): Instructions for changes.
  • inputImageBase64 (Text): Image to edit (Base64 string).
  • inputMimeType (Text): MIME type of input image (e.g., "image/jpeg").
  • apiKey (Text): Google API Key.
  • modelName (Text): Image editing model.

EditImageFromPath

Modifies an existing image using its file path.

  • prompt (Text): Instructions for changes.
  • inputImagePath (Text): Full path to the image file on device.
  • apiKey (Text): Google API Key.
  • modelName (Text): Image editing model.

EditMultipleImagesSimple

Advanced editing/generation using multiple input images (URL/Path/Base64) and text.

  • prompt (Text): Instructions involving the images.
  • imageSourceStrings (List): List of image sources (URLs, paths, or Base64 strings).
  • apiKey (Text): Google API Key.
  • modelName (Text): Multi-image capable model.

DisplayBase64Image

Helper block to display Base64 image data on an Image component.

  • base64Data (Text): Base64 image data (from GotImageResponse).
  • mimeType (Text): Image MIME type (from GotImageResponse).
  • imageComponent (Component): The Image component to display on.

Event: GotImageResponse

Fires when image generation/editing succeeds.

  • imageBase64 (Text): Resulting image as Base64. Empty on failure.
  • mimeType (Text): MIME type of the result (e.g., "image/png").
  • responseText (Text): Any text from the API (e.g., errors if blocked).
  • rawApiResponse (Text): Full JSON response (for debugging).
  • imagePath (Text): Path where the result image was saved in app storage (ASD). Empty on failure.

Examples of generating and editing with Gemini



  • StreamGenerateContentFromLocalVideoPath
    • Parameters:

      • videoPath (String): The local file path to a video file.

      • prompt (String): The text prompt related to the video content.

      • apiKey (String): Your Google AI API Key.

      • modelName (String): The Gemini model to use.

      • systemInstructionsValue (String): Optional system instructions.

      • jsonSchemaString (String): Optional JSON schema for structured output.

    • Description: Uploads a local video file using the File API, polls until the file is processed ("ACTIVE"), and then starts a streaming request based on the video content and prompt. Optionally includes system instructions and/or requests structured output via a JSON schema. Response chunks arrive via GotGeminiStream. Triggers StreamFinished when done or ErrorOccurred on failure.

  • StreamGenerateContentFromLocalVideoPathWithInstructions

    • Parameters:

      • videoPath (String): The local file path to a video file.

      • prompt (String): The text prompt related to the video content.

      • apiKey (String): Your Google AI API Key.

      • modelName (String): The Gemini model to use.

      • systemInstructionsValue (String): Optional system instructions.

    • Description: Similar to StreamGenerateContentFromLocalVideoPath, but only includes the option for system instructions (no structured output schema). Uploads the video, waits for processing, then starts the streaming request. Response chunks arrive via GotGeminiStream. Triggers StreamFinished when done or ErrorOccurred on failure. Uses standard Designer Properties for generation config.



  • StreamGenerateContentFromYouTubeUrl (Overload 1 - Basic)
    • Parameters:

      • youtubeUrl (String): Public URL of a YouTube video (including Shorts).

      • prompt (String): Text prompt relating to the video.

      • apiKey (String): Your Google AI API Key.

      • modelName (String): The Gemini model to use.

    • Description: Starts a streaming analysis request using a YouTube URL and prompt. Uses default generation settings from Designer Properties. Response chunks arrive via GotGeminiStream. Triggers StreamFinished when done or ErrorOccurred on failure.

  • StreamGenerateStructuredContentFromYouTubeUrl (Overload 2 - Advanced)

    • Parameters:

      • youtubeUrl (String): Public URL of a YouTube video (including Shorts).

      • prompt (String): Text prompt relating to the video.

      • apiKey (String): Your Google AI API Key.

      • modelName (String): The Gemini model to use.

      • systemInstructionsValue (String): Optional system instructions.

      • jsonSchemaString (String): Optional JSON schema for structured output.

    • Description: Starts a streaming analysis request using a YouTube URL and prompt. Optionally includes system instructions and/or requests structured output via a JSON schema. Response chunks arrive via GotGeminiStream. Triggers StreamFinished when done or ErrorOccurred on failure.




GetGeminiModelNames

Retrieves a list of available Gemini model names.

  • apiKey (String): Your Google API key.

Events:

  • GotGeminiModelNames(modelNames as List): Triggered on success with the list of model names.



7

  • ErrorOccurred(message, component): Triggered if an error occurs during the API request.



Encoding Images to Base64

The EncodeImageToBase64 block encodes an image file path to Base64 (removes line breaks).

  • imagePath: Path to the image file.

Returns the Base64 string.


Error Handling

Screenshot 2023-12-16 235605

The ErrorOccurred event signals errors from the extension.

  • message: Error description.
  • component: Name of the component causing the error ("Gemini", "Gemini-JSON", etc.).

Examples

Example: Generate text (non-streaming):



Example: Generate text (streaming):

Example: Generate text with images (streaming):

Example: Generate text with images in FreeForm Prompt (streaming):
Use TextFormater extension for FreeForm layout.


Applications that use this extension :

Videos preview:

Aix_file:

Check the comparison between PAID and FREE versions:point_down:


PAID_file

Price: $5.99
Purchase: PayPal Link or You can pay HERE using your credit card
In both cases after payment, you'll be redirected to the download URL. Contact me for any help or issues.

FREE_file

Gemini_Mini.aix (11.6 KB)

Have Inquiries?
For questions about the Gemini extension, contact me via PM on Telegram.

Note :

You can try Gemini and get your API key from Google AI Studio.

2 Likes

PaLM 2 NewBlocks added

1 Like

Hi, I just paid for gemini.aix but it sent me chatGPT.aix

2 Likes

@Long_Cao I have sent you the file of Gemini. aix and I thank you for reporting this issue
and for ChatGpt.aix it's yours cause this is my fault not yours

1 Like

I just made payment via "Gemini.aix".

I paid for Gemini.aix today, but I was not directed to the website to download the file.

Taifun

I sent a private message and also tried via chat, but I still couldn't speak to "Cavaleiro_Negro".

I think "Black_Knight" is the extension author, not the one you said.
Oh wait you traslated the author's name, you obviously won't be able to send a message to them! Send it to "Black_Knight" (and do not translate the name please)...

Sorry for this error, the author of the extension is "Black_Knight". Thanks!

1 Like

You're welcome.

I would like help in contacting "Black_Knight". I already sent a private message, tried via chat, but I was unsuccessful. I purchased the "Gemini.aix" extension, but after payment I was not directed to download the file.

@Black_Knight also gets messages if you post something here in this thread
just be patient

Taifun

Thank you very much for your help, I will wait for the answer

Firstly I am very sorry for my very late reply and for this bad situation I was very busy with my military missions at the last of those days and you

You can Contact me again here and I will solve your problem

Send me the URL to download the extension and everything will be resolved.

Thank you very much for sending me the file. If I have any questions about the application, I will post them here.

1 Like

You are welcome man !

Ok you can ask if you want any thing about it

New update for the Extension to meet the latest updates of the Gemini API .

Here's a summary of the updates made to the Gemini.aix compared to the initial version.

1. Model Selection:

  • The GenerateGeminiContent, StreamGenerateGeminiContent, and functions now all accept a modelName parameter, allowing the user to specify which Gemini model to use for the request. This provides flexibility in choosing the appropriate model for different tasks.

2. StreamGenerateGeminiFileContentFromBase64 Function:

  • New Function: A new function called StreamGenerateGeminiFileContentFromBase64 has been added.



  • Base64 File Input: This function accepts a list of Base64 encoded files (fileBase64List) and a corresponding list of MIME types (mimeTypeList).

  • Generic File Handling: It handles various file types (not just images) by using the MIME type information.

  • Streaming Response: It uses streaming to receive the response from the Gemini API and triggers the GotGeminiStream event for each chunk of text received.

3. GetGeminiModelNames Function:

  • New Function: A new function called GetGeminiModelNames has been added.




  • Retrieving Model Names: It retrieves a list of available Gemini model names from the API and triggers the GotGeminiModelNames event with the list.

4. GetFilePathFromDataURI Function:

  • New Function: A new function called GetFilePathFromDataURI has been added.

  • Data URI to File Path: It converts a Data URI (representing a file) to a local file path. It handles content://, file://, and data:// URI schemes.

5. getMimeType Function:

  • New Function: A new function called getMimeType has been added.

  • Get MIME Type: It takes a file path as input and returns the MIME type of the file using Files.probeContentType(path).

6. Code Cleanup and Improvements:

  • Removed Redundant Parameter: The contents parameter in the StreamGenerateGeminiVisionContentFromPathsAndText function was removed as it became unnecessary after adding separate parameters for images and text.

  • Error Handling: The code now includes more robust error handling, using try-catch blocks and triggering the ErrorOccurred event when necessary.

Overall, the updated code is more versatile, efficient, and user-friendly:

  • More Features: It provides functions to retrieve model names, handle various file types, and work with Data URIs.

  • Flexibility: Users can now choose specific Gemini models and send different file types to the API.

  • Efficiency: Streaming responses allow for better handling of large data.

  • Improved Usability: The code is more organized and includes better documentation and error handling.

These updates enhance the functionality and make the extension more useful for a wider range of applications within App Inventor.

If any one interested of this competition AI event from google