@techxsarthak you can try this feature inside Google AI Studio
with structured output
without structured output
@techxsarthak you can try this feature inside Google AI Studio
with structured output
without structured output
Added a block in GroqText, as I said earlier... was working on an update with new blocks, released it just now.
Same for Groq Playground - GroqCloud. Its JSON mode btw ![]()
Added blocks for it in my extension too!
UserRequest= Question to ask to the AI
customScheme= The json type to generate
For eg.
Output
Yes its now possible with GroqText
Btw Gemini extension is also very good!
Good job ![]()
Thanks, your extensions are also good as always
Gemini Extension brings Google’s powerful, multimodal Gemini AI directly into MIT App Inventor, enabling you to build and customize any AI-driven API—text processing, OCR, image analysis, video intelligence, image editing, and more—without incurring extra third-party API fees
By embedding Google’s state-of-the-art Gemini model (optimized in Ultra, Pro, and Nano sizes), this extension delivers enterprise-grade AI capabilities right inside your App Inventor projects—no external subscriptions required blog.google. You define your own JSON Schemas to guarantee structured, consistent outputs, eliminating parsing headaches and ensuring your data flows smoothly into your app
OCR Gemini customized API Example:
APK file
Try from here
Transform the way you build AI features in MIT App Inventor with only 5.99$. Install Gemini Extension now, unlock Google’s Gemini AI, and craft the exact APIs your app demands—cost-efficiently, reliably, and at scale.
Hello,
I recently purchased your Gemini Extension for MIT App Inventor through PayPal, but I wasn’t redirected to the download link after payment was completed.
Could you please send the extension file (or the access link) directly to my email address at [mail id removed by mod, please do not post personal info, use PMs]
Let me know if you need any payment confirmation or transaction details.
Thank you for your support!
Best regards,
I am sorry for such a situation,
I have sent the Extension for you please check your email,
I hope this extension will move your app development to the next level.
Thanks,
Uploads a local file, waits for it to be processed (ACTIVE), " +
"and returns detailed metadata via the 'GotFileMetadata' event. " +
"Also reports progress via 'FileUploadProgress'.Reports the progress of a file upload (e.g., video, audio, pdf)."
Retrieves the direct download URI for the content of a file identified by its resource name (e.g., 'files/your_file_id'). " +
"Use this URI to download the file content directly (e.g., with Web component).
Analyzes multiple images (from URLs/Paths) based on prompt, streaming results. Optionally provide system instructions and/or JSON schema. Results via GotGeminiStream/StreamFinished events.
Analyzes an image from URL based on prompt, streaming results. Optionally provide system instructions and/or a JSON schema for structured output. Results via GotGeminiStream/StreamFinished events.
Get ready! You can now transform text into incredibly realistic speech with the NEW Text-to-Speech (TTS) service just added to your Gemini AI extension, powered by Google! ![]()
This isn't just any speech generator; it's INSANELY powerful! Create:
Crystal-clear single voice narrations
Dynamic multi-speaker conversations (perfect for podcasts!)
And so much more!
We're excited to announce a powerful new feature for the Gemini extension: URL Context . This update allows your apps to give the Gemini model the ability to read and understand the content of web pages you provide directly in your prompt!

Imagine you want Gemini to summarize a news article, compare two product pages, or answer questions based on a specific blog post. Before, you would have to copy and paste all the text.
Now, you can simply include the web page links (URLs) in your prompt and enable the new enableUrlContext feature. Gemini will visit those URLs, read the content, and use that information to give you a much more relevant and contextual response.
This opens up amazing new possibilities, such as:
Article Summarization: "Summarize the key points of this article for me: [URL]"
Data Extraction: "Extract all the technical specifications from this product page: [URL]"
Content Comparison: "Compare the pros and cons of the cameras reviewed in [URL1] and [URL2]"
Question Answering: "Based on the information at [URL], what is the main ingredient in their recipe?"
See the video :
The blocks :
isStreaming: booleanGenerateGeminiContent block that enables you to create continuous chat :Can you please provide all features in one message
Text Generation :
Simple Chat : Provides a basic Ask function for simple text-in, text-out conversations.
Advanced Generation : A powerful GenerateGeminiContent function that supports both single-turn and multi-turn conversations, system instructions, and optional tools.
Streaming : Offers streaming versions of all major generation functions (StreamGenerateGeminiContent, StreamGenerateGroundedContent, etc.) that provide the response in real-time chunks.
Image Understanding (Vision) :
Simple Image Queries : An AskWithImage block to ask questions about a single image.
Multi-Modal Analysis : The ability to send multiple images, videos, audio files, PDFs, and text in a single prompt for comprehensive analysis.
Multiple Input Sources : Accepts files from local paths, Base64 encoded strings, content URIs, and public URLs (for PDFs and images).
YouTube Video Analysis : Can analyze content directly from a public YouTube URL (including Shorts) when provided with a prompt.
Image Generation & Editing :
Text-to-Image : GenerateImage function to create an image from a text description.
Image Editing : EditImage and EditImageFromPath functions to modify an existing image based on a text prompt.
Multi-Image Editing : EditMultipleImagesSimple function to process a prompt against a list of images from various sources (URLs, paths, Base64).
Audio Understanding :
Video Understanding :
Text-to-Speech (TTS) :
Single Speaker : GenerateSingleSpeakerSpeech function to convert text into speech using a specified prebuilt voice.
Multi-Speaker : GenerateMultiSpeakerSpeech function to create dialogue with multiple distinct voices from a structured script.
Structured Output (JSON) :
Users can provide a JSON Schema to force the model to return its answer in a structured JSON format, making it easy to parse and use data in the app. This is supported by multiple functions.
Includes a CreateJsonSchema helper block to easily build the required schema.
Google Search Grounding :
Code Execution :
File API Integration :
Efficient File Uploads : Includes robust functions to upload large files (like videos) directly to Google's servers. This is highly efficient as the file is processed on the server and referenced by a URI, avoiding the need to send the full file with every request.
File Management : Provides blocks to get detailed metadata (UploadFileAndGetMetadata) and the direct download link (GetFileContentUri) for uploaded files.
Reusability : Uploaded files can be reused in multiple API calls by referencing their URI.
The extension is event-driven, providing specific events to handle different outcomes:
General Responses : RespondedToGemini (for single responses), GotGeminiStream (for each piece of a streaming response), and StreamFinished.
Image & Audio Generation : GotImageResponse (returns Base64 image data and a saved file path) and GotSpeechAudio (returns Base64 audio and a saved file path).
File Uploads : FileUploadProgress (provides real-time progress for large uploads) and FileUploadComplete / GotFileMetadata (fires when a file is uploaded and processed, returning its URI and details).
Error Handling : A robust ErrorOccurred event that provides detailed error messages for easier debugging.
API Key Validation : APIKeyValid, APIKeyInvalid, and APIKeyCheckError events to confirm if the provided API key works.
Grounding Sources : GotGroundingInfo event that returns a list of source URLs and titles when using Google Search.
File Encoding : Multiple blocks to encode various file types (images, videos, PDFs) into Base64 format.
Path Conversion : A GetFilePathFromURI function to handle file paths provided by components like the Activity Starter or File Picker.
Permission Handling : Blocks to check for and request the necessary storage permissions on Android.
Image Display : A DisplayBase64Image helper to easily display a Base64 string in an Image component.
Model Management : A GetGeminiModelNames function to retrieve a list of all available models for the user's API key.
Favicon Fetcher : A simple utility to get the URL for a website's favicon.
Designer Properties : The extension allows setting key parameters directly in the MIT App Inventor designer, including:
API Key and default Model Name.
Generation controls: Temperature, Top P, Top K, and Max Output Tokens.
Safety settings: Category and Threshold for content moderation.
Which gemini model needs to be used to access all features
There is no model that can access all features
For General Analysis (Text, Chat, Vision, Audio, Video):
For Generating and Editing Images:
For Generating Speech (Text-to-Speech):
I'm excited to share a major update to the Gemini extension!
We've just added a powerful new feature: Image Editing . To celebrate, we are also introducing our most powerful and a-peeling model yet: the Nano Bananana AKA gemini-2.5-flash-image-preview model!
Now you can perform powerful image edits directly within your App Inventor projects. Take a look:
We are very excited to see what you can create with this new functionality.
Happy Inventing
https://x.com/Google/status/1990924447402828120?t=1Avhi2kbi6XVDg7SQNkuQA&s=19