Gemma 4 models handle function calls, structured JSON output, and system commands at the model level, rather than through prompt generation techniques.
Google's Gemma 4 family of open-source models is now available through the Gemini API and Google AI Studio. Built on the same research behind Gemini 3, these models offer enhanced inference capabilities, native function calls, multimodal understanding, and a 256K context window in a single open-source package, licensed under Apache 2.0, that you can run anywhere.
There are currently two models available through the Gemini API:
- gemma-4-26b-a4b-it
- gemma-4-31b-it
What makes Gemma 4 different?
Gemma 4 models handle function calls, structured JSON output, and system commands at the model level, rather than through prompt generation techniques. The dense 31B model currently ranks #3 among open-source models on the Arena AI text rankings, with the 26B MoE model at #6, competing with models 20 times larger.
Key features:
- 256K context window on both models
- Call the original function and the output will be structured.
- Text, images, and multimedia videos
- Over 140 languages are taught using core language skills.
- The Apache 2.0 license allows for full, unrestricted commercial use.
Get started with AI Studio
The quickest way to try Gemma 4 is through Google AI Studio. Select gemma-4-26b-a4b-it or gemma-4-31b-it from the model selector, type prompt, and start chatting. You can check system instructions, adjust the temperature, and experiment with multimodal input via your browser. No API key or code is required.
Or click Get Code to export Python, JavaScript, or cURL snippets from any conversation.
Using Gemma 4 with the Gemini API
Install the Python SDK:
pip install google-genai
Set your API key as an environment variable. You can generate a key at aistudio.google.com/apikey .
export GEMINI_API_KEY="api-key-của-bạn"
Create text
Create text with Gemma 4:
from google import genai client = genai.Client() response = client.models.generate_content( model="gemma-4-26b-a4b-it", contents="Compare ramen and udon in 3 bullet points: broth, noodle texture, and best season to eat." ) print(response.text)
Pass a system command to set the model's behavior:
from google import genai from google.genai import types client = genai.Client() response = client.models.generate_content( model="gemma-4-31b-it", config=types.GenerateContentConfig( system_instruction="You are a wise Kyoto tea master. Speak calmly and poetically, using nature metaphors. Keep answers under 3 sentences." ), contents="What is the purpose of the tea ceremony?" ) print(response.text)
Multi-turn conversation
The SDK provides an automated chat interface that tracks conversation history:
from google import genai client = genai.Client() chat = client.chats.create(model="gemma-4-26b-a4b-it") response = chat.send_message("What are the three most famous castles in Japan?") print(response.text) response = chat.send_message("Which one should I visit in spring for cherry blossoms?") print(response.text)
Understanding images
Pass an image along with your text prompt:
from google import genai from google.genai import types client = genai.Client() with open("path/to/image.png", "rb") as f: image_bytes = f.read() response = client.models.generate_content( model="gemma-4-26b-a4b-it", contents=[ types.Part.from_bytes(data=image_bytes, mime_type="image/png"), "Describe this image in 2-3 sentences as if writing a caption for a Japanese travel magazine." ] ) print(response.text)
Call function
Define tools as function declarations. The model decides when to call them:
from google import genai from google.genai import types # Define the function declaration get_weather = { "name": "get_weather", "description": "Get current weather for a given location.", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City and state, e.g. 'San Francisco, CA'", }, }, "required": ["location"], }, } client = genai.Client() tools = types.Tool(function_declarations=[get_weather]) config = types.GenerateContentConfig(tools=[tools]) response = client.models.generate_content( model="gemma-4-26b-a4b-it", contents="Should I bring an umbrella to Kyoto today?", config=config, ) # The model returns a function call instead of text if response.candidates[0].content.parts[0].function_call: fc = response.candidates[0].content.parts[0].function_call print(f"Function: {fc.name}") print(f"Arguments: {fc.args}")
Google Search
Based on real-time web data from Google Search, Gemma 4 provides the following responses:
from google import genai from google.genai import types client = genai.Client() response = client.models.generate_content( model="gemma-4-26b-a4b-it", contents="What are the dates for cherry blossom season in Tokyo this year?", config=types.GenerateContentConfig( tools=[{"google_search":{}}] ), ) print(response.text) # Access grounding metadata for citations for chunk in response.candidates[0].grounding_metadata.grounding_chunks: print(f"Source: {chunk.web.title} — {chunk.web.uri}")