How to do File Search in Gemini API?

Creating a RAG program has become very easy. Google’s File Search tool for Gemini API now handles the heavy lifting of connecting LLMs to your data. Chunking, embedding, targeting are all handled by you. And with the latest update, it’s gone multimodal. Now you can search both text and images in one way, with custom metadata filtering and built-in page-level citations. In this guide, we’ll walk through how File Search works and use it with practical examples.
What Does File Search Do?
File Search helps Gemini access and use information from your data sources such as reports, documents, research papers, code, and confidential information bases.
When you upload a file, Gemini breaks it up into small pieces called “chunks” and creates an embed. These embeddings are numerical representations that capture the meaning of the content, helping Gemini understand the context. They are then stored in the File Search Store for easy retrieval.
When you ask a question, Gemini searches the embedded database for the most relevant bits and uses it as context to generate answers. This is the essence of Retrieval Augmented Generation (RAG).
Gemini File Search goes beyond just text. It also supports multimodal RAG, which allows text and images to be indexed and searched together. This means you can retrieve information from PDFs, images, charts, screenshots, and more using natural language queries.
For multimodal activities, Gemini uses gemini-embedding-2 of image embedding and multimodal embedding, while gemini-embedding-001 handles embedding text. Note that audio and video formats are not yet supported.
Also Read: Building an LLM Model using Google Gemini API
How Does File Search Work?
File Search is enabled with semantic vector search. Instead of matching words directly, it will find information based on meaning and context. This means that File Search can find important information even if the query terms are different.
Time required: 4 minutes
Here’s how it works step by step:
- Upload the file
The file will be divided into small parts called “chunks.”
- The embedded generation
Each segment will be converted into a vector of numbers representing the meaning of that segment.
- Storage
Embeds will be stored in the File Search Store, an embedded store designed specifically for retrieval.
- Question
When a user asks a question, File Search will convert that question into an embed.
- Retrieval
The retrieval step will compare the query embeddings with the stored embeddings and find which bits are most similar (if any).
- Logging
Relevant pieces are added to the information in the Gemini model so that the answer is based on factual data from the documents.
All this process is managed under the Gemini API. The developer does not have to manage any additional infrastructure or databases.
Setup Requirements
To use the File Search Tool, developers will need a few key components. They will need to have Python 3.9 or newer, the google-genai client library, and a valid Gemini API key with access to gemini-2.5-pro or gemini-2.5-flash.
Install the client library by running:
pip install google-genai -U Then, set your environment variable to the API key:
export GOOGLE_API_KEY="your_api_key_here"Creates a File Search Store
The File Search Store is where Gemini stores and embeds directories from your uploaded files. Once a file is uploaded and indexed, the indexed data remains available for retrieval until you manually delete it.
With a text-only RAG, you can create a regular File Search Store. For a multimodal RAG, where you want to upload and search both documents and images, create a store with models/gemini-embedding-2.
from google import genai
from google.genai import types
import time
import os
from pathlib import Path
# Do not hardcode your API key in the notebook.
# Set it as an environment variable instead.
os.environ["GOOGLE_API_KEY"] = "enter_your_api_key"
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
file_search_store = client.file_search_stores.create(
config={
"display_name": "my_multimodal_rag_store",
"embedding_model": "models/gemini-embedding-2"
}
)
print("File Search Store created:", file_search_store.name)Output:
This update is important because the official documentation shows embedding_model: models/gemini-embedding-2 when creating the File Search Store for multiple uses.
Upload File
After the File Search Store is created, you can upload files to it. When a file is uploaded, Gemini File Search automatically compiles the content, generates an embed, and indexes it for quick retrieval.
For text-based RAG, File Search supports documents such as PDF, DOCX, TXT, JSON, and format files such as .py and .js.
In multimodal RAG, File Search also supports image files. This means you can upload documents and images to the same File Search Store and ask questions that require both textual and visual context. For example, you can upload a research paper, a product image, and a chart, and ask Gemini to summarize the paper and describe the relevant visual information.
For uploading images, make sure the File Search Store is created with models/gemini-embedding-2. According to the official documentation, the supported image formats are PNG and JPEG. Image files must be at least 4K x 4K pixels, and an application can include a maximum of 6 images.
Upload Text File
# Upload and import a document into the File Search Store.
# The display name will be visible in citations.
operation = client.file_search_stores.upload_to_file_search_store(
file="/content/Paper2Agent.pdf",
file_search_store_name=file_search_store.name,
config={
"display_name": "Paper2Agent.pdf",
}
)
# Wait until import is complete
while not operation.done:
time.sleep(5)
operation = client.operations.get(operation)
print("Document successfully uploaded and indexed.")Output:

After this step, the document is cut, embedded, indexed, and ready for retrieval.
Upload Image File for Multimodal Acquisition
You can also upload an image file from the same File Search Store. This is useful if your application needs to retrieve information from product images, screenshots, charts, diagrams, or other visual content.
# Upload an image file for multimodal retrieval.
operation = client.file_search_stores.upload_to_file_search_store(
file="/content/product_image.jpg",
file_search_store_name=file_search_store.name,
config={
"display_name": "product_image.jpg",
}
)
# Wait until import is complete
while not operation.done:
time.sleep(5)
operation = client.operations.get(operation)
print("Image successfully uploaded and indexed."Output:

Once an image is identified, Gemini can find it during File Search when the user’s query matches the image.
Upload Multiple Documents and Images
In real-world applications, you may want to upload multiple files at once. These files can include both text and image documents.
from pathlib import Path
import time
files_to_upload = [
"/content/Paper2Agent.pdf",
"/content/product_image.jpg",
"/content/sales_chart.png"
]
for file_path in files_to_upload:
operation = client.file_search_stores.upload_to_file_search_store(
file=file_path,
file_search_store_name=file_search_store.name,
config={
"display_name": Path(file_path).name,
}
)
while not operation.done:
time.sleep(5)
operation = client.operations.get(operation)
print(f"Uploaded and indexed: {file_path}")Output:

After the upload step, all files are cut, embedded, indexed, and ready for retrieval. If the File Search Store contains both documents and images, Gemini can retrieve relevant content from both sources while answering user queries.
Ask Questions About the File
Once your files are identified, Gemini can answer questions using uploaded documents and images as context. It searches the File Search Store, finds the most relevant bits, and uses them to generate a base response.
In the case of using text only, you can ask a question about the uploaded PDF:
response = client.models.generate_content(
model="gemini-3-flash-preview",
contents="Summarize what is there in the research paper.",
config=types.GenerateContentConfig(
tools=[
types.Tool(
file_search=types.FileSearch(
file_search_store_names=[file_search_store.name]
)
)
]
)
)
print("Model Response:n")
print(response.text)Output:

Here, FileSearch is used as a tool inside generate_content(). The model first searches your stored embeds, pulls the most relevant categories, and generates an answer based on that context.
For a multimodal use case, you can ask a question that uses both a document and an image:
response = client.models.generate_content(
model="gemini-3-flash-preview",
contents="""
Based on the uploaded research paper, and the images,
summarize the key idea from the paper and explain what the images shows.
""",
config=types.GenerateContentConfig(
tools=[
types.Tool(
file_search=types.FileSearch(
file_search_store_names=[file_search_store.name]
)
)
]
)
)
print("Multimodal Response:n")
print(response.text)Output:

Here, FileSearch is used as a tool inside generate_content(). The model searches the stored embeds, finds the most relevant text or image context, and generates a response based on that returned information.
Customize Chunking
By default, File Search determines how files are divided into pieces, but you can control this behavior for better search accuracy.
operation = client.file_search_stores.upload_to_file_search_store(
file_search_store_name=file_search_store.name,
file="path/to/your/file.txt",
config={
'chunking_config': {
'white_space_config': {
'max_tokens_per_chunk': 200,
'max_overlap_tokens': 20
}
}
}
) This setting sets each slice to 200 tokens and 20 overlapping tokens for smooth context. Short snippets provide good search results, while large ones keep a complete description useful for research papers and code files.
Show Retrieved Content Excerpts
You can also print extract information to check which files or fragments were used by Gemini while generating the response. The official documentation states that citation information is available via grounding_metadata, and image references can include media citation information.
grounding_metadata = response.candidates[0].grounding_metadata
print("nRetrieved Context:n")
if grounding_metadata and grounding_metadata.grounding_chunks:
for chunk in grounding_metadata.grounding_chunks:
context = chunk.retrieved_context
if context:
print("Source:", getattr(context, "title", "Unknown"))
print("Text:", getattr(context, "text", "No text available"))
if getattr(context, "page_number", None):
print("Page Number:", context.page_number)
if getattr(context, "media_id", None):
print("Media ID:", context.media_id)
print("-" * 50)
else:
print("No grounding metadata found.")Output:

This makes the hands-on section powerful because students can see not only the answer, but also the context of the resource Gemini is using.
Manage Your Stores Search Files
You can easily list, view, and delete file search stores using the API.
print("n Available File Search Stores:")
for s in client.file_search_stores.list():
print(" -", s.name)
# Get detailed info
details = client.file_search_stores.get(name=file_search_store.name)
print("n Store Details:n", details
# Delete the store (optional cleanup)
client.file_search_stores.delete(name=file_search_store.name, config={'force': True})
print("File Search Store deleted.")
These management options help keep your site organized. Indexed data remains stored until manually deleted, while files uploaded via the Temporary Files API are automatically removed after 48 hours.
Also Read: 12 Things You Can Do With The Free Gemini API
File Search Support and Restrictions
File Search is available for the following Gemini models: Gemini 3.1 Pro Preview, Gemini 3.1 Flash-Lite Preview, Gemini 3 Flash Preview, Gemini 2.5 Pro, and Gemini 2.5 Flash-Lite.
Gemini 3 models allow you to combine File Search with custom tools by typing a function. However, File Search is not yet supported in the Live API and cannot be used with some built-in tools such as Grounding with Google Search or URL Context.
File Search supports many types of file formats, including PDFs, Word documents, spreadsheets, presentations, JSON, CSV, HTML, XML, Markdown, YAML, code files, ZIP files, and Jupyter notebooks. For multimodal RAG, it also supports PNG and JPEG images when the store is created with models/gemini-embedding-2.
File Size and Storage Limits
| User category | File Size Limit | Keep to the Limit of Dose |
|---|---|---|
| It’s free | 100 MB per file | 1 GB |
| Section 1 | 100 MB per file | 10 GB |
| Section 2 | 100 MB per file | 100 GB |
| Section 3 | 100 MB per file | 1 TB |
Recommended: Keep each store under 20 GB for best performance and lowest latency.
Regarding the price, embedding is charged at the time of indexing. Storage and query time embedding are free, and returned document tokens are billed like normal context tokens.
Also read: How to access and use Gemini API?
The conclusion
File Search takes the infrastructure work out of building RAG systems. No external vector databases, no custom embedding pipelines. Just upload your files and start querying. With new multimodal support, you can now search across all documents and images together. Filtering metadata helps you narrow down results to what’s relevant, and page-level citations make every response traceable back to its source. Whether you’re prototyping or building for production, File Search gives you a solid, manageable foundation to build on. Get started in Google AI Studio or with the Gemini API documentation linked in the article.
Sign in to continue reading and enjoy content curated by experts.



