Google AI Launches PaperBanana: An Agentic Framework That Generates Publication Ready Methodology Diagrams and Statistical Analysis

Producing images ready for publication is a labor-intensive bottleneck in the research workflow. While AI scientists can now manage literature and code reviews, they struggle to communicate complex findings. A team of researchers from Google and Peking University introduced a new framework called ‘Paper Banana‘ is changing that by using a multi-agent system to automate high-level educational graphics and sites.

5 Special Agents: Architecture
Paper Banana it does not depend on a single information. It organizes a collaborative team 5 agents turning raw text into professional visuals.


Section 1: Line Arrangement
- Recovery agent: Indicates the 10 the most appropriate reference examples from the database to guide style and layout.
- Planning agent: Translates the text of the technical method into a detailed description of the text of the target figure.
- Stylist Agent: Acts as a design consultant to ensure that your output matches the “NeurIPS Look” using specific color palettes and properties.
Phase 2: Iterative Refinement
- Visualizer Agent: Converts the description to visual output. For graphics, it uses similar image models Nano-Banana-Pro. In mathematical fields, it writes useful Python Matplotlib the code.
- A critical agent: Checks the generated image against the source text for factual errors or visual issues. It gives the answer of 3 refining cycles.
Beating the NeurIPS 2025 Benchmark


The research team presented Paper BananaThe benchdata set of 292 test cases selected from real NeurIPS 2025 books. Using a VLM-as-a-Judge way, they compare Paper Banana against the best bases.
| Metric | Improvement over Baseline |
| Total Score | +17.0% |
| To summarize | + 37.2% |
| Readability | +12.9% |
| Aesthetics | +6.6% |
| Honesty | +2.8% |
The program tops the ‘Agent and Consultancy’ charts, earning a 69.9% overall score. It also offers an automatic ‘Beauty Guide’ that favors ‘Soft Tech Pastels’ over strong primary colors.
Mathematical Plots: Code vs. A photo
Mathematical plots require numerical precision that conventional graphical models often lack. Paper Banana solves this by having the Visualizer Agent write code instead of drawing pixels.
- Image Generation: It works very well aesthetically but often suffers from ‘number vision’ or duplicated elements.
- Code-Based Generation: It confirms 100% data reliability by using the Matplotlib library to provide the final plot.
Domain-Specific Aesthetic Preferences in AI Research
According to the Paper Banana style guide, aesthetic choices often change based on the research domain to match the expectations of different scholarly communities.
| Research Background | Visual ‘Vibe‘ | Main Design Elements |
| Agent and consultation | Illustrations, Narrative, “Friendly” | 2D vector robots, human avatars, emojis, and “User Interface” aesthetics (chat bubbles, document icons) |
| Computer Vision & 3D | Spatial, Dense, Geometric | Camera cones (frustums), ray lines, point clouds, and RGB color coding for axis coordinates |
| Generative & Learning | Modular, Flow-oriented | 3D cuboid of tensors, matrix grids, and “Spatial” techniques using bright pastel fills have a group logic |
| Theory and development | Minimalist, Abstract, “Textbook” | Graph nodes (circles), manifolds (planes), and a restricted gray palette with single highlighted colors |
A Comparison of Visualization Paradigms
In the areas of mathematics, the framework highlights a clear trade-off between using an image generation model (IMG) versus executable code (Encoding).
| A feature | Episodes with Image Generation (IMG) | Plots with Coding (Matplotlib) |
| Aesthetics | Usually high; buildings look “visually appealing” | Technical and general aspects of education |
| Honesty | Underneath; they tend to “see by the numbers” or the repetition of something | 100% accurate; strictly represents the raw data provided |
| Readability | It excels with sparse data but struggles with complex data sets | It is consistently high; handles dense or multi-threaded data without error |
Key Takeaways
- A Multi-Agent Collaborative Framework: Paper Banana is a referral driven program that organizes 5 special agents—Retriever, Planner, Stylist, Visualizer, and Critic-transforming raw technical text and captions into publication quality method diagrams and statistical calculations.
- A Two Generation Process: Workflow consists of a Linear Programming Phase to retrieve reference examples and set aesthetic guidelines, followed by a 3 rounds of the Iterative Refinement Loop where the Critic agent identifies errors and the Visualizer agent refreshes the image for maximum accuracy.
- High Performance is on Paper BananaThe bench: Tested against 292 test cases from NeurIPS 2025, the framework outperformed the vanilla bases in Total Score (+17.0%), Summary (+37.2%), Readability (+12.9%)again Beauty (+6.6%).
- Accuracy-Oriented Mathematical Sites: To obtain statistical data, the system switches from generating a direct image to An executable Python Matplotlib code; this hybrid approach ensures numerical accuracy and eliminates common “ideas” in standard AI image generators.
Check it out Paper and Repo. Also, feel free to follow us Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.




