Technology & AI

VectifyAI Launches Mafin 2.5 and PageIndex: Achieves 98.7% Financial RAG Accuracy With New Vectorless Tree Index System.

Building a Retrieval-Augmented Generation (RAG) pipeline is easy; building that doesn’t hallucinate during the 10-K test is nearly impossible. For devs in the financial sector, the ‘standard’ vector-based RAG approach—combining text and hoping for the best—often results in a ‘text soup’ that loses the important structural context of tables and balance sheets.

VectifyAI tries to bridge this gap with the introduction of Mafin 2.5agent of many funds, and PageIndexan open source framework that is changing the industry to ‘Vectorless RAG.’

Problem: Why Vector RAG Failed Financially

Traditional RAG relies on semantic similarity. When you ask about ‘Income,’ the vector database looks for bits of text noise as income. However, financial documents depend on planning. A number in a cell means nothing except its header, and those headers are usually removed during traditional PDF-to-text conversion.

This is the ‘garbage in, garbage out’ trap: even the smartest LLM can’t think straight if the input data has lost its sequential structure.

Mafin 2.5: Accuracy in Scale

The Mafin 2.5 is not just a well-tuned model; it’s the engine of thought achieved 98.7% accuracy on FinanceBenchsignificantly outperforms GPT-4o and Perplexity in recovery operations.

What sets it apart for devs is its native integration with high-fidelity data sources:

  • Full SEC Access: Direct entry for filing 10-K, 10-Q, and 8-K.
  • Intel’s advantages: Real-time and historical earnings call documents.
  • Market Data: Live tickets on the Russell 3000 and Nasdaq.

PageIndex: Moving to ‘Vectorless’ RAG

The ‘secret sauce’ behind the accuracy of Mafin 2.5 PageIndex. PageIndex replaces traditional flat embedding with hierarchical tree index.

Instead of searching for random bits, PageIndex allows LLM to ‘think’ about the structure of the document. It builds a semantic tree—essentially a smart map of the document—allowing the agent to identify the exact section, page, and line item needed.

Key technical features include:

  • Native Vision Support: PageIndex supports A vision-based RAGallowing models to ‘see’ the global structure of the page (charts, complex grids) rather than relying solely on OCR text.
  • Hierarchical Navigation: Converts PDFs into a navigable tree structure, ensuring that relationships between topics and data remain intact.
  • Traceability: Unlike the ‘black box’ of vector similarity, every answer has a clear path in the document tree, providing a much-needed audit trail for regulated financial situations.

Key Takeaways

  • Unprecedented Financial Accuracy (98.7%): Mafin 2.5 set a new record high FinanceBench benchmark, it achieves 98.7% accuracy. This significantly outperforms general purpose models such as GPT-4o (~31%) and Perplexity (~45%) by focusing on specific financial reasoning rather than general discovery.
  • Switching to ‘Vectorless RAG’: From the “vibe-based” search of traditional databases, PageIndex introduce Reasoning based RAG. It uses LLM to ‘think’ its way through the document structure, simulating how a human analyst would navigate through a report to find specific data points.
  • Hierarchical ‘Tree’ Indexing vs. Chunking: Instead of slicing documents into meaningless, context-free chunks of text, PageIndex organizes PDFs into circles. a semantic tree structure (Smart Table of Contents). This preserves important relationships between headers, nested tables, and footnotes that conventional RAG tends to destroy.
  • Vision-Native & OCR-Free Workflows: The frame is supportive RAG based on Vectorless theoryallowing AI to ‘see’ and derive information directly from page images. This is a game changer in financial documents where the visual structure of a balance sheet or complex grid is as important as the numbers themselves.
  • Enterprise-Grade Traceability: Unlike the ‘black box’ of vector matching, PageIndex provides a a fully testable way of thinking. Every response is linked to specific nodes, pages, and sections, providing the transparency needed for superior audits and compliance.

Check it out Technical details again Repo. Also, feel free to follow us Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.


Michal Sutter is a data science expert with a Master of Science in Data Science from the University of Padova. With a strong foundation in statistical analysis, machine learning, and data engineering, Michal excels at turning complex data sets into actionable insights.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button