``

Stop blaming the LLM when your AI architecture fails risk reviews in production. The root cause isn't your model's reasoning; it is almost always your retrieval strategy. Select the right architecture based on your data structure:
Stop blaming the LLM when your AI architecture fails risk reviews. The pattern I've seen across ninety-three production applications is consistent: teams pick a tool before solving a problem. In regulated environments, assuming meaning lives in the relationship between words (Vector RAG) fails when meaning lives in the structure (PageIndex). To fix your production systems, you must stop treating retrieval as a "nice-to-have" feature and start validating if your data requires "neighborhood search" or "structural reasoning."
We are not just choosing a database; we are choosing a theory of meaning.
Vector RAG assumes language is a dense map where similar concepts cluster together. It works by splitting documents into chunks (usually ~512 tokens), converting them into dense vectors, and retrieving neighbors based on cosine similarity.
PageIndex replaces embeddings with a hierarchical tree that mirrors the physical document (Tables of Contents, Section headers, Subsections).
The industry is pushing embeddings harder, but the data is getting messier. I have watched otherwise excellent retrieval systems fail third-party audits because the team couldn't explain a specific passage was retrieved. A cosine score of 0.87 is not a defense in a regulatory hearing or a risk committee review. We treat similarity as a proxy for truth, but in regulated environments,
On FinanceBench, a benchmark for long financial filings:
This gap exists because Vector RAG struggles with "distributed" information. In a drug monograph or legal contract, the answer might be split across a Header, a Table, and a Footnote. Vector RAG retrieves one piece of that data and hallucinates the rest. PageIndex understands that to answer the question, the system must traverse the node graph to read the specific node containing the interaction table securely.
Before you write a single line of ingestion code, answer these seven questions.
If you determine your architecture requires PageIndex, your system design shifts from a "Map" to a "Tree."
Data Schema Concept: Instead of storing chunks as flat tensors, store parent-child relationships.
Ingestion Flow:
Query Runtime (Pseudo-Code):
def query_risk(document_tree, question):
# Step 1: Summarize Table of Contents
toc_summary = summarize_node(document_tree.toc)
relevant_section_id = choose_section_based_on_question(toc_summary, question)
# Step 2: Descend the tree (Sequential Retrieval)
current_node = document_tree.get_node(relevant_section_id)
context = ""
while current_node.has_children():
# Decide whether to go deeper into sub-sections
decision = is_sub_section_relevant(current_node.text, question)
if decision:
current_node = current_node.children[0] # Go deeper
else:
break
final_answer = construct_answer(current_node.text, question)
return final_answer, [relevant_section_id] # Trace is built-in
Note the difference: Vector RAG launches 10 parallel requests. PageIndex launches 3 sequential requests. This chain of calls is the only thing that guarantees you know exactly where the answer came from.
| Feature | Vector RAG | PageIndex |
|---|---|---|
| Core Mechanism | Dense Embeddings + Cosine Similarity | Hierarchy Traversal + LLM Reasoning |
| Document Suitability | Short, Unstructured, High Volume | Long, Structured, Low Volume |
| Latency | Low (ms) | Moderate (seconds) |
| Cost | Low | Higher (more tokens parsed) |
| Audit Trail | Weak (Statistical score only) | Strong (Node path reference) |
| Failure Mode | Silent Hallucination (Retrieval mismatch) | Low Precision (Tree too deep) |
Q: Can I use both Vector DB and PageIndex? A: Yes. Use Vector RAG to find which documents might have the answer, and use PageIndex to find the exact clause within that document.
Q: Is PageIndex too slow for production? A: Only if your query volume is massive (millions/day). For internal enterprise systems handling thousands of queries, the latency is acceptable.
Q: How do I handle messy documents? A: PageIndex struggles with genuinely messy data. If your headers are inconsistent, Vector RAG is likely the better tool, or you need a pre-processing step to "standardize" the document PDF.
The next time an AI fails a "risk review," do not point to the hallucination. Look at your retrieval architecture. If your documents rely on structure rather than proximity, you are likely using the wrong system. Stop blaming the model, and start blaming the architecture.