High-Precision Retrieval with Double-Hop Query Reformulation
The Problem
Legal documents are written in complex, formal terminology ("legalese"), while users typically ask questions in casual, colloquial language. Standard RAG systems often fail to bridge this semantic gap; a vector search for "My boss cut my pay" might miss the relevant document titled "Wage Deduction Regulations" because the keywords do not align. This mismatch leads to poor retrieval accuracy and model hallucinations.
The Solution
I engineered a specialized "Double-Hop" RAG architecture designed to translate user intent into legal precision. The system does not just search; it interprets. By implementing an intermediate reasoning layer, the system reformulates ambiguous user questions (e.g., "Can I sell waqf land?") into standardized legal queries (e.g., "Legal status of waqf land transfer under Law No. 41/2004") before executing the final search.
Technical Architecture
The pipeline utilizes a multi-stage retrieval strategy to maximize precision:
Query Reformulation Agent: An LLM (Qwen 3 32B) analyzes the initial user input and rewrites it into formal legal queries, bridging the vocabulary gap.
Double-Hop Retrieval: The system performs a broad initial search to gather context, refines the query based on that context, and then executes a second "precision hop" to retrieve specific regulations.
Cross-Encoder Reranking: To ensure the most relevant regulation appears at the top, the final retrieved documents undergo a reranking process, achieving a 93% Mean Reciprocal Rank (MRR).
Performance & Impact
The system was rigorously benchmarked using an "LLM-as-a-Judge" framework on a custom Indonesian legal dataset.
100% Hit Rate: Successfully retrieved the correct legal document for every test query.
98.6% Faithfulness: The logic layers significantly reduced hallucinations, ensuring answers were strictly grounded in the retrieved statutes.
Validation: Proven capability to map highly informal inputs (e.g., "curhat" style questions) to exact legal articles without losing context.
Core Stack
Python, Qwen 3 (via Groq), FAISS, Google Embedding-004, Modal (Serverless Deployment).


