LawRAG: Retrieval-Augmented Generation for Judicial Case Law: An Embedding Model Benchmark

Main Article Content

L. K. Suresh Kumar, Mohammed Yaseen, Mohammed Junaid Adil, V. Ramesh

Abstract

This paper presents LawRAG, an advanced Retrieval-Augmented Generation (RAG) system designed for legal question answering using judicial case law in the Australian legal domain. The framework integrates legal document corpora, optimized vector embeddings, and state-of-the-art large language model to produce authoritative, contextually grounded responses. Unlike prior work focused on statutory texts, LawRAG addresses the nuanced structure of court judgments through an innovative parent document retrieval strategy. This method preserves critical legal context and improves factual accuracy. We evaluate multiple embedding models on a rigorously curated legal QA dataset, identifying GTE-large as the most reliable encoder, achieving a BERT Score of 0.8476 and the highest answer relevancy (0.7444). The system’s Dockerized implementation offers a fully reproducible pipeline for judicial case law analysis, establishing new best practices for contextual retrieval in legal AI applications.

Article Details

Section
Articles