Abstract:In this paper, we present Paramanu-Ayn, a collection of legal language models trained exclusively on Indian legal case documents. This 97-million-parameter Auto-Regressive (AR) decoder-only model was pretrained from scratch with a context size of 8192 on a single GPU for just 185 hours, achieving an efficient MFU of 41.35. We also developed a legal domain specialized BPE tokenizer. We evaluated our model using perplexity and zero-shot tasks: case judgment prediction with explanation and abstractive case summarization. Paramanu-Ayn outperformed Llama-2 7B and Gemini-Pro in case judgment prediction with explanation task on test accuracy by nearly 2 percentage points, despite being 72 times smaller. In zero-shot abstractive summarization, it surpassed decoder-only LLMs generating fixed-length summaries (5000 tokens) by over 10 percentage points in BLEU and METEOR metrics, and by nearly 4 percentage points in BERTScore. Further evaluations on zero-shot commonsense and mathematical benchmarks showed that Paramanu-Ayn excelled despite being trained exclusively on legal documents, outperforming Llama-1, Llama-2, and Falcon on AGIEVAL-AQuA-RAT and AGIEVAL-SAT-Math tasks. We also instruction-tuned our model on 10,763 diverse legal tasks, including legal clause generation, legal drafting, case summarization, etc. The Paramanu-Ayn-instruct model scored above 8 out of 10 in clarity, relevance, completeness, and legal reasoning metrics by GPT-3.5-Turbo. We found that our models, were able to learn drafting knowledge and generalize to draft legal contracts and legal clauses with limited instruction-tuning. Hence, we conclude that for a strong domain-specialized generative language model (such as legal), domain specialized pretraining from scratch is more cost effective, environmentally friendly, and remains competitive with larger models or even better than adapting LLMs for legal domain tasks.

Pre-training Transformers on Indian Legal Text

Pre-trained Language Models for the Legal Domain: A Case Study on Indian Law

Transformer-Based Approaches for Legal Text Processing

Legal Transformer Models May Not Always Help

Bringing order into the realm of Transformer-based language models for artificial intelligence and law

Analysing similarities between legal court documents using natural language processing approaches based on Transformers

PARAMANU-AYN: Pretrain from scratch or Continual Pretraining of LLMs for Legal Domain Adaptation?

LegaLMFiT: Efficient Short Legal Text Classification with LSTM Language Model Pre-Training

Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts

TransformLLM: Adapting Large Language Models via LLM-Transformed Reading Comprehension Text

On the Effectiveness of Pre-Trained Language Models for Legal Natural Language Processing: An Empirical Study

Rhetorical Role Labeling of Legal Documents using Transformers and Graph Neural Networks

Large Scale Legal Text Classification Using Transformer Models

IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning

Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages

Rethinking Legal Judgement Prediction in a Realistic Scenario in the Era of Large Language Models

Leveraging open-source models for legal language modeling and analysis: a case study on the Indian constitution

SLJP: Semantic Extraction based Legal Judgment Prediction

Legal Information Retrieval and Entailment Using Transformer-based Approaches

Human Centered AI for Indian Legal Text Analytics

LegalNLP -- Natural Language Processing methods for the Brazilian Legal Language