Case Law Research
Semantic Search Across Indian Case Law
From Keywords to Meaning
Traditional legal research works like searching a dictionary — you type exact words and hope the judgment you need uses those same words. If the court said "specific performance" but you searched for "enforcement of contract," you miss it. This is the fundamental limitation of keyword search, and it is the reason advocates spend hours in libraries refining search terms.
Semantic search changes this entirely. Instead of matching exact words, AI understands the meaning behind your query. Search for "tenant's right to continue occupation after lease expiry" and AI finds judgments about "holding over," "tenancy by sufferance," and "Section 116 of the Transfer of Property Act" — even if your exact phrase appears nowhere in those judgments.
This chapter explains how semantic legal search works, which Indian databases offer it, and — critically — how to verify that the cases AI finds actually exist.
How Semantic Legal Search Works
Think of it as a two-step process:
Step 1: Understanding. When you type a query, AI converts your question into a mathematical representation of its meaning — a vector of numbers that captures concepts, not just words. "Landlord evicting tenant for personal use" and "owner seeking possession under Section 12 of the Bombay Rent Act" end up as similar vectors because they mean similar things.
Step 2: Matching. AI compares your query vector against pre-computed vectors for millions of judgment paragraphs. The closest matches by meaning — not by keyword overlap — appear first. This is why semantic search finds relevant precedents that keyword search misses entirely.
Open data/case-law-search-examples.json in the code panel. This file contains 15 sample queries with their keyword search results vs semantic search results, showing the difference in relevance and completeness.
Indian Citation Formats
Before using any AI research tool, you must understand Indian citation formats so you can verify what AI returns:
| Format | Example | Source |
|---|---|---|
| AIR | AIR 1973 SC 1461 (Kesavananda Bharati) | All India Reporter |
| SCC | (2017) 10 SCC 1 (Puttaswamy — Right to Privacy) | Supreme Court Cases |
| SCR | [1954] SCR 1 (Beru Bari Union) | Supreme Court Reports |
| High Court | 2019 SCC OnLine Del 1234 | SCC Online (state-specific) |
| MANU | MANU/SC/0001/2023 | Manupatra unique ID |
| Neutral Citation | 2023 INSC 456 | Supreme Court neutral citation (new format) |
AI tools sometimes return citations in non-standard formats or mix up citation styles. If AI returns "Supreme Court, 2017, Privacy case" instead of "(2017) 10 SCC 1," that is a sign the AI is summarizing rather than citing — and the citation needs verification.
The Court Hierarchy and Precedent
AI research tools must account for India's court hierarchy, and you must verify that AI weights precedents correctly:
Supreme Court of India
↓ Binding on all courts
High Courts (25)
↓ Binding within state/territory
District Courts
↓
Subordinate Courts / Tribunals (NCLT, NCLAT, SAT, ITAT, etc.)A common AI error is treating a High Court judgment as equivalent to a Supreme Court judgment on the same point. If you are arguing in the Bombay High Court, a Delhi High Court judgment is persuasive but not binding — while a Supreme Court judgment is binding. AI may not make this distinction clear in its results.
Similarly, AI may not flag when a judgment has been overruled, distinguished, or superseded by legislation. A 1985 Supreme Court judgment interpreting Section 498A IPC is still legally valid in principle, but the underlying statute is now Section 85 BNS. AI must — and often does not — flag this transition.
Indian Legal Databases
Here are the primary databases where you verify AI-found citations:
SCC Online
The most comprehensive paid database. Covers Supreme Court, all High Courts, tribunals, and some district courts. Subscription-based (approximately ₹15,000-50,000/year depending on package). Best for: comprehensive research, verified citations, editorial notes on overruled judgments.
Manupatra
India's oldest legal database. Strong coverage of statutes, rules, and notifications alongside case law. Offers AI-assisted search features. Best for: statute tracking, regulatory updates, bare act with commentary.
Indian Kanoon
Free, open-access database with the largest collection of Indian judgments. Not as well-organized as SCC Online, but invaluable for quick verification. Best for: verifying that a citation exists, reading the full text, preliminary research.
NearLaw
AI-native search engine specifically built for Indian case law. Understands Indian citation formats and court hierarchy natively. Best for: semantic search, finding recent High Court judgments, understanding judicial trends.
Open data/legal-databases-comparison.json to see a detailed comparison of these databases by coverage, cost, AI features, and suitability for different types of legal research.
The Hallucination Problem
This is the most dangerous aspect of AI legal research: AI can fabricate citations that look completely real but do not exist.
Here is an actual pattern of AI hallucination in legal research:
You ask: "Find Supreme Court judgments on the enforceability
of arbitration clauses in unstamped agreements."
AI responds: "In M/s Garware Wall Ropes Ltd v. Coastal Marine
Constructions, (2019) 9 SCC 209, the Supreme Court held that
an unstamped arbitration agreement is not enforceable..."
This citation is REAL and the holding is accurate.
But AI might also say: "In Patel Engineering Consortium v.
Northern Railways, (2020) 4 SCC 112, the Court further
clarified..."
This citation may be FABRICATED — the case name sounds
plausible, the citation format is correct, but the case
does not exist.The danger is that fabricated citations are formatted correctly and sound authoritative. An advocate who cites a non-existent case in court faces professional embarrassment and potential disciplinary action.
Verification Protocol
Every AI-found citation must go through this verification:
Prompt Engineering for Legal Research
Here are effective prompts for using general AI tools for case law research:
Prompt: "I need to find Indian Supreme Court judgments on
[legal issue]. For each case:
1. Provide the exact citation (SCC or AIR format)
2. State the key holding in one sentence
3. Note if this judgment has been overruled or distinguished
4. Indicate confidence level — are you certain this case
exists or is this a summary from memory?
Important: If you are not certain a case exists, say so.
Do not fabricate citations."The last instruction is critical. Explicitly asking AI not to fabricate reduces (but does not eliminate) hallucination. Always verify regardless.
Building a Research Workflow
The most effective legal research workflow combines AI speed with human verification:
| Step | Tool | Time |
|---|---|---|
| 1. Frame the legal issue | Your expertise | 5 min |
| 2. Broad semantic search | AI (Claude/NearLaw) | 10 min |
| 3. Verify top citations | SCC Online / Indian Kanoon | 20 min |
| 4. Read key judgments | SCC Online (full text) | 30-60 min |
| 5. Check overruled status | SCC Online editorial notes | 10 min |
| 6. Organize by relevance | AI for summarization | 10 min |
Total: 1.5-2 hours for research that would take 6-8 hours with purely manual methods.
Key Takeaways
This is chapter 3 of AI for Legal Professionals.
Get the full hands-on course — free during early access. Build the complete system. Your projects become your portfolio.
View course details