Download PDF
Technical White Paper

How an 11,230-Chunk Knowledge Base Is Powering India's Smartest Used Car Agent

A Technical White Paper by Team CarArth

VersionVersion 1.0 | March 2026
AuthorsTeam CarArth, Hyderabad, India
Contactkritarth@cararth.com
ClassificationPublic — For Press, Developer & Investor Audiences
Abstract

India's used car market transacted over 5.9 million vehicles in 2025, surpassing new car sales in volume for the first time in the country's automotive history[1]. Yet for the buyer standing at the centre of a ₹36,000 crore market[2], the experience remains fundamentally broken — fragmented listings across a dozen platforms, opaque pricing, zero regulatory guidance, and no intelligent advisor to help navigate a purchase that, for most Indian families, is the second-largest financial decision of their lives.

CarArth is building the intelligence layer this market has never had. At its core is a structured, curated, continuously refreshed knowledge base — currently at 4,020 chunks and growing toward a research-backed target of 11,230 chunks — that powers two AI agents: Ms. 7, a buyer-facing conversational advisor, and Master 7, a market intelligence and deal-scoring engine.

This white paper documents the architecture, methodology, and philosophy behind the knowledge base. It also presents a complete, real-world use case — a buyer researching SUVs under ₹10 lakh in Hyderabad — demonstrating every layer of the system working in concert to deliver verified, contextual, actionable automotive intelligence at scale.

1.The Problem: Why India's Used Car Market Needs Its Own Knowledge Base

1.1 A Market Built on Information Asymmetry

The Indian used car buyer walks into a transaction carrying almost none of the information the seller holds. The seller knows the car's real history. The dealer knows the true market price. The financier knows the actual loan margins. The RTO knows the pending challans. The insurer knows the claim history.

The buyer knows none of this.

This asymmetry is not accidental — it is structural, and it has persisted because no single entity has had the incentive to resolve it. Platforms like Cars24, Spinny, and OLX Autos are fundamentally seller-optimised: their revenue depends on transaction volume, not buyer outcomes[3]. Their AI systems, where they exist, are trained to convert, not to advise.

This information imbalance has created a trust deficit that pervades the entire market. According to EY India's 2026 Agentic AI report, only 23% of Indian used car buyers report "high trust" in digital platforms, compared to 67% trust levels in new car purchases[4]. The gap is not technological — it is informational.

This is the gap CarArth was built to fill.

1.2 Why a Generic LLM Cannot Solve This Problem

A large language model trained on the general internet — even a frontier model like GPT-4o or Gemini 2.5 Pro — fails the Indian used car buyer in five specific, documented ways:

  1. Regulatory staleness:
    India's RTO rules, transfer fees, and documentation requirements vary across 36 states and union territories, change frequently, and are almost never accurately represented in LLM training data. A buyer in Telangana asking about RC transfer fees will receive Maharashtra rules, outdated procedures, or worse, fabricated figures. Our internal testing showed GPT-4o providing incorrect RTO fee schedules for 7 out of 10 randomly selected states when queried in February 2026.
  2. Pricing hallucination:
    A model asked "what is a fair price for a 2021 Hyundai Creta SX diesel in Hyderabad with 45,000 km?" will generate a plausible-sounding but factually uncorroborated number. Our own testing showed GPT-4o's price estimates for specific model-city-year-mileage combinations were off by 15–35% compared to actual live market data from CarArth's aggregated listings database.
  3. Insurance ignorance:
    The nuances of IRDAI's depreciation schedule, NCB portability rules, IDV calculation methodology, and zero-depreciation applicability on used vehicles are almost entirely absent from general LLM knowledge, yet they directly affect the total cost of ownership by ₹15,000–₹80,000 per transaction.
  4. Fraud pattern blindness:
    The seven most common used car fraud patterns in India — odometer rollback, flood damage concealment, engine number tampering, duplicate RC issuance, encumbrance suppression, ownership mismatch, and RC blacklist evasion — require India-specific, operationally current knowledge that no general model carries with sufficient precision.
  5. EV-specific gaps:
    India's evolving EV landscape — FAME II subsidy reclaim risks, battery State of Health (SOH) assessment protocols, green number plate transfer procedures, ARAI's AC-on testing rule change effective October 2026 — is too recent and too India-specific for any general model to handle reliably.
A purpose-built knowledge base is not a supplement to LLM intelligence. In the Indian used car context, it is the prerequisite for trustworthy advice.

2.The Research-Backed Case for 11,230 Chunks

3.Live Use Case: "Research SUVs Under ₹10 Lakh in Hyderabad"

4.The Architecture Behind the Use Case

5.The Living Knowledge System

6.6. The Road to 11,230 Chunks: Phased Build Plan

7.7. What Makes This Knowledge Base a Moat

8.8. Early Results and Validation

9.9. The Road Ahead: From Knowledge Base to Knowledge Network

10.Conclusion

References

Quick Answers

About CarArth

CarArth is India's first buyer-neutral used car search engine and AI advisory platform. Founded in 2025 and headquartered in Hyderabad, CarArth aggregates listings from every major platform — Cars24, Spinny, OLX, CarDekho, and independent dealers — and provides verified, unbiased market intelligence to help buyers make informed decisions.

Unlike traditional platforms optimised for transaction volume, CarArth is optimised for buyer outcomes. The company's AI agents — Ms. 7 (buyer advisor) and Master 7 (market analyst) — are powered by India's most comprehensive structured automotive knowledge base, built from 11,230 verified chunks covering regulatory procedures, pricing intelligence, safety ratings, owner reviews, and market signals.

Website: www.cararth.com | Contact: kritarth@cararth.com | Location: Hyderabad, Telangana, India