- AngelsRound
- Posts
- TLDC 📏
TLDC 📏
Infrastructure for evaluating and governing AI models.

Spotlight
What if evaluating AI models became as standard and scalable as writing code tests?
Quick Pitch: The LLM Data Company (TLDC) is building the evaluation layer for AI. Its flagship product, doteval, turns LLM evaluations into programmable assets, making it easier to test, govern, and ship AI models with confidence.


The Problem
Widespread Adoption: 200,000+ enterprises are deploying GenAI, but few can measure performance reliably.
Manual, Brittle Workflows: Evaluations are done through spreadsheets, JSON, or internal tools that don’t scale.
Deployment Risk: Without proper evals, teams ship untested prompts, agents, or fine-tuned models into production.

Snapshot
Industry: AI Infrastructure and Evaluation Tools
Headquarters: San Francisco, California
Year Founded: 2025 (YC S25)
Traction: Used by Perplexity, Diode, and Cubic in legal, technical, and safety-critical workflows
Founder Profiles
Daanish Khazi, Co-Founder: Ex-Traba engineer with experience at Tesla and Meta
Gavin Bains, Co-Founder: Former Traba engineer with backgrounds at leading tech companies
Joseph Besgen, Co-Founder: Ex-Traba with experience at Honey and Roland Berger consulting
Funding
Current Round: Raising (Seed)
Lead Investor: Y Combinator
Other Backers: Multiple Tier-1 VCs & Angels
Revenue Engine
Target: Mid-market enterprises with eval needs
Product: Developer-first, subscription-based platform
Go-to-Market: Direct sales to applied AI teams
What Users Love
Easy-to-write eval instructions using YAML, with AI helping speed up the process
Versioning and reuse across models and prompts
Tight collaboration between legal, product, and engineering
Measurable improvement in speed and accuracy of AI rollouts

Playing Field
Pi Labs: Provides copilot for generic rubrics with proprietary graders
Haize Labs: Focuses on judge alignment but limited authoring infrastructure
Internal Lab Solutions: Custom-built but not scalable or reusable
TLDC's Edge: First to make evaluation design structured, collaborative, and repeatable across organizations and domains
Why It Matters
Evaluations are the new QA for AI. As GenAI adoption grows (71% using it in at least one function), teams need graded examples and guardrails to assess quality before deployment.

What Sets Them Apart
Evaluation-as-Infrastructure: Not just result viewers—full-stack creation and management
Cross-Domain Fit: Legal, tech, healthcare, reasoning tasks
Collaborative Authoring: Empowers non-engineers to build and own evals
Speed: Turns weeks of manual work into hours via reusable, collaborative workflows
Analysis
Bulls Case 📈
Used in high-stakes production environments
Evaluation is becoming standard in GenAI deployment
Backed by top-tier investors
Large and growing enterprise market
Bears Case 📉
Varying customer needs across domains
Competition from internal tooling at large labs
Need to balance flexibility with simplicity
Early-stage go-to-market execution

Verdict
As GenAI moves from pilot to production, TLDC addresses a core gap: evaluation infrastructure. Its structured, developer-first approach and early traction signal strong product-market fit. Like GitHub did for code QA, TLDC has the potential to define how teams test and trust AI. The challenge will be scaling while staying usable—and proving value faster than internal tools can.
In Partnership With
Stop Asking AI Questions, and Start Building Personal AI Software.
Transform your AI skills in just 5 days through this free email course. Whatever your starting point, by Day 5 you'll be building working software without writing code.
Each day delivers actionable techniques and real-world examples straight to your inbox. No technical skills required, just knowledge you can apply immediately.
The Startup Pulse
Grammarly — Acquired Superhuman at undisclosed amount to expand its AI productivity suite; last valued at $825M with ~$35M ARR.
Figma — Filed to go public under “FIG,” aiming to raise $1.5B after $749M in 2024 revenue and a return to profitability in Q1.
DataBahn AI — The Dallas-based startup Raised $17M Series A to develop AI-native data pipelines for enterprise security, led by Forgepoint Capital.
Written by Ashher

Update your email preferences or unsubscribe here
© 2025 AngelsRound
228 Park Ave S, #29976, New York, New York 10003, United States