Agentic Legal RAG Challenge

AGENTIC

RAG

Legal

CHALLENGE

Legal DOCs

300+

March

Duration

11-25

Questions

1K+

Prize Pool

$32K

Machines Can See 2026

leaderboard

From Benchmark
to Global Stage

This is not a hackathon, but a benchmark-style challenge designed to test systems under real-world conditions and conference-level rigor.

Legal Challenge 2026 is an international engineering competition focused on building production-grade AI systems for the legal domain. Teams develop Agentic & RAG solutions evaluated for accuracy, faithfulness to legal sources, and real-time performance.

Reach

GLOBAL

Attendees

3,500+

AWARDS CEREMONY
AT MACHINES
CAN SEE

Strategic Partners

The winners will be announced during a special online awards event affiliated with Machines Can See, bringing together competition participants, organizers, and members of the global AI community, and showcasing the best solutions developed during the challenge.

The Agentic RAG Legal Challenge is a flagship event of Dubai AI Week and part of the Machines Can See 2026 ecosystem.

Dubai, UAE — April 2026

Benchmark Principles & Guarantees

Six pillars defining the most rigorous legal AI evaluation framework ever created.

Open Results

Final datasets, telemetry, and winning approaches are released for transparent and repeatable research.

Anti-Gaming

Private test set, mandatory telemetry, chunk-ID validation, and code audits prevent gaming and ensure fairness.

TTFT as a Metric

Time-to-First-Token directly
impacts scoring, rewarding production-grade latency.

Real Legal Dataset

A corpus of real regulations, case law, and long-form contracts that reflects true legal complexity.

Focus on Faithfulness

Strict grounding checks ensure
every answer is verifiably based on retrieved legal sources.

Production Evaluation

Systems are tested as full production pipelines, including ingestion, retrieval, generation, and telemetry.

01

Strict evaluation for exact-answer questions, verifying factual correctness and required data types

Deterministic Fact Checking

02

Generative explanations are evaluated by an LLM judge for correctness, completeness, grounding, clarity, and uncertainty handling

LLM-as-Judge Scoring

03

Each answer must include full telemetry — TTFT, total latency, token usage, and retrieved chunk IDs — with penalties for missing data

Mandatory Telemetry

04

Retrieval is validated through gold-chunk matching; if the correct sources are not retrieved, the grounding score is zero

Grounding Verificationg

05

Time-to-First-Token directly impacts scoring, rewarding fast, production-grade systems and penalizing slow responses

Latency as a Metric

06

Final rankings are determined by a 24-hour evaluation run on a private test set, ensuring fairness and resistance to gaming

Private-Set Evaluation

PRODUCTION-
GRADE EVALUATION
FRAMEWORK

Designed to mirror real-world deployment conditions, our framework evaluates accuracy, faithfulness, latency, and system integrity as a unified production benchmark.

$12,000

1st Place

$8,000

$4,000

Prizes

2nd Place

3rd Place

Main Prizes (by Total Score)

The prize pool is divided between Main Prizes (overall ranking) and Special Nominations (engineering excellence in specific areas):

$1,000

(2x) Best Publication

AI popularization

Best video/post/etc about the competition (by Jury's choice)

$2,000

Retrieval Master

Best grounding

Highest Grounding Score
(Total Score ≥ 70%)

$2,000

Efficiency Expert

Most token-efficient

Highest Score/Token ratio
(Total Score ≥ 70%)

$2,000

Lowest avg TTFT
(Total Score ≥ 70%)

Fastest solution

Speed Champion

Special Nominations

All top-3 teams receive a fully-funded trip to Dubai for the Machines Can See 2026 Summit

Key milestones of the challenge — from registration and onboarding to final evaluation and awards

Schedule

Team registration opens
Community support via Discord

Phase 01

Sign-up & Onboarding

11 February – 11 March

1

11 March – 18 March

2

Final Event

Awards

Winners announced at MCS online event
Winning teams’ pitches
Part of Dubai AI Week

6 April – 9 April

Phase 02

Competition Start

Starter kit & documentation
Demo dataset (30 documents)
100 sample questions released
Live leaderboard updates

4

Full corpus unlocked (300+ docs)
Full evaluation (1,000+ questions)
Final submission: March 22

Phase 03

Active Competition

18 March – 25 March

3

FAQ

All documents are real, public legal materials. No synthetic or toy documents are used.

Stay in the Loop

Not ready to compete? Get early access to the next challenge, research releases, and Dubai AI Week announcements.

Manage cookies

Cookie Settings

Cookies necessary for the correct operation of the site are always enabled.
Other cookies are configurable.

Essential cookies

Always On. These cookies are essential so that you can use the website and use its functions. They cannot be turned off. They're set in response to requests made by you, such as setting your privacy preferences, logging in or filling in forms.

Analytics cookies

Disabled

These cookies collect information to help us understand how our Websites are being used or how effective our marketing campaigns are, or to help us customise our Websites for you. See a list of the analytics cookies we use here.

Advertising cookies

Disabled

These cookies provide advertising companies with information about your online activity to help them deliver more relevant online advertising to you or to limit how many times you see an ad. This information may be shared with other advertising companies. See a list of the advertising cookies we use here.