Exceeding Australia's Age Assurance Trial Benchmarks: Xident's Results vs. 48 Vendors

In December 2025, Australia became the first country in the world to ban social media for users under 16. The Age Assurance Technology Trial — commissioned by the eSafety Commissioner — tested over 60 technologies from 48 vendors to establish what works and what doesn’t. Phase 2 industry codes are due by March 9, 2026. Platforms need solutions now, and the trial provides the de facto benchmarks against which those solutions will be measured.

Xident’s age estimation model outperforms the trial’s best results. Here’s the evidence.

The “Reasonable Steps” Standard

Australia’s Online Safety Act doesn’t mandate a specific FPR threshold. Instead, it requires platforms to take “reasonable steps” to prevent underage access. Self-declared age is explicitly no longer acceptable.

What constitutes “reasonable steps” is defined in practice by the Age Assurance Technology Trial. When the eSafety Commissioner publishes benchmarks based on testing 48 vendors, those benchmarks become the reference point for what “reasonable” means. A platform using a technology that performs well below the trial’s findings would have a difficult time arguing that it took reasonable steps.

Technology Trial Results vs. Xident

The trial tested multiple verification approaches. Here’s how each method performed, compared to Xident:

Method	Trial FPR	Trial FNR	Xident FPR	Xident FRR
Document verification (best)	~2.95%	~3.07%	N/A (used as fallback)	N/A
Facial age estimation (best vendors)	Varied, best <1%	Varied	0.03%	11%
Self-declared age	Not tested (banned)	—	Not used	—
Xident combined (ML + doc fallback)	—	—	~0.03%	<2% effective

Three key observations:

Document verification showed 2.95% FPR. This means nearly 3 out of every 100 minors presenting documents were incorrectly verified as adults. Document-based systems — often considered the gold standard — are actually less accurate than the best ML-based approaches.

The best facial age estimation vendors achieved under 1% FPR, but with significant variation. Xident’s 0.03% represents a 33x improvement over the 1% benchmark and outperforms even the best trial participants.

Xident’s combined approach is the strongest. The ML fast path catches 99.97% of minors. The document fallback path catches the remaining edge cases. Combined, the effective FPR is approximately 0.03% with an effective FRR well under 2%.

The “Waterfall” Approach

The technology trial explicitly recommended a “layered approach” — using multiple techniques in sequence rather than relying on any single method. The report used the term “successive validation” to describe this pattern.

Xident’s architecture implements exactly this approach:

Layer 1 — ML Age Estimation (Path A): Client-side ONNX inference analyzes the user’s face and determines whether they appear to be above the age threshold. Approximately 89% of adult users pass this layer immediately.
Layer 2 — Document Verification (Path B): Users who don’t pass the ML threshold are directed to upload a government-issued ID. OCR extracts the date of birth, and a face match confirms document ownership. This catches the 11% of adults who received a “maybe” from the ML model.
Layer 3 — Xident Account (Path D): Users who’ve completed verification through either Layer 1 or Layer 2 can create a Xident account. Future verifications on any Xident-enabled site are instant token lookups. No face analysis, no document upload — just a cryptographic token confirming the verified age bracket.

This is precisely the layered architecture the trial recommended: fast automated screening, document fallback for uncertain cases, and a persistent identity layer for returning users.

Addressing the Trial’s Specific Concerns

The technology trial identified several areas of concern across the industry. Here’s how Xident addresses each one:

Skin Tone Bias

The trial noted “ongoing challenges for darker skin tones” across facial age estimation systems. This is a known issue in the facial analysis industry — models trained primarily on lighter-skinned populations show degraded performance on darker skin tones.

Xident addresses this through:

Diverse training data spanning multiple ethnicities and skin tones
Per-demographic evaluation during model testing — FPR and FRR are computed separately for each demographic group
Ongoing fairness monitoring to catch performance drift across populations

Our evaluation methodology explicitly breaks down metrics by ethnicity and skin tone. If performance degrades for any group, it’s flagged before the model reaches production.

The 16–20 Age Range Problem

The trial flagged a “large margin of error commonly 18 months” in the 16–20 age range. This is the hardest range for age estimation — a 17-year-old can look very similar to a 19-year-old, and the natural variation in physical development makes precise classification difficult.

Xident’s Challenge-21 buffer directly addresses this. Instead of asking “is this person 18?”, the model asks “does this person appear to be at least 21?” This 3-year buffer exceeds the trial’s observed 18-month margin of error by a factor of 2. Even if the model’s estimate is off by 2 years, the buffer absorbs the error before it becomes a false pass.

At Challenge-25 (a 7-year buffer), which some jurisdictions recommend, the margin is even larger. Xident achieves 0.03% FPR at Challenge-21 — the stricter standard — meaning performance at Challenge-25 would be even better.

Privacy Concerns

The trial emphasized privacy-preserving solutions. Several vendors in the trial required users to upload face images to remote servers for processing — a practice that creates significant privacy risk and regulatory complexity.

Xident processes face images entirely in the browser:

No face data is transmitted to any server during the ML verification path
No biometric database exists — there’s nothing to breach
Document images (for the fallback path) are processed server-side and deleted immediately
Platforms receive only pass/fail tokens — never biometric data

This aligns with the trial’s emphasis on privacy-preserving solutions and with Australia’s broader data protection framework under the Privacy Act 1988.

The Effective FRR Story

Xident’s 11% FRR for first-time adult users deserves context in the Australian regulatory landscape.

11% is the first-visit rate. When an adult user encounters Xident for the first time and their face doesn’t confidently pass the Challenge-21 threshold, they’re directed to document verification. They’re not blocked — they take an additional step that takes about 60 seconds.

The effective FRR trends toward zero. Users who complete document verification can create a Xident account. On their next visit — to any Xident-enabled site — they verify instantly via token lookup. As the Xident network grows:

More users have verified accounts
More verifications use the instant token path
The average FRR across all verifications decreases
For returning users, the FRR is effectively 0%

The trial itself acknowledged this pattern. The recommendation for layered approaches implicitly accepts that first-pass methods will have some false rejection rate. The key is that the fallback path catches those users without losing them entirely.

Over a platform’s user lifecycle, the 11% first-visit FRR translates to a much lower overall FRR — typically under 2% when accounting for returning users and Xident account holders.

Penalties and Timeline

Australia’s enforcement framework creates urgency:

Penalties up to AUD $49.5 million for serious non-compliance
Phase 2 industry codes due March 9, 2026 — less than a month away
Search engine obligations begin June 27, 2026
The eSafety Commissioner has broad enforcement powers including formal warnings, infringement notices, and court-enforceable undertakings

The under-16 social media ban is already law. The technology trial has established benchmarks. Platforms that can demonstrate they’ve implemented solutions exceeding those benchmarks will be in a strong compliance position when enforcement ramps up.

Why the Trial Benchmarks Matter

Australia’s “reasonable steps” standard means platforms must demonstrate effort. The technology trial created a public benchmark: this is what 48 vendors could achieve. A platform using a solution that significantly exceeds those benchmarks has a strong defense.

Xident’s position relative to the trial:

Trial Metric	Trial Best	Xident
Facial age estimation FPR	<1% (best vendors)	0.03% (33x better)
Document verification FPR	~2.95%	Used as fallback only
Layered approach	Recommended	Built-in (3 layers)
Privacy preservation	Emphasized	Client-side processing
Demographic fairness	Required	Per-group monitoring

A platform integrating Xident can point to concrete metrics that exceed the trial’s findings across every dimension. That’s a strong “reasonable steps” argument.

Conclusion

Australia’s Age Assurance Technology Trial set the standard for what “reasonable steps” means in practice. Testing 60+ technologies from 48 vendors, it established benchmarks across accuracy, privacy, and fairness.

Xident’s 0.03% FPR outperforms the trial’s best facial age estimation results by over 33x. Its layered architecture — ML fast path, document fallback, persistent account system — matches the trial’s recommended “successive validation” approach. And its client-side processing model addresses the trial’s privacy concerns at an architectural level.

With Phase 2 codes due March 9, 2026 and penalties reaching AUD $49.5 million, Australian platforms need solutions that demonstrably exceed the benchmark. Xident provides exactly that.

Preparing for Australia’s age assurance requirements? Join the waitlist to get early access when we launch.

Exceeding Australia's Age Assurance Trial Benchmarks: Xident's Results vs. 48 Vendors

The “Reasonable Steps” Standard

Technology Trial Results vs. Xident

The “Waterfall” Approach