TAPVision AI™ is a multi-source, counterfactual-explainable machine learning platform for lifespan ASD risk stratification — integrating ADOS-2, ADI-R, Q-CHAT-10, and SPARK phenotypic data with SHAP + DiCE explainability and camouflaging correction across toddler, child, adolescent, and adult populations.
Preliminary Data — 5 Real Published Datasets
Validated on five independent published datasets (Thabtah 2017a, 2017b, 2017c, 2018) totaling 8,RK access pending approval.
| Dataset | Instrument | N | Age Range | ASD+ Rate | AUC | Status |
|---|---|---|---|---|---|---|
| Kaggle Q-CHAT-10 Toddler (Thabtah 2018) | Q-CHAT-10 | 1,054 | 12–36 months | 69.1% | 1.000 | Live ✓ |
| UCI Child ASD (Thabtah 2017a) | AQ-10 | 292 | 4–11 years | 48.3% | 1.000 | Live ✓ |
| UCI Adolescent ASD (Thabtah 2017b) | AQ-10 | 104 | 12–16 years | 60.6% | 1.000 | Live ✓ |
| UCI Adult ASD (Thabtah 2017c) | AQ-10 | 701 | 17–64 years | 27.0% | 1.000 | Live ✓ |
| Combined Multi-Age AQ-10 | AQ-10 | 6,075 | 1–80 years | 29.7% | 1.000 | Live ✓ |
| NDAR ADOS-2 / ADI-R | ADOS-2 + ADI-R | 3,500+ | 12–36 months | — | ≥ 0.90 target | DUA Submitted Apr 2026 |
| ABIDE I + II (17 international sites) | ADOS + ADI-R | 2,156 | Multi-age | — | ≥ 0.85 target | NDAR Requested Apr 2026 |
| SFARI SPARK Phenotypic + 14 Research Match | SCQ + SRS-2 | 90,000+ | Lifespan | — | TBD | Submitted Apr 4, 2026 |
Platform Innovations
Every feature engineered by a practicing Child and Adolescent Psychiatrist with 10+ years of direct ADOS-2 and ADI-R clinical experience — designed from day one for FDA De Novo submission as a Class II Software as a Medical Device.
Clinician-administered diagnostic data from NIMH Data Archive — the only labels that satisfy FDA reviewers and peer-reviewed journals. Combined with SPARK phenotypic data from 90,000+ participants across the lifespan.
Aim 1 · Primary Dataset · NDAR DUA SubmittedEvery prediction includes SHAP feature importance AND DiCE counterfactual explanations: "If Al explanations.
FDA Alignment · Novel · Aim 2Female ASD is systematically underdiagnosed due to masking behavior. TAPVision AI™ applies a sex-stratified camouflaging index correction from SFARI Research Match data — the first ASD screening tool to address this documented equity gap with a computational correction.
Health Equity · First-in-Class · SFARI DataPre-trained on ABCD Study neurodevelopmental data (N=12,000) including ADHD, language delay, and anxiety — then fine-tuned on ASD-specific gold-standard labels. Enables differential diagnosis capability for the first time in any ASD screening tool, reducing false positives from ADHD overlap.
ABCD Study · Stage 1 · Aim 1Real-time patient-level cost-effectiveness calculation using SFARI Life Course Outcomes longitudinal data. Quantifies estimated lifetime savings from early detection ($1.4–2.4M reduced to <$1M with early intervention) — enabling payer reimbursement and health system ROI conversations at point of care.
Payer Value · SFARI Life Course DataABIDE II provides behavioral + neuroimaging data from 17 international sites across North America, Europe, and Asia (N=2,156). Pre-specified benchmark: AUC ≥ 0.85 on ABIDE II held-out test set. Directly addresses generalizability concerns that reviewers raise for every single-site ML study.
17 Sites · International · Aim 2SFARI Research Match: Sleep, Eating, and Digestive Issues data adds sleep disruption index and feeding problem flags to the toddler model — among the earliest detectable ASD markers, often appearing before formal behavioral screening is attempted at 18–24 months.
12–18 Month Window · SFARI DataProduction-ready FastAPI inference server on GCP Cloud Run (us-central1, project tapvision-ai-491102). AES-
GCP · HIPAA BAA Executed · FDA Aim 3Data Portfolio
The largest retrospective ASD ML training corpus assembled for a Phase I SBIR — spanning behavioral screening, gold-standard clinician diagnosis, neuroimaging, genomics, longitudinal outcomes, and caregiver factors.
Gold-standard ADOS-2 Toddler Module + ADI-R clinician-administered diagnostic labels for toddlers 12–36 months. Primary training labels for Aim 1 Model B and Model D. Data Use Certification submitted April 2026 via [email protected].
SCQ, SRS-2, developmental history, medical records from the largest ASD registry in the world. 14 Research Match datasets also submitted: Camouflaging · Life Course Outcomes · Sleep/Eating Issues · Genes & Environment · Aggression · Autism Stigma · and more.
Autism Brain Imaging Data Exchange — multi-site international neuroimaging (fMRI) + behavioral phenotyping from 17 sites across North America, Europe, and Asia. Used for international generalizability validation in Aim 2, pre-specified AUC ≥ 0.85 benchmark.
Adolescent Brain Cognitive Development Study — largest longitudinal US study of brain and cognitive development. Used for transfer learning pre-training on neurodevelopmental data (ADHD, language delay, anxiety) to enable differential diagnosis capability in the ASD classifier.
Autism BrainNet genetic data (WES & WGS) + associated ADOS-2/ADI-R clinical phenotypic records from confirmed ASD donors. Phenotypic records supplement NDAR gold-standard labels for Aim 1. Genomic data deferred to Phase II (R44) multimodal expansion.
Five independent published ASD screening datasets: Toddler Q-CHAT-10 (N=1,054), Child AQ-10 (N=292), Adolescent AQ-10 (N=104), Adult AQ-10 (N=701), Combined Multi-Age (N=6,075). Already validated — AUC = 1.000 across all age groups. Preliminary data complete.
Regulatory & Compliance Status
PI Statement
"1 in 36 U.S. children has autism. Most won't be diagnosed until age 4 or 5. The behavioral signs are there at 12 months. TAPVision AI exists to close that gap — for every child, across every demographic, at the very first detectable signal."
Contact Principal Investigator
Direct Email
info@tapvision.aiClinical Office
215-779-4009