NeuroBayes Diagnostics | Cloud & Data Engineering Portfolio

Problem to Solve

Despite considerable advances in understanding of the neurobiology of Autism Spectrum Disorder (ASD), its method of diagnosis has barely changed since it was devised some 50 years ago. It is cumbersome, lengthy (18 months to 3 years in the UK), and fraught with subjective mistakes.

Who is affected?
• Autism spectrum disorder (ASD) patients.
• ASD patient families.
• ASD patient carers.

Who is the End-user of This Tool?
• Medical Doctors
• General Practitiners (GPs)
• Clinicians
• Psychologists
• Psychiatrists
• Specially trained Pervasive Developmental Disorder (PDD) assessors.

Objective

The objective of this project is to deploy a diagnostic tool that analyses standardised medical data to instantly calculate the probability of a subject having autism or another neurodivergent trait or disorder. This tool has already been developed beyond proof of concept and up to working prototype stage with excellent performance on simulated data. The goal is to deploy it as a specialised tool configured for controlled access only to specialist and consultant medical professionals, following further model development, evaluation on real patient data, and publication in a peer-reviewed journal.

Medical professionals need this demonstrably accurate, objective, relatively quick (but not shoddy), and cost-effective diagnostic tool to bring the process of autism and other neurodivergence diagnosis into the 21st century.

The deployed application will reduce and eventually eliminate the long queue of patients waiting for diagnosis, together with the anxiety and distress that goes with these for patients and their families. It will enable families of confirmed diagnosed patients to access specialised and customised educational institutions and resources, interventions, support, and therapies in a timely manner.

It will bring the current antiquated and laborious system of autism and PDD diagnosis into the digital and computer age.

How does the tool work?

This Clinical Decision Support System (CDSS) functions via a secure, user-friendly frontend interface where clinicians input formatted medical records, questionnaire parameters, observational data, and most importantly quantitative medical laboratory determined parameters from MRIs, genetic, and genomic tests. They then receive an instantaneous, real-time predicted probability (expressed as a percentage) indicating the likelihood of an autism or other neorodivergence diagnosis. Personally Identifiable Information (PII) is excluded at the input stage. However, redunduncies would be built in to deal with any PII that slips through.

Once the clinician submits the data, the payload triggers a serverless compute pipeline executed by AWS Lambda, which pulls a custom Python environment container image from Amazon Elastic Container Registry (ECR) to evaluate the underlying mathematical conditional logic of a pre-trained Bayesian Network model. This containerised serverless design scales automatically to match incoming traffic and completely eliminates persistent server overhead.

To safeguard data privacy and ensure that unauthorised PII and Protected Health Information (PHI) do not propagate down the data pipeline, incoming structured and unstructured text inputs are proactively scanned and sanitised using Amazon Macie and Amazon Comprehend Medical, which automatically detect and flag medical-grade sensitive identifiers before calculations run.

The entire application infrastructure is architected under the AWS Shared Responsibility Model to comply with international healthcare privacy and data governance frameworks. All operational inputs and telemetry are piped directly into an Amazon S3 storage bucket configured with Object Lock. This provides an immutable, WORM-compliant compliance log layer, ensuring that necessary GDPR and the Data Protection Act 2018 records (in the UK) as well as HIPAA audit trails (in the USA) are safely preserved for their legally mandated retention period.

Cloud Architectural Flow Description (Provisional)

Clinician App Frontend
│
▼ ( Secure HTTPS / REST Requests + Auth Token )
Amazon API Gateway
│
├─► ( Validates JWT Token ) ─► Amazon Cognito ◄── ( User Realm and Auth )
│
▼ ( Authorised Payload )
Amazon Comprehend Medical ◄── ( PII Detection and Redaction )
│ ( Scans unstructured medical text in real-time )
│
▼ (Sanitised Medical Data Payload)
AWS Lambda ( Serverless Orchestrator - Bayesian Engine )
│
├─► ( Pulls custom model environment ) ──► Amazon ECR
│ ( Container Image: Python Runtime + Model Weights )
│
├─► ( Executes Bayesian Inference Loop )
│ ( Conditional probability calculations )
│
├─► ( Writes probability outputs ) ───────► Amazon DynamoDB
│ ( Real-time diagnostic scores )
│
├─► (Triggers clinical alerts) ─────────► Amazon SNS
│ │ ( Probability threshold breach detection )
│ └──► Clinician Email Notification
│
▼ ( Streams execution traces and audit telemetry )
Amazon CloudWatch
│ ( Request latency, inference duration, error tracking )
│
▼ ( Automated Log Export Pipeline )
Amazon S3 ◄── ( WORM Compliance via Object Lock )
│ ( Immutable audit trail - GDPR / DPA 2018 / HIPAA )
└─► Healthcare Compliance Vault

Architecture Diagram

Provisional Cloud Architecture Diagram

N.B. - This architecture is yet to be finalised - under review.

Software, Cloud, and Data Engineering Trade-offs

Under review - coming soon.

Cost Optimisations

Under review - coming soon.

Technical Details

Coming soon.

🏆 Proof of Concept and Early Prototype

Project: A causal Bayesian Network model for early diagnosis of autism

N.B. - This model requires Agena.AI / AgenaRisk Software as prerequisite to run.

🏆 Remaining Major Milestones

1. Use further Agena.AI / AgenaRisk utilities and capabilities to obtain an even more robust model.
2. Further refinement of model capabilities: for example MRI scan types currently modelled as node states would be drilled down into full nodes - Longitudinal MRI, ROI-based volumetry, tensor-based morphometry, surface-based morphometry, and voxel-based morphometry would each be modelled as a full node.
3. Real patient data, instead of the synthetic data used so far, will be used and the obtained model performance metrics such as accuracy, precision, F1-score, sensitivity, and specificity duly documented.
4. An object-oriented Multiobject Bayesian Network Model (MBNM) version would be built and compared to the improved mainstream version in terms of performance.
5. Detailed publication in a peer-reviewed scientific journal.
6. Submit model to the Medicines and Healthcare products Regulatory Agency (MHRA) and the National Institute for Health and Care Excellence (NICE) for approval for clinical use in the UK, while approval in the USA would be sought from the Food and Drug Administration (FDA) and the Centers for Medicare & Medicaid Services (CMS).

NeuroBayes Diagnostics - Autism and Other Neurodiversity Diagnostic Tool