Agentic AI, evals, and reliability engineering for high-stakes domains

Agentic AI SystemsThat Hold Up in Production

I design, evaluate, and ship AI systems for healthcare, legal, and enterprise workflows. Creator of Continuous Alignment Testing (CAT), bringing mathematical rigor, evaluation discipline, and production reliability to systems deployed at Mayo Clinic, eBay, Trust & Will, and Arrive Health.

Agentic Architectures

Production-grade RAG, orchestration, and workflow design

Evals & Reliability

Validators, monitoring, regression thinking, and CAT

High-Stakes Domains

Healthcare, legal, commerce, identity, and payments

Featured Work

Public writing, production case studies, and systems work centered on agentic architecture, evaluation, and reliability engineering.

Reliability Testing for LLM-Based Systems

20 min read

AI EngineeringSep 11, 2024

Reliability Testing for LLM-Based Systems

The CAT Framework White Paper

Comprehensive framework for conducting reliability tests on LLM systems using validators, verifiers, and reliability tensors. Includes binomial experiments, generative conditional validators, and production monitoring strategies.

#CAT Framework#Reliability Testing#LLM Systems

Read Article

10 min read

AI EngineeringSep 11, 2024

Agentic Architecture

Structuring Agent-Tool-User Interactions

A structured approach to modeling interactions between agents and tools through chat histories, cycles, and graph-theoretic representations. Foundational concepts for understanding agentic AI systems.

#Agentic AI#Architecture#Graph Theory

Read Article

Formalized Structures: The Algebra of Agentic Architectures

15 min read

Technical Deep DiveMay 11, 2024

Formalized Structures: The Algebra of Agentic Architectures

Mathematical Formalization of Tool-Using AI Systems

Mathematical formalization of agentic AI through tool call matrices, selection masks, and tensor operations. Defines the algebra underlying agent-tool interactions with formal propositions and proofs.

#Mathematics#Agentic AI#Formal Methods

Read Article

eBay

2025

AI Engineer (via Artium AI)

Enterprise-scale Agentic AI system optimizing legacy seller workspaces.

AI Capabilities

Agentic AI ArchitectureLegacy System IntegrationIntelligent Workflow AutomationEnterprise-Scale AI Deployment

Key Highlights

Architected agentic AI workflows for enterprise-scale deployment
Integrated AI capabilities into legacy seller workspace systems
Deployed production AI at massive scale

PythonTypeScriptOpenAIAgentic FrameworksEnterprise APIs

Mayo Clinic

2024

AI Engineer (via Artium AI)

Multi-agent RAG system accelerating medical research and discovery for one of the world's leading healthcare institutions.

AI Capabilities

Multi-Agent SystemsRetrieval-Augmented Generation (RAG)Medical Knowledge SynthesisProduction AI Monitoring

Key Highlights

Designed and implemented multi-agent architecture for medical research
Built retrieval-augmented generation (RAG) system for knowledge synthesis
Enabled researchers to accelerate discovery through AI-powered insights

PythonLangChainVector DatabasesOpenAI GPT-4RAG Architecture

Trust & Will

2024

AI Engineer (via Artium AI)

Attorney-in-the-loop automation system digitizing estate planning practices with human oversight and AI efficiency.

AI Capabilities

Human-in-the-Loop AILegal Document AutomationWorkflow OrchestrationCompliance Monitoring

Key Highlights

Designed human-in-the-loop AI architecture for legal workflows
Automated estate planning document generation and processing
Maintained attorney oversight while increasing efficiency

PythonOpenAI GPT-4Workflow OrchestrationDocument Automation

Arrive Health

2024

AI Engineer (via Artium AI)

Complex AI flows condensing critical clinical information to support patient care and healthcare decision-making.

AI Capabilities

Clinical Data ProcessingAI-Driven SummarizationInformation ExtractionHealthcare AI Systems

Key Highlights

Built AI systems for clinical information processing
Designed complex information extraction pipelines
Enabled AI-driven summarization for critical patient support

PythonOpenAIClinical Data ProcessingInformation Extraction

AI/ML

Graph Theoretic Multi-Agent Dynamics

My First Python Script

Mathematical simulation of autonomous agents reaching consensus through graph-based communication. Graduate thesis work in Applied Mathematics.

PythonNumPySciPyMatplotlib+1 more

View Project

Hillcrest Ski & Sports

Software Developer → Principal Consultant

Built and continuously maintain e-commerce platform driving over $1M in annual sales. Ongoing client with active support and feature development.

Key Highlights

Built company's first dynamic e-commerce platform (2020-2021)
Transformed static site into real-time multi-channel fulfillment system
Integrated 23,000+ SKUs with automated data pipelines

ReactNode.jsPythonE-commerce APIsData Pipelines

Continuous Alignment Testing (CAT)

A framework for evaluating and monitoring agentic AI systems in production. CAT combines validators, reliability tensors, and statistical rigor to make high-leverage AI systems more observable, auditable, and dependable at scale.