Evra Backend Architecture

Technical Walkthrough & System Design

Table of Contents

1. System Overview: End-to-End Flow

The Evra backend operates as a closed-loop clinical intelligence system. It ingests multi-modal data, synthesizes it into actionable context, creates structured plans, and learns from user interaction.

System Overview Diagram
Figure 1: End-to-End Data Flow Architecture

1. Ingestion (Inputs)

The system aggregates three distinct data streams:

2. Insight Engine (Synthesis)

Data is normalized in parallel to create a "Patient Snapshot":

3. Coaching (Agentic Decision)

The LangGraph Controller (chat_service.py) orchestrates the response:

  1. Intent Classification: Determines if the query is health-related.
  2. Context Retrieval: Fetches memories, lab summaries, and vitals simultaneously.
  3. Generation: Produces a clinically grounded response using the synthesized context.

4. Agentic Action (Execution)

Insights are converted into concrete database records via the Goal Planner. The goals_service converts intent into structured ActionItems anchored to specific calendar dates.

5. Feedback Loop

2. Insights Layer: Longitudinal Signals

The Insights Layer transforms raw, high-frequency data streams into coherent health narratives by separating Snapshots (daily state) from Trends (longitudinal evolution) using deterministic logic before AI interpretation.

Insights Layer Diagram
Figure 2: Signal Processing and Insight Generation

1. Signal Processing Strategy

2. Preventing Drift & Contradictions

3. State & Memory: Representation & Evolution

User state in Evra is a Federated State Model, dynamically assembled from structured metrics, semantic memories, and cached profiles.

State and Memory Diagram
Figure 3: Federated State Model

1. User State Representation

2. Persisted vs. Ephemeral

3. Conflict Resolution & Updates

4. Coaching Layer: Logic & Goal Setting

The Coaching Layer translates clinical insights into sustainable behavioral changes. It strictly adheres to a non-clinical persona by decoupling medical analysis from action planning.

Coaching Layer Diagram
Figure 4: Behavior-First Coaching Architecture

1. Structure (Behavior-First)

2. Goal Setting & Adjustment

3. Avoiding Repetition & Over-Specificity

5. Prompt Engineering: Modular Composition & Safety

Prompt engineering in Evra is treated as software architecture, not string concatenation. Prompts are dynamically assembled modules that enforce strict clinical boundaries and personalize the voice before the LLM receives the input.

Prompt Engineering Diagram
Figure 5: Dynamic Prompt Assembly

1. High-Level Design: Dynamic Assembly

Prompts are constructed in layers at runtime (chat_service.py), ensuring every request contains the necessary context without exceeding token limits or losing focus.

2. Scope Enforcement (The Guardrail)

We do not rely on the main LLM to "figure out" if it should answer.

3. Tone & Safety Enforcement

6. Agentic Layer: Decision-Making & Execution

The Agentic Layer bridges the gap between knowing something (Insight) and doing something (Action). It uses a "Human-in-the-Loop" architecture where the AI proposes structured actions, but execution requires validation or explicit triggers.

Agentic Layer Diagram
Figure 6: Agentic Logic (Suggest -> Prepare -> Execute)

1. Translation Logic (Suggest -> Prepare -> Execute)

2. Confirmation Loops

3. Reversibility & Conflict Handling

4. Failure Handling

7. Safety & Guardrails: Constraining Authority

Evra enforces safety through architectural constraints rather than relying solely on model training. We treat the LLM as a text processing engine, not a doctor, by wrapping it in deterministic logic layers.

Safety and Guardrails Diagram
Figure 7: Safety Pipeline and Risk Escalation

1. Scope & Output Constraints

2. Avoiding Implied Medical Authority

3. Uncertainty & Escalation

8. Evaluation & Learnings: Real-World Performance

Our architecture has evolved significantly based on real-world friction points. The biggest lesson is that latency kills engagement and determinism beats cleverness.

Evaluation and Learnings Diagram
Figure 8: Wins, Failures, and Architectural Shifts

1. What Worked (The Wins)

2. What Broke (The Failures)

3. Architectural Shifts

9. Failure Modes: Detection & Mitigation

The Evra system is designed with a Defense-in-Depth strategy. Instead of assuming perfect AI behavior or 100% API uptime, the architecture anticipates failure at the dependency, logic, and data layers, implementing specific "Safety Nets" for each.

Failure Modes Diagram
Figure 9: Failure Detection and Mitigation Logic

1. External Dependency Failure (APIs)

2. Logic & Hallucination Failure (The "Drift")

3. Data Ingestion Failure (Docs & Streams)

10. What's Non-Trivial: Integration Challenges

Replicating Evra is difficult not because of any single component, but because of the "Three-Speed Integration" problem. The system must harmonize High-Frequency Streams (Device Vitals), Static Documents (PDFs), and Interactive Latency (Chat) into a single, medically safe narrative.

Integration Complexity Diagram
Figure 10: The Three-Speed Integration Problem

1. Integration Complexity (The Temporal Mismatch)

2. Iteration Cycles (Code-Backed Prompting)

3. Key Tradeoffs (Determinism vs. Flexibility)