Skip to main content

Introduction

This System Design Document (SDD) describes the Deep Evidence Agent (DEA) – the core "AI in Engineering" system within the AURAISON platform. DEA is an engineering-grade, multi-agent system that turns heterogeneous engineering artifacts (requirements, designs, code, tests, standards, incident reports) into traceable, evidence-grounded insights. It operationalizes the capabilities defined in the PRD – Deep Evidence Agent and the ADD – Deep Evidence Agent, with a focus on detailed design, component responsibilities, data flows, and deployment topology.

The SDD is written for:

  • Systems and software engineers implementing DEA components.
  • ML / data platform engineers operating model and retrieval infrastructure.
  • Enterprise and security architects reviewing compliance with organizational standards.
  • Product and engineering leaders needing a concrete view of how PRD goals are realized in the running system.

Relationship to PRD and ADD

  • The PRD defines what the AI in Engineering system must achieve for users (goals, use cases, functional and non-functional requirements).
  • The ADD defines the logical architecture (major components, responsibilities, and views) at a technology-agnostic but architecture-specific level.
  • This SDD refines the ADD into an implementation-oriented design:
    • Concrete services, modules, and data stores.
    • Chosen technologies and integration patterns where decided.
    • Detailed sequence and data flow diagrams for key engineering workflows.
    • Operational aspects (scaling, observability, safety controls).

Traceability is maintained via explicit references to PRD IDs (e.g., FR-1, NFR-1) and ADD components (e.g., Orchestrator, Engineering Retrieval Service, Traceability Graph Store).

Scope

This SDD covers the "AI in Engineering" Deep Evidence Agent capabilities, including:

  • Requirements traceability and impact analysis across lifecycle artifacts (PRD UC-1, FR-4).
  • Evidence-grounded design decision support (PRD UC-2).
  • Standards and compliance evidence assembly (PRD UC-3).
  • Engineering knowledge capture and reuse (PRD UC-5).
  • Human-in-the-loop workflows where engineers review, curate, and approve trace links and generated artifacts.

Out of scope for this SDD (but potentially part of the broader AURAISON platform):

  • Generic conversational assistants without explicit engineering traceability.
  • Standalone simulation, optimization, or CAD/CAE tooling that does not integrate with DEA via defined APIs.

System Overview

At a high level, DEA exposes an Engineering Assistant API and UI to end users. Internally it orchestrates multiple role-specialized agents over a structured representation of engineering knowledge, backed by a traceability graph and hybrid retrieval.

DEA is stateless at the request-layer but maintains session and project state (trace graphs, curated evidence sets, revisions) in persistent stores to support long-running engineering analyses.

Detailed Design – Component View

1. Interface & Session Layer

1.1 Engineering UI / API Gateway

Responsibilities

  • Provide a web-based engineering UI (and potentially notebook or IDE plugins) exposing:
    • Requirements traceability views (matrices, graphs) aligned with DOORS-style representations.
    • Impact analysis dashboards (change requests → affected requirements/design/code/tests).
    • Evidence browsers with side-by-side artifact and claim views.
    • Session timelines showing planner decisions, agent calls, and human interventions.
  • Expose a versioned REST/GraphQL API for programmatic access (e.g., CI/CD integrations, batch analyses).
  • Enforce authentication and authorization via the organization’s IdP and role-based access control (RBAC).

Key Design Decisions

  • API endpoints are coarse-grained and session-oriented (e.g., POST /sessions/{id}/impact-analysis), rather than chat-style, to match engineering workflows (PRD FR-6, FR-8).
  • All UI and API interactions emit structured telemetry events for auditability (PRD NFR-6).

1.2 Session & Project Management

Responsibilities

  • Represent engineering work as sessions scoped to a project, subsystem, or change request.
  • Persist:
    • Session metadata (project, scope, involved users, timestamps).
    • Planner plans (task DAGs) and revisions.
    • Snapshots of trace graphs and evidence sets at key milestones (e.g., before/after curation).
  • Support resume, fork, and compare operations for sessions.

Data Model (Simplified)

Session:
id: string
project_id: string
scope: string # e.g., REQ-123, subsystem name
status: open|closed|archived
created_by: user_id
created_at: timestamp
last_updated_at: timestamp

SessionSnapshot:
id: string
session_id: string
trace_graph_version: string
evidence_set_ids: [string]
planner_plan_id: string
created_at: timestamp

Interfaces

  • createSession(projectId, scope, initiator)sessionId.
  • saveSnapshot(sessionId, traceGraphRef, evidenceSetRefs, planRef).
  • getSession(sessionId) / listSessions(projectId).

2. Orchestration & Agent Layer

2.1 Orchestrator & Planner (PRD FR-1, FR-5)

Responsibilities

  • Accept engineering questions or commands from the UI/API (e.g., "What is the impact of changing REQ-123?").
  • Build a task plan (DAG) decomposing work into well-defined steps:
    • Requirements lookup and neighborhood expansion.
    • Design artifact retrieval and filtering.
    • Code and test discovery.
    • Standards lookup and policy checks.
    • Evidence consolidation and critique.
    • Synthesis (trace matrix, narrative impact summary).
  • Assign tasks to specialized agents and manage their execution (possibly with concurrency and retries).
  • Capture rationale and decisions for audit and explainability (PRD NFR-6, NFR-7).

Design

  • Plans are represented as structured graphs (e.g., JSON DAG) with typed nodes:
PlanNode:
id: string
type: "requirements_lookup" | "design_scan" | "code_scan" | "test_scan" | "standards_lookup" | "evidence_consolidation" | "synthesis" | ...
inputs: [node_id]
params: object
status: pending|running|completed|failed
  • Planner implementation:
    • Uses LLM-based planning constrained by templates of allowed task types to avoid arbitrary tool invocation.
    • Validates generated plans against schemas and policy rules before execution.

2.2 Agent Runtime (Researcher, Critic, Synthesizer)

Responsibilities

  • Provide a common execution framework for role-specialized agents:
    • Researcher Agent: queries retrieval services and normalizes results into evidence units.
    • Critic Agent: evaluates trace link quality, detects gaps and inconsistencies.
    • Synthesizer Agent: generates structured outputs (matrices, engineering reports) with explicit evidence citations.
  • Manage tooling sandbox and rate limits for calls to retrieval, model serving, and external systems.
  • Emit detailed logs of agent invocations, parameters, and outputs.

Agent Contract (Logical)

AgentRequest:
agent_type: "researcher" | "critic" | "synthesizer"
session_id: string
task_type: string
inputs: object

AgentResponse:
status: success|error
outputs: object # evidence units, scores, narrative, etc.
used_tools: [string]
metrics:
tokens_in: int
tokens_out: int
latency_ms: int

3. Engineering Knowledge & Data Layer

3.1 Artifact Ingestion Pipelines (PRD FR-2, FR-3)

Responsibilities

  • Continuously ingest and normalize engineering artifacts from lifecycle tools and repositories:
    • Requirements (DOORS/ReqIF, Jama, Polarion).
    • Design documents (Confluence, SharePoint, Git-hosted docs).
    • SysML/UML/Simulink models.
    • Source code (Git-based systems).
    • Test cases, results, and CI logs.
    • Standards documents and internal guidelines.
  • Extract structured metadata and evidence units (paragraphs, sections, symbols, log excerpts) with stable identifiers.
  • Populate both the vector search index and the traceability graph with normalized entities.

Pipeline Stages

  1. Source Connector
    • Tool-specific connectors invoking APIs, file exports, or webhook-based change feeds.
  2. Parsing & Normalization
    • Parsers per artifact type (ReqIF/XMI/Markdown/PDF/AST/logs).
    • Canonical internal model with consistent IDs and versioning.
  3. Enrichment
    • Embedding generation for semantic similarity.
    • Heuristic and ML-based link suggestions (e.g., requirement ↔ code comment or test name).
  4. Indexing & Graph Update
    • Upsert into vector index.
    • Upsert/update nodes and edges in trace graph.

Design Considerations

  • Pipelines are idempotent and incremental, keyed by artifact IDs and versions.
  • For safety-critical environments, ingestion can be configured to run in air-gapped or restricted zones.

3.2 Engineering Retrieval Service (PRD FR-2)

Responsibilities

  • Provide a single, consistent API for high-level retrieval queries:
    • Hybrid search (keyword + semantic + filters) over ingested artifacts.
    • Neighborhood expansion (e.g., all artifacts directly or transitively linked to a requirement or design element).
    • Time-bounded queries (e.g., only artifacts as of a given baseline).

Retrieval API (Logical)

RetrieveRequest:
scope: project_id
query: string # natural language or artifact ID
filters:
artifact_types: ["requirement","design","code","test","standard"]
time_range: [timestamp, timestamp]
top_k: int

RetrieveResponse:
hits:
- id: string
artifact_type: string
score: float
snippet: string
source_ref: object # path, URL, tool-specific identifiers

Implementation Notes

  • Combines:
    • Vector similarity search (semantic relevance) via VDB.
    • Lexical / fielded search (IDs, exact phrases, filters) via a search engine (e.g., OpenSearch/Elasticsearch) or database indexes.
  • Retrieves full artifact bodies and metadata from STORE/connectors as needed.

3.3 Traceability & Provenance Graph (PRD FR-4)

Responsibilities

  • Store and maintain a graph of engineering entities and their relationships:
    • Requirement, DesignElement, CodeComponent, TestCase, StandardClause, EvidenceUnit, Claim.
  • Support operations:
    • Compute coverage metrics (e.g., % requirements covered by tests).
    • Impact analysis queries (e.g., all paths downstream of a changed requirement).
    • Version-aware views (graphs at specific baselines or snapshots).

Schema (Simplified)

Node types:
Requirement:
id: string
source: string
text: string
version: string

DesignElement:
id: string
doc_path: string
section_anchor: string

CodeComponent:
id: string
repo: string
file: string
symbol: string

TestCase:
id: string
tool: string
status: string

EvidenceUnit:
id: string
artifact_ref: object
text: string

Claim:
id: string
text: string
created_by: user_or_agent

Edge types:
implements: Requirement -> CodeComponent
derives_from: DesignElement -> Requirement
verifies: TestCase -> Requirement
tested_by: Requirement -> TestCase
supported_by: Claim -> EvidenceUnit
conflicts_with: Requirement -> Requirement
affects: Requirement -> DesignElement

Implementation Notes

  • Backed by a graph database or graph layer on top of a relational store.
  • All mutating operations are audited with who/what/when to satisfy PRD NFR-6.

4. Model & Inference Layer

4.1 Model Serving (LLMs and Embedders)

Responsibilities

  • Provide managed access to:
    • LLMs for planning, evidence extraction, critique, and synthesis.
    • Embedding models for artifact indexing and retrieval.
  • Enforce guardrails and usage policies (e.g., approved models per domain or tenant).

Design Aspects

  • Requests carry contextual metadata (tenant, project, classification) used by a policy engine to route to permitted models.
  • LLM prompts are templated and versioned so that outputs are reproducible and comparable across runs.
  • Model serving exposes metrics (latency, token counts, error rates) feeding into NFR-2, NFR-3, NFR-5.

4.2 Safety & Guardrails

Responsibilities

  • Enforce constraints such as:
    • No unsupported claims in final outputs (must reference evidence units and trace links).
    • Prohibition of external web retrieval for regulated projects.
    • Redaction/handling policies for PII or sensitive design information.
  • Integrate pre- and post-processing steps:
    • Input validation and normalization.
    • Output validation checks (e.g., every claim has at least one supporting evidence ID).

Key Workflows

1. Requirements Traceability & Impact Analysis (UC-1)

This workflow realizes PRD UC-1 and FR-1–FR-4, while ensuring that every synthesized claim in the report is backed by at least one EvidenceUnit in the trace graph.

2. Design Decision Justification (UC-2)

At a high level:

  1. Engineer selects a design decision record (e.g., ADR, design review item) or starts a new one.
  2. Orchestrator and Researcher Agent gather relevant requirements, prior decisions, standards clauses, and test results.
  3. Critic verifies that proposed rationale does not contradict existing requirements or standards.
  4. Synthesizer generates a Design Decision Document with explicit references to requirements, alternatives, and standards.
  5. Engineer reviews, edits, and signs off; DEA stores the decision as a Claim with supported_by links to EvidenceUnits and related nodes.

Quality Attributes & NFR Realization

Accuracy & Trust (NFR-1)

  • All final user-facing engineering claims must:
    • Be associated with at least one EvidenceUnit node in the graph.
    • Include navigable links back to the original artifact (requirement, design doc, code, test, or standard).
  • The Critic Agent runs consistency checks on trace graphs and synthesized outputs, flagging potentially unsupported or conflicting claims.
  • Evaluation datasets (curated by domain experts) are used to periodically measure claim correctness and evidence completeness.

Performance & Scalability (NFR-2, NFR-3, NFR-5)

  • Stateless application services (UI/API, Orchestrator, Retrieval, Agent Runtime) are horizontally scalable.
  • Retrieval requests are optimized via:
    • Pre-computed indexes and embeddings.
    • Caching of popular queries and trace neighborhoods.
  • Long-running analyses (e.g., large impact studies) are executed asynchronously with progress indicators.

Availability & Resilience (NFR-4)

  • Core services are deployed with:
    • Multiple replicas behind load balancers.
    • Health checks and auto-restart policies.
    • Graceful degradation modes (e.g., disable heavy synthesis while keeping basic retrieval and trace views available).

Auditability & Explainability (NFR-6, NFR-7)

  • Every session maintains:
    • Planner plans and revisions.
    • Agent requests/responses with tool usage metadata.
    • Trace graph versions and diffs.
  • UI exposes a session timeline view where engineers can inspect:
    • Which agents ran.
    • What artifacts were retrieved.
    • How evidence supported specific claims.

Security, Privacy, and Compliance

Authentication & Authorization (SEC-1, SEC-3)

  • All ingress to DEA passes through the API gateway integrated with an enterprise IdP (e.g., OIDC/SAML).
  • RBAC enforces project-level and role-level access to:
    • Artifacts (requirements, designs, code, tests).
    • Sessions and reports.
    • Administrative and configuration functions.

Data Protection (SEC-2, PRIV-1, PRIV-2)

  • Encryption in transit using TLS for all internal and external calls.
  • Encryption at rest for:
    • Trace graph stores.
    • Object storage with engineering artifacts.
    • Vector indexes.
  • PII handling policies are implemented via classification tags on artifacts and masking/redaction where required.

Policy Enforcement (ADD Security & Governance)

  • A policy engine is consulted before high-risk actions (e.g., using a particular model, accessing certain repositories, or exporting data):
    • Per-tenant and per-project configurations control external connectivity, allowed model families, and data egress rules.

Deployment View (High Level)

DEA is deployed as a set of microservices (or well-separated services) on a container orchestration platform (e.g., Kubernetes) in one or more engineering-aligned regions.

Key Deployment Groups

  • Edge / Ingress Layer: API gateway, web UI.
  • Control Plane: Orchestrator, Planner, Agent Runtime, Session Service.
  • Data Plane – Knowledge: Ingestion pipelines, Retrieval, Trace Graph, Vector DB, Object Store.
  • Model Plane: Model serving endpoints (LLMs, embedders) managed by the ML platform.
  • Observability Plane: Logging, metrics, traces, audit event streams.

Network segmentation and security groups ensure that:

  • Engineering data remains within approved zones and regions (PRD C-2).
  • Model serving endpoints only accept traffic from DEA services and approved internal clients.

Risks and Open Issues

Identified Risks

  • R1 – Connector Coverage: Incomplete support for key engineering lifecycle tools may limit traceability.
    • Mitigation: Prioritize DOORS/ReqIF, Git, and primary test systems; design connectors as pluggable modules.
  • R2 – Model Drift and Behavior Changes: Model upgrades may affect trace link suggestions or synthesis quality.
    • Mitigation: Version model configurations; run regression evaluations on engineering test sets before promotion.
  • R3 – Complex Graph Queries at Scale: Large trace graphs may degrade impact analysis response times.
    • Mitigation: Pre-computed indexes, caching of common traversals, and performance testing on representative datasets.

Open Issues

  • Final selection of concrete technologies (graph database, vector database, search engine) per environment.
  • Detailed schema for domain-specific extensions (e.g., safety case arguments, hazard logs) beyond the core traceability graph.
  • Definition of organization-specific standards mapping (e.g., how ISO 26262 clauses map into StandardClause nodes).