Introduction
This System Design Document (SDD) describes the Deep Evidence Agent (DEA) – the core "AI in Engineering" system within the AURAISON platform. DEA is an engineering-grade, multi-agent system that turns heterogeneous engineering artifacts (requirements, designs, code, tests, standards, incident reports) into traceable, evidence-grounded insights. It operationalizes the capabilities defined in the PRD – Deep Evidence Agent and the ADD – Deep Evidence Agent, with a focus on detailed design, component responsibilities, data flows, and deployment topology.
The SDD is written for:
- Systems and software engineers implementing DEA components.
- ML / data platform engineers operating model and retrieval infrastructure.
- Enterprise and security architects reviewing compliance with organizational standards.
- Product and engineering leaders needing a concrete view of how PRD goals are realized in the running system.
Relationship to PRD and ADD
- The PRD defines what the AI in Engineering system must achieve for users (goals, use cases, functional and non-functional requirements).
- The ADD defines the logical architecture (major components, responsibilities, and views) at a technology-agnostic but architecture-specific level.
- This SDD refines the ADD into an implementation-oriented design:
- Concrete services, modules, and data stores.
- Chosen technologies and integration patterns where decided.
- Detailed sequence and data flow diagrams for key engineering workflows.
- Operational aspects (scaling, observability, safety controls).
Traceability is maintained via explicit references to PRD IDs (e.g., FR-1, NFR-1) and ADD components (e.g., Orchestrator, Engineering Retrieval Service, Traceability Graph Store).
Scope
This SDD covers the "AI in Engineering" Deep Evidence Agent capabilities, including:
- Requirements traceability and impact analysis across lifecycle artifacts (PRD UC-1, FR-4).
- Evidence-grounded design decision support (PRD UC-2).
- Standards and compliance evidence assembly (PRD UC-3).
- Engineering knowledge capture and reuse (PRD UC-5).
- Human-in-the-loop workflows where engineers review, curate, and approve trace links and generated artifacts.
Out of scope for this SDD (but potentially part of the broader AURAISON platform):
- Generic conversational assistants without explicit engineering traceability.
- Standalone simulation, optimization, or CAD/CAE tooling that does not integrate with DEA via defined APIs.
System Overview
At a high level, DEA exposes an Engineering Assistant API and UI to end users. Internally it orchestrates multiple role-specialized agents over a structured representation of engineering knowledge, backed by a traceability graph and hybrid retrieval.
DEA is stateless at the request-layer but maintains session and project state (trace graphs, curated evidence sets, revisions) in persistent stores to support long-running engineering analyses.
Detailed Design – Component View
1. Interface & Session Layer
1.1 Engineering UI / API Gateway
Responsibilities
- Provide a web-based engineering UI (and potentially notebook or IDE plugins) exposing:
- Requirements traceability views (matrices, graphs) aligned with DOORS-style representations.
- Impact analysis dashboards (change requests → affected requirements/design/code/tests).
- Evidence browsers with side-by-side artifact and claim views.
- Session timelines showing planner decisions, agent calls, and human interventions.
- Expose a versioned REST/GraphQL API for programmatic access (e.g., CI/CD integrations, batch analyses).
- Enforce authentication and authorization via the organization’s IdP and role-based access control (RBAC).
Key Design Decisions
- API endpoints are coarse-grained and session-oriented (e.g.,
POST /sessions/{id}/impact-analysis), rather than chat-style, to match engineering workflows (PRD FR-6, FR-8). - All UI and API interactions emit structured telemetry events for auditability (PRD NFR-6).
1.2 Session & Project Management
Responsibilities
- Represent engineering work as sessions scoped to a project, subsystem, or change request.
- Persist:
- Session metadata (project, scope, involved users, timestamps).
- Planner plans (task DAGs) and revisions.
- Snapshots of trace graphs and evidence sets at key milestones (e.g., before/after curation).
- Support resume, fork, and compare operations for sessions.
Data Model (Simplified)
Session:
id: string
project_id: string
scope: string # e.g., REQ-123, subsystem name
status: open|closed|archived
created_by: user_id
created_at: timestamp
last_updated_at: timestamp
SessionSnapshot:
id: string
session_id: string
trace_graph_version: string
evidence_set_ids: [string]
planner_plan_id: string
created_at: timestamp
Interfaces
createSession(projectId, scope, initiator)→sessionId.saveSnapshot(sessionId, traceGraphRef, evidenceSetRefs, planRef).getSession(sessionId)/listSessions(projectId).
2. Orchestration & Agent Layer
2.1 Orchestrator & Planner (PRD FR-1, FR-5)
Responsibilities
- Accept engineering questions or commands from the UI/API (e.g., "What is the impact of changing REQ-123?").
- Build a task plan (DAG) decomposing work into well-defined steps:
- Requirements lookup and neighborhood expansion.
- Design artifact retrieval and filtering.
- Code and test discovery.
- Standards lookup and policy checks.
- Evidence consolidation and critique.
- Synthesis (trace matrix, narrative impact summary).
- Assign tasks to specialized agents and manage their execution (possibly with concurrency and retries).
- Capture rationale and decisions for audit and explainability (PRD NFR-6, NFR-7).
Design
- Plans are represented as structured graphs (e.g., JSON DAG) with typed nodes:
PlanNode:
id: string
type: "requirements_lookup" | "design_scan" | "code_scan" | "test_scan" | "standards_lookup" | "evidence_consolidation" | "synthesis" | ...
inputs: [node_id]
params: object
status: pending|running|completed|failed
- Planner implementation:
- Uses LLM-based planning constrained by templates of allowed task types to avoid arbitrary tool invocation.
- Validates generated plans against schemas and policy rules before execution.
2.2 Agent Runtime (Researcher, Critic, Synthesizer)
Responsibilities
- Provide a common execution framework for role-specialized agents:
- Researcher Agent: queries retrieval services and normalizes results into evidence units.
- Critic Agent: evaluates trace link quality, detects gaps and inconsistencies.
- Synthesizer Agent: generates structured outputs (matrices, engineering reports) with explicit evidence citations.
- Manage tooling sandbox and rate limits for calls to retrieval, model serving, and external systems.
- Emit detailed logs of agent invocations, parameters, and outputs.
Agent Contract (Logical)
AgentRequest:
agent_type: "researcher" | "critic" | "synthesizer"
session_id: string
task_type: string
inputs: object
AgentResponse:
status: success|error
outputs: object # evidence units, scores, narrative, etc.
used_tools: [string]
metrics:
tokens_in: int
tokens_out: int
latency_ms: int
3. Engineering Knowledge & Data Layer
3.1 Artifact Ingestion Pipelines (PRD FR-2, FR-3)
Responsibilities
- Continuously ingest and normalize engineering artifacts from lifecycle tools and repositories:
- Requirements (DOORS/ReqIF, Jama, Polarion).
- Design documents (Confluence, SharePoint, Git-hosted docs).
- SysML/UML/Simulink models.
- Source code (Git-based systems).
- Test cases, results, and CI logs.
- Standards documents and internal guidelines.
- Extract structured metadata and evidence units (paragraphs, sections, symbols, log excerpts) with stable identifiers.
- Populate both the vector search index and the traceability graph with normalized entities.
Pipeline Stages
- Source Connector
- Tool-specific connectors invoking APIs, file exports, or webhook-based change feeds.
- Parsing & Normalization
- Parsers per artifact type (ReqIF/XMI/Markdown/PDF/AST/logs).
- Canonical internal model with consistent IDs and versioning.
- Enrichment
- Embedding generation for semantic similarity.
- Heuristic and ML-based link suggestions (e.g., requirement ↔ code comment or test name).
- Indexing & Graph Update
- Upsert into vector index.
- Upsert/update nodes and edges in trace graph.
Design Considerations
- Pipelines are idempotent and incremental, keyed by artifact IDs and versions.
- For safety-critical environments, ingestion can be configured to run in air-gapped or restricted zones.
3.2 Engineering Retrieval Service (PRD FR-2)
Responsibilities
- Provide a single, consistent API for high-level retrieval queries:
- Hybrid search (keyword + semantic + filters) over ingested artifacts.
- Neighborhood expansion (e.g., all artifacts directly or transitively linked to a requirement or design element).
- Time-bounded queries (e.g., only artifacts as of a given baseline).
Retrieval API (Logical)
RetrieveRequest:
scope: project_id
query: string # natural language or artifact ID
filters:
artifact_types: ["requirement","design","code","test","standard"]
time_range: [timestamp, timestamp]
top_k: int
RetrieveResponse:
hits:
- id: string
artifact_type: string
score: float
snippet: string
source_ref: object # path, URL, tool-specific identifiers
Implementation Notes
- Combines:
- Vector similarity search (semantic relevance) via
VDB. - Lexical / fielded search (IDs, exact phrases, filters) via a search engine (e.g., OpenSearch/Elasticsearch) or database indexes.
- Vector similarity search (semantic relevance) via
- Retrieves full artifact bodies and metadata from
STORE/connectors as needed.
3.3 Traceability & Provenance Graph (PRD FR-4)
Responsibilities
- Store and maintain a graph of engineering entities and their relationships:
Requirement,DesignElement,CodeComponent,TestCase,StandardClause,EvidenceUnit,Claim.
- Support operations:
- Compute coverage metrics (e.g., % requirements covered by tests).
- Impact analysis queries (e.g., all paths downstream of a changed requirement).
- Version-aware views (graphs at specific baselines or snapshots).
Schema (Simplified)
Node types:
Requirement:
id: string
source: string
text: string
version: string
DesignElement:
id: string
doc_path: string
section_anchor: string
CodeComponent:
id: string
repo: string
file: string
symbol: string
TestCase:
id: string
tool: string
status: string
EvidenceUnit:
id: string
artifact_ref: object
text: string
Claim:
id: string
text: string
created_by: user_or_agent
Edge types:
implements: Requirement -> CodeComponent
derives_from: DesignElement -> Requirement
verifies: TestCase -> Requirement
tested_by: Requirement -> TestCase
supported_by: Claim -> EvidenceUnit
conflicts_with: Requirement -> Requirement
affects: Requirement -> DesignElement
Implementation Notes
- Backed by a graph database or graph layer on top of a relational store.
- All mutating operations are audited with who/what/when to satisfy PRD NFR-6.
4. Model & Inference Layer
4.1 Model Serving (LLMs and Embedders)
Responsibilities
- Provide managed access to:
- LLMs for planning, evidence extraction, critique, and synthesis.
- Embedding models for artifact indexing and retrieval.
- Enforce guardrails and usage policies (e.g., approved models per domain or tenant).
Design Aspects
- Requests carry contextual metadata (tenant, project, classification) used by a policy engine to route to permitted models.
- LLM prompts are templated and versioned so that outputs are reproducible and comparable across runs.
- Model serving exposes metrics (latency, token counts, error rates) feeding into NFR-2, NFR-3, NFR-5.
4.2 Safety & Guardrails
Responsibilities
- Enforce constraints such as:
- No unsupported claims in final outputs (must reference evidence units and trace links).
- Prohibition of external web retrieval for regulated projects.
- Redaction/handling policies for PII or sensitive design information.
- Integrate pre- and post-processing steps:
- Input validation and normalization.
- Output validation checks (e.g., every claim has at least one supporting evidence ID).
Key Workflows
1. Requirements Traceability & Impact Analysis (UC-1)
This workflow realizes PRD UC-1 and FR-1–FR-4, while ensuring that every synthesized claim in the report is backed by at least one EvidenceUnit in the trace graph.
2. Design Decision Justification (UC-2)
At a high level:
- Engineer selects a design decision record (e.g., ADR, design review item) or starts a new one.
- Orchestrator and Researcher Agent gather relevant requirements, prior decisions, standards clauses, and test results.
- Critic verifies that proposed rationale does not contradict existing requirements or standards.
- Synthesizer generates a Design Decision Document with explicit references to requirements, alternatives, and standards.
- Engineer reviews, edits, and signs off; DEA stores the decision as a
Claimwithsupported_bylinks toEvidenceUnits and related nodes.
Quality Attributes & NFR Realization
Accuracy & Trust (NFR-1)
- All final user-facing engineering claims must:
- Be associated with at least one
EvidenceUnitnode in the graph. - Include navigable links back to the original artifact (requirement, design doc, code, test, or standard).
- Be associated with at least one
- The Critic Agent runs consistency checks on trace graphs and synthesized outputs, flagging potentially unsupported or conflicting claims.
- Evaluation datasets (curated by domain experts) are used to periodically measure claim correctness and evidence completeness.
Performance & Scalability (NFR-2, NFR-3, NFR-5)
- Stateless application services (UI/API, Orchestrator, Retrieval, Agent Runtime) are horizontally scalable.
- Retrieval requests are optimized via:
- Pre-computed indexes and embeddings.
- Caching of popular queries and trace neighborhoods.
- Long-running analyses (e.g., large impact studies) are executed asynchronously with progress indicators.
Availability & Resilience (NFR-4)
- Core services are deployed with:
- Multiple replicas behind load balancers.
- Health checks and auto-restart policies.
- Graceful degradation modes (e.g., disable heavy synthesis while keeping basic retrieval and trace views available).
Auditability & Explainability (NFR-6, NFR-7)
- Every session maintains:
- Planner plans and revisions.
- Agent requests/responses with tool usage metadata.
- Trace graph versions and diffs.
- UI exposes a session timeline view where engineers can inspect:
- Which agents ran.
- What artifacts were retrieved.
- How evidence supported specific claims.
Security, Privacy, and Compliance
Authentication & Authorization (SEC-1, SEC-3)
- All ingress to DEA passes through the API gateway integrated with an enterprise IdP (e.g., OIDC/SAML).
- RBAC enforces project-level and role-level access to:
- Artifacts (requirements, designs, code, tests).
- Sessions and reports.
- Administrative and configuration functions.
Data Protection (SEC-2, PRIV-1, PRIV-2)
- Encryption in transit using TLS for all internal and external calls.
- Encryption at rest for:
- Trace graph stores.
- Object storage with engineering artifacts.
- Vector indexes.
- PII handling policies are implemented via classification tags on artifacts and masking/redaction where required.
Policy Enforcement (ADD Security & Governance)
- A policy engine is consulted before high-risk actions (e.g., using a particular model, accessing certain repositories, or exporting data):
- Per-tenant and per-project configurations control external connectivity, allowed model families, and data egress rules.
Deployment View (High Level)
DEA is deployed as a set of microservices (or well-separated services) on a container orchestration platform (e.g., Kubernetes) in one or more engineering-aligned regions.
Key Deployment Groups
- Edge / Ingress Layer: API gateway, web UI.
- Control Plane: Orchestrator, Planner, Agent Runtime, Session Service.
- Data Plane – Knowledge: Ingestion pipelines, Retrieval, Trace Graph, Vector DB, Object Store.
- Model Plane: Model serving endpoints (LLMs, embedders) managed by the ML platform.
- Observability Plane: Logging, metrics, traces, audit event streams.
Network segmentation and security groups ensure that:
- Engineering data remains within approved zones and regions (PRD C-2).
- Model serving endpoints only accept traffic from DEA services and approved internal clients.
Risks and Open Issues
Identified Risks
- R1 – Connector Coverage: Incomplete support for key engineering lifecycle tools may limit traceability.
- Mitigation: Prioritize DOORS/ReqIF, Git, and primary test systems; design connectors as pluggable modules.
- R2 – Model Drift and Behavior Changes: Model upgrades may affect trace link suggestions or synthesis quality.
- Mitigation: Version model configurations; run regression evaluations on engineering test sets before promotion.
- R3 – Complex Graph Queries at Scale: Large trace graphs may degrade impact analysis response times.
- Mitigation: Pre-computed indexes, caching of common traversals, and performance testing on representative datasets.
Open Issues
- Final selection of concrete technologies (graph database, vector database, search engine) per environment.
- Detailed schema for domain-specific extensions (e.g., safety case arguments, hazard logs) beyond the core traceability graph.
- Definition of organization-specific standards mapping (e.g., how ISO 26262 clauses map into
StandardClausenodes).