Docs/Concepts/System Architecture

System Architecture

mBedLM-core is a contract-driven platform built around layered runtime boundaries: model serving, memory and policy, orchestration, and domain adapters.

Last updated: 2026-06-23Keywords: architecture layers runtime governance orchestration

Architecture Overview

The platform separates concerns so each layer can evolve independently while preserving deterministic request behavior. Core requests flow through validated ingress, intent/routing, policy-governed inference, normalized response contracts, and observability hooks.

Runtime Layers

Ingress and API Surfaces: entrypoints in web/API services accept requests and normalize runtime context.
Orchestration and Routing: intent detection, route policy, fallback, and escalation decisions.
Memory and Inference Policy: retrieval context, generation controls, caching/governors, and contract shaping.
Tooling and Skills: structured action calls and specialist behavior extensions.
Domain Adapters: forecasting, RL, OCR, and business-specific runtime modules.

High-Level Topology

Client/Web App
  -> API ingress and auth
  -> Intent and route policy
  -> Tool/skill mediation (optional)
  -> Memory + inference policy
  -> Model serving backends
  -> Response normalization (content_json.v1)
  -> Telemetry and persistence

Request Lifecycle (Simplified)

Validate request envelope, tenant/session context, and safety constraints.
Classify intent and select route strategy (direct, tool-assisted, or orchestrated).
Hydrate contextual memory and optional retrieval/tool outputs.
Execute inference with runtime policy controls (timeouts, fallbacks, gating).
Normalize to response contract and emit observability metadata.

Reference Startup Sequence

Validation-first startup reduces drift between environments and prevents partial-route instability.

1. Validate config and secrets
2. Confirm model artifacts and endpoint reachability
3. Start serving backends (general and specialist)
4. Start memory and tool substrate
5. Start orchestration services
6. Start web/API product surfaces
7. Enable domain modules and optional enhancers

Governance and Safety Boundaries

Canonical request paths prevent accidental traffic to experimental entrypoints.
Trust tiers (production, canary, experimental) constrain auto-routing behavior.
Response contracts keep downstream integrations stable across model changes.
Guardrails enforce rewrite/salvage behavior when output quality degrades.

Extension Points

Add new domain adapters without changing core contract semantics.
Attach skill packs to orchestration for specialist workflows.
Enable optional runtime systems (OCR, RL, prediction) after core health gates pass.

Curated Source Synthesis

This page is built from repository architecture documents and condensed into operator-safe guidance. Instead of raw document links, the key architecture outcomes are synthesized here.

Runtime criticality is tiered so core availability remains isolated from optional modules.
Startup is dependency-ordered: serving and memory substrate are ready before orchestration admission.
Canonical request paths and trust tiers reduce accidental promotion of experimental routes.
Response contract normalization protects downstream systems from provider/model churn.
Ownership boundaries are mapped to platform, memory, tooling, and domain teams for clear accountability.

What Is Intentionally Not Included

Environment-specific secrets, credentials, and private endpoint values.
Detailed cutover/rollout playbooks and incident runbooks.
Internal migration task sequencing and unreleased feature timelines.

Previous: Bare Metal and Virtualized Deployment Back to Docs Home