Runsight

Runsight is a YAML-first workflow engine for AI agents that enables developers to design, commit, run, and evaluate agent workflows with built-in cost tracking and Git-native version control.

Introduction

Overview

Runsight is a YAML-first workflow engine designed specifically for AI agent development, addressing the growing complexity of orchestrating intelligent automation systems. The platform solves the fundamental problem of managing AI agent workflows that are often scattered across disparate Python files, lacking visibility into execution decisions, and creating unexpected cost overruns. Built for developers and technical teams working with AI agents, Runsight introduces a Git-native approach where every workflow exists as a YAML file in version control, enabling the same development practices used for traditional software to be applied to AI agent systems. The product operates within the broader AI orchestration and workflow automation space, providing a structured alternative to ad-hoc agent implementations.

Key Features

YAML-first workflow design enables developers to define AI agent workflows using clean, version-controlled YAML files that live directly in the filesystem. The platform supports multiple block types including linear, gate, and code blocks, with each block referencing specific agent "souls" or configurations. This approach transforms AI workflows from scattered Python scripts into structured, maintainable configuration files that can be reviewed, branched, and merged using standard Git workflows.

Dual canvas and YAML editing provides two synchronized views of the same workflow state. Developers can edit YAML in the Monaco editor with automatic canvas updates, or manipulate nodes visually on the canvas while maintaining clean YAML output. This bidirectional editing capability eliminates the disconnect between visual workflow design and actual implementation code, ensuring that both representations remain consistent throughout the development process.

Per-run cost tracking monitors every block execution down to the cent, providing real-time visibility into AI agent expenses. The system allows developers to set hard budget caps that automatically kill execution before overspending occurs, with detailed breakdowns showing cost accumulation per step. This feature addresses the common problem of surprise bills from uncontrolled AI agent runs by providing granular financial controls and transparency.

Built-in evaluation framework incorporates assertions on every block output, transformation hooks for structured data extraction, and regression testing capabilities across multiple runs. This systematic approach to quality assurance replaces subjective "looks good" assessments with verifiable, automated testing of AI agent behavior and output quality, enabling more reliable production deployments.

Runtime control mechanisms allow developers to pause running agents mid-execution for state inspection, then resume or terminate as needed. This capability prevents wasted computational resources and budget when agents encounter unexpected conditions or begin producing undesirable outputs, providing intervention points that traditional batch execution systems typically lack.

Git-native workflow management treats every YAML workflow file as standard source code that can be diffed, committed, branched, and reviewed using existing Git tooling. The platform automatically scaffolds projects when none exist and integrates seamlessly with existing version control practices, making AI agent development feel like conventional software engineering rather than experimental scripting.

Self-hosted deployment model ensures that all execution occurs on the user's infrastructure with their API keys and models, maintaining data privacy and security. The open-source nature of the platform eliminates vendor lock-in concerns, as workflows remain readable YAML configurations even if the Runsight platform itself were to disappear.

How It Works

The user journey begins with a single command line installation using uvx, which starts the local server and opens the browser interface without requiring cloud accounts or signup processes. Upon launching, the platform either connects to an existing project directory or scaffolds a new one with appropriate structure. Developers then design workflows using either the visual canvas interface or direct YAML editing in the integrated Monaco editor, with both views maintaining synchronized state. API keys are added through an onboarding flow, and workflows are saved as YAML files in the local filesystem. Execution occurs with a single click, providing real-time visualization of block execution with cost and latency metrics displayed per step. The entire system operates locally, with workflows automatically committed to Git repositories for version history and collaboration.

Use Cases

A solo AI developer building customer support automation can use Runsight to create a multi-stage agent workflow that first classifies incoming queries, then routes them to specialized analysis blocks, and finally generates personalized responses. The built-in evaluation framework ensures response quality meets predefined standards, while cost tracking prevents budget overruns during high-volume periods. This enables reliable, cost-controlled automation without the debugging nightmares of scattered Python scripts.

A research team at a financial institution developing market analysis agents benefits from the Git-native workflow management, allowing multiple researchers to collaborate on complex analytical pipelines through standard branch and merge workflows. The dual editing interface enables quantitative analysts to design workflows visually while developers refine the underlying YAML configurations, with per-run cost tracking ensuring compliance with departmental budget constraints for computational resources.

A startup building a content generation platform utilizes Runsight's pause and kill functionality to intervene when agents begin producing off-brand content, preventing public-facing quality issues. The self-hosted deployment keeps proprietary training data and model configurations within the company's infrastructure, while the structured YAML workflows enable rapid iteration and A/B testing of different agent configurations across content types and target audiences.

Who It's For

Runsight targets technical teams and individual developers working with AI agents, particularly those in organizations where version control, cost management, and systematic testing are priorities. The platform serves companies ranging from startups to enterprises that have moved beyond experimental AI implementations and require production-grade orchestration tools. Compared to alternatives like LangChain or custom Python implementations, Runsight distinguishes itself through its Git-native approach, integrated cost controls, and dual editing interface that bridges visual design with code-based configuration. The product assumes moderate technical proficiency but doesn't require deep infrastructure expertise thanks to its self-contained deployment model and straightforward command-line interface.

Back