CLI/TUI：终端优先的交互层

本章核心源码：cli.py（8736 行）、hermes_cli/main.py（5580 行）、hermes_cli/commands.py（1025 行）

定位：本章拆解 Hermes 的终端交互层——从 hermes 命令的入口分发到 HermesCLI 的 prompt_toolkit TUI，理解 CLI-First 赌注如何在 8736 行代码中落地。前置依赖：第 3 章（请求旅程）、第 4 章（AIAgent 内核）。适用场景：想理解 CLI 的交互机制，或准备添加新的 slash command。

为什么 CLI 需要 8736 行

一般认为 CLI 是"轻量"的入口——解析参数、调用 API、打印结果。但 Hermes 的 CLI 不是一个简单的命令行包装器，它是一个完整的 TUI 应用：

多行编辑：用户可以用 Alt+Enter 输入多行消息，而非被限制在一行
流式渲染：模型响应逐 token 渲染到终端，支持 Markdown 格式化
中断与重定向：用户在 agent 运行时输入新消息，agent 被中断并处理新消息
状态栏：底部显示模型名、token 用量、上下文占用等实时信息
多模态交互：剪贴板图片粘贴、语音输入/TTS 输出、sudo 密码提示
Slash 命令系统：40+ 个命令，支持自动补全、别名、分类帮助

这些交互需求让 cli.py 膨胀到 8736 行——但这不是随意膨胀，而是 CLI-First 赌注的工程代价。

入口层架构

graph TD
    A["hermes 命令"] --> B["hermes_cli/main.py<br/>argparse 分发"]
    B --> C1["hermes chat<br/>默认"]
    B --> C2["hermes setup"]
    B --> C3["hermes gateway"]
    B --> C4["hermes cron"]
    B --> C5["hermes doctor"]
    B --> C6["hermes model"]
    B --> C7["hermes tools"]
    B --> C8["hermes config"]
    B --> C9["hermes sessions"]
    B --> C10["hermes acp"]
    
    C1 --> D["cli.py:main 8525"]
    D --> E["HermesCLI.__init__"]
    E --> F1["单次查询模式<br/>cli.py:8696"]
    E --> F2["交互模式<br/>cli.py:7064 run"]
    
    F2 --> G["prompt_toolkit<br/>Application"]

两层入口

Hermes 的 CLI 有两层入口。第一层是 hermes_cli/main.py，它使用 argparse 将 hermes 命令分发到不同的子命令：

# hermes_cli/main.py:0-43（文档字符串展示了完整的子命令列表）
"""
Usage:
    hermes                     # Interactive chat (default)
    hermes chat                # Interactive chat
    hermes gateway             # Run gateway in foreground
    hermes gateway start       # Start gateway as service
    hermes setup               # Interactive setup wizard
    hermes cron                # Manage cron jobs
    hermes doctor              # Check configuration and dependencies
    hermes acp                 # Run as an ACP server for editor integration
    hermes sessions browse     # Interactive session picker with search
"""

注意第一行——hermes 不带子命令时直接进入交互聊天。这不是偶然，而是 CLI-First 的设计决策：最常用的操作不需要记忆任何子命令。

第二层是 cli.py:main()（cli.py:8525），它是交互聊天的真正入口。通过 python-fire 将函数参数暴露为命令行选项：

# cli.py:8525-8546
def main(
    query: str = None,
    q: str = None,
    toolsets: str = None,
    skills: str | list[str] | tuple[str, ...] = None,
    model: str = None,
    provider: str = None,
    api_key: str = None,
    base_url: str = None,
    max_turns: int = None,
    verbose: bool = False,
    quiet: bool = False,
    compact: bool = False,
    list_tools: bool = False,
    gateway: bool = False,
    resume: str = None,
    worktree: bool = False,
    w: bool = False,
    checkpoints: bool = False,
    pass_session_id: bool = False,
):

Profile 系统

在任何模块导入之前，main.py 执行了一个关键的预处理——profile override（hermes_cli/main.py:82-136）：

# hermes_cli/main.py:82-83
def _apply_profile_override() -> None:
    """Pre-parse --profile/-p and set HERMES_HOME before module imports."""

Profile 允许用户维护多个独立的 Hermes 配置（不同的 API key、不同的记忆、不同的 skills）。这个预处理必须在所有模块导入之前执行，因为很多模块在 import 时就读取 HERMES_HOME 并缓存为模块级常量。如果 profile 切换发生在导入之后，HERMES_HOME 的变化就无法传播到已经缓存了旧值的模块。

交互式子命令（hermes tools、hermes setup、hermes model）需要终端输入。当它们被管道调用时（如 echo "" | hermes tools），curses 和 input() 会空转消耗 100% CPU。_require_tty() 守护（hermes_cli/main.py:52-66）在这些命令执行前检查 stdin 是否是终端，阻止非交互式调用：

# hermes_cli/main.py:52-66
def _require_tty(command_name: str) -> None:
    """Exit with a clear error if stdin is not a terminal."""
    if not sys.stdin.isatty():
        print(
            f"Error: 'hermes {command_name}' requires an interactive terminal.\n"
            f"It cannot be run through a pipe or non-interactive subprocess.",
            file=sys.stderr,
        )
        sys.exit(1)

HermesCLI 类：TUI 的核心

HermesCLI（cli.py:1307）是整个 CLI 的核心。它的职责不仅是调用 AIAgent，还包括构建一个完整的 prompt_toolkit 应用。

初始化链

HermesCLI.__init__()（cli.py:1315-1456）解析来自三个来源的配置（优先级从高到低）：

来源	示例	覆盖关系
CLI 参数	`--model claude-opus-4-20250514`	最高优先
config.yaml	`model.default: gpt-5.3-codex`	中间
环境变量	`HERMES_INFERENCE_PROVIDER=openrouter`	最低（仅部分场景生效）

一个重要的设计决策：LLM_MODEL 和 OPENAI_MODEL 环境变量不被检查（cli.py:1379-1383）。在多 agent 场景中，这些环境变量可能被其他工具设置，会导致意外的模型切换。config.yaml 是唯一权威来源。

# cli.py:1379-1386
# Model comes from: CLI arg or config.yaml (single source of truth).
# LLM_MODEL/OPENAI_MODEL env vars are NOT checked — config.yaml is
# authoritative.  This avoids conflicts in multi-agent setups where
# env vars would stomp each other.
_model_config = CLI_CONFIG.get("model", {})
_config_model = (_model_config.get("default") or _model_config.get("model") or "") \
    if isinstance(_model_config, dict) else (_model_config or "")
self.model = model or _config_model or _DEFAULT_CONFIG_MODEL

prompt_toolkit TUI 架构

run() 方法（cli.py:7064）构建了 prompt_toolkit 的 Application，这是一个成熟的终端 UI 框架：

# cli.py:7064-7110（关键状态初始化）
def run(self):
    """Run the interactive CLI loop with persistent input at bottom."""
    self._agent_running = False
    self._pending_input = queue.Queue()     # 正常输入（命令 + 新查询）
    self._interrupt_queue = queue.Queue()   # agent 运行时的中断消息
    self._should_exit = False

两个 Queue 的设计是 TUI 交互的核心：

stateDiagram-v2
    [*] --> Idle
    
    Idle --> Processing: 用户输入进入 pending_input
    Processing --> Idle: agent 返回结果
    
    Processing --> Interrupted: 用户新消息进入 interrupt_queue
    Interrupted --> Processing: agent 中断后处理新消息
    
    Processing --> Clarify: agent 调用 clarify 工具
    Clarify --> Processing: 用户选择或输入回答
    
    Processing --> Approval: 危险命令需要审批
    Approval --> Processing: 用户批准或拒绝
    
    Idle --> [*]: /quit

run() 还初始化了大量的 UI 状态变量（cli.py:7123-7159）：

_clarify_state / _clarify_freetext：clarify 工具的问答状态
_sudo_state：sudo 密码提示状态
_approval_state：危险命令审批状态
_secret_state：密钥输入状态
_attached_images：剪贴板图片附件
_voice_mode / _voice_recording：语音模式状态

每种状态都有对应的 deadline（超时时间），防止用户忘记响应时无限阻塞 agent。

按键绑定与输入路由

run() 方法注册了关键的按键绑定（cli.py:7183-7313）。Enter 键的路由逻辑是 TUI 中最复杂的部分——它根据当前 UI 状态决定输入去向：

UI 状态	Enter 行为	目标
Sudo 密码提示	提交密码	`_sudo_state["response_queue"]`
Secret 输入	提交密钥	`_secret_state["response_queue"]`
危险命令审批	确认选择	`_approval_state["response_queue"]`
Clarify 自由文本	提交答案	`_clarify_state["response_queue"]`
Clarify 选择模式	确认选项	`_clarify_state["response_queue"]`
Agent 运行中 + 文本	中断或排队	`_interrupt_queue` 或 `_pending_input`
Agent 空闲 + 文本	提交查询	`_pending_input`

# cli.py:7279-7287
@kb.add('escape', 'enter')
def handle_alt_enter(event):
    """Alt+Enter inserts a newline for multi-line input."""
    event.current_buffer.insert_text('\n')

@kb.add('c-j')
def handle_ctrl_enter(event):
    """Ctrl+Enter (c-j) inserts a newline."""
    event.current_buffer.insert_text('\n')

Alt+Enter 和 Ctrl+Enter 插入换行，让用户在不提交的情况下输入多行消息。这是 CLI 超越传统 readline REPL 的关键能力。

中断与重定向

当用户在 agent 运行时输入新消息，有两种模式（cli.py:1358-1360）：

# cli.py:1358-1360
_bim = CLI_CONFIG["display"].get("busy_input_mode", "interrupt")
self.busy_input_mode = "queue" if str(_bim).strip().lower() == "queue" else "interrupt"

interrupt 模式（默认）：新消息进入 _interrupt_queue，agent 被中断，立即处理新消息
queue 模式：新消息进入 _pending_input，等待当前 turn 完成后处理

中断模式是 CLI-First 的核心交互创新——用户不需要等 agent 完成当前任务才能发送新指令。chat() 方法（cli.py:6424）在每个 API 调用和工具执行之间轮询 _interrupt_queue，一旦发现新消息就设置 agent._interrupt_requested = True。

Slash 命令系统

命令注册表

所有 slash 命令定义在 hermes_cli/commands.py 的 COMMAND_REGISTRY（commands.py:45-144）中：

# hermes_cli/commands.py:26-38
@dataclass(frozen=True)
class CommandDef:
    """Definition of a single slash command."""
    name: str                          # 规范名："background"
    description: str                   # 人类可读描述
    category: str                      # "Session", "Configuration" 等
    aliases: tuple[str, ...] = ()      # 别名：("bg",)
    args_hint: str = ""                # 参数占位符："<prompt>"
    subcommands: tuple[str, ...] = ()  # tab 补全的子命令
    cli_only: bool = False             # 仅 CLI 可用
    gateway_only: bool = False         # 仅 Gateway 可用
    gateway_config_gate: str | None = None  # config 门控

注册表是单一事实来源（commands.py:0-8 的文档明确声明）。CLI 帮助、Gateway 分发、Telegram BotCommands、Slack 子命令映射、自动补全——所有消费方都从 COMMAND_REGISTRY 派生数据。

命令按功能分类：

分类	命令数	示例
Session	16	`/new`, `/retry`, `/undo`, `/branch`, `/compress`, `/background`
Configuration	9	`/model`, `/prompt`, `/yolo`, `/reasoning`, `/voice`
Tools & Skills	7	`/tools`, `/skills`, `/cron`, `/browser`, `/plugins`
Info	7	`/help`, `/usage`, `/insights`, `/platforms`, `/paste`
Exit	1	`/quit`（别名 `/exit`, `/q`）

插件命令注册

第三方插件可以通过 register_plugin_command() 动态注册新命令（commands.py:172-175）：

# hermes_cli/commands.py:172-175
def register_plugin_command(cmd: CommandDef) -> None:
    """Append a plugin-defined command to the registry and refresh lookups."""
    COMMAND_REGISTRY.append(cmd)
    rebuild_lookups()

rebuild_lookups()（commands.py:178-199）重建所有派生的查找字典——这保证了插件命令在注册后立即出现在帮助、自动补全和 Gateway 分发中。

CLI/Gateway 命令分离

CommandDef 的 cli_only 和 gateway_only 标志控制命令在不同入口的可见性：

/clear：cli_only=True——清屏在消息平台上没有意义
/approve、/deny：gateway_only=True——CLI 用内建 UI 处理审批
/model、/new：两端都可用

gateway_config_gate 字段提供了更细粒度的控制（commands.py:97）：/verbose 命令默认是 cli_only，但如果 config.yaml 中设置了 display.tool_progress_command: true，Gateway 端也会启用它。

回调适配

CLI 通过 AIAgent 的 11 个回调接口适配终端 IO（详见第 4 章）。核心的适配模式是将 agent 的同步回调桥接到 prompt_toolkit 的异步 UI：

当 agent 调用 clarify 工具时，clarify_callback 将问题和选项注入 _clarify_state，prompt_toolkit 的渲染循环检测到状态变化后切换 UI 为选择模式。用户用方向键选择后，答案通过 response_queue（一个 queue.Queue）返回给 agent 线程。这个跨线程通信机制同样用于 sudo 密码输入（_sudo_state）、密钥捕获（_secret_state）和危险命令审批（_approval_state）。

main() 的分支逻辑

main()（cli.py:8525）根据参数走不同路径：

路径	触发条件	行为
Gateway	`--gateway`	启动消息平台网关（`cli.py:8584`）
List tools	`--list-tools`	打印工具列表并退出
单次查询（安静）	`-q "..." --quiet`	执行查询，仅打印结果和 session_id
单次查询（正常）	`-q "..."`	显示 banner，执行查询
交互模式	无 `-q`	启动 TUI REPL（`cli.py:8732`）

Worktree 隔离（cli.py:8593-8614）在交互模式和单次查询模式下都可用。当 --worktree 或 -w 被指定时，_setup_worktree() 创建一个独立的 git worktree，让当前 agent 实例在隔离的分支上工作：

# cli.py:8597-8608
use_worktree = worktree or w or CLI_CONFIG.get("worktree", False)
if use_worktree:
    _repo = _git_repo_root()
    if _repo:
        _prune_stale_worktrees(_repo)
    wt_info = _setup_worktree()
    if wt_info:
        _active_worktree = wt_info
        os.environ["TERMINAL_CWD"] = wt_info["path"]
        atexit.register(_cleanup_worktree, wt_info)

Worktree 信息还会注入 agent 的 system prompt（cli.py:8669-8678），让 agent 知道自己在隔离分支上工作，应该 commit 和创建 PR。

设计启示

拆解 CLI/TUI 的 8736 行代码，可以提炼出三个设计原则：

交互复杂度在入口层消化：中断/重定向、多模态输入、密码提示等交互逻辑全部在 CLI 层处理，编排层（AIAgent）通过回调看到的是简单的同步接口。这让同一个 AIAgent 可以零修改适配 Gateway 等完全不同的交互模型
单一命令注册表：COMMAND_REGISTRY 是 slash 命令的唯一事实来源，所有消费方（CLI 帮助、Gateway、Telegram、Slack、自动补全）都从中派生。新增命令只需添加一个 CommandDef，五个消费方自动更新
渐进式复杂度：hermes（无参数）直接进入聊天，hermes -q 单次查询，hermes --toolsets 定制工具集，hermes -w 隔离工作区。从简单到复杂，用户按需解锁功能

设计赌注回扣：本章是 CLI-First 赌注的核心体现。8736 行的 TUI 代码证明 Hermes 不把终端当作 Web UI 的"降级版"——多行编辑、流式渲染、中断重定向、状态栏、语音输入等功能让终端体验与图形界面对等。同时，回调体系和命令注册表也回扣了 Run Anywhere 赌注：同一套命令系统同时服务 CLI 和 Gateway。

版本演化说明

本章核心分析基于 Hermes Agent v0.8.0（2026 年 4 月）。 HermesCLI 的 prompt_toolkit TUI 和统一命令注册表在 v0.3.0 发布窗口前后就已经存在。Worktree 隔离同样属于 v0.3.0 窗口内较早落地的能力，而 busy_input_mode 则可以明确放到 v0.5.0 发布窗口。

Hermes Agent 源码与设计