diff --git a/docs/dev/cli/README.md b/docs/dev/cli/README.md
new file mode 100644
index 0000000000..85a80afc04
--- /dev/null
+++ b/docs/dev/cli/README.md
@@ -0,0 +1,494 @@
+# `rock job run` CLI 参数设计方案
+
+## 1. 背景
+
+当前 `rock/cli/command/job.py` 对 bash 和 harbor 两种 job 类型使用不同的输入方式：
+
+| job_type | 必填参数 | 其他参数 |
+|----------|----------|----------|
+| `bash`   | `--script` 或 `--script-content`（二选一） | `--image / --memory / --cpus / --env / --local-path / --timeout / ...` |
+| `harbor` | `--job_config`（YAML 文件） | `--image`（覆盖 YAML 中的字段） |
+
+存在的问题：
+
+1. **Bash 不支持 YAML 配置** — 真实业务里 bash 也会携带大量 env、setup_commands、uploads、image 等字段，挤在一行 CLI flag 里既难写又难复用。
+2. **`--type` 与入参强耦合** — `--type bash` 强制走 CLI flags 路径，`--type harbor` 强制走 YAML 路径，入参形态随类型跳变。实际上，"从 YAML 载入 config" 和 "类型"是正交的两个维度。
+3. **错误提示不友好** — 现有实现只打印一行 `logger.error`，不显示 help / 示例 / 合法组合，用户无法自助排查。
+4. **参数互斥关系隐式** — `--script` vs `--script-content`、`--job_config` vs `--script*`、`--type` vs YAML 里的类型等互斥/冲突关系散落在校验代码里，没有集中说明。
+
+已具备的能力（参考 `rock/sdk/job/config.py`）：
+
+- `JobConfig.from_yaml(path)` 已支持类型自动识别 —— 通过 `HarborJobConfig → BashJobConfig` 顺序的严格模型校验（`extra="forbid"`）定位子类。
+- `BashJobConfig` / `HarborJobConfig` 都是 `from_yaml` 的可用入口。
+
+所以 CLI 层只需要把"YAML 路径"升格为一等公民，即可统一两种类型的入参形态。
+
+---
+
+## 2. 设计目标
+
+1. **YAML 优先** — `--job_config` 是推荐路径，适用于所有 job 类型；CLI flags 是简化入口，仅适用于 bash。
+2. **单一输入模式** — `--job_config` 模式与 flags 模式互斥，不允许混用以避免配置优先级歧义。
+3. **类型自动识别** — 使用 `--job_config` 时不需要 `--type`；`--type` 仅作为 flags 模式下的显式声明（默认 `bash`）。
+4. **错误提示优雅** — 任何参数校验失败都：
+   - 清晰指出错了什么；
+   - 给出合法组合示例；
+   - 最后附上 `rock job run --help` 提示。
+5. **argparse 原生风格** — 使用 `parser.error()`，退出码为 2，日志/错误都写 stderr，与其他 CLI 子命令一致。
+
+---
+
+## 3. 入参规则
+
+### 3.1 两种互斥的输入模式
+
+```
+┌──────────────────────────────────────────────┬───────────────────────────────────────────────┐
+│  Mode A: YAML config（推荐，通用）              │  Mode B: CLI flags（简化，仅 bash）           │
+├──────────────────────────────────────────────┼───────────────────────────────────────────────┤
+│  rock job run --job_config job.yaml                │  rock job run [--type bash]                   │
+│                [--image ...] [--cluster ...]   │                --script path/to/script.sh     │
+│                [--base-url ...] [--timeout N]  │               (或 --script-content "...")     │
+│                                                │                [--image / --memory / --cpus] │
+│                                                │                [--env / --local-path / ...]  │
+└──────────────────────────────────────────────┴───────────────────────────────────────────────┘
+```
+
+### 3.2 参数分类
+
+| 组别 | 参数 | Mode A 允许 | Mode B 允许 |
+|------|------|:-----------:|:-----------:|
+| **模式开关** | `--job_config PATH` | ✅（必填） | ❌ |
+| **类型声明** | `--type {bash,harbor}` | ❌（类型由 YAML 决定） | ✅（可选，默认 `bash`；若指定 `harbor` 必须配 `--job_config`） |
+| **bash 脚本** | `--script / --script-content` | ❌ | ✅（二选一，必填） |
+| **环境覆盖** | `--image / --memory / --cpus` | ✅（覆盖 YAML 中对应字段） | ✅ |
+| **运行位置** | `--base-url / --cluster / --extra-headers / --xrl-authorization` | ✅ | ✅ |
+| **环境变量** | `--env KEY=VALUE`（可重复） | ✅（追加/覆盖 YAML 的 `environment.env`） | ✅ |
+| **文件上传** | `--local-path / --target-path` | ✅（追加到 YAML 的 `environment.uploads`） | ✅ |
+| **超时** | `--timeout N` | ✅（覆盖 YAML 的 `timeout`） | ✅ |
+
+### 3.3 校验矩阵
+
+| 情况 | 判定 | 行为 |
+|------|------|------|
+| 两种 mode 都没给（既没 `--job_config` 也没 `--script*`） | 错误 | 打印缺参提示 + usage + help 指引 |
+| 同时给 `--job_config` 和 `--script` / `--script-content` | 错误 | 明确告诉用户两种模式互斥，二选一 |
+| 同时给 `--script` 和 `--script-content` | 错误 | 告知"inline 内容和文件路径只能二选一" |
+| `--type harbor` 但没 `--job_config` | 错误 | 告知 harbor 类型必须走 YAML |
+| `--type bash` + `--job_config` 指向 HarborJobConfig YAML | 错误 | 告知类型冲突（YAML 自动识别为 harbor） |
+| `--job_config` 文件不存在 / 解析失败 | 错误 | 转发 `from_yaml` 的 `ValueError` 细节 |
+
+---
+
+## 4. 错误提示设计
+
+所有入参校验失败都走一个统一的 `_fail(parser, msg, hint=None)` 辅助函数，保持风格一致。
+
+### 4.1 辅助函数示意
+
+```python
+def _fail(parser: argparse.ArgumentParser, msg: str, *, hint: str | None = None) -> None:
+    """Emit a consistent CLI error: message + optional hint + help指引，然后退出 2。"""
+    lines = [msg]
+    if hint:
+        lines.append("")
+        lines.append(hint)
+    lines.append("")
+    lines.append("Run `rock job run --help` for full usage.")
+    parser.error("\n".join(lines))
+```
+
+`parser.error()` 会：先打印 usage（来自 argparse），再把我们给的消息写到 stderr，最后以 exit code 2 退出 —— 这正是 argparse 原生校验失败时的行为，一致性好。
+
+### 4.2 错误样例（对用户实际看到的输出）
+
+**Case 1: 什么都没给**
+
+```
+usage: rock job run [-h] [--type {bash,harbor}] [--job_config CONFIG] [--script SCRIPT] ...
+rock job run: error: Missing job definition. Provide either a YAML config or inline script.
+
+Examples:
+  rock job run --job_config job.yaml                     # any job type, auto-detected
+  rock job run --script path/to/run.sh               # bash, script file
+  rock job run --script-content "echo hi"            # bash, inline snippet
+
+Run `rock job run --help` for full usage.
+```
+
+**Case 2: 同时给了 `--job_config` 和 `--script`**
+
+```
+rock job run: error: --job_config is mutually exclusive with --script / --script-content.
+
+Pick one mode:
+  • YAML mode:  rock job run --job_config job.yaml
+  • flags mode: rock job run --script run.sh
+
+Run `rock job run --help` for full usage.
+```
+
+**Case 3: `--type harbor` 没带 `--job_config`**
+
+```
+rock job run: error: --type harbor requires --job_config <yaml>.
+
+Harbor jobs cannot be expressed purely via CLI flags.
+Example:
+  rock job run --job_config harbor.yaml
+
+Run `rock job run --help` for full usage.
+```
+
+**Case 4: `--script` 和 `--script-content` 同时给**
+
+```
+rock job run: error: --script and --script-content are mutually exclusive (pick a file path OR an inline snippet).
+
+Run `rock job run --help` for full usage.
+```
+
+**Case 5: YAML 解析失败（来自 `JobConfig.from_yaml`）**
+
+```
+rock job run: error: Failed to load --job_config 'job.yaml':
+YAML does not match any known job type.
+  As HarborJobConfig: 1 validation error for HarborJobConfig
+    experiment_id: Field required [type=missing, ...]
+  As BashJobConfig:   1 validation error for BashJobConfig
+    agents: Extra inputs are not permitted [type=extra_forbidden, ...]
+
+Run `rock job run --help` for full usage.
+```
+
+---
+
+## 5. 实现草图
+
+```python
+# rock/cli/command/job.py
+
+import argparse
+from pathlib import Path
+
+from rock.cli.command.command import Command
+from rock.logger import init_logger
+
+logger = init_logger(__name__)
+
+
+def _fail(parser: argparse.ArgumentParser, msg: str, *, hint: str | None = None) -> None:
+    """Emit a consistent CLI error + help pointer, then exit 2."""
+    parts = [msg]
+    if hint:
+        parts.extend(["", hint])
+    parts.extend(["", "Run `rock job run --help` for full usage."])
+    parser.error("\n".join(parts))
+
+
+class JobCommand(Command):
+    name = "job"
+
+    # The run sub-parser is stashed here so arun() can call parser.error().
+    _run_parser: argparse.ArgumentParser | None = None
+
+    async def arun(self, args: argparse.Namespace):
+        if args.job_command == "run":
+            await self._job_run(args)
+        else:
+            logger.error(f"Unknown job subcommand: {args.job_command}")
+
+    async def _job_run(self, args: argparse.Namespace):
+        from rock.sdk.job import Job
+        from rock.sdk.job.config import BashJobConfig, JobConfig
+
+        parser = self._run_parser  # set in add_parser_to
+
+        # ── 1. 模式互斥校验 ──────────────────────────────────────────
+        has_config = bool(args.config)
+        has_script = bool(args.script or args.script_content)
+
+        if not has_config and not has_script:
+            _fail(
+                parser,
+                "Missing job definition. Provide either a YAML config or inline script.",
+                hint=(
+                    "Examples:\n"
+                    "  rock job run --job_config job.yaml                     # any job type, auto-detected\n"
+                    "  rock job run --script path/to/run.sh               # bash, script file\n"
+                    "  rock job run --script-content \"echo hi\"            # bash, inline snippet"
+                ),
+            )
+
+        if has_config and has_script:
+            _fail(
+                parser,
+                "--job_config is mutually exclusive with --script / --script-content.",
+                hint=(
+                    "Pick one mode:\n"
+                    "  • YAML mode:  rock job run --job_config job.yaml\n"
+                    "  • flags mode: rock job run --script run.sh"
+                ),
+            )
+
+        if args.script and args.script_content:
+            _fail(
+                parser,
+                "--script and --script-content are mutually exclusive "
+                "(pick a file path OR an inline snippet).",
+            )
+
+        if args.type == "harbor" and not has_config:
+            _fail(
+                parser,
+                "--type harbor requires --job_config <yaml>.",
+                hint=(
+                    "Harbor jobs cannot be expressed purely via CLI flags.\n"
+                    "Example:\n"
+                    "  rock job run --job_config harbor.yaml"
+                ),
+            )
+
+        # ── 2. 组装 config ──────────────────────────────────────────
+        if has_config:
+            config = self._config_from_yaml(parser, args)
+        else:
+            config = self._config_from_flags(args)
+
+        # ── 3. 用 CLI overrides 打补丁（两种模式共用）────────────────
+        self._apply_overrides(config, args)
+
+        # ── 4. 运行 ────────────────────────────────────────────────
+        try:
+            result = await Job(config).run()
+            if result.trial_results:
+                for tr in result.trial_results:
+                    output = getattr(tr, "raw_output", None) or ""
+                    if output:
+                        print(output)
+            logger.info(f"Job completed: status={result.status}")
+        except Exception as e:
+            logger.error(f"Job failed: {e}")
+
+    # ── helpers ───────────────────────────────────────────────────
+
+    def _config_from_yaml(self, parser, args):
+        from rock.sdk.job.config import BashJobConfig, JobConfig
+
+        path = args.config
+        if not Path(path).is_file():
+            _fail(parser, f"--job_config path does not exist: {path}")
+
+        try:
+            config = JobConfig.from_yaml(path)  # 自动识别 Bash/Harbor
+        except (ValueError, Exception) as exc:
+            _fail(parser, f"Failed to load --job_config {path!r}:\n{exc}")
+
+        # --type 显式声明时做一次一致性检查
+        if args.type:
+            expected_type = args.type
+            actual_type = "bash" if isinstance(config, BashJobConfig) else "harbor"
+            if expected_type != actual_type:
+                _fail(
+                    parser,
+                    f"--type {expected_type} does not match YAML (detected as {actual_type}).",
+                    hint="Remove --type and let the YAML decide, or pass a matching config file.",
+                )
+        return config
+
+    def _config_from_flags(self, args):
+        from rock.sdk.bench.models.trial.config import RockEnvironmentConfig
+        from rock.sdk.job.config import BashJobConfig
+
+        env = {}
+        for item in args.env or []:
+            key, _, value = item.partition("=")
+            env[key] = value
+
+        uploads = [(args.local_path, args.target_path)] if args.local_path else []
+
+        return BashJobConfig(
+            script=args.script_content,
+            script_path=args.script,
+            environment=RockEnvironmentConfig(
+                image=args.image,
+                memory=args.memory,
+                cpus=args.cpus,
+                base_url=args.base_url,
+                cluster=args.cluster,
+                extra_headers=args.extra_headers,
+                xrl_authorization=args.xrl_authorization,
+                uploads=uploads,
+                auto_stop=True,
+                env=env,
+            ),
+            timeout=args.timeout,
+        )
+
+    def _apply_overrides(self, config, args):
+        """Apply CLI overrides that are valid in both modes (e.g. --image)."""
+        env = config.environment
+        if args.image:
+            env.image = args.image
+        if args.memory:
+            env.memory = args.memory
+        if args.cpus:
+            env.cpus = args.cpus
+        if args.base_url:
+            env.base_url = args.base_url
+        if args.cluster:
+            env.cluster = args.cluster
+        if args.xrl_authorization:
+            env.xrl_authorization = args.xrl_authorization
+        # --env / --local-path / --timeout 在 YAML 模式下也追加/覆盖
+        for item in args.env or []:
+            key, _, value = item.partition("=")
+            env.env[key] = value
+        if args.local_path:
+            env.uploads = list(env.uploads) + [(args.local_path, args.target_path)]
+        if args.timeout is not None:
+            config.timeout = args.timeout
+        env.auto_stop = True
+
+    # ── parser ────────────────────────────────────────────────────
+
+    @staticmethod
+    async def add_parser_to(subparsers: argparse._SubParsersAction):
+        job_parser = subparsers.add_parser("job", help="Manage sandbox jobs")
+        job_subparsers = job_parser.add_subparsers(dest="job_command")
+
+        run_parser = job_subparsers.add_parser(
+            "run",
+            help="Run a job in a sandbox",
+            description=(
+                "Run a sandbox job in one of two modes:\n"
+                "  (1) YAML mode  : --job_config <file>         (type auto-detected)\n"
+                "  (2) flags mode : --script / --script-content  (bash only)\n"
+                "The two modes are mutually exclusive."
+            ),
+            formatter_class=argparse.RawDescriptionHelpFormatter,
+        )
+
+        # mode switches
+        run_parser.add_argument(
+            "--type",
+            choices=["bash", "harbor"],
+            default=None,
+            help="Explicit job type (flags mode only; YAML mode auto-detects).",
+        )
+        run_parser.add_argument("--job_config", default=None, help="YAML config path (any job type).")
+        run_parser.add_argument("--script", default=None, help="Bash script file path (flags mode).")
+        run_parser.add_argument("--script-content", default=None, help="Inline bash snippet (flags mode).")
+
+        # shared overrides
+        run_parser.add_argument("--image", default=None, help="Sandbox image (overrides YAML).")
+        run_parser.add_argument("--memory", default=None, help="Memory (e.g. 8g). Overrides YAML.")
+        run_parser.add_argument("--cpus", default=None, type=float, help="CPU count. Overrides YAML.")
+        run_parser.add_argument("--timeout", type=int, default=None, help="Timeout in seconds.")
+        run_parser.add_argument("--local-path", default=None, help="Local dir to upload.")
+        run_parser.add_argument("--target-path", default="/root/job", help="Target dir in sandbox.")
+        run_parser.add_argument("--base-url", default=None, help="Admin service base URL.")
+        run_parser.add_argument("--cluster", default=None, help="Cluster name (e.g. vpc-sg-sl-a).")
+        run_parser.add_argument("--extra-headers", default=None, help="Extra HTTP headers (JSON).")
+        run_parser.add_argument(
+            "--env",
+            action="append",
+            default=None,
+            metavar="KEY=VALUE",
+            help="Environment variable, repeatable (e.g. --env FOO=bar --env BAZ=qux).",
+        )
+        run_parser.add_argument("--xrl-authorization", default=None, help="XRL authorization token.")
+
+        # stash on command class so arun() can call parser.error() consistently
+        JobCommand._run_parser = run_parser
+```
+
+> 注：`parser.error()` 在 argparse 内部 `sys.exit(2)`，因此 `arun()` 里不需要 return；后续代码永远不会执行到。
+
+---
+
+## 6. 使用示例
+
+### 6.1 Bash — flags 模式（原行为保持）
+
+```bash
+rock job run --script ./train.sh \
+    --image python:3.11 --memory 8g --cpus 4 \
+    --env FOO=bar --env BAZ=qux \
+    --local-path ./workspace --target-path /root/job
+```
+
+### 6.2 Bash — YAML 模式（新能力）
+
+`bash_job.yaml`：
+
+```yaml
+script_path: ./train.sh
+timeout: 7200
+environment:
+  image: python:3.11
+  memory: 8g
+  cpus: 4
+  env:
+    FOO: bar
+    BAZ: qux
+  uploads:
+    - ["./workspace", "/root/job"]
+```
+
+```bash
+rock job run --job_config bash_job.yaml
+
+# 仍可以用 CLI override 局部字段
+rock job run --job_config bash_job.yaml --image python:3.12 --timeout 1800
+```
+
+### 6.3 Harbor — YAML 模式（与之前一致）
+
+```bash
+rock job run --job_config harbor.yaml
+rock job run --job_config harbor.yaml --image harbor-runner:v2   # 覆盖 image
+```
+
+---
+
+## 7. 向后兼容
+
+| 用法 | 是否仍然可用 | 说明 |
+|------|:------------:|------|
+| `rock job run --type bash --script foo.sh` | ✅ | 等价于 flags 模式，`--type bash` 为显式声明。 |
+| `rock job run --script foo.sh` | ✅ | `--type` 默认视为 `bash`。 |
+| `rock job run --type harbor --job_config harbor.yaml` | ✅ | 显式声明 + YAML 校验一致性。 |
+| `rock job run --job_config harbor.yaml` | ✅ | 推荐新用法，自动识别。 |
+| `rock job run --job_config ... --script ...` | ❌（新增校验） | 之前不明确，现在明确报错。 |
+| `rock job run`（无参） | ❌（错误消息更友好） | 之前报错含糊，现在打印 usage + 示例。 |
+
+没有被移除的 flag，没有破坏性变更。
+
+---
+
+## 8. 变更清单
+
+- `rock/cli/command/job.py`
+  - 新增 `_fail()` 辅助函数
+  - 拆分 `_config_from_yaml` / `_config_from_flags` / `_apply_overrides`
+  - `--type` 默认改为 `None`（由模式推导），仅做一致性校验
+  - `--timeout` 默认改为 `None`（不覆盖 YAML 中的 timeout，除非显式给出）
+  - `--job_config` 支持 bash 类型
+  - 所有入参错误走 `parser.error()`，统一格式
+  - `description` 里写清楚两种模式（`RawDescriptionHelpFormatter`）
+- `tests/unit/cli/command/test_job.py`（新增 / 扩展）
+  - Case: 两种模式互斥
+  - Case: `--type harbor` 无 `--job_config`
+  - Case: YAML 模式下 `--type` 与 YAML 一致 / 冲突
+  - Case: bash YAML 正常加载 + CLI override 生效
+  - Case: `parser.error()` 以 exit code 2 退出、stderr 包含 usage
+
+---
+
+## 9. 未来扩展
+
+1. **`rock job validate --job_config foo.yaml`** — 只做 YAML 校验 + 类型识别，便于 CI 中 lint。
+2. **`rock job run --job_config foo.yaml --dry-run`** — 打印最终合并后的 config（含 CLI overrides）但不实际提交。
+3. **多个 `--job_config`** — 支持 config 合并（base + override），对应 `JobConfig.model_validate` 的深合并。
+4. **`ROCK_JOB_CONFIG` 环境变量** — 作为 `--job_config` 的默认值，与 `ROCK_CONFIG` 风格一致。
diff --git a/rock/cli/command/job.py b/rock/cli/command/job.py
index 7142c662a8..9c1bb8c93c 100644
--- a/rock/cli/command/job.py
+++ b/rock/cli/command/job.py
@@ -6,9 +6,26 @@
 logger = init_logger(__name__)
 
 
+def _fail(parser: argparse.ArgumentParser, msg: str, *, hint: str | None = None) -> None:
+    """Emit a consistent CLI error: message + optional hint + help pointer, then exit 2.
+
+    Uses ``parser.error()`` which prints the parser's usage line to stderr, writes
+    the message, and calls ``sys.exit(2)``. Never returns.
+    """
+    parts = [msg]
+    if hint:
+        parts.extend(["", hint])
+    parts.extend(["", "Run `rock job run --help` for full usage."])
+    parser.error("\n".join(parts))
+
+
 class JobCommand(Command):
     name = "job"
 
+    # Cached reference to the `run` sub-parser; populated by add_parser_to,
+    # used by _job_run to call parser.error() consistently.
+    _run_parser: argparse.ArgumentParser | None = None
+
     async def arun(self, args: argparse.Namespace):
         if args.job_command == "run":
             await self._job_run(args)
@@ -17,73 +34,64 @@ async def arun(self, args: argparse.Namespace):
 
     async def _job_run(self, args: argparse.Namespace):
         # Import lazily to avoid pulling in bench/Harbor modules for bash-only uses
-        from rock.sdk.bench.models.trial.config import RockEnvironmentConfig
         from rock.sdk.job import Job
-        from rock.sdk.job.config import BashJobConfig
 
-        job_type = args.type or "bash"
-
-        if job_type == "bash":
-            if not args.script and not args.script_content:
-                logger.error("Either --script or --script-content is required for bash type")
-                return
-            if args.script and args.script_content:
-                logger.error("--script and --script-content cannot be used together")
-                return
-
-            env_kwargs = {}
-            if args.image:
-                env_kwargs["image"] = args.image
-            if args.memory:
-                env_kwargs["memory"] = args.memory
-            if args.cpus:
-                env_kwargs["cpus"] = args.cpus
-            if getattr(args, "base_url", None):
-                env_kwargs["base_url"] = args.base_url
-            if getattr(args, "cluster", None):
-                env_kwargs["cluster"] = args.cluster
-            if getattr(args, "extra_headers", None):
-                env_kwargs["extra_headers"] = args.extra_headers
-            if getattr(args, "xrl_authorization", None):
-                env_kwargs["xrl_authorization"] = args.xrl_authorization
-
-            uploads = []
-            if args.local_path:
-                uploads.append((args.local_path, args.target_path))
-
-            env = {}
-            if getattr(args, "env", None):
-                for item in args.env:
-                    key, _, value = item.partition("=")
-                    env[key] = value
-
-            config = BashJobConfig(
-                script=args.script_content,
-                script_path=args.script,
-                environment=RockEnvironmentConfig(
-                    **env_kwargs,
-                    uploads=uploads,
-                    auto_stop=True,
-                    env=env,
+        parser = self._run_parser
+
+        # ── 1. Mode validation ────────────────────────────────────────
+        has_config = bool(args.job_config)
+        has_script = bool(args.script or args.script_content)
+
+        if not has_config and not has_script:
+            _fail(
+                parser,
+                "Missing job definition. Provide either a YAML config or inline script.",
+                hint=(
+                    "Examples:\n"
+                    "  rock job run --job_config job.yaml                 # any job type, auto-detected\n"
+                    "  rock job run --script path/to/run.sh               # bash, script file\n"
+                    '  rock job run --script-content "echo hi"            # bash, inline snippet'
+                ),
+            )
+
+        if has_config and has_script:
+            _fail(
+                parser,
+                "--job_config is mutually exclusive with --script / --script-content.",
+                hint=(
+                    "Pick one mode:\n"
+                    "  - YAML mode:  rock job run --job_config job.yaml\n"
+                    "  - flags mode: rock job run --script run.sh"
                 ),
-                timeout=args.timeout,
             )
 
-        elif job_type == "harbor":
-            if not args.config:
-                logger.error("--config is required for harbor type")
-                return
-            from rock.sdk.bench.models.job.config import HarborJobConfig
+        if args.script and args.script_content:
+            _fail(
+                parser,
+                "--script and --script-content are mutually exclusive (pick a file path OR an inline snippet).",
+            )
 
-            config = HarborJobConfig.from_yaml(args.config)
-            if args.image:
-                config.environment.image = args.image
-            config.environment.auto_stop = True
+        if args.type == "harbor" and not has_config:
+            _fail(
+                parser,
+                "--type harbor requires --job_config <yaml>.",
+                hint=(
+                    "Harbor jobs cannot be expressed purely via CLI flags.\n"
+                    "Example:\n"
+                    "  rock job run --job_config harbor.yaml"
+                ),
+            )
 
+        # ── 2. Build config ───────────────────────────────────────────
+        if has_config:
+            config = self._config_from_yaml(parser, args)
         else:
-            logger.error(f"Unknown job type: {job_type}")
-            return
+            config = self._config_from_flags(args)
+
+        # ── 3. Apply overrides (shared across both modes) ─────────────
+        self._apply_overrides(config, args)
 
+        # ── 4. Run ────────────────────────────────────────────────────
         try:
             result = await Job(config).run()
             if result.trial_results:
@@ -95,23 +103,157 @@ async def _job_run(self, args: argparse.Namespace):
         except Exception as e:
             logger.error(f"Job failed: {e}")
 
+    def _apply_overrides(self, config, args: argparse.Namespace) -> None:
+        """Apply CLI overrides that are valid in both YAML and flags modes.
+
+        Mutates ``config`` in place. Works for both BashJobConfig and HarborJobConfig
+        because both use ``RockEnvironmentConfig`` for ``environment``.
+        """
+        env = config.environment
+        if args.image:
+            env.image = args.image
+        if args.memory:
+            env.memory = args.memory
+        if args.cpus:
+            env.cpus = args.cpus
+        if getattr(args, "base_url", None):
+            env.base_url = args.base_url
+        if getattr(args, "cluster", None):
+            env.cluster = args.cluster
+        if getattr(args, "extra_headers", None):
+            env.extra_headers = args.extra_headers
+        if getattr(args, "xrl_authorization", None):
+            env.xrl_authorization = args.xrl_authorization
+
+        for item in args.env or []:
+            key, _, value = item.partition("=")
+            env.env[key] = value
+
+        if args.local_path:
+            env.uploads = list(env.uploads) + [(args.local_path, args.target_path)]
+
+        if args.timeout is not None:
+            config.timeout = args.timeout
+
+        env.auto_stop = True
+
+    def _config_from_yaml(self, parser: argparse.ArgumentParser, args: argparse.Namespace):
+        """Load config via JobConfig.from_yaml and enforce --type consistency."""
+        from pathlib import Path
+
+        from rock.sdk.job.config import BashJobConfig, JobConfig
+
+        path = args.job_config
+        if not Path(path).is_file():
+            _fail(parser, f"--job_config path does not exist: {path}")
+
+        try:
+            config = JobConfig.from_yaml(path)
+        except ValueError as exc:
+            # from_yaml raises ValueError with a combined Bash/Harbor error message
+            _fail(parser, f"Failed to load --job_config {path!r}:\n{exc}")
+        except Exception as exc:  # YAML parse error, IO error, etc.
+            _fail(parser, f"Failed to load --job_config {path!r}:\n{exc}")
+
+        if args.type is not None:
+            actual_type = "bash" if isinstance(config, BashJobConfig) else "harbor"
+            if args.type != actual_type:
+                _fail(
+                    parser,
+                    f"--type {args.type} does not match YAML (detected as {actual_type}).",
+                    hint="Remove --type and let the YAML decide, or pass a matching config file.",
+                )
+        return config
+
+    def _config_from_flags(self, args: argparse.Namespace):
+        """Build a BashJobConfig purely from CLI flags (mode B)."""
+        from rock.sdk.bench.models.trial.config import RockEnvironmentConfig
+        from rock.sdk.job.config import BashJobConfig
+
+        env: dict[str, str] = {}
+        for item in args.env or []:
+            key, _, value = item.partition("=")
+            env[key] = value
+
+        uploads = [(args.local_path, args.target_path)] if args.local_path else []
+
+        env_kwargs: dict = {}
+        if args.image:
+            env_kwargs["image"] = args.image
+        if args.memory:
+            env_kwargs["memory"] = args.memory
+        if args.cpus:
+            env_kwargs["cpus"] = args.cpus
+        if getattr(args, "base_url", None):
+            env_kwargs["base_url"] = args.base_url
+        if getattr(args, "cluster", None):
+            env_kwargs["cluster"] = args.cluster
+        if getattr(args, "extra_headers", None):
+            env_kwargs["extra_headers"] = args.extra_headers
+        if getattr(args, "xrl_authorization", None):
+            env_kwargs["xrl_authorization"] = args.xrl_authorization
+
+        cfg_kwargs: dict = {}
+        if args.timeout is not None:
+            cfg_kwargs["timeout"] = args.timeout
+
+        return BashJobConfig(
+            script=args.script_content,
+            script_path=args.script,
+            environment=RockEnvironmentConfig(
+                **env_kwargs,
+                uploads=uploads,
+                auto_stop=True,
+                env=env,
+            ),
+            **cfg_kwargs,
+        )
+
     @staticmethod
     async def add_parser_to(subparsers: argparse._SubParsersAction):
         job_parser = subparsers.add_parser("job", help="Manage sandbox jobs")
         job_subparsers = job_parser.add_subparsers(dest="job_command")
 
-        run_parser = job_subparsers.add_parser("run", help="Run a job in a sandbox")
-        run_parser.add_argument("--type", choices=["bash", "harbor"], default="bash", help="Job type (default: bash)")
+        run_parser = job_subparsers.add_parser(
+            "run",
+            help="Run a job in a sandbox",
+            description=(
+                "Run a sandbox job in one of two mutually-exclusive modes:\n"
+                "  (1) YAML mode  : --job_config <file>          (type auto-detected)\n"
+                "  (2) flags mode : --script / --script-content  (bash only)"
+            ),
+            formatter_class=argparse.RawDescriptionHelpFormatter,
+        )
+        run_parser.add_argument(
+            "--type",
+            choices=["bash", "harbor"],
+            default=None,
+            help="Explicit job type (flags mode only; YAML mode auto-detects).",
+        )
         # bash args
         run_parser.add_argument("--script", default=None, help="Path to script file")
         run_parser.add_argument("--script-content", default=None, help="Inline script content")
-        # harbor args
-        run_parser.add_argument("--config", default=None, help="Harbor YAML config path")
+        # YAML config (mode A) — flag name is --job_config (distinct from the
+        # top-level --config that points at the CLI INI config). Also accept
+        # --job-config as the hyphen-form alias.
+        run_parser.add_argument(
+            "--job_config",
+            "--job-config",
+            dest="job_config",
+            default=None,
+            metavar="YAML",
+            help="Job YAML config path (any job type; auto-detected).",
+        )
         # shared args
         run_parser.add_argument("--image", default=None, help="Sandbox image")
         run_parser.add_argument("--memory", default=None, help="Memory (e.g. 8g)")
         run_parser.add_argument("--cpus", default=None, type=float, help="CPU count")
-        run_parser.add_argument("--timeout", type=int, default=3600, help="Timeout in seconds")
+        run_parser.add_argument(
+            "--timeout",
+            type=int,
+            default=None,
+            help="Timeout in seconds (overrides YAML when given).",
+        )
         run_parser.add_argument("--local-path", default=None, help="Local dir to upload")
         run_parser.add_argument("--target-path", default="/root/job", help="Target dir in sandbox")
         run_parser.add_argument("--base-url", default=None, help="Admin service base URL")
@@ -128,3 +270,6 @@ async def add_parser_to(subparsers: argparse._SubParsersAction):
             default=None,
             help="XRL authorization token",
         )
+
+        # Stash on the class so _job_run can call parser.error() with the right parser.
+        JobCommand._run_parser = run_parser
diff --git a/rock/cli/main.py b/rock/cli/main.py
index 61ceecc9f3..ebb6225c1b 100644
--- a/rock/cli/main.py
+++ b/rock/cli/main.py
@@ -27,13 +27,18 @@ def load_config_from_file(args):
     manager = ConfigManager(config_path)
     cli_config = manager.get_config()
 
-    # If command line arguments are not set, use configuration file
-    if not args.base_url:
-        args.base_url = cli_config.base_url
-    if not args.auth_token and (authorization := cli_config.extra_headers.get("xrl-authorization")):
-        args.auth_token = authorization
-    if not args.cluster and (cluster := cli_config.extra_headers.get("cluster")):
-        args.cluster = cluster
+    # For the `job` command, the job YAML (--job_config) is the source of truth
+    # for base_url / cluster / auth_token. Backfilling from the CLI INI here would
+    # clobber YAML-specified values inside JobCommand._apply_overrides, which
+    # cannot distinguish user-supplied --base-url from an INI backfill.
+    # (Users who want INI defaults for a job can pass --base-url explicitly.)
+    if args.command != "job":
+        if not args.base_url:
+            args.base_url = cli_config.base_url
+        if not args.auth_token and (authorization := cli_config.extra_headers.get("xrl-authorization")):
+            args.auth_token = authorization
+        if not args.cluster and (cluster := cli_config.extra_headers.get("cluster")):
+            args.cluster = cluster
 
     # Process extra_headers, first get from configuration file
     extra_headers = cli_config.extra_headers.copy()
diff --git a/tests/unit/cli/__init__.py b/tests/unit/cli/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tests/unit/cli/command/__init__.py b/tests/unit/cli/command/__init__.py
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/tests/unit/cli/command/test_job.py b/tests/unit/cli/command/test_job.py
new file mode 100644
index 0000000000..557fadfc16
--- /dev/null
+++ b/tests/unit/cli/command/test_job.py
@@ -0,0 +1,674 @@
+"""Unit tests for rock.cli.command.job.JobCommand.
+
+All tests in this file are fast: no Docker, Ray, or network. We drive the
+sub-parser end-to-end with argparse so mutual-exclusion and error messages
+match what users see at the terminal.
+"""
+
+from __future__ import annotations
+
+import argparse
+import asyncio
+
+import pytest
+
+from rock.cli.command.job import JobCommand
+
+
+def _build_parser() -> argparse.ArgumentParser:
+    """Build a top-level parser with `job` subcommand wired in, same as the CLI."""
+    top = argparse.ArgumentParser(prog="rock")
+    subparsers = top.add_subparsers(dest="command")
+    asyncio.run(JobCommand.add_parser_to(subparsers))
+    return top
+
+
+def test_parser_builds():
+    """Smoke: the parser builds without error and exposes --job_config / --script."""
+    parser = _build_parser()
+    ns = parser.parse_args(["job", "run", "--job_config", "foo.yaml"])
+    assert ns.command == "job"
+    assert ns.job_command == "run"
+    assert ns.job_config == "foo.yaml"
+    assert ns.script is None
+    assert ns.script_content is None
+
+
+def test_top_level_config_does_not_collide_with_job_config():
+    """Regression: `rock --config cli.ini job run --job_config foo.yaml` keeps
+    both values separate — the top-level --config (INI loader) and the sub-parser's
+    --job_config (YAML) must not overwrite each other.
+    """
+    top = argparse.ArgumentParser(prog="rock")
+    top.add_argument("--config", help="top-level CLI config (INI)")
+    subparsers = top.add_subparsers(dest="command")
+    asyncio.run(JobCommand.add_parser_to(subparsers))
+
+    ns = top.parse_args(["--config", "cli.ini", "job", "run", "--job_config", "bash.yaml"])
+    assert ns.config == "cli.ini"
+    assert ns.job_config == "bash.yaml"
+
+
+def test_job_config_hyphen_alias():
+    """Both --job_config and --job-config should work (argparse accepts the alias)."""
+    parser = _build_parser()
+    ns = parser.parse_args(["job", "run", "--job-config", "foo.yaml"])
+    assert ns.job_config == "foo.yaml"
+
+
+class TestFailHelper:
+    def test_fail_exits_with_code_2_and_usage(self, capsys):
+        """_fail() must print usage + msg and exit code 2 (argparse convention)."""
+        from rock.cli.command.job import _fail
+
+        parser = argparse.ArgumentParser(prog="rock job run")
+        parser.add_argument("--config")
+
+        with pytest.raises(SystemExit) as excinfo:
+            _fail(parser, "boom")
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "usage:" in err
+        assert "boom" in err
+        assert "rock job run --help" in err  # always appended
+
+    def test_fail_includes_hint_when_given(self, capsys):
+        from rock.cli.command.job import _fail
+
+        parser = argparse.ArgumentParser(prog="rock job run")
+
+        with pytest.raises(SystemExit):
+            _fail(parser, "boom", hint="try this: X")
+
+        err = capsys.readouterr().err
+        assert "boom" in err
+        assert "try this: X" in err
+
+    def test_fail_no_hint_still_appends_help_pointer(self, capsys):
+        from rock.cli.command.job import _fail
+
+        parser = argparse.ArgumentParser(prog="rock job run")
+
+        with pytest.raises(SystemExit):
+            _fail(parser, "boom")
+
+        err = capsys.readouterr().err
+        assert "rock job run --help" in err
+
+
+class TestRunParserStash:
+    def test_run_parser_stashed_on_class_after_add_parser_to(self):
+        """After add_parser_to runs, JobCommand._run_parser must point to the 'run' sub-parser."""
+        # Reset to isolate from other tests
+        JobCommand._run_parser = None
+
+        top = argparse.ArgumentParser(prog="rock")
+        subparsers = top.add_subparsers(dest="command")
+        asyncio.run(JobCommand.add_parser_to(subparsers))
+
+        assert JobCommand._run_parser is not None
+        assert isinstance(JobCommand._run_parser, argparse.ArgumentParser)
+        # Sanity: it is the parser that knows about --config (stored as job_config)
+        actions = {a.dest for a in JobCommand._run_parser._actions}
+        assert "job_config" in actions
+        assert "script" in actions
+
+
+class TestJobRunValidation:
+    @pytest.fixture(autouse=True)
+    def _parser(self):
+        """Rebuild the parser for each test so _run_parser is populated and fresh."""
+        JobCommand._run_parser = None
+        top = argparse.ArgumentParser(prog="rock")
+        subparsers = top.add_subparsers(dest="command")
+        asyncio.run(JobCommand.add_parser_to(subparsers))
+        self.top = top
+        yield
+
+    def _run(self, argv):
+        """Parse argv and invoke JobCommand.arun; SystemExit bubbles up."""
+        ns = self.top.parse_args(argv)
+        cmd = JobCommand()
+        asyncio.run(cmd.arun(ns))
+
+    def test_missing_both_config_and_script_errors(self, capsys):
+        """With neither --config nor --script*, must exit 2 with helpful message."""
+        with pytest.raises(SystemExit) as excinfo:
+            self._run(["job", "run"])
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "Missing job definition" in err
+        assert "--job_config job.yaml" in err  # example in hint
+        assert "--script" in err
+        assert "rock job run --help" in err
+
+    def test_config_and_script_mutually_exclusive(self, capsys, tmp_path):
+        """--job_config together with --script must error with mutex hint."""
+        yaml_path = tmp_path / "job.yaml"
+        yaml_path.write_text("script_path: ./run.sh\n")
+
+        with pytest.raises(SystemExit) as excinfo:
+            self._run(["job", "run", "--job_config", str(yaml_path), "--script", "run.sh"])
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "mutually exclusive" in err
+        assert "YAML mode" in err
+        assert "flags mode" in err
+
+    def test_config_and_script_content_mutually_exclusive(self, capsys, tmp_path):
+        yaml_path = tmp_path / "job.yaml"
+        yaml_path.write_text("script_path: ./run.sh\n")
+
+        with pytest.raises(SystemExit) as excinfo:
+            self._run(["job", "run", "--job_config", str(yaml_path), "--script-content", "echo hi"])
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "mutually exclusive" in err
+
+    def test_script_and_script_content_mutually_exclusive(self, capsys):
+        with pytest.raises(SystemExit) as excinfo:
+            self._run(["job", "run", "--script", "run.sh", "--script-content", "echo hi"])
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "--script and --script-content are mutually exclusive" in err
+
+    def test_type_harbor_requires_config(self, capsys):
+        with pytest.raises(SystemExit) as excinfo:
+            self._run(["job", "run", "--type", "harbor", "--script", "run.sh"])
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "--type harbor requires --job_config" in err
+        assert "cannot be expressed purely via CLI flags" in err
+
+    def test_type_default_is_none(self):
+        """--type should default to None; 'bash' is implied only when mode is flags."""
+        ns = self.top.parse_args(["job", "run", "--script", "run.sh"])
+        assert ns.type is None
+
+
+class TestConfigFromFlags:
+    def _args(self, **overrides):
+        """Build an argparse.Namespace matching the run sub-parser defaults, then overlay overrides."""
+        JobCommand._run_parser = None
+        top = argparse.ArgumentParser(prog="rock")
+        subparsers = top.add_subparsers(dest="command")
+        asyncio.run(JobCommand.add_parser_to(subparsers))
+        argv = ["job", "run", "--script", "dummy.sh"]
+        ns = top.parse_args(argv)
+        for k, v in overrides.items():
+            setattr(ns, k, v)
+        return ns
+
+    def test_inline_script_content_with_env_and_uploads(self):
+        ns = self._args(
+            script=None,
+            script_content="echo hi",
+            image="python:3.11",
+            memory="8g",
+            cpus=4.0,
+            env=["FOO=bar", "BAZ=qux"],
+            local_path="/tmp/workspace",
+            target_path="/root/job",
+            timeout=1800,
+        )
+        config = JobCommand()._config_from_flags(ns)
+
+        from rock.sdk.job.config import BashJobConfig
+
+        assert isinstance(config, BashJobConfig)
+        assert config.script == "echo hi"
+        assert config.script_path is None
+        assert config.timeout == 1800
+        env = config.environment
+        assert env.image == "python:3.11"
+        assert env.memory == "8g"
+        assert env.cpus == 4.0
+        assert env.env == {"FOO": "bar", "BAZ": "qux"}
+        assert env.uploads == [("/tmp/workspace", "/root/job")]
+        assert env.auto_stop is True
+
+    def test_script_path_mode_no_env_no_uploads(self):
+        ns = self._args(script="run.sh", script_content=None, env=None, local_path=None)
+        config = JobCommand()._config_from_flags(ns)
+
+        assert config.script_path == "run.sh"
+        assert config.script is None
+        assert config.environment.env == {}
+        assert config.environment.uploads == []
+
+
+class TestConfigFromYaml:
+    def _setup(self):
+        JobCommand._run_parser = None
+        top = argparse.ArgumentParser(prog="rock")
+        subparsers = top.add_subparsers(dest="command")
+        asyncio.run(JobCommand.add_parser_to(subparsers))
+        return top
+
+    def _make_args(self, top, argv):
+        return top.parse_args(argv)
+
+    def test_loads_bash_yaml_autodetected(self, tmp_path):
+        top = self._setup()
+        yaml_path = tmp_path / "bash.yaml"
+        yaml_path.write_text("script_path: ./run.sh\ntimeout: 1800\nenvironment:\n  image: python:3.11\n")
+        ns = self._make_args(top, ["job", "run", "--job_config", str(yaml_path)])
+        config = JobCommand()._config_from_yaml(JobCommand._run_parser, ns)
+
+        from rock.sdk.job.config import BashJobConfig
+
+        assert isinstance(config, BashJobConfig)
+        assert config.script_path == "./run.sh"
+        assert config.timeout == 1800
+        assert config.environment.image == "python:3.11"
+
+    def test_missing_file_errors_via_parser(self, tmp_path, capsys):
+        top = self._setup()
+        missing = tmp_path / "nope.yaml"
+        ns = self._make_args(top, ["job", "run", "--job_config", str(missing)])
+
+        with pytest.raises(SystemExit) as excinfo:
+            JobCommand()._config_from_yaml(JobCommand._run_parser, ns)
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "--job_config path does not exist" in err
+
+    def test_invalid_yaml_surfaces_from_yaml_error(self, tmp_path, capsys):
+        top = self._setup()
+        yaml_path = tmp_path / "weird.yaml"
+        yaml_path.write_text("totally_unknown_field: 1\n")
+        ns = self._make_args(top, ["job", "run", "--job_config", str(yaml_path)])
+
+        with pytest.raises(SystemExit) as excinfo:
+            JobCommand()._config_from_yaml(JobCommand._run_parser, ns)
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "Failed to load --job_config" in err
+        assert "YAML does not match any known job type" in err
+
+    def test_type_bash_matches_yaml(self, tmp_path):
+        top = self._setup()
+        yaml_path = tmp_path / "bash.yaml"
+        yaml_path.write_text("script_path: ./run.sh\n")
+        ns = self._make_args(top, ["job", "run", "--type", "bash", "--job_config", str(yaml_path)])
+        config = JobCommand()._config_from_yaml(JobCommand._run_parser, ns)
+
+        from rock.sdk.job.config import BashJobConfig
+
+        assert isinstance(config, BashJobConfig)
+
+    def test_type_harbor_mismatch_against_bash_yaml(self, tmp_path, capsys):
+        top = self._setup()
+        yaml_path = tmp_path / "bash.yaml"
+        yaml_path.write_text("script_path: ./run.sh\n")
+        ns = self._make_args(top, ["job", "run", "--type", "harbor", "--job_config", str(yaml_path)])
+
+        with pytest.raises(SystemExit) as excinfo:
+            JobCommand()._config_from_yaml(JobCommand._run_parser, ns)
+
+        assert excinfo.value.code == 2
+        err = capsys.readouterr().err
+        assert "--type harbor does not match YAML (detected as bash)" in err
+
+
+class TestApplyOverrides:
+    def _setup(self):
+        JobCommand._run_parser = None
+        top = argparse.ArgumentParser(prog="rock")
+        subparsers = top.add_subparsers(dest="command")
+        asyncio.run(JobCommand.add_parser_to(subparsers))
+        return top
+
+    def test_override_image_memory_cpus_on_bash_config(self):
+        from rock.sdk.bench.models.trial.config import RockEnvironmentConfig
+        from rock.sdk.job.config import BashJobConfig
+
+        top = self._setup()
+        config = BashJobConfig(
+            script="echo hi",
+            environment=RockEnvironmentConfig(image="python:3.10", memory="2g", cpus=1),
+            timeout=7200,
+        )
+        ns = top.parse_args(
+            [
+                "job",
+                "run",
+                "--job_config",
+                "unused.yaml",
+                "--image",
+                "python:3.12",
+                "--memory",
+                "16g",
+                "--cpus",
+                "8",
+                "--timeout",
+                "900",
+            ]
+        )
+        JobCommand()._apply_overrides(config, ns)
+
+        assert config.environment.image == "python:3.12"
+        assert config.environment.memory == "16g"
+        assert config.environment.cpus == 8.0
+        assert config.timeout == 900
+        assert config.environment.auto_stop is True
+
+    def test_env_overrides_append_and_overwrite(self):
+        from rock.sdk.bench.models.trial.config import RockEnvironmentConfig
+        from rock.sdk.job.config import BashJobConfig
+
+        top = self._setup()
+        config = BashJobConfig(
+            script="echo hi",
+            environment=RockEnvironmentConfig(env={"FOO": "old", "KEEP": "1"}),
+        )
+        ns = top.parse_args(
+            [
+                "job",
+                "run",
+                "--job_config",
+                "unused.yaml",
+                "--env",
+                "FOO=new",
+                "--env",
+                "NEW=2",
+            ]
+        )
+        JobCommand()._apply_overrides(config, ns)
+
+        assert config.environment.env == {"FOO": "new", "KEEP": "1", "NEW": "2"}
+
+    def test_uploads_appended(self):
+        from rock.sdk.bench.models.trial.config import RockEnvironmentConfig
+        from rock.sdk.job.config import BashJobConfig
+
+        top = self._setup()
+        config = BashJobConfig(
+            script="echo hi",
+            environment=RockEnvironmentConfig(uploads=[("/a", "/b")]),
+        )
+        ns = top.parse_args(
+            [
+                "job",
+                "run",
+                "--job_config",
+                "unused.yaml",
+                "--local-path",
+                "/src",
+                "--target-path",
+                "/dst",
+            ]
+        )
+        JobCommand()._apply_overrides(config, ns)
+
+        assert config.environment.uploads == [("/a", "/b"), ("/src", "/dst")]
+
+    def test_parser_timeout_default_is_none(self):
+        """--timeout should default to None so YAML timeout is not unconditionally overridden."""
+        top = self._setup()
+        ns = top.parse_args(["job", "run", "--script", "run.sh"])
+        assert ns.timeout is None
+
+    def test_parser_description_mentions_two_modes(self):
+        self._setup()  # populates JobCommand._run_parser
+        desc = JobCommand._run_parser.description or ""
+        assert "YAML mode" in desc
+        assert "flags mode" in desc
+
+    def test_no_overrides_leaves_config_untouched_except_auto_stop(self):
+        from rock.sdk.bench.models.trial.config import RockEnvironmentConfig
+        from rock.sdk.job.config import BashJobConfig
+
+        top = self._setup()
+        config = BashJobConfig(
+            script="echo hi",
+            environment=RockEnvironmentConfig(image="python:3.10", env={"X": "1"}),
+            timeout=1234,
+        )
+        ns = top.parse_args(["job", "run", "--job_config", "unused.yaml"])
+        # Force timeout to None so this test is robust regardless of the parser's
+        # current default (Task 11 flips it to None permanently).
+        ns.timeout = None
+        JobCommand()._apply_overrides(config, ns)
+
+        assert config.environment.image == "python:3.10"
+        assert config.environment.env == {"X": "1"}
+        assert config.timeout == 1234
+        assert config.environment.auto_stop is True  # always set
+
+
+class TestJobRunEndToEnd:
+    """End-to-end-ish tests for _job_run, mocking only Job.run."""
+
+    def _setup(self):
+        JobCommand._run_parser = None
+        top = argparse.ArgumentParser(prog="rock")
+        subparsers = top.add_subparsers(dest="command")
+        asyncio.run(JobCommand.add_parser_to(subparsers))
+        return top
+
+    def test_flags_mode_builds_bash_config_and_runs_job(self, monkeypatch):
+        """A --script invocation should produce a BashJobConfig and call Job(config).run()."""
+        from unittest.mock import MagicMock
+
+        from rock.sdk.job.config import BashJobConfig
+
+        captured = {}
+
+        class FakeJob:
+            def __init__(self, cfg):
+                captured["cfg"] = cfg
+
+            async def run(self):
+                result = MagicMock()
+                result.status = "COMPLETED"
+                result.trial_results = []
+                return result
+
+        monkeypatch.setattr("rock.sdk.job.Job", FakeJob)
+
+        top = self._setup()
+        ns = top.parse_args(
+            [
+                "job",
+                "run",
+                "--script",
+                "run.sh",
+                "--image",
+                "python:3.11",
+                "--env",
+                "A=1",
+            ]
+        )
+        asyncio.run(JobCommand().arun(ns))
+
+        cfg = captured["cfg"]
+        assert isinstance(cfg, BashJobConfig)
+        assert cfg.script_path == "run.sh"
+        assert cfg.environment.image == "python:3.11"
+        assert cfg.environment.env == {"A": "1"}
+
+    def test_yaml_mode_with_override_image(self, monkeypatch, tmp_path):
+        from unittest.mock import MagicMock
+
+        from rock.sdk.job.config import BashJobConfig
+
+        yaml_path = tmp_path / "bash.yaml"
+        yaml_path.write_text("script_path: ./run.sh\nenvironment:\n  image: python:3.10\n")
+
+        captured = {}
+
+        class FakeJob:
+            def __init__(self, cfg):
+                captured["cfg"] = cfg
+
+            async def run(self):
+                r = MagicMock()
+                r.status = "COMPLETED"
+                r.trial_results = []
+                return r
+
+        monkeypatch.setattr("rock.sdk.job.Job", FakeJob)
+
+        top = self._setup()
+        ns = top.parse_args(
+            [
+                "job",
+                "run",
+                "--job_config",
+                str(yaml_path),
+                "--image",
+                "python:3.12",
+            ]
+        )
+        asyncio.run(JobCommand().arun(ns))
+
+        cfg = captured["cfg"]
+        assert isinstance(cfg, BashJobConfig)
+        assert cfg.script_path == "./run.sh"
+        assert cfg.environment.image == "python:3.12"  # override applied
+
+
+class TestYamlSourceOfTruth:
+    """Regression: YAML-loaded fields (base_url, cluster) must survive when the
+    user doesn't pass the corresponding CLI flag, even though main.py would
+    otherwise backfill args.base_url from the ~/.rock/config.ini default.
+    """
+
+    def _setup(self):
+        JobCommand._run_parser = None
+        top = argparse.ArgumentParser(prog="rock")
+        # Mirror the top-level flags declared in rock/cli/main.py so we can
+        # drive load_config_from_file() as main.py does.
+        top.add_argument("--config")
+        top.add_argument("--base-url")
+        top.add_argument("--auth-token")
+        top.add_argument("--cluster")
+        top.add_argument("--extra-header", action="append", dest="extra_headers_list")
+        subparsers = top.add_subparsers(dest="command")
+        asyncio.run(JobCommand.add_parser_to(subparsers))
+        return top
+
+    def test_load_config_from_file_skips_backfill_for_job(self, tmp_path, monkeypatch):
+        """load_config_from_file must NOT backfill base_url/cluster/auth_token
+        when the command is `job` — the YAML is the source of truth.
+        """
+        from rock.cli import main as main_mod
+
+        top = self._setup()
+        ns = top.parse_args(["job", "run", "--job_config", "unused.yaml"])
+        # Pretend the INI file would return http://ini.example.com as base_url
+        fake_cli_config = type(
+            "CLI", (), {"base_url": "http://ini.example.com", "extra_headers": {"cluster": "ini-cluster"}}
+        )()
+        monkeypatch.setattr(
+            main_mod, "ConfigManager", lambda _path: type("M", (), {"get_config": lambda self: fake_cli_config})()
+        )
+
+        main_mod.load_config_from_file(ns)
+
+        # Backfill should have been skipped for `job`
+        assert ns.base_url is None
+        assert ns.cluster is None
+        assert ns.auth_token is None
+
+    def test_load_config_from_file_backfills_for_non_job(self, tmp_path, monkeypatch):
+        """Non-job commands still get the backfill behavior."""
+        from rock.cli import main as main_mod
+
+        # Build a parser with a non-job subcommand
+        top = argparse.ArgumentParser(prog="rock")
+        top.add_argument("--config")
+        top.add_argument("--base-url")
+        top.add_argument("--auth-token")
+        top.add_argument("--cluster")
+        top.add_argument("--extra-header", action="append", dest="extra_headers_list")
+        subparsers = top.add_subparsers(dest="command")
+        subparsers.add_parser("sandbox")
+        ns = top.parse_args(["sandbox"])
+
+        fake_cli_config = type(
+            "CLI", (), {"base_url": "http://ini.example.com", "extra_headers": {"cluster": "ini-cluster"}}
+        )()
+        monkeypatch.setattr(
+            main_mod, "ConfigManager", lambda _path: type("M", (), {"get_config": lambda self: fake_cli_config})()
+        )
+
+        main_mod.load_config_from_file(ns)
+
+        assert ns.base_url == "http://ini.example.com"
+        assert ns.cluster == "ini-cluster"
+
+    def test_yaml_base_url_overridden_when_user_passes_flag(self, tmp_path, monkeypatch):
+        """When user explicitly passes --base-url, it should override YAML."""
+        from unittest.mock import MagicMock
+
+        yaml_path = tmp_path / "bash.yaml"
+        yaml_path.write_text("script_path: ./run.sh\nenvironment:\n  base_url: http://xrl.alibaba-inc.com\n")
+
+        captured = {}
+
+        class FakeJob:
+            def __init__(self, cfg):
+                captured["cfg"] = cfg
+
+            async def run(self):
+                r = MagicMock()
+                r.status = "COMPLETED"
+                r.trial_results = []
+                return r
+
+        monkeypatch.setattr("rock.sdk.job.Job", FakeJob)
+
+        top = self._setup()
+        ns = top.parse_args(
+            [
+                "job",
+                "run",
+                "--job_config",
+                str(yaml_path),
+                "--base-url",
+                "http://explicit.example.com",
+            ]
+        )
+        asyncio.run(JobCommand().arun(ns))
+
+        cfg = captured["cfg"]
+        assert cfg.environment.base_url == "http://explicit.example.com"
+
+
+class TestArun:
+    """Tests for JobCommand.arun dispatch (not _job_run)."""
+
+    def test_unknown_job_command_logs_error(self):
+        from unittest.mock import patch
+
+        ns = argparse.Namespace(job_command="weird")
+        with patch("rock.cli.command.job.logger") as mock_logger:
+            asyncio.run(JobCommand().arun(ns))
+            mock_logger.error.assert_called()
+
+
+class TestHelpOutput:
+    def test_help_output_mentions_both_modes(self, capsys):
+        top = argparse.ArgumentParser(prog="rock")
+        subparsers = top.add_subparsers(dest="command")
+        asyncio.run(JobCommand.add_parser_to(subparsers))
+
+        with pytest.raises(SystemExit) as excinfo:
+            top.parse_args(["job", "run", "--help"])
+
+        assert excinfo.value.code == 0
+        out = capsys.readouterr().out
+        assert "YAML mode" in out
+        assert "flags mode" in out
+        assert "--job_config" in out
+        assert "--script" in out
diff --git a/tests/unit/sdk/job/test_cli_job.py b/tests/unit/sdk/job/test_cli_job.py
deleted file mode 100644
index b46a405f7e..0000000000
--- a/tests/unit/sdk/job/test_cli_job.py
+++ /dev/null
@@ -1,395 +0,0 @@
-"""Tests for rock.cli.command.job — JobCommand with --type bash/harbor routing."""
-
-from __future__ import annotations
-
-import argparse
-import tempfile
-from pathlib import Path
-from unittest.mock import AsyncMock, MagicMock, patch
-
-import pytest
-
-from rock.cli.command.job import JobCommand
-
-
-async def _build_parser() -> argparse.ArgumentParser:
-    p = argparse.ArgumentParser()
-    sub = p.add_subparsers(dest="main_command")
-    await JobCommand.add_parser_to(sub)
-    return p
-
-
-def _make_mock_job_result():
-    mock_result = MagicMock()
-    mock_result.trial_results = []
-    mock_result.status = "completed"
-    return mock_result
-
-
-# ----------------------------------------------------------------------------
-# Parser tests
-# ----------------------------------------------------------------------------
-
-
-async def test_parser_default_type_is_bash():
-    p = await _build_parser()
-    args = p.parse_args(["job", "run", "--script-content", "echo hi"])
-    assert args.type == "bash"
-    assert args.script_content == "echo hi"
-
-
-async def test_parser_accepts_type_bash_explicit():
-    p = await _build_parser()
-    args = p.parse_args(["job", "run", "--type", "bash", "--script-content", "echo hi"])
-    assert args.type == "bash"
-
-
-async def test_parser_accepts_type_harbor():
-    p = await _build_parser()
-    args = p.parse_args(["job", "run", "--type", "harbor", "--config", "/tmp/c.yaml"])
-    assert args.type == "harbor"
-    assert args.config == "/tmp/c.yaml"
-
-
-async def test_parser_supports_all_bash_args():
-    p = await _build_parser()
-    args = p.parse_args(
-        [
-            "job",
-            "run",
-            "--script",
-            "/tmp/s.sh",
-            "--image",
-            "python:3.11",
-            "--memory",
-            "4g",
-            "--cpus",
-            "2",
-            "--timeout",
-            "600",
-            "--local-path",
-            "/tmp/local",
-            "--target-path",
-            "/root/other",
-        ]
-    )
-    assert args.script == "/tmp/s.sh"
-    assert args.image == "python:3.11"
-    assert args.memory == "4g"
-    assert args.cpus == 2.0
-    assert args.timeout == 600
-    assert args.local_path == "/tmp/local"
-    assert args.target_path == "/root/other"
-
-
-async def test_parser_invalid_type_rejected():
-    p = await _build_parser()
-    with pytest.raises(SystemExit):
-        p.parse_args(["job", "run", "--type", "invalid"])
-
-
-# ----------------------------------------------------------------------------
-# arun / _job_run behavior tests
-# ----------------------------------------------------------------------------
-
-
-def _bash_args(**overrides):
-    defaults = dict(
-        job_command="run",
-        type="bash",
-        script=None,
-        script_content=None,
-        image=None,
-        memory=None,
-        cpus=None,
-        local_path=None,
-        target_path="/root/job",
-        timeout=3600,
-        config=None,
-        base_url=None,
-        cluster=None,
-        extra_headers=None,
-        env=None,
-    )
-    defaults.update(overrides)
-    return argparse.Namespace(**defaults)
-
-
-async def test_bash_creates_bash_job_config_and_runs():
-    from rock.sdk.job.config import BashJobConfig
-
-    args = _bash_args(
-        script_content="echo hello",
-        image="python:3.11",
-        memory="4g",
-        cpus=2.0,
-        timeout=600,
-    )
-
-    with patch("rock.sdk.job.Job") as MockJob:
-        mock_instance = MagicMock()
-        mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-        MockJob.return_value = mock_instance
-
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        MockJob.assert_called_once()
-        config_arg = MockJob.call_args[0][0]
-        assert isinstance(config_arg, BashJobConfig)
-        assert config_arg.script == "echo hello"
-        assert config_arg.script_path is None
-        assert config_arg.environment.image == "python:3.11"
-        assert config_arg.environment.memory == "4g"
-        assert config_arg.environment.cpus == 2.0
-        assert config_arg.timeout == 600
-        assert config_arg.environment.auto_stop is True
-        mock_instance.run.assert_awaited_once()
-
-
-async def test_bash_with_script_path():
-    from rock.sdk.job.config import BashJobConfig
-
-    args = _bash_args(script="/tmp/my_script.sh")
-
-    with patch("rock.sdk.job.Job") as MockJob:
-        mock_instance = MagicMock()
-        mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-        MockJob.return_value = mock_instance
-
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        config_arg = MockJob.call_args[0][0]
-        assert isinstance(config_arg, BashJobConfig)
-        assert config_arg.script_path == "/tmp/my_script.sh"
-        assert config_arg.script is None
-
-
-async def test_bash_with_file_upload():
-    args = _bash_args(
-        script_content="echo hi",
-        local_path="/tmp/src",
-        target_path="/root/target",
-    )
-
-    with patch("rock.sdk.job.Job") as MockJob:
-        mock_instance = MagicMock()
-        mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-        MockJob.return_value = mock_instance
-
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        config_arg = MockJob.call_args[0][0]
-        assert config_arg.environment.uploads == [("/tmp/src", "/root/target")]
-
-
-async def test_bash_requires_script_or_script_content():
-    args = _bash_args()  # neither set
-
-    with patch("rock.sdk.job.Job") as MockJob, patch("rock.cli.command.job.logger") as mock_logger:
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        MockJob.assert_not_called()
-        mock_logger.error.assert_called()
-
-
-async def test_bash_rejects_both_script_and_script_content():
-    args = _bash_args(script="/tmp/s.sh", script_content="echo hi")
-
-    with patch("rock.sdk.job.Job") as MockJob, patch("rock.cli.command.job.logger") as mock_logger:
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        MockJob.assert_not_called()
-        mock_logger.error.assert_called()
-
-
-async def test_harbor_requires_config():
-    args = _bash_args(type="harbor", config=None)
-
-    with patch("rock.sdk.job.Job") as MockJob, patch("rock.cli.command.job.logger") as mock_logger:
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        MockJob.assert_not_called()
-        mock_logger.error.assert_called()
-
-
-async def test_harbor_loads_from_yaml():
-    from rock.sdk.bench.models.job.config import HarborJobConfig
-
-    yaml_content = """
-experiment_id: exp-123
-job_name: my-harbor-job
-"""
-    with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
-        f.write(yaml_content)
-        yaml_path = f.name
-
-    try:
-        args = _bash_args(type="harbor", config=yaml_path)
-
-        with patch("rock.sdk.job.Job") as MockJob:
-            mock_instance = MagicMock()
-            mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-            MockJob.return_value = mock_instance
-
-            cmd = JobCommand()
-            await cmd.arun(args)
-
-            MockJob.assert_called_once()
-            config_arg = MockJob.call_args[0][0]
-            assert isinstance(config_arg, HarborJobConfig)
-            assert config_arg.experiment_id == "exp-123"
-            assert config_arg.environment.auto_stop is True
-    finally:
-        Path(yaml_path).unlink(missing_ok=True)
-
-
-async def test_harbor_image_override():
-    yaml_content = """
-experiment_id: exp-abc
-"""
-    with tempfile.NamedTemporaryFile(mode="w", suffix=".yaml", delete=False) as f:
-        f.write(yaml_content)
-        yaml_path = f.name
-
-    try:
-        args = _bash_args(type="harbor", config=yaml_path, image="custom:tag")
-
-        with patch("rock.sdk.job.Job") as MockJob:
-            mock_instance = MagicMock()
-            mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-            MockJob.return_value = mock_instance
-
-            cmd = JobCommand()
-            await cmd.arun(args)
-
-            config_arg = MockJob.call_args[0][0]
-            assert config_arg.environment.image == "custom:tag"
-    finally:
-        Path(yaml_path).unlink(missing_ok=True)
-
-
-async def test_parser_accepts_base_url_and_cluster():
-    p = await _build_parser()
-    args = p.parse_args(
-        [
-            "job",
-            "run",
-            "--script-content",
-            "echo hi",
-            "--base-url",
-            "http://example.com",
-            "--cluster",
-            "test-cluster-a",
-        ]
-    )
-    assert args.base_url == "http://example.com"
-    assert args.cluster == "test-cluster-a"
-
-
-async def test_bash_passes_base_url_and_cluster_to_environment():
-    args = _bash_args(
-        script_content="echo hello",
-        base_url="http://example.com",
-        cluster="test-cluster-a",
-    )
-
-    with patch("rock.sdk.job.Job") as MockJob:
-        mock_instance = MagicMock()
-        mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-        MockJob.return_value = mock_instance
-
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        config_arg = MockJob.call_args[0][0]
-        assert config_arg.environment.base_url == "http://example.com"
-        assert config_arg.environment.cluster == "test-cluster-a"
-
-
-async def test_parser_accepts_env_args():
-    p = await _build_parser()
-    args = p.parse_args(
-        [
-            "job",
-            "run",
-            "--script-content",
-            "echo hi",
-            "--env",
-            "FOO=bar",
-            "--env",
-            "BAZ=qux=123",
-        ]
-    )
-    assert args.env == ["FOO=bar", "BAZ=qux=123"]
-
-
-async def test_parser_env_default_is_none():
-    p = await _build_parser()
-    args = p.parse_args(["job", "run", "--script-content", "echo hi"])
-    assert args.env is None
-
-
-async def test_bash_passes_env_to_config():
-    args = _bash_args(script_content="echo hello")
-    args.env = ["SERP_DEV_KEY=abc123", "RUN_CMD=claw-eval batch --parallel 4"]
-
-    with patch("rock.sdk.job.Job") as MockJob:
-        mock_instance = MagicMock()
-        mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-        MockJob.return_value = mock_instance
-
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        config_arg = MockJob.call_args[0][0]
-        assert config_arg.environment.env == {
-            "SERP_DEV_KEY": "abc123",
-            "RUN_CMD": "claw-eval batch --parallel 4",
-        }
-
-
-async def test_bash_env_with_equals_in_value():
-    """Values containing '=' should not be split."""
-    args = _bash_args(script_content="echo hello")
-    args.env = ["API_KEY=sk-abc=def=="]
-
-    with patch("rock.sdk.job.Job") as MockJob:
-        mock_instance = MagicMock()
-        mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-        MockJob.return_value = mock_instance
-
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        config_arg = MockJob.call_args[0][0]
-        assert config_arg.environment.env == {"API_KEY": "sk-abc=def=="}
-
-
-async def test_bash_no_env_defaults_to_empty():
-    args = _bash_args(script_content="echo hello")
-
-    with patch("rock.sdk.job.Job") as MockJob:
-        mock_instance = MagicMock()
-        mock_instance.run = AsyncMock(return_value=_make_mock_job_result())
-        MockJob.return_value = mock_instance
-
-        cmd = JobCommand()
-        await cmd.arun(args)
-
-        config_arg = MockJob.call_args[0][0]
-        assert config_arg.environment.env == {}
-
-
-async def test_unknown_job_command_logs_error():
-    args = argparse.Namespace(job_command="weird")
-
-    with patch("rock.cli.command.job.logger") as mock_logger:
-        cmd = JobCommand()
-        await cmd.arun(args)
-        mock_logger.error.assert_called()