Skip to content

[Bug]: Async subprocess stdout output loss Labels: bug, verification, output, performance #308

Description

@ouyinghui

What happened?

Problem

When using prompt parameter with agent tasks, or running long-running bash tasks, output is sometimes lost or incomplete.

Root causes identified

  1. Async stdout read buffering: stdout.read(4096) reads in 4KB chunks, but data may not be fully flushed before reading.
  2. No immediate flush: File writes don't call flush() + fsync(), leaving data in OS buffers that can be lost.
  3. Limited read: read_task_output() defaults to last 12KB, potentially losing early output.

Steps to reproduce

Solution

Patch 1: _copy_output() in manager.py (lines 276-281)

Before:

while True:
    chunk = await process.stdout.read(4096)
    if not chunk:
        return
    async with self._output_locks[task_id]:
        with self._tasks[task_id].output_file.open("ab") as handle:
            handle.write(chunk)

After:

while True:
    chunk = await process.stdout.read(4096)
    if not chunk:
        return
    async with self._output_locks[task_id]:
        with self._tasks[task_id].output_file.open("ab") as handle:
            handle.write(chunk)
            handle.flush()              # ←立即刷新 Python 缓冲区
            os.fsync(handle.fileno())   # ←强制写入磁盘

Patch 2: read_task_output() in manager.py (line 230)

Before:

def read_task_output(self, task_id: str, *, max_bytes: int = 12000) -> str:

After:

def read_task_output(self, task_id: str, *, max_bytes: int = 12000, flush: bool = False) -> str:
    """Return the tail of a task's output file.

    Args:
        task_id: Task identifier
        max_bytes: Maximum bytes to return (default 12000)
        flush: If True, force flush file system buffers before reading
    """
    task = self._require_task(task_id)
    if flush:
        try:
            with task.output_file.open("ab") as handle:
                handle.flush()
                os.fsync(handle.fileno())
        except Exception:
            pass
    content = task.output_file.read_text(encoding="utf-8", errors="replace")
    if len(content) > max_bytes:
        return content[-max_bytes:]
    return content

Verification Results

Test Suite (100% Pass Rate: 6/6)

Test Expected Actual Status
100 lines (basic) 100 100 ✅ PASS
500 lines (basic) 500 500 ✅ PASS
1000 lines (basic) 1000 1000 ✅ PASS
100 lines (10ms delay/line) 100 100 ✅ PASS
200 lines (5ms delay/line) 200 200 ✅ PASS
Hello World (5 lines) 5 5 ✅ PASS

Output Size Verification

Task Lines Size Status
100_line test 100 900 bytes
500_line test 500 4500 bytes
1000_line test 1000 9000 bytes

Files Modified

  • /usr/local/lib/python3.12/site-packages/openharness/tasks/manager.py

Environment

Version: openharness 0.1.9
Python: 3.12.13
OS: Linux (Debian 13 trixie)
Container: Docker (overlay filesystem)

Relevant logs or screenshots

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions