Releases: ZJU-REAL/ClawGUI
Releases · ZJU-REAL/ClawGUI
clawgui-app-v0.3.1
Fixes
- Feishu GUI runner now drives step-by-step. The chat bubble updates with the current action on every step instead of staying frozen on "正在执行任务…". Each step has a 5-minute timeout so a wedged LLM call can't hang the loop forever.
- Clean error surfacing for Feishu replies. SDK errors are decoded to
code: messageinstead of the previousError@<hash>. Pre-flight failures (missing Vision creds, no device-control auth) now reply to Feishu with a human-readable reason instead of silently bailing. Image-reply failures land in Settings → 外部通道 → 调试日志 so users can see them without logcat. @_user_Nmention stripping. Was hardcoded to@_user_1; now regex-matches@_(user|chat)_Nso@bot 发朋友圈reaches PhoneAgent as just发朋友圈.- Live screenshot fallback when image-reply is enabled but trace recording is off — users still get a final-state image instead of silent nothing.
- IME enabled check via the Android Framework API instead of
ime list -s. Users who'd already enabled "ClawGUI Input" via system settings would see "未启用" in our panel because the shell command needed Shizuku/wadb auth to succeed. Settings → 输入法 also dropped the manual switcher button — agent handles the switching itself.
Chore
- Repo cleanup: retired the v1
clawgui-app/(Java client) and promotedclawgui-app-ng/into the naturalclawgui-app/path. Android applicationId stayscom.clawgui.ngso existing v0.3.0 installs upgrade in place.
clawgui-app-v0.3.0
Structured Plan + execution trace
- PhoneAgent now maintains a user-visible task plan through a 6-state machine —
PENDING / IN_PROGRESS / DONE / SKIPPED / FAILED / BLOCKED. - The model emits incremental
<plan>{"ops":[...]}</plan>blocks each step (init/update/insert_after/remove). - A
PlanCardrenders inside the assistant bubble; status transitions fade + scale, the IN_PROGRESS row pulses. - New
ActionTraceList: every step shown as a chip (index / name / args / status badge), with a spinner placeholder while waiting on the next LLM round-trip.
Floating Plan + Trace overlay
- While a task runs, a semi-transparent draggable card floats above the host app so the user can keep an eye on progress without flipping back to ClawGUI.
- 3 collapse stages: 20dp blue dot → 1-line chip → full panel. Tap cycles, long-press drags.
- Auto-hides on finish / cancel / error. × dismisses for the current run.
- Settings → Floating Panel: enable toggle, alpha slider (40–100%), permission prompt.
Mid-task Ask
- The agent now proactively asks the user when uncertain: missing parameters, multiple candidates, sensitive operations, no clear next step, fuzzy task scope.
- The input field is embedded directly in the floating panel — no need to switch back to ClawGUI to answer. IME pops automatically.
- The answer is injected as a system-role context turn; PhoneAgent resumes immediately.
Image attachments + multimodal
+button in the input bar opens a sheet: pick from gallery or take a photo. Up to 3 attachments.- Thumbnails render inside the user bubble; tap for a full-screen Lightbox (pinch-zoom + pan).
- Images are treated as task inputs (not just reference): auto-exported to
Pictures/ClawGUI/so any host app's picker can find them. - Brain falls back to the Vision provider when it can't see images.
ImageInsightruns a generic content-summary + extractable-text pre-pass.
Polish
- Follow-up suggestions: 3 one-tap chips under the latest assistant reply that drop a draft into the input box.
- Thinking separation: Brain
reasoning_contentand inline<think>tags are split off into a collapsible panel. PhoneAgent bubbles show per-step thinking when you peek a trace row. - Appearance: Light / Dark / System in settings now actually drives the theme (the UI was there but unwired).
- Return-to-ClawGUI: task finish brings the chat back to the foreground via NEW_TASK + REORDER_TO_FRONT + a PendingIntent fallback that works on Honor / MIUI.
Robustness
- AutoGLM parser hardening: tolerates stray
<plan>blocks, natural-language preambles, markdown fences, full-width punctuation (【】, ,), single/double quotes, whitespace variants (do (/do(action='), JSON-shape fallbacks, and nested quotes. - Step-1 action guard: if the model violates the "step 1 may only Launch / Home / Ask / finish" rule, the runtime intercepts and injects a system feedback turn — no more poking ClawGUI's own UI.
- Plan protocol fault-tolerance: any malformed op is dropped silently; the main task still moves forward on
<answer>. - Parse-failure retry: first parse error gets one retry with a system correction; only the second consecutive failure bails.
- Overlay lifecycle fix:
LifecycleRegistryis single-shot after ON_DESTROY, which was why the floating panel never came back on the second task — each show() now allocates a fresh owner.
Prompt overhaul
- System prompt compressed from 200 → ~80 lines, format contract pulled to the top so the model's attention lands on it.
- Step 1 is pure planning (no screenshot; the prompt no longer hints there's a screen), restricted to Launch / Home / Ask / finish.
- Explicit "pick the app first" framing with concrete mappings (WeChat / Moments / Xiaohongshu / Weibo / Douyin / Meituan / …).
- Tells the model to ignore ClawGUI's own floating panel — it's UI for the user, not part of the task.
- Ask reframed as "ask when uncertain, not only when something breaks", with 5 trigger scenarios + 4 counter-examples + a 3-per-task cap.
clawgui-app-v0.2.0
What's Changed
- feat(clawgui-eval): add Kimi K2.5 evaluation support by @ZichenWen1 in #1
- fix: handle reasoning_content in streaming for reasoning models by @ekkoitac in #8
- fix(agent): fix issues in webui.py by @jucve in #9
- feat: add clawgui-app on-device deployment module by @gta886 in #14
- docs cleanup, remove diagnostic export, refine Roadmap wording by @lhppppp in #15
- fix(clawgui-app): restore source files swallowed by root .gitignore by @lhppppp in #16
- Remove some configs, as it causes the system to freeze. by @HangFang6 in #17
New Contributors
- @ZichenWen1 made their first contribution in #1
- @ekkoitac made their first contribution in #8
- @jucve made their first contribution in #9
- @gta886 made their first contribution in #14
- @lhppppp made their first contribution in #15
- @HangFang6 made their first contribution in #17
Full Changelog: https://github.com/ZJU-REAL/ClawGUI/commits/clawgui-app-v0.2.0