tell the agent what to do differently. it picks this up at the start of its next step (5–30s latency). does NOT abort the run.
🙋 the agent needs you
the agent called ask_human. it's blocked, polling for your answer. it will resume the moment you send.
tip: cmd/ctrl+enter to send
log
latest screenshot
run history
started
task
status
testing center — service health
probes every service Manny2 depends on. green = ok, red = broken. click a row to see detail.
live SHM frame — what manny2 sees right now (click to refresh)
tools — call any service directly
pick an endpoint, fill in params if needed, click call. raw request URL + response body shown below.
🧠 parse current frame with omniparser (canonical)
runs Microsoft's canonical OmniParser pipeline (YOLO + easyocr + Florence-2) on the live SHM frame. ~0.8s warm. returns text + icon elements with bboxes and labels.
🎯 ask UI-TARS to locate something on screen
natural language → vision LLM finds the pixel. e.g. "the Inbox folder in the left sidebar", "the Send button"