Interface Agents and the People Current Benchmarks Ignore
Interface agents promise to operate software on behalf of users. Their benchmarks should include the people for whom delegated software control is an access need rather than a convenience.
Many current agent evaluations measure whether a system can complete a web or desktop task from an instruction. The task is treated as a neutral sequence: find a form, click a field, enter information, submit, and report completion. This setup misses how software tasks change when the user cannot easily see the screen, control a pointer, sustain attention, or recover quickly from a mistaken action.
For disabled users, delegation is not simply automation. It is a redistribution of agency between user and system. The agent may need to read context aloud, ask before pressing a destructive control, preserve an audit trail, or pause when uncertainty is high. Speed is not always the correct objective. Sometimes the correct objective is controlled progress with clear opportunities to intervene.
Benchmarks that ignore these conditions reward brittle behavior. An agent may score well by completing a transaction quickly while providing no useful account of what it did. It may rely on visual cues without alternative representations. It may fail when the interface changes slightly, then leave the user with no explanation and no safe recovery path.
Accessible agent evaluation should include assistive workflows, multimodal interaction, confirmation policies, and state summaries that can be understood without looking at the screen. It should measure whether the user remains in control, not only whether the final task state matches the instruction.
The research challenge is also institutional. Developers need datasets, test environments, and reporting norms that treat disabled users as central participants. Community organizations need ways to contest unsafe design choices before systems become infrastructure.
Interface agents may become an important layer of accessible computing. Whether they do so responsibly depends on whether benchmark design includes the people whose needs reveal the real difficulty of delegated action.
Contact the research office
For collaboration proposals or questions about accessible interface agents, contact the institute.
research@compaccess.edu.kg