Evaluating Whether an AI System Is Actually Usable
A system can achieve high task accuracy and still fail the people it is meant to support. Accessibility evaluation must therefore examine use, not only output.
Most AI evaluation asks whether the system produced the correct answer under a specified test condition. That question is necessary, but it is insufficient for accessibility. A blind user may need to know why a visual description matters. A user with limited motor control may need the system to avoid actions that require rapid correction. A user with cognitive fatigue may need summaries that preserve decision points rather than compressing everything into fluent prose.
Usability depends on the interaction around the model. Did the system communicate uncertainty in a way the user could act on? Could the user stop an action before it completed? Were errors recoverable? Did the system preserve state after interruption? Did it ask for confirmation when a delegated action had meaningful consequences?
Evaluation must also account for sustained use. A system that succeeds in a lab task may impose fatigue when used repeatedly. It may require prompts that are too difficult to compose, produce explanations that are too long to scan, or make small recurring mistakes that shift labor back to the user. Accessibility research should measure these burdens directly.
Benchmarks need tasks grounded in real access needs. This includes document reading, form completion, device navigation, meeting participation, multilingual communication, and software control under varied sensory and motor conditions. It also includes failure cases, because safe recovery is often the difference between a useful assistive tool and a dangerous one.
Evaluation should include disabled users and community organizations as research partners rather than late-stage testers. Their role is not to validate a finished system. It is to shape the definition of success, the task environment, and the acceptable risk profile.
An AI system is actually usable when people can rely on it without surrendering agency, privacy, or interpretability. That standard is higher than accuracy, and it is the standard that accessible intelligence requires.
Contact the research office
For collaboration proposals or questions about accessibility evaluation, contact the institute.
research@compaccess.edu.kg