Which evaluation frameworks measure social trust in human-AI collaboration?

Social trust in human-AI collaboration is assessed by a mix of psychological models, psychometric scales, and standards that translate theoretical constructs into measurable indicators. Core frameworks focus on perceived competence, integrity, and benevolence, and on how these shape reliance, cooperation, and oversight.

Psychological and organizational models

The Mayer Davis Schoorman integrative model of organizational trust identifies ability, benevolence, and integrity as antecedents to trust and has been widely adapted to automation and AI research. The human factors perspective led by John D. Lee Ohio State University and Kathleen A. See emphasizes how transparency, predictability, and performance history shape operator trust in automation. Empirical work by Y. Jian A. M. Bisantz C. T. Drury produced a validated Trust in Automation scale that operationalizes subjective trust through survey items, supporting quantitative comparison across designs.

Standards and risk frameworks

Standards and regulatory frameworks provide instruments and metrics for evaluating trustworthiness at system and organizational levels. The National Institute of Standards and Technology issues the AI Risk Management Framework which links technical measures, such as robustness and explainability, to governance practices that affect societal trust. The International Organization for Standardization publishes guidance on AI trustworthiness that frames metrics for safety, transparency, and accountability. These institutional frameworks make trust a measurable target for audits and certification rather than only a psychological state.

Measurement approaches converge on three methods: psychometric surveys that capture perceived trust and intent to rely, behavioral metrics that record reliance, intervention, and override rates in tasks, and system-level audits that evaluate compliance with governance, explainability, and bias mitigation. Cultural and territorial variations matter: social norms about authority and privacy change how users interpret system behavior and survey responses, so cross-cultural validation is essential.

Consequences of measurement choices are practical and ethical. Focusing only on short-term reliance can encourage automation complacency, while emphasizing governance and transparency supports sustainable public acceptance and reduces harm. Research grounded in human-robot interaction by Cynthia Breazeal Massachusetts Institute of Technology shows that social cues and embodiment alter trust dynamics, illustrating that technical fixes alone do not guarantee social legitimacy. Combining validated psychological scales, behavioral indicators, and standards-based audits creates a multidimensional evaluation that aligns design, deployment, and policy toward responsible human-AI collaboration.