Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data

Aparna Balagopalan Gillian K. Hadfield 1 *, David Madras 2,3,5,6,7 2,3 , David H. Yang , Marzyeh Ghassemi 1,2

Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data

Files

thesis.pdf (1.61 MB)

Date

2024

Authors

Aparna Balagopalan Gillian K. Hadfield 1 *, David Madras 2,3,5,6,7 2,3 , David H. Yang , Marzyeh Ghassemi 1,2

Abstract

As governments and industry turn to increased use of automated decision systems, it becomes essential to con sider how closelysuch systems can reproduce humanjudgment.Weidentifyacorepotentialfailure,findingthat annotators label objects differently depending on whether they are being asked a factual question or a norma tive question. This challenges a natural assumption maintained in many standard machine-learning (ML) data acquisition procedures: that there is no difference between predicting the factual classification of an object and an exercise of judgment about whether an object violates a rule premised on those facts. We find that using factual labels to train models intended for normative judgments introduces a notable measurement error. We show that models trained using factual labels yield significantly different judgments than those trained using normative labels and that the impact of this effect on model performance can exceed that of other factors (e.g., dataset size) that routinely attract attention from ML researchers and practitioners.

URI

https://demo.dspace.org/handle/10673/1382

Collections

Artical

Full item page

Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections