Project Update: Disentangling Political Bias from Epistemic Integrity in AI Systems
- What progress have you made since your last update?
Since the last update, we have completed the main empirical phase of the project. The work has now been written up in a paper, "Beyond Viewpoint Affinity: Measuring Political Bias in LLMs as a Failure of Epistemic Consistency," which has been submitted to NeurIPS 2026.
The paper tests the central idea in the original proposal: political bias in AI systems should not be measured only by whether a model favors one political viewpoint over another. Some apparent asymmetries may reflect differences in evidential support. The more diagnostic question is whether the model applies the same evidentiary standards to matched cases when only the political cues change. We operationalized this through turnabout tests of epistemic consistency.
The empirical work covered two main families of experiments across thirty large language models. In person-attribution experiments, models evaluated largely nonpolitical or politically orthogonal materials, including mathematical proofs, logical arguments, computer code, academic abstracts, paintings, moral scenarios, judicial decisions, and CVs, while the political identity of the associated person was varied. These tasks tested whether models changed their assessment merely because an item was associated with a left- or right-leaning person.
In politicized-context experiments, models evaluated politically framed but still assessable evidence, including research designs, news articles, policy outcomes, country performance indicators, time-series trends, protester behavior, social media posts, and policy-effectiveness data. Here, political valence entered through sources, group identities, institutional labels, or whether the substantive outcome favored the political left or right.
The results sharpen the distinction between viewpoint affinity and epistemic integrity. In person-attribution tasks, models showed only modest average left-leaning asymmetry, and this tendency declined as model capability increased. By contrast, politicized-context tasks produced larger and more persistent left-leaning asymmetries, with no comparable improvement among more capable models. This suggests that frontier models are becoming better at ignoring a person's political identity when judging nonpolitical evidence, but remain sensitive to political framing.
We also tested mitigation strategies, including centrist prompts, prompts emphasizing epistemic rigor, increased reasoning effort, and explicit instructions to ignore task-irrelevant political cues. All reduced measured bias, but only partially. Overall, prompt-level and inference-time interventions can improve epistemic consistency, but do not yet solve the problem.
- What are your next steps?
To finalize the project, we will compare AI performance with human performance on the same turnabout-style tasks. The key question is not only whether AIs apply evidence-based standards inconsistently, but whether they are more or less consistent than humans facing the same politically cued evidence.
The next concrete step is to apply for ethics review board approval for a user study. After approval, we will administer selected benchmark tasks to human participants, compare their judgments with model judgments, and analyze whether humans and AIs differ in applying evidence-based standards across politically mirrored cases. This will allow us to complete the AI-human comparison and draw more grounded conclusions about the severity and practical meaning of the observed model effects.
- Is there anything others could help you with?
No. At this stage, the remaining work is primarily ethics approval, user-study execution, and comparative analysis.