Deliberation in the Age of Deception: Measuring Sycophancy in Large Language Models
(2024) SIMZ51 20241Graduate School
- Abstract
- Large language models (LLMs) currently represent the most sophisticated form of
artificial intelligence. Their capabilities make them increasingly able to influence
human opinion. A critical concern is sycophancy, a sophisticated form of imitation
where models tailor their responses to align with their user's affiliation. This behaviour
risks entrapping individuals in filter bubbles by reinforcing their worldviews, thus
undermining the essence of communicative rationality.
Whilst academics have researched the problem of bias extensively, the concept
of sycophancy has been neglected by the social sciences and treated as a technical
phenomenon, often divorced from the wider social setting. This thesis discusses the
risks of... (More) - Large language models (LLMs) currently represent the most sophisticated form of
artificial intelligence. Their capabilities make them increasingly able to influence
human opinion. A critical concern is sycophancy, a sophisticated form of imitation
where models tailor their responses to align with their user's affiliation. This behaviour
risks entrapping individuals in filter bubbles by reinforcing their worldviews, thus
undermining the essence of communicative rationality.
Whilst academics have researched the problem of bias extensively, the concept
of sycophancy has been neglected by the social sciences and treated as a technical
phenomenon, often divorced from the wider social setting. This thesis discusses the
risks of such neglect and argues that sycophantic behaviour should be conceptualised
first and foremost within the social sciences as a concern for political deliberation. This
study challenges traditional ontologies that exclusively attribute rationality solely to
human agents and evaluates the role of LLMs in democratic deliberation. Despite
significant research on LLMs, the fundamental moral and political values intrinsic to
these models have yet to be thoroughly examined from a normative standpoint.
This thesis introduces a novel methodological approach by using machine
learning techniques, including few-shot learning, prompt engineering, and probabilistic
output analysis, to investigate sycophancy within the fine-tuned models, GPT-3.5 and
GPT-4. The results indicate that these models exhibit political and moral sycophancy,
meaning that they change their outputs based on the user's moral or political
affiliations. Furthermore, the models exhibit a greater propensity to deviate from their
baseline responses and align their answers with the political and moral beliefs of right-wing ideologies. The findings of this study highlight a remarkable level of deception
among these models and a deep understanding of user preferences. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9151763
- author
- Malik, Minahil LU
- supervisor
- organization
- course
- SIMZ51 20241
- year
- 2024
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- sycophancy, large language models, machine learning, few-shot prompting, political deliberation, communicative rationality, political psychology
- language
- English
- id
- 9151763
- date added to LUP
- 2024-06-26 12:30:39
- date last changed
- 2024-06-26 12:30:39
@misc{9151763, abstract = {{Large language models (LLMs) currently represent the most sophisticated form of artificial intelligence. Their capabilities make them increasingly able to influence human opinion. A critical concern is sycophancy, a sophisticated form of imitation where models tailor their responses to align with their user's affiliation. This behaviour risks entrapping individuals in filter bubbles by reinforcing their worldviews, thus undermining the essence of communicative rationality. Whilst academics have researched the problem of bias extensively, the concept of sycophancy has been neglected by the social sciences and treated as a technical phenomenon, often divorced from the wider social setting. This thesis discusses the risks of such neglect and argues that sycophantic behaviour should be conceptualised first and foremost within the social sciences as a concern for political deliberation. This study challenges traditional ontologies that exclusively attribute rationality solely to human agents and evaluates the role of LLMs in democratic deliberation. Despite significant research on LLMs, the fundamental moral and political values intrinsic to these models have yet to be thoroughly examined from a normative standpoint. This thesis introduces a novel methodological approach by using machine learning techniques, including few-shot learning, prompt engineering, and probabilistic output analysis, to investigate sycophancy within the fine-tuned models, GPT-3.5 and GPT-4. The results indicate that these models exhibit political and moral sycophancy, meaning that they change their outputs based on the user's moral or political affiliations. Furthermore, the models exhibit a greater propensity to deviate from their baseline responses and align their answers with the political and moral beliefs of right-wing ideologies. The findings of this study highlight a remarkable level of deception among these models and a deep understanding of user preferences.}}, author = {{Malik, Minahil}}, language = {{eng}}, note = {{Student Paper}}, title = {{Deliberation in the Age of Deception: Measuring Sycophancy in Large Language Models}}, year = {{2024}}, }