Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Counter-productivity and suspicion : two arguments against talking about the AGI control problem

Stenseke, Jakob LU (2025) In Philosophical Studies
Abstract
How do you control a superintelligent artificial being given the possibility that its goals or actions might conflict with human interests? Over the past few decades, this concern– the AGI control problem– has remained a central challenge for research in AI safety. This paper develops and defends two arguments that provide pro tanto support for the following policy for those who worry about the AGI control problem: don’t talk about it. The first is argument from counter-productivity, which states that unless kept secret, efforts to solve the control problem could be used by a misaligned AGI to counter those very efforts. The second is argument from suspicion, stating that open discussions of the control problem may serve to make humanity... (More)
How do you control a superintelligent artificial being given the possibility that its goals or actions might conflict with human interests? Over the past few decades, this concern– the AGI control problem– has remained a central challenge for research in AI safety. This paper develops and defends two arguments that provide pro tanto support for the following policy for those who worry about the AGI control problem: don’t talk about it. The first is argument from counter-productivity, which states that unless kept secret, efforts to solve the control problem could be used by a misaligned AGI to counter those very efforts. The second is argument from suspicion, stating that open discussions of the control problem may serve to make humanity appear threatening to an AGI, which increases the risk that the AGI perceives humanity as a threat. I consider objections to the arguments and find them unsuccessful. Yet, I also consider objections to the don’t-talk policy itself and find it inconclusive whether it should be adopted. Additionally, the paper examines whether the arguments extend to other areas of AI safety research, such as AGI alignment, and argues that they likely do, albeit not necessarily as directly. I conclude by offering recommendations on what one can safely talk about, regardless of whether the don’t-talk policy is ultimately adopted. (Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
epub
subject
keywords
AI alignment, AI control, Existential risk, Superintelligence, Artificial general intelligence
in
Philosophical Studies
pages
24 pages
publisher
Springer
ISSN
0031-8116
DOI
10.1007/s11098-025-02379-9
language
English
LU publication?
yes
id
ef8f9dbb-1af9-4d71-9cc9-b06287ff590f
date added to LUP
2025-07-10 16:46:38
date last changed
2025-07-16 12:45:29
@article{ef8f9dbb-1af9-4d71-9cc9-b06287ff590f,
  abstract     = {{How do you control a superintelligent artificial being given the possibility that its goals or actions might conflict with human interests? Over the past few decades, this concern– the AGI control problem– has remained a central challenge for research in AI safety. This paper develops and defends two arguments that provide pro tanto support for the following policy for those who worry about the AGI control problem: don’t talk about it. The first is argument from counter-productivity, which states that unless kept secret, efforts to solve the control problem could be used by a misaligned AGI to counter those very efforts. The second is argument from suspicion, stating that open discussions of the control problem may serve to make humanity appear threatening to an AGI, which increases the risk that the AGI perceives humanity as a threat. I consider objections to the arguments and find them unsuccessful. Yet, I also consider objections to the don’t-talk policy itself and find it inconclusive whether it should be adopted. Additionally, the paper examines whether the arguments extend to other areas of AI safety research, such as AGI alignment, and argues that they likely do, albeit not necessarily as directly. I conclude by offering recommendations on what one can safely talk about, regardless of whether the don’t-talk policy is ultimately adopted.}},
  author       = {{Stenseke, Jakob}},
  issn         = {{0031-8116}},
  keywords     = {{AI alignment; AI control; Existential risk; Superintelligence; Artificial general intelligence}},
  language     = {{eng}},
  month        = {{07}},
  publisher    = {{Springer}},
  series       = {{Philosophical Studies}},
  title        = {{Counter-productivity and suspicion : two arguments against talking about the AGI control problem}},
  url          = {{http://dx.doi.org/10.1007/s11098-025-02379-9}},
  doi          = {{10.1007/s11098-025-02379-9}},
  year         = {{2025}},
}