Counter-productivity and suspicion : two arguments against talking about the AGI control problem

Stenseke, Jakob

Counter-productivity and suspicion : two arguments against talking about the AGI control problem

Mark

Stenseke, Jakob ^LU (2025) In Philosophical Studies

Abstract: How do you control a superintelligent artificial being given the possibility that its goals or actions might conflict with human interests? Over the past few decades, this concern– the AGI control problem– has remained a central challenge for research in AI safety. This paper develops and defends two arguments that provide pro tanto support for the following policy for those who worry about the AGI control problem: don’t talk about it. The first is argument from counter-productivity, which states that unless kept secret, efforts to solve the control problem could be used by a misaligned AGI to counter those very efforts. The second is argument from suspicion, stating that open discussions of the control problem may serve to make humanity... (More); How do you control a superintelligent artificial being given the possibility that its goals or actions might conflict with human interests? Over the past few decades, this concern– the AGI control problem– has remained a central challenge for research in AI safety. This paper develops and defends two arguments that provide pro tanto support for the following policy for those who worry about the AGI control problem: don’t talk about it. The first is argument from counter-productivity, which states that unless kept secret, efforts to solve the control problem could be used by a misaligned AGI to counter those very efforts. The second is argument from suspicion, stating that open discussions of the control problem may serve to make humanity appear threatening to an AGI, which increases the risk that the AGI perceives humanity as a threat. I consider objections to the arguments and find them unsuccessful. Yet, I also consider objections to the don’t-talk policy itself and find it inconclusive whether it should be adopted. Additionally, the paper examines whether the arguments extend to other areas of AI safety research, such as AGI alignment, and argues that they likely do, albeit not necessarily as directly. I conclude by offering recommendations on what one can safely talk about, regardless of whether the don’t-talk policy is ultimately adopted. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/ef8f9dbb-1af9-4d71-9cc9-b06287ff590f

author

Stenseke, Jakob ^LU

organization

publishing date

2025-07-10

type

Contribution to journal

publication status

epub

subject

keywords

AI alignment, AI control, Existential risk, Superintelligence, Artificial general intelligence

in

Philosophical Studies

pages

24 pages

publisher

Springer

external identifiers

scopus:105010515907

ISSN

0031-8116

DOI

10.1007/s11098-025-02379-9

language

English

LU publication?

yes

id

ef8f9dbb-1af9-4d71-9cc9-b06287ff590f

date added to LUP

2025-07-10 16:46:38

date last changed

2025-10-14 09:55:56

@article{ef8f9dbb-1af9-4d71-9cc9-b06287ff590f,
  abstract     = {{How do you control a superintelligent artificial being given the possibility that its goals or actions might conflict with human interests? Over the past few decades, this concern– the AGI control problem– has remained a central challenge for research in AI safety. This paper develops and defends two arguments that provide pro tanto support for the following policy for those who worry about the AGI control problem: don’t talk about it. The first is argument from counter-productivity, which states that unless kept secret, efforts to solve the control problem could be used by a misaligned AGI to counter those very efforts. The second is argument from suspicion, stating that open discussions of the control problem may serve to make humanity appear threatening to an AGI, which increases the risk that the AGI perceives humanity as a threat. I consider objections to the arguments and find them unsuccessful. Yet, I also consider objections to the don’t-talk policy itself and find it inconclusive whether it should be adopted. Additionally, the paper examines whether the arguments extend to other areas of AI safety research, such as AGI alignment, and argues that they likely do, albeit not necessarily as directly. I conclude by offering recommendations on what one can safely talk about, regardless of whether the don’t-talk policy is ultimately adopted.}},
  author       = {{Stenseke, Jakob}},
  issn         = {{0031-8116}},
  keywords     = {{AI alignment; AI control; Existential risk; Superintelligence; Artificial general intelligence}},
  language     = {{eng}},
  month        = {{07}},
  publisher    = {{Springer}},
  series       = {{Philosophical Studies}},
  title        = {{Counter-productivity and suspicion : two arguments against talking about the AGI control problem}},
  url          = {{http://dx.doi.org/10.1007/s11098-025-02379-9}},
  doi          = {{10.1007/s11098-025-02379-9}},
  year         = {{2025}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Counter-productivity and suspicion : two arguments against talking about the AGI control problem