Adaptive Tactics in Software Operations: How People Manage Complexity by Asking for Help
(2024) FLMU16 20232Division of Risk Management and Societal Safety
- Abstract
- Society relies on the continued operation of software systems, from
healthcare to social media platforms. These systems can be engineered to be
highly reliable, but complexity assures surprise. People have the ability to
manage this complexity, filling gaps in design by recognizing, diagnosing and
correcting anomalies. Rarely can surprises be managed by individuals alone.
When engaged in anomaly response, people ask for help, recruiting others into
the response.
This study explores recruitment – what criteria influence people to recruit
additional resources, who do they recruit, why do they recruit them, and what
methods do they use for recruitment? A case study approach was used, focusing
on instances of recruitment across... (More) - Society relies on the continued operation of software systems, from
healthcare to social media platforms. These systems can be engineered to be
highly reliable, but complexity assures surprise. People have the ability to
manage this complexity, filling gaps in design by recognizing, diagnosing and
correcting anomalies. Rarely can surprises be managed by individuals alone.
When engaged in anomaly response, people ask for help, recruiting others into
the response.
This study explores recruitment – what criteria influence people to recruit
additional resources, who do they recruit, why do they recruit them, and what
methods do they use for recruitment? A case study approach was used, focusing
on instances of recruitment across four cases of anomaly response in software
operations. Data were collected in the form of chat transcripts, and records
from monitoring and alerting tools and code repositories. These data were
used to inform semi-structured interviews and cued recall with participants
to elicit perspectives related to decision-making around specific instances
of recruitment.
In attempting to maintain operations, individuals sought help for a variety
of reasons: diagnosis, repair, coordination, cross-checking, information
gathering and approval. Recruitment occurred across a range of roles, and
requests for help occurred throughout the response. People also employed
various strategies in their attempts to recruit others, such as communicating
urgency by sharing a “scary” graph and leveraging the organizational
structure to identify who might be available to help. In all cases, restoring
system operations required recruiting additional people into the response,
and this recruitment was enabled by the accumulated experiences and
relationships of the recruiters. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9150096
- author
- Wettick, Michael Vincent LU
- supervisor
- organization
- course
- FLMU16 20232
- year
- 2024
- type
- H1 - Master's Degree (One Year)
- subject
- keywords
- Recruitment, Software Operations, Anomaly Response, Incident Response, Escalation, Adaptive Capacity, Decision Making, Tacit Knowledge, Complexity
- language
- English
- id
- 9150096
- date added to LUP
- 2024-03-21 15:40:50
- date last changed
- 2024-03-21 15:40:50
@misc{9150096, abstract = {{Society relies on the continued operation of software systems, from healthcare to social media platforms. These systems can be engineered to be highly reliable, but complexity assures surprise. People have the ability to manage this complexity, filling gaps in design by recognizing, diagnosing and correcting anomalies. Rarely can surprises be managed by individuals alone. When engaged in anomaly response, people ask for help, recruiting others into the response. This study explores recruitment – what criteria influence people to recruit additional resources, who do they recruit, why do they recruit them, and what methods do they use for recruitment? A case study approach was used, focusing on instances of recruitment across four cases of anomaly response in software operations. Data were collected in the form of chat transcripts, and records from monitoring and alerting tools and code repositories. These data were used to inform semi-structured interviews and cued recall with participants to elicit perspectives related to decision-making around specific instances of recruitment. In attempting to maintain operations, individuals sought help for a variety of reasons: diagnosis, repair, coordination, cross-checking, information gathering and approval. Recruitment occurred across a range of roles, and requests for help occurred throughout the response. People also employed various strategies in their attempts to recruit others, such as communicating urgency by sharing a “scary” graph and leveraging the organizational structure to identify who might be available to help. In all cases, restoring system operations required recruiting additional people into the response, and this recruitment was enabled by the accumulated experiences and relationships of the recruiters.}}, author = {{Wettick, Michael Vincent}}, language = {{eng}}, note = {{Student Paper}}, title = {{Adaptive Tactics in Software Operations: How People Manage Complexity by Asking for Help}}, year = {{2024}}, }