Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice

Borg, Markus; Bengtsson, Johan; Osterling, Harald; Hagelborn, Alexander; Gagner, Isabella; Tomaszewski, Piotr

Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice

Mark

Borg, Markus ^LU ; Bengtsson, Johan ; Osterling, Harald ; Hagelborn, Alexander ; Gagner, Isabella and Tomaszewski, Piotr (2022) 1st International Conference on AI Engineering - Software Engineering for AI, CAIN 2022 In Proceedings - 1st International Conference on AI Engineering - Software Engineering for AI, CAIN 2022 p.22-32

Abstract: Due to the migration megatrend, efficient and effective second-language acquisition is vital. One proposed solution involves AI-enabled conversational agents for person-centered interactive language practice. We present results from ongoing action research targeting quality assurance of proprietary generative dialog models trained for virtual job interviews. The action team elicited a set of 38 requirements for which we designed corresponding automated test cases for 15 of particular interest to the evolving solution. Our results show that six of the test case designs can detect meaningful differences between candidate models. While quality assurance of natural language processing applications is complex, we provide initial steps toward... (More); Due to the migration megatrend, efficient and effective second-language acquisition is vital. One proposed solution involves AI-enabled conversational agents for person-centered interactive language practice. We present results from ongoing action research targeting quality assurance of proprietary generative dialog models trained for virtual job interviews. The action team elicited a set of 38 requirements for which we designed corresponding automated test cases for 15 of particular interest to the evolving solution. Our results show that six of the test case designs can detect meaningful differences between candidate models. While quality assurance of natural language processing applications is complex, we provide initial steps toward an automated framework for machine learning model selection in the context of an evolving conversational agent. Future work will focus on model selection in an MLOps setting.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/9ebe54d7-4156-4ba8-bc0c-ed6af761f908

author

Borg, Markus ^LU ; Bengtsson, Johan ; Osterling, Harald ; Hagelborn, Alexander ; Gagner, Isabella and Tomaszewski, Piotr

publishing date

2022

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Other Computer and Information Science

keywords

action research, AI quality, conversational agent, generative dialog model, requirements engineering, software testing

host publication

Proceedings - 1st International Conference on AI Engineering - Software Engineering for AI, CAIN 2022

series title

Proceedings - 1st International Conference on AI Engineering - Software Engineering for AI, CAIN 2022

pages

11 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

conference name

1st International Conference on AI Engineering - Software Engineering for AI, CAIN 2022

conference location

Pittsburgh, United States

conference dates

2022-05-16 - 2022-05-17

external identifiers

scopus:85133467455

ISBN

9781450392754

DOI

10.1145/3522664.3528592

language

English

LU publication?

no

id

9ebe54d7-4156-4ba8-bc0c-ed6af761f908

date added to LUP

2022-10-24 14:57:11

date last changed

2025-10-14 10:42:23

@inproceedings{9ebe54d7-4156-4ba8-bc0c-ed6af761f908,
  abstract     = {{<p>Due to the migration megatrend, efficient and effective second-language acquisition is vital. One proposed solution involves AI-enabled conversational agents for person-centered interactive language practice. We present results from ongoing action research targeting quality assurance of proprietary generative dialog models trained for virtual job interviews. The action team elicited a set of 38 requirements for which we designed corresponding automated test cases for 15 of particular interest to the evolving solution. Our results show that six of the test case designs can detect meaningful differences between candidate models. While quality assurance of natural language processing applications is complex, we provide initial steps toward an automated framework for machine learning model selection in the context of an evolving conversational agent. Future work will focus on model selection in an MLOps setting.</p>}},
  author       = {{Borg, Markus and Bengtsson, Johan and Osterling, Harald and Hagelborn, Alexander and Gagner, Isabella and Tomaszewski, Piotr}},
  booktitle    = {{Proceedings - 1st International Conference on AI Engineering - Software Engineering for AI, CAIN 2022}},
  isbn         = {{9781450392754}},
  keywords     = {{action research; AI quality; conversational agent; generative dialog model; requirements engineering; software testing}},
  language     = {{eng}},
  pages        = {{22--32}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{Proceedings - 1st International Conference on AI Engineering - Software Engineering for AI, CAIN 2022}},
  title        = {{Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice}},
  url          = {{http://dx.doi.org/10.1145/3522664.3528592}},
  doi          = {{10.1145/3522664.3528592}},
  year         = {{2022}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice