Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Human Text Query for Sport Data using LLMs

Blaho Mildton, Eira LU and Randsalu, Isa LU (2025) EITM01 20251
Department of Electrical and Information Technology
Abstract
This thesis explores the use of Large Language Models (LLMs) to convert natural language questions into SQL (Structured Query Language) queries within the field of sports analytics, with a focus on a real-world use case for Spiideo, a provider of cloud-based video analysis tools. The study compares different system designs, including single-agent and multi-agent setups, various prompting strategies, and the use of Retrieval-Augmented Generation (RAG).

The systems tested include configurations using ReAct (Reasoning + Acting) and StateGraph agents across single-agent setups as well as network- and supervisor-based multi-agent architectures. Each setup is evaluated using both zero-shot and few-shot prompting strategies, with and without... (More)
This thesis explores the use of Large Language Models (LLMs) to convert natural language questions into SQL (Structured Query Language) queries within the field of sports analytics, with a focus on a real-world use case for Spiideo, a provider of cloud-based video analysis tools. The study compares different system designs, including single-agent and multi-agent setups, various prompting strategies, and the use of Retrieval-Augmented Generation (RAG).

The systems tested include configurations using ReAct (Reasoning + Acting) and StateGraph agents across single-agent setups as well as network- and supervisor-based multi-agent architectures. Each setup is evaluated using both zero-shot and few-shot prompting strategies, with and without RAG. Performance is measured by execution accuracy, exact-set-match accuracy, and response time across test sets of varying complexity.

Results show that few-shot prompting consistently leads to better performance than zero-shot, in processing time, execution accuracy and exact-set-match accuracy. While multi-agent systems demonstrate stronger capability in handling complex queries and longer contexts, they typically require more processing time due to inter-agent communication and coordination. Among the multi-agent setups, the network architecture achieves the highest overall execution accuracy.

In contrast, single-agent systems are generally faster and benefit the most from the inclusion of RAG, which retrieves the most relevant examples and adds them to the prompt using semantic search. The StateGraph agent exhibits more stable behavior across test cases, whereas the ReAct agent occasionally repeats actions unnecessarily. These outcomes illustrate a clear trade-off between system complexity and performance: multi-agent systems excel in capability but at the cost of speed, while simpler single-agent systems prioritize efficiency, often requiring more targeted context to perform well.

Overall, the findings provide insight into how different architectural and prompting choices influence the effectiveness of LLM-driven systems for natural language to SQL translation, offering practical guidance for building efficient and accurate text-to-SQL applications in real-world environments. (Less)
Please use this url to cite or link to this publication:
author
Blaho Mildton, Eira LU and Randsalu, Isa LU
supervisor
organization
course
EITM01 20251
year
type
H2 - Master's Degree (Two Years)
subject
report number
LU/LTH-EIT 2025-1086
language
English
id
9209729
date added to LUP
2025-08-21 15:28:09
date last changed
2025-08-21 15:28:09
@misc{9209729,
  abstract     = {{This thesis explores the use of Large Language Models (LLMs) to convert natural language questions into SQL (Structured Query Language) queries within the field of sports analytics, with a focus on a real-world use case for Spiideo, a provider of cloud-based video analysis tools. The study compares different system designs, including single-agent and multi-agent setups, various prompting strategies, and the use of Retrieval-Augmented Generation (RAG).

The systems tested include configurations using ReAct (Reasoning + Acting) and StateGraph agents across single-agent setups as well as network- and supervisor-based multi-agent architectures. Each setup is evaluated using both zero-shot and few-shot prompting strategies, with and without RAG. Performance is measured by execution accuracy, exact-set-match accuracy, and response time across test sets of varying complexity.

Results show that few-shot prompting consistently leads to better performance than zero-shot, in processing time, execution accuracy and exact-set-match accuracy. While multi-agent systems demonstrate stronger capability in handling complex queries and longer contexts, they typically require more processing time due to inter-agent communication and coordination. Among the multi-agent setups, the network architecture achieves the highest overall execution accuracy.

In contrast, single-agent systems are generally faster and benefit the most from the inclusion of RAG, which retrieves the most relevant examples and adds them to the prompt using semantic search. The StateGraph agent exhibits more stable behavior across test cases, whereas the ReAct agent occasionally repeats actions unnecessarily. These outcomes illustrate a clear trade-off between system complexity and performance: multi-agent systems excel in capability but at the cost of speed, while simpler single-agent systems prioritize efficiency, often requiring more targeted context to perform well.

Overall, the findings provide insight into how different architectural and prompting choices influence the effectiveness of LLM-driven systems for natural language to SQL translation, offering practical guidance for building efficient and accurate text-to-SQL applications in real-world environments.}},
  author       = {{Blaho Mildton, Eira and Randsalu, Isa}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Human Text Query for Sport Data using LLMs}},
  year         = {{2025}},
}