Human Text Query for Sport Data using LLMs
(2025) EITM01 20251Department of Electrical and Information Technology
- Abstract
- This thesis explores the use of Large Language Models (LLMs) to convert natural language questions into SQL (Structured Query Language) queries within the field of sports analytics, with a focus on a real-world use case for Spiideo, a provider of cloud-based video analysis tools. The study compares different system designs, including single-agent and multi-agent setups, various prompting strategies, and the use of Retrieval-Augmented Generation (RAG).
The systems tested include configurations using ReAct (Reasoning + Acting) and StateGraph agents across single-agent setups as well as network- and supervisor-based multi-agent architectures. Each setup is evaluated using both zero-shot and few-shot prompting strategies, with and without... (More) - This thesis explores the use of Large Language Models (LLMs) to convert natural language questions into SQL (Structured Query Language) queries within the field of sports analytics, with a focus on a real-world use case for Spiideo, a provider of cloud-based video analysis tools. The study compares different system designs, including single-agent and multi-agent setups, various prompting strategies, and the use of Retrieval-Augmented Generation (RAG).
The systems tested include configurations using ReAct (Reasoning + Acting) and StateGraph agents across single-agent setups as well as network- and supervisor-based multi-agent architectures. Each setup is evaluated using both zero-shot and few-shot prompting strategies, with and without RAG. Performance is measured by execution accuracy, exact-set-match accuracy, and response time across test sets of varying complexity.
Results show that few-shot prompting consistently leads to better performance than zero-shot, in processing time, execution accuracy and exact-set-match accuracy. While multi-agent systems demonstrate stronger capability in handling complex queries and longer contexts, they typically require more processing time due to inter-agent communication and coordination. Among the multi-agent setups, the network architecture achieves the highest overall execution accuracy.
In contrast, single-agent systems are generally faster and benefit the most from the inclusion of RAG, which retrieves the most relevant examples and adds them to the prompt using semantic search. The StateGraph agent exhibits more stable behavior across test cases, whereas the ReAct agent occasionally repeats actions unnecessarily. These outcomes illustrate a clear trade-off between system complexity and performance: multi-agent systems excel in capability but at the cost of speed, while simpler single-agent systems prioritize efficiency, often requiring more targeted context to perform well.
Overall, the findings provide insight into how different architectural and prompting choices influence the effectiveness of LLM-driven systems for natural language to SQL translation, offering practical guidance for building efficient and accurate text-to-SQL applications in real-world environments. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9209729
- author
- Blaho Mildton, Eira LU and Randsalu, Isa LU
- supervisor
- organization
- course
- EITM01 20251
- year
- 2025
- type
- H2 - Master's Degree (Two Years)
- subject
- report number
- LU/LTH-EIT 2025-1086
- language
- English
- id
- 9209729
- date added to LUP
- 2025-08-21 15:28:09
- date last changed
- 2025-08-21 15:28:09
@misc{9209729, abstract = {{This thesis explores the use of Large Language Models (LLMs) to convert natural language questions into SQL (Structured Query Language) queries within the field of sports analytics, with a focus on a real-world use case for Spiideo, a provider of cloud-based video analysis tools. The study compares different system designs, including single-agent and multi-agent setups, various prompting strategies, and the use of Retrieval-Augmented Generation (RAG). The systems tested include configurations using ReAct (Reasoning + Acting) and StateGraph agents across single-agent setups as well as network- and supervisor-based multi-agent architectures. Each setup is evaluated using both zero-shot and few-shot prompting strategies, with and without RAG. Performance is measured by execution accuracy, exact-set-match accuracy, and response time across test sets of varying complexity. Results show that few-shot prompting consistently leads to better performance than zero-shot, in processing time, execution accuracy and exact-set-match accuracy. While multi-agent systems demonstrate stronger capability in handling complex queries and longer contexts, they typically require more processing time due to inter-agent communication and coordination. Among the multi-agent setups, the network architecture achieves the highest overall execution accuracy. In contrast, single-agent systems are generally faster and benefit the most from the inclusion of RAG, which retrieves the most relevant examples and adds them to the prompt using semantic search. The StateGraph agent exhibits more stable behavior across test cases, whereas the ReAct agent occasionally repeats actions unnecessarily. These outcomes illustrate a clear trade-off between system complexity and performance: multi-agent systems excel in capability but at the cost of speed, while simpler single-agent systems prioritize efficiency, often requiring more targeted context to perform well. Overall, the findings provide insight into how different architectural and prompting choices influence the effectiveness of LLM-driven systems for natural language to SQL translation, offering practical guidance for building efficient and accurate text-to-SQL applications in real-world environments.}}, author = {{Blaho Mildton, Eira and Randsalu, Isa}}, language = {{eng}}, note = {{Student Paper}}, title = {{Human Text Query for Sport Data using LLMs}}, year = {{2025}}, }