Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

LLM-assisted trend scout

Ek, Alexander LU and Månsson, Lucas LU (2025) EITL05 20251
Department of Electrical and Information Technology
Abstract
Abstract

It is important for a large company like Bosch to be at the forefront of technological development, and a central part of this entails scouting for trends in the market. To achieve this, they search the Internet for news articles that could be of interest, providing them with vast amounts of information. A problem, however, is that they receive information about very many different areas in many different regions, which makes it difficult and time-consuming to sort out what may be interesting and relevant. This thesis explores how LLMs can be used to assist with finding relevant information on the internet.

The thesis contains two main parts: One explores how a pre-trained LLM can be optimized through different techniques... (More)
Abstract

It is important for a large company like Bosch to be at the forefront of technological development, and a central part of this entails scouting for trends in the market. To achieve this, they search the Internet for news articles that could be of interest, providing them with vast amounts of information. A problem, however, is that they receive information about very many different areas in many different regions, which makes it difficult and time-consuming to sort out what may be interesting and relevant. This thesis explores how LLMs can be used to assist with finding relevant information on the internet.

The thesis contains two main parts: One explores how a pre-trained LLM can be optimized through different techniques to evaluate given information according to its relevance to Bosch. The other explores how this solution could be integrated into a web-based application. To test how an LLM could be optimized, 5 different LLM techniques (zero-shot prompt, few-shot prompt, multi-stage pipeline, RAG 1, and RAG 2) were found and created by doing literature research, and each LLM technique were tested on a test-set of articles that already were labeled on a scale from 0-3 by a Bosch employee were the score 0 meant not relevant and score 3 meant highly relevant. To test how this solution could be integrated into a web-based application, a prototype was made upon requirement from a Bosch employee and some user-tests were done by a Bosch employee.

The results of the different LLM techniques showed that multi-stage pipeline performed best in F1-score, Cohens kappa and Quadrant kappa in predicting the right relevant score to an article. But in another test where scores 2 and 3 were seen as positive relevance, and scores 0 and 1 were seen as negative relevance. Few-shot performed better than multi-stage in recall of articles with positive relevance, meaning it found most of the positively relevant articles. To minimize the risk of missing a relevant article in the prototype, a few-shot prompt would be the best LLM technique for the prototype.

Due to the limitation of training data in the RAG 1 technique, a reliable conclusion could not be drawn from the results for RAG 1, and since RAG 1 is similar to the few-shot technique, RAG 1 was used in the prototype although few-shot prompt performed better in the LLM technique tests. But to make a better decision about which LLM technique to use in the prototype, more test data should be used to compare the different LLM techniques. The final prototype was tested by a Bosch employee according to the user-tests and the result showed that the prototype was easy to use and had a high number of relevant articles. (Less)
Abstract (Swedish)
Sammanfattning

Det är viktigt för stora företag som Bosch att ligga i framkant när det gäller teknisk utveckling. En central del av detta innebär att bevaka trender på marknaden genom att söka på internet efter nyheter som kan vara av intresse, vilket ger dem stora mängder information. Ett problem är dock att de får information om väldigt många olika områden i många olika regioner, vilket gör det svårt och tidskrävande att sålla fram vad som kan vara intressant och relevant. Detta examensarbete undersöker hur en LLM kan användas för att filtrera relevanta och irrelevanta nyhetsartiklar.

Detta examensarbete består av två huvuddelar: Första delen utforskar hur en förtränad LLM kan optimeras med hjälp av olika tekniker för att... (More)
Sammanfattning

Det är viktigt för stora företag som Bosch att ligga i framkant när det gäller teknisk utveckling. En central del av detta innebär att bevaka trender på marknaden genom att söka på internet efter nyheter som kan vara av intresse, vilket ger dem stora mängder information. Ett problem är dock att de får information om väldigt många olika områden i många olika regioner, vilket gör det svårt och tidskrävande att sålla fram vad som kan vara intressant och relevant. Detta examensarbete undersöker hur en LLM kan användas för att filtrera relevanta och irrelevanta nyhetsartiklar.

Detta examensarbete består av två huvuddelar: Första delen utforskar hur en förtränad LLM kan optimeras med hjälp av olika tekniker för att utvärdera relevansen på en given information för bosch. Den andra delen utforskar hur denna lösning kan integreras till en webbaserad applikation. För att testa hur en LLM kan optimeras har 5 olika LLM-tekniker (zero-shot, fewshot, multi-stage pipline, RAG 1, och RAG 2) tagits fram med hjälp av litteraturundersökning. Varje LLM-teknik testades på ett testset av artiklar som redan var märkta på en skala 0–3 av en Bosch anställd, där värdet 0 innebar inte relevant och värde 3 innebar högst relevant. För att testa hur denna lösning kunde integreras till en webbaserad applikation gjordes en prototyp som byggde på några krav från en Bosch anställd. Därefter utförde en Bosch anställd några specifikt utformade användar-tester för att se om prototypen uppfyllde kraven.

Resultatet av de olika LLM-teknikerna visade att multi-stage pipeline presterade best i F1-score, Cohens kappa och Quadrant kappa för att bedöma artiklarna i skalan 0–3. Men i ett annat test där värde 2 och 3 sågs som positiv relevans och värde 0 och 1 sågs som negativ relevans, presterade few-shot bättre i recall av positiva artiklar. Detta innebär att few-shot hittade fler artiklar som hade positiv relevans än multi-stage. För att minimera risken att missa relevanta artiklar i prototypen hade few-shot prompt varit en bättre LLM-teknik än multi-stage pipeline.

På grund av den begränsade mängden träningsdatan i RAG 1-tekniken kan ingen trovärdig slutsats dras från resultatet för RAG 1, och eftersom RAG 1-tekniken är lik few-shot tekniken användes RAG 1 i prototypen, trots att few-shot presterade bättre i testerna. För att kunna ta ett bättre beslut om vilken LLM-teknik som ska användas i prototypen skulle mer testdata kunnat användas för att jämföra de olika LLM-teknikerna. Den slutliga prototypen testades av en Bosch anställd enligt användartesterna och resultatet visade att prototypen var lätt att använd
och hade ett högt antal relevanta artiklar. (Less)
Please use this url to cite or link to this publication:
author
Ek, Alexander LU and Månsson, Lucas LU
supervisor
organization
alternative title
LLM-assisterad trendanalys
course
EITL05 20251
year
type
M2 - Bachelor Degree
subject
keywords
Artificial Intelligence, Large Language Models, GPT-4o mini, OpenAI, Prompting, Retrieval Augmented Generation, Information Retrieval, Web development.
report number
LU/LTH-EIT 2025-1064
language
English
id
9197278
date added to LUP
2025-06-13 15:08:48
date last changed
2025-06-13 15:08:48
@misc{9197278,
  abstract     = {{Abstract 

It is important for a large company like Bosch to be at the forefront of technological development, and a central part of this entails scouting for trends in the market. To achieve this, they search the Internet for news articles that could be of interest, providing them with vast amounts of information. A problem, however, is that they receive information about very many different areas in many different regions, which makes it difficult and time-consuming to sort out what may be interesting and relevant. This thesis explores how LLMs can be used to assist with finding relevant information on the internet. 

The thesis contains two main parts: One explores how a pre-trained LLM can be optimized through different techniques to evaluate given information according to its relevance to Bosch. The other explores how this solution could be integrated into a web-based application. To test how an LLM could be optimized, 5 different LLM techniques (zero-shot prompt, few-shot prompt, multi-stage pipeline, RAG 1, and RAG 2) were found and created by doing literature research, and each LLM technique were tested on a test-set of articles that already were labeled on a scale from 0-3 by a Bosch employee were the score 0 meant not relevant and score 3 meant highly relevant. To test how this solution could be integrated into a web-based application, a prototype was made upon requirement from a Bosch employee and some user-tests were done by a Bosch employee.

The results of the different LLM techniques showed that multi-stage pipeline performed best in F1-score, Cohens kappa and Quadrant kappa in predicting the right relevant score to an article. But in another test where scores 2 and 3 were seen as positive relevance, and scores 0 and 1 were seen as negative relevance. Few-shot performed better than multi-stage in recall of articles with positive relevance, meaning it found most of the positively relevant articles. To minimize the risk of missing a relevant article in the prototype, a few-shot prompt would be the best LLM technique for the prototype.

Due to the limitation of training data in the RAG 1 technique, a reliable conclusion could not be drawn from the results for RAG 1, and since RAG 1 is similar to the few-shot technique, RAG 1 was used in the prototype although few-shot prompt performed better in the LLM technique tests. But to make a better decision about which LLM technique to use in the prototype, more test data should be used to compare the different LLM techniques. The final prototype was tested by a Bosch employee according to the user-tests and the result showed that the prototype was easy to use and had a high number of relevant articles.}},
  author       = {{Ek, Alexander and Månsson, Lucas}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{LLM-assisted trend scout}},
  year         = {{2025}},
}