Transformer-Based Self-Supervised Learning for Human Activity Recognition Using Accelerometer Data
(2025) BMEM01 20251Department of Biomedical Engineering
- Abstract
- Human Activity Recognition (HAR) from wearable sensors is often constrained by limited labeled data and varying device placements. This thesis investigates a Transformer-based self-supervised approach that learns representations via a masked reconstruction and noise-injection pretext task, followed by fine-tuning on smaller labeled datasets.
Experiments on WISDM, REALWORLD, and OPPORTUNITY confirm that fully fine-tuning the Transformer significantly outperforms training from scratch, enhancing F1-score by 18.1% on average. In addition, the model shows moderate cross-subject and cross-placement transfer, although certain sensor locations remain challenging. Despite its relatively small footprint (fewer than 30 MFLOPs and 100k... (More) - Human Activity Recognition (HAR) from wearable sensors is often constrained by limited labeled data and varying device placements. This thesis investigates a Transformer-based self-supervised approach that learns representations via a masked reconstruction and noise-injection pretext task, followed by fine-tuning on smaller labeled datasets.
Experiments on WISDM, REALWORLD, and OPPORTUNITY confirm that fully fine-tuning the Transformer significantly outperforms training from scratch, enhancing F1-score by 18.1% on average. In addition, the model shows moderate cross-subject and cross-placement transfer, although certain sensor locations remain challenging. Despite its relatively small footprint (fewer than 30 MFLOPs and 100k parameters), the proposed model consistently achieves robust performance. Overall, these findings suggest that self-supervised Transformers can reduce reliance on labeled data while adapting effectively across diverse devices and user populations. (Less) - Popular Abstract
- Compact AI Lets Wearables Learn Your Moves on Their Own
A compact AI model trains itself on raw wrist-sensor data, turning every shake and sway into clues about daily activities, without the large teams of people usually needed to tag videos.
Picture a jigsaw puzzle with a few pieces missing. A compact Transformer network studied half a million ten-second motion snippets from a watch's accelerometer and tried to guess the hidden parts. After a few hours of this game the model had an inner "feel" for everyday movement, without a single human label.
The experiments demonstrate that this self-supervised approach can reduce the need for large labeled datasets in Human Activity Recognition while maintaining effectiveness across... (More) - Compact AI Lets Wearables Learn Your Moves on Their Own
A compact AI model trains itself on raw wrist-sensor data, turning every shake and sway into clues about daily activities, without the large teams of people usually needed to tag videos.
Picture a jigsaw puzzle with a few pieces missing. A compact Transformer network studied half a million ten-second motion snippets from a watch's accelerometer and tried to guess the hidden parts. After a few hours of this game the model had an inner "feel" for everyday movement, without a single human label.
The experiments demonstrate that this self-supervised approach can reduce the need for large labeled datasets in Human Activity Recognition while maintaining effectiveness across different devices, placements, and users. When tested on three much smaller, labeled datasets, the self-taught model spotted activities 18.1% more accurately than the same network trained from scratch. The entire model is efficient, squeezing into about 100,000 parameters and requiring only 30 million calculations per prediction. This makes it light enough to run on the chip already inside most fitness bands.
Perhaps most importantly for real-world deployment, the model shows promising adaptability when conditions change. When the sensor is moved from the wrist to a different body location like the shin, just a few minutes of labeled training data was enough to boost accuracy from poor performance to genuinely useful results. This demonstrates meaningful transfer learning capabilities across different device placements, though the adaptation isn't perfect.
During testing, the network sometimes made curious mistakes that reveal how it "thinks." For instance, it occasionally labeled "eating a sandwich" as "drinking." Both gestures involve raising a hand to the mouth, showing just how finely the system must sense subtle differences in motion patterns to distinguish between similar activities.
This breakthrough opens doors across several important applications. In health and care settings, such responsive AI could enable quicker fall alerts for seniors and provide early hints of chronic disease when daily routines begin to change subtly. For personalized fitness, imagine wrist-worn coaches that learn new exercises and tailor advice to each user's actual activity patterns rather than generic recommendations. From an environmental perspective, these bite-sized models keep the computational carbon footprint far below today's AI giants.
Looking ahead, the same self-supervised learning approach could potentially reveal much larger behavioral routines, such as shopping trips or dining out. A sequence like sitting, then standing, then walking, then sitting again might reveal a café visit, all without any camera or privacy-invasive monitoring.
Ultimately, a small, self-taught Transformer can make wearables smarter, more environmentally friendly, and far less dependent on hard-to-obtain labeled training data. Feed it raw accelerometer data and, like a toddler learning to walk, it gradually finds the underlying patterns on its own. This opens the door to more helpful, privacy-friendly gadgets that can adapt to our daily lives. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9205993
- author
- Myrén, Jonathan LU
- supervisor
- organization
- alternative title
- Transformerbaserad självövervakad inlärning för igenkänning av mänsklig aktivitet med accelerometerdata
- course
- BMEM01 20251
- year
- 2025
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Self-Supervised Learning, Transformer, Human Activity Recognition, Accelerometer, Wearable Sensors
- language
- English
- additional info
- 2015-13
- id
- 9205993
- date added to LUP
- 2025-06-30 12:18:32
- date last changed
- 2025-06-30 12:18:32
@misc{9205993, abstract = {{Human Activity Recognition (HAR) from wearable sensors is often constrained by limited labeled data and varying device placements. This thesis investigates a Transformer-based self-supervised approach that learns representations via a masked reconstruction and noise-injection pretext task, followed by fine-tuning on smaller labeled datasets. Experiments on WISDM, REALWORLD, and OPPORTUNITY confirm that fully fine-tuning the Transformer significantly outperforms training from scratch, enhancing F1-score by 18.1% on average. In addition, the model shows moderate cross-subject and cross-placement transfer, although certain sensor locations remain challenging. Despite its relatively small footprint (fewer than 30 MFLOPs and 100k parameters), the proposed model consistently achieves robust performance. Overall, these findings suggest that self-supervised Transformers can reduce reliance on labeled data while adapting effectively across diverse devices and user populations.}}, author = {{Myrén, Jonathan}}, language = {{eng}}, note = {{Student Paper}}, title = {{Transformer-Based Self-Supervised Learning for Human Activity Recognition Using Accelerometer Data}}, year = {{2025}}, }