Keystroke Classification of Motion Sensor Data - An LSTM Approach
(2021) In Master's Theses in Mathematical Sciences FMSM01 20211Mathematical Statistics
- Abstract
- With mobile phones often being used to type sensitive information, it is important that they remain secure and leak no information. One conceivable channel for information leakage, though, is the motion sensors accelerometer and gyroscope, sensors that require no permission to be used by an app. Do the data they produce contain information about the keys being typed?
To answer this question, this thesis investigates whether, using LSTM networks, keystrokes can be classified as (1) either backspace or not backspace, and (2) any key on the keyboard, using only motion sensor data collected around the keystroke. Furthermore, the problems are investigated in three different cases, one where the models are built on a user-basis, one where... (More) - With mobile phones often being used to type sensitive information, it is important that they remain secure and leak no information. One conceivable channel for information leakage, though, is the motion sensors accelerometer and gyroscope, sensors that require no permission to be used by an app. Do the data they produce contain information about the keys being typed?
To answer this question, this thesis investigates whether, using LSTM networks, keystrokes can be classified as (1) either backspace or not backspace, and (2) any key on the keyboard, using only motion sensor data collected around the keystroke. Furthermore, the problems are investigated in three different cases, one where the models are built on a user-basis, one where they are built on a mobile phone brand-basis, and the last where they are built on a general basis, using data pertaining to all users and brands.
The thesis finds that the motion sensor data do indeed contain information about the keys being typed. The different cases yield similar results for the backspace problem (1), while models built on a user-basis performs best in the more general problem (2). Training on a user-basis yields an EER of 0.11 and an F1-score of 0.61 for the backspace problem, and an Accuracy of 51 % and a macro-averaged F1-score of 0.32 for the more general problem, much better than naïve model performance. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9068608
- author
- Patricks, Marcus LU
- supervisor
- organization
- alternative title
- Tangentklassificering av mobil rörelsedata med LSTM-nätverk
- course
- FMSM01 20211
- year
- 2021
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Neural Networks, Keystroke Dynamics, LSTM-Networks, Accelerometer, Gyroscope, Mobile Sensors, Transfer Learning
- publication/series
- Master's Theses in Mathematical Sciences
- report number
- LUTFMS-3433-2021
- ISSN
- 1404-6342
- other publication id
- 2021:E68
- language
- English
- id
- 9068608
- date added to LUP
- 2021-12-01 14:48:06
- date last changed
- 2021-12-01 14:48:06
@misc{9068608, abstract = {{With mobile phones often being used to type sensitive information, it is important that they remain secure and leak no information. One conceivable channel for information leakage, though, is the motion sensors accelerometer and gyroscope, sensors that require no permission to be used by an app. Do the data they produce contain information about the keys being typed? To answer this question, this thesis investigates whether, using LSTM networks, keystrokes can be classified as (1) either backspace or not backspace, and (2) any key on the keyboard, using only motion sensor data collected around the keystroke. Furthermore, the problems are investigated in three different cases, one where the models are built on a user-basis, one where they are built on a mobile phone brand-basis, and the last where they are built on a general basis, using data pertaining to all users and brands. The thesis finds that the motion sensor data do indeed contain information about the keys being typed. The different cases yield similar results for the backspace problem (1), while models built on a user-basis performs best in the more general problem (2). Training on a user-basis yields an EER of 0.11 and an F1-score of 0.61 for the backspace problem, and an Accuracy of 51 % and a macro-averaged F1-score of 0.32 for the more general problem, much better than naïve model performance.}}, author = {{Patricks, Marcus}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master's Theses in Mathematical Sciences}}, title = {{Keystroke Classification of Motion Sensor Data - An LSTM Approach}}, year = {{2021}}, }