Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Time Series Active Learning using Automated Feature Extraction

Lindskog, William LU (2021) In Master's thesis in Matematical Scieces FMSM01 20211
Mathematical Statistics
Abstract
Time series classification is a prevalent problem for sensor data in an industrial setting. The ability to classify time series correctly allows for predictive maintenance which is the process of optimally maintaining assets and ensuring minimal ”down-time” of machines or processes. A requirement for classifying the time series are labeled time series that can be used as training instances. Labelling data is a time consuming and costly activity. Overcoming this, time series active learning attempts to label the most usefultime series in order for a machine learning model to learn quicker. Time series active learning is argued to be an understudied topic. Time series classification also faces the problem of large data dimensionality. By... (More)
Time series classification is a prevalent problem for sensor data in an industrial setting. The ability to classify time series correctly allows for predictive maintenance which is the process of optimally maintaining assets and ensuring minimal ”down-time” of machines or processes. A requirement for classifying the time series are labeled time series that can be used as training instances. Labelling data is a time consuming and costly activity. Overcoming this, time series active learning attempts to label the most usefultime series in order for a machine learning model to learn quicker. Time series active learning is argued to be an understudied topic. Time series classification also faces the problem of large data dimensionality. By extracting features from the time series, one may reduce the dimensionality and leverage quicker processing. It is found that both suggested methods are faster and more accurate than a state-of-the-art method for time series active learning. (Less)
Popular Abstract
Predictive maintenance aims at foreseeing and fix future problems before they occur. Machine learning can be used to achieve that. For a company it could mean reduced unnecessary costs and a more digitalized organisation.

Together with the Viking Analytics AB, located in Gothenburg, I have written my master thesis. The thesis’s main area of work is machine learning. Machine learning is one out of many subfields of artificial intelligence. It aims at training computers to identify underlying patterns and rules in order to solve a task. Apart from machine learning I have worked with time series.

A time series is a continuous series of values over time. Take the stock price of a listed company for instance. Within a manufacturing... (More)
Predictive maintenance aims at foreseeing and fix future problems before they occur. Machine learning can be used to achieve that. For a company it could mean reduced unnecessary costs and a more digitalized organisation.

Together with the Viking Analytics AB, located in Gothenburg, I have written my master thesis. The thesis’s main area of work is machine learning. Machine learning is one out of many subfields of artificial intelligence. It aims at training computers to identify underlying patterns and rules in order to solve a task. Apart from machine learning I have worked with time series.

A time series is a continuous series of values over time. Take the stock price of a listed company for instance. Within a manufacturing company a relevant time series might be the energy usage of a machine. There is a need for these types of companies to be able to classify information gathered from processes and machines. Classifying in this case means being able to distinguish one time series from another and tell why they are different. For a machine one could classify its current state into three classes: (1) good state, (2) repair, or (3) exchange. This is often called predicitve maintenance. Its goal is to foresee costly/undesirable events before they happen.

A computer needs training examples to know if a time series is “good” or “bad”. Nowadays, employers must manually add this information. It is very time-consuming.

I have identified two problems: (1) A computer
does not know how a time series should be classified, and (2) it is a costly and time-consuming process to add information about the time series manually. It is therefore relevant to try to solve these problems as they could be automized and costs could be avoided.

I have found a solution for both problems. Classification of time series can be done by looking at their features. An intuitive feature is minimum value. Imagine a simple example where a person uses a smart watch when sleeping. The smart watch records the person’s heart rate, and it is recommended that the
person is notified in the morning if the heart rate was less than 40 beats per minute (bpm) at one point during the night. The recorded heart rate can be seen as a time series. Classifying whether the heart rate was higher or equal, or lower than 40 bpm is easy as one must only look at the minimum value of the time series. For more complicated time series one might have to look at more features to be able to classify them.

The idea is that a machine learning model could be trained to classify these time series. However, it must first train on examples that includes information if e.g. the minimum value is less than 40 bpm or not. A person has then been sitting and adding this information for the model to train on. To tackle this problem, I have used active learning. The idea with active learning is that the machine learning model trains on the most useful time series. With most useful means the time series the computer is most uncertain about. If it is able to accurately classify uncertain training example, more certain examples will be easier to classify.

My solution is more accurate and faster than the currently best algorithm that I have found. I have tested my solution on three different datasets. Two out of three stem from industrial applications and one from Swedish leaves.

I thought the Swedish leaves dataset was interesting as I did not imagine that it could be seen as a time series classification problem. Imagine that you “roll out” the edge of a leaf and put flat on a table as one string. This string will be somewhat shaky and bumpy and could be seen as a time series.

William Lindskog
M.Sc. Industrial Engineering, LTH (Less)
Please use this url to cite or link to this publication:
author
Lindskog, William LU
supervisor
organization
alternative title
Aktiv maskininlärning för tidsserier med hjälp av automatisk särdragsextrahering
course
FMSM01 20211
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Machine Learning, Active Learning, Time Series, Artificial Intelligence, Feature Extraction, Predictive Maintenance
publication/series
Master's thesis in Matematical Scieces
report number
LUTFMS-3419-2021
ISSN
1404-6342
other publication id
2021:E38
language
English
id
9052567
date added to LUP
2021-07-02 16:08:54
date last changed
2021-07-02 16:08:54
@misc{9052567,
  abstract     = {{Time series classification is a prevalent problem for sensor data in an industrial setting. The ability to classify time series correctly allows for predictive maintenance which is the process of optimally maintaining assets and ensuring minimal ”down-time” of machines or processes. A requirement for classifying the time series are labeled time series that can be used as training instances. Labelling data is a time consuming and costly activity. Overcoming this, time series active learning attempts to label the most usefultime series in order for a machine learning model to learn quicker. Time series active learning is argued to be an understudied topic. Time series classification also faces the problem of large data dimensionality. By extracting features from the time series, one may reduce the dimensionality and leverage quicker processing. It is found that both suggested methods are faster and more accurate than a state-of-the-art method for time series active learning.}},
  author       = {{Lindskog, William}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's thesis in Matematical Scieces}},
  title        = {{Time Series Active Learning using Automated Feature Extraction}},
  year         = {{2021}},
}