Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Imputing connections of random gene networks from time series data using ANNs

Andersson, Sofia LU (2022) FYTM03 20221
Computational Biology and Biological Physics - Has been reorganised
Abstract
This thesis presents the architecture of a convolutional neural network which is trained to impute the connections of randomly generated gene regulatory networks under varying amounts of regularisation. The generated gene networks are simulated from 10 different starting conditions for each set of connections in order to obtain multiple time series. The generated time series are fed into the neural network for classification of the connections of each node. Ternary classification labels connections as inhibiting, absent, or promoting, whereas binary classification labels connections as present or absent. The performance of the neural network is evaluated by testing its accuracy on data it had not previously seen, known as test data.... (More)
This thesis presents the architecture of a convolutional neural network which is trained to impute the connections of randomly generated gene regulatory networks under varying amounts of regularisation. The generated gene networks are simulated from 10 different starting conditions for each set of connections in order to obtain multiple time series. The generated time series are fed into the neural network for classification of the connections of each node. Ternary classification labels connections as inhibiting, absent, or promoting, whereas binary classification labels connections as present or absent. The performance of the neural network is evaluated by testing its accuracy on data it had not previously seen, known as test data. Ternary classification is unable to obtain a test accuracy above ~54% and binary classification cannot increase test accuracy beyond ~64%.

Despite a relatively low test accuracy, the time dynamics which are obtained from the predicted networks perform much better than randomly generated networks. Furthermore, some of the higher test accuracy scores were associated with distributions of guesses which were heavily biased towards only guessing zeroes, whereas somewhat lower test accuracy scores had very realistic guess distributions. While the results did not produce a particularly high numerical accuracy, the time dynamics obtained from the predicted connections of the neural network resemble the real dynamics, which indicates that there is potential in the method. Making some improvements, such as changing the way gene networks are generated and creating super learner ensembles, has potential to substantially increase performance. (Less)
Popular Abstract (Swedish)
Internet är ett nätverk där många datorer kommunicerar med varandra. På samma sätt består gennätverk av en uppsättning gener som är sammankopplade för att kunna kommunicera. Dessa sammankopplingar är dock inte alltid lätta att urskilja enbart utifrån att studera nätverkets dynamik. Processen, som ofta är lång och utmanande, kräver många tester för att få en korrekt nätverksstruktur. Detta är en betydande fråga inom många biologiska områden, men särskilt inom medicin. Genom att förstå strukturen av ett gennätverk kan det vara möjligt att styra nätverket så att en hudcell omvandlas till en hjärncell. Det får en genast att tänka på vilka användningsområden inom medicin som dessa typer av nätverk möjliggör.

Intentionen med detta projekt är... (More)
Internet är ett nätverk där många datorer kommunicerar med varandra. På samma sätt består gennätverk av en uppsättning gener som är sammankopplade för att kunna kommunicera. Dessa sammankopplingar är dock inte alltid lätta att urskilja enbart utifrån att studera nätverkets dynamik. Processen, som ofta är lång och utmanande, kräver många tester för att få en korrekt nätverksstruktur. Detta är en betydande fråga inom många biologiska områden, men särskilt inom medicin. Genom att förstå strukturen av ett gennätverk kan det vara möjligt att styra nätverket så att en hudcell omvandlas till en hjärncell. Det får en genast att tänka på vilka användningsområden inom medicin som dessa typer av nätverk möjliggör.

Intentionen med detta projekt är att förenkla processen att hitta dessa genkopplingar. För detta, använder vi neuronnät. Ett neuronnät liknar som namnet antyder nervcellerna i hjärnan. Noderna samverkar för att finna en lösning på uppgiften. Nätverket tränas genom att förse detta med många gennätverk och deras kopplingar att analysera. Detta gör att nätverket kan känna igen mönster och sedan applicera detta på gennätverk som det inte har sett tidigare. (Less)
Please use this url to cite or link to this publication:
@misc{9090103,
  abstract     = {{This thesis presents the architecture of a convolutional neural network which is trained to impute the connections of randomly generated gene regulatory networks under varying amounts of regularisation. The generated gene networks are simulated from 10 different starting conditions for each set of connections in order to obtain multiple time series. The generated time series are fed into the neural network for classification of the connections of each node. Ternary classification labels connections as inhibiting, absent, or promoting, whereas binary classification labels connections as present or absent. The performance of the neural network is evaluated by testing its accuracy on data it had not previously seen, known as test data. Ternary classification is unable to obtain a test accuracy above ~54% and binary classification cannot increase test accuracy beyond ~64%.

Despite a relatively low test accuracy, the time dynamics which are obtained from the predicted networks perform much better than randomly generated networks. Furthermore, some of the higher test accuracy scores were associated with distributions of guesses which were heavily biased towards only guessing zeroes, whereas somewhat lower test accuracy scores had very realistic guess distributions. While the results did not produce a particularly high numerical accuracy, the time dynamics obtained from the predicted connections of the neural network resemble the real dynamics, which indicates that there is potential in the method. Making some improvements, such as changing the way gene networks are generated and creating super learner ensembles, has potential to substantially increase performance.}},
  author       = {{Andersson, Sofia}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Imputing connections of random gene networks from time series data using ANNs}},
  year         = {{2022}},
}