Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

With a Little Help from My Friends – A Comparative Study of Decentralized Deep Learning Strategies

Ihre-Thomason, Eric LU and Hagander, Tom LU (2024) In Master's Theses in Mathematical Sciences FMAM05 20241
Mathematics (Faculty of Engineering)
Abstract
This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data.

Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes... (More)
This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data.

Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes diverse similarity metrics such as inverse training loss, cosine similarity of weights and gradients, and the inverse L2 distance between weights. Different model merging protocols are also examined to provide a comprehensive analysis of DL strategies. Our research provides insights into the performance of these metrics and communication strategies, highlighting their potential for effective collaboration in DL and contributing to the development of robust DL methods. (Less)
Popular Abstract
Imagine a world where your data helps improve technology without ever leaving your device. Our research in decentralized deep learning aims to make this a reality, by creating better models through collaboration without compromising privacy.

In the rapidly evolving field of machine learning, privacy concerns are becoming more and more important. Traditional methods of training machine learning models often require large amounts of data which often leads to entities collecting data that people would rather keep private. Our thesis explores decentralized deep learning strategies, where individual users or organizations can collaboratively train models without exposing their private data.

Decentralized deep learning works by letting all... (More)
Imagine a world where your data helps improve technology without ever leaving your device. Our research in decentralized deep learning aims to make this a reality, by creating better models through collaboration without compromising privacy.

In the rapidly evolving field of machine learning, privacy concerns are becoming more and more important. Traditional methods of training machine learning models often require large amounts of data which often leads to entities collecting data that people would rather keep private. Our thesis explores decentralized deep learning strategies, where individual users or organizations can collaboratively train models without exposing their private data.

Decentralized deep learning works by letting all participating parties calculate their similarity to other participating parties. Then participants choose to collaborate with each other based on their similarity, dictated by a communication strategy. To preserve privacy all collaborators only share their model parameters when they calculate their similarity and when they share their insights. This lets all collaborators that do not have enough data to be able train a well-performing model do so without exposing their private data, even if the data differs significantly between collaborators.

We investigated two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). Additionally, we explored various similarity metrics together with these communication strategies.

Our experiments were conducted using various datasets, including CIFAR-10 and Fashion-MNIST, simulating real-world scenarios where data is distributed unevenly among multiple participants. The results highlight the potential of DAC and PANM strategies in improving model performance while maintaining data privacy in various settings.

Our research demonstrates the feasibility of decentralized deep learning and presents some unsolved problems, providing a foundation for future work in this area. By improving communication strategies and the use of similarity metrics, we can make collaborative learning systems more reliable and effective. This has far-reaching implications, from healthcare to finance, where data privacy is highly important, and collaborative learning can drive significant advancements. (Less)
Please use this url to cite or link to this publication:
author
Ihre-Thomason, Eric LU and Hagander, Tom LU
supervisor
organization
course
FMAM05 20241
year
type
H2 - Master's Degree (Two Years)
subject
keywords
Decentralized Deep Learning, Privacy-Preserving Machine Learning, Collaborative Learning, Federated Learning, Similarity Metrics, Non-IID Data, Model Averaging
publication/series
Master's Theses in Mathematical Sciences
report number
LUTFMA-3536-2024
ISSN
1404-6342
other publication id
2024:28
language
English
id
9159064
date added to LUP
2024-06-28 16:04:00
date last changed
2024-06-28 16:04:00
@misc{9159064,
  abstract     = {{This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data.

Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes diverse similarity metrics such as inverse training loss, cosine similarity of weights and gradients, and the inverse L2 distance between weights. Different model merging protocols are also examined to provide a comprehensive analysis of DL strategies. Our research provides insights into the performance of these metrics and communication strategies, highlighting their potential for effective collaboration in DL and contributing to the development of robust DL methods.}},
  author       = {{Ihre-Thomason, Eric and Hagander, Tom}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{With a Little Help from My Friends – A Comparative Study of Decentralized Deep Learning Strategies}},
  year         = {{2024}},
}