With a Little Help from My Friends – A Comparative Study of Decentralized Deep Learning Strategies

Ihre-Thomason, Eric; Hagander, Tom

With a Little Help from My Friends – A Comparative Study of Decentralized Deep Learning Strategies

Mark

Ihre-Thomason, Eric ^LU and Hagander, Tom ^LU (2024) In Master's Theses in Mathematical Sciences FMAM05 20241
Mathematics (Faculty of Engineering)

Abstract: This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data.

Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes... (More); This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data.

Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes diverse similarity metrics such as inverse training loss, cosine similarity of weights and gradients, and the inverse L2 distance between weights. Different model merging protocols are also examined to provide a comprehensive analysis of DL strategies. Our research provides insights into the performance of these metrics and communication strategies, highlighting their potential for effective collaboration in DL and contributing to the development of robust DL methods. (Less)
Popular Abstract: Imagine a world where your data helps improve technology without ever leaving your device. Our research in decentralized deep learning aims to make this a reality, by creating better models through collaboration without compromising privacy.

In the rapidly evolving field of machine learning, privacy concerns are becoming more and more important. Traditional methods of training machine learning models often require large amounts of data which often leads to entities collecting data that people would rather keep private. Our thesis explores decentralized deep learning strategies, where individual users or organizations can collaboratively train models without exposing their private data.

Decentralized deep learning works by letting all... (More); Imagine a world where your data helps improve technology without ever leaving your device. Our research in decentralized deep learning aims to make this a reality, by creating better models through collaboration without compromising privacy.

In the rapidly evolving field of machine learning, privacy concerns are becoming more and more important. Traditional methods of training machine learning models often require large amounts of data which often leads to entities collecting data that people would rather keep private. Our thesis explores decentralized deep learning strategies, where individual users or organizations can collaboratively train models without exposing their private data.

Decentralized deep learning works by letting all participating parties calculate their similarity to other participating parties. Then participants choose to collaborate with each other based on their similarity, dictated by a communication strategy. To preserve privacy all collaborators only share their model parameters when they calculate their similarity and when they share their insights. This lets all collaborators that do not have enough data to be able train a well-performing model do so without exposing their private data, even if the data differs significantly between collaborators.

We investigated two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). Additionally, we explored various similarity metrics together with these communication strategies.

Our experiments were conducted using various datasets, including CIFAR-10 and Fashion-MNIST, simulating real-world scenarios where data is distributed unevenly among multiple participants. The results highlight the potential of DAC and PANM strategies in improving model performance while maintaining data privacy in various settings.

Our research demonstrates the feasibility of decentralized deep learning and presents some unsolved problems, providing a foundation for future work in this area. By improving communication strategies and the use of similarity metrics, we can make collaborative learning systems more reliable and effective. This has far-reaching implications, from healthcare to finance, where data privacy is highly important, and collaborative learning can drive significant advancements. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9159064

author

Ihre-Thomason, Eric ^LU and Hagander, Tom ^LU

supervisor

organization

Mathematics (Faculty of Engineering)

course

FMAM05 20241

year

2024

type

H2 - Master's Degree (Two Years)

subject

Mathematics and Statistics

keywords

Decentralized Deep Learning, Privacy-Preserving Machine Learning, Collaborative Learning, Federated Learning, Similarity Metrics, Non-IID Data, Model Averaging

publication/series

Master's Theses in Mathematical Sciences

report number

LUTFMA-3536-2024

ISSN

1404-6342

other publication id

2024:28

language

English

id

9159064

date added to LUP

2024-06-28 16:04:00

date last changed

2024-06-28 16:04:00

@misc{9159064,
  abstract     = {{This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data.

Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes diverse similarity metrics such as inverse training loss, cosine similarity of weights and gradients, and the inverse L2 distance between weights. Different model merging protocols are also examined to provide a comprehensive analysis of DL strategies. Our research provides insights into the performance of these metrics and communication strategies, highlighting their potential for effective collaboration in DL and contributing to the development of robust DL methods.}},
  author       = {{Ihre-Thomason, Eric and Hagander, Tom}},
  issn         = {{1404-6342}},
  language     = {{eng}},
  note         = {{Student Paper}},
  series       = {{Master's Theses in Mathematical Sciences}},
  title        = {{With a Little Help from My Friends – A Comparative Study of Decentralized Deep Learning Strategies}},
  year         = {{2024}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

With a Little Help from My Friends – A Comparative Study of Decentralized Deep Learning Strategies