With a Little Help from My Friends – A Comparative Study of Decentralized Deep Learning Strategies
(2024) In Master's Theses in Mathematical Sciences FMAM05 20241Mathematics (Faculty of Engineering)
- Abstract
- This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data.
Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes... (More) - This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data.
Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes diverse similarity metrics such as inverse training loss, cosine similarity of weights and gradients, and the inverse L2 distance between weights. Different model merging protocols are also examined to provide a comprehensive analysis of DL strategies. Our research provides insights into the performance of these metrics and communication strategies, highlighting their potential for effective collaboration in DL and contributing to the development of robust DL methods. (Less) - Popular Abstract
- Imagine a world where your data helps improve technology without ever leaving your device. Our research in decentralized deep learning aims to make this a reality, by creating better models through collaboration without compromising privacy.
In the rapidly evolving field of machine learning, privacy concerns are becoming more and more important. Traditional methods of training machine learning models often require large amounts of data which often leads to entities collecting data that people would rather keep private. Our thesis explores decentralized deep learning strategies, where individual users or organizations can collaboratively train models without exposing their private data.
Decentralized deep learning works by letting all... (More) - Imagine a world where your data helps improve technology without ever leaving your device. Our research in decentralized deep learning aims to make this a reality, by creating better models through collaboration without compromising privacy.
In the rapidly evolving field of machine learning, privacy concerns are becoming more and more important. Traditional methods of training machine learning models often require large amounts of data which often leads to entities collecting data that people would rather keep private. Our thesis explores decentralized deep learning strategies, where individual users or organizations can collaboratively train models without exposing their private data.
Decentralized deep learning works by letting all participating parties calculate their similarity to other participating parties. Then participants choose to collaborate with each other based on their similarity, dictated by a communication strategy. To preserve privacy all collaborators only share their model parameters when they calculate their similarity and when they share their insights. This lets all collaborators that do not have enough data to be able train a well-performing model do so without exposing their private data, even if the data differs significantly between collaborators.
We investigated two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). Additionally, we explored various similarity metrics together with these communication strategies.
Our experiments were conducted using various datasets, including CIFAR-10 and Fashion-MNIST, simulating real-world scenarios where data is distributed unevenly among multiple participants. The results highlight the potential of DAC and PANM strategies in improving model performance while maintaining data privacy in various settings.
Our research demonstrates the feasibility of decentralized deep learning and presents some unsolved problems, providing a foundation for future work in this area. By improving communication strategies and the use of similarity metrics, we can make collaborative learning systems more reliable and effective. This has far-reaching implications, from healthcare to finance, where data privacy is highly important, and collaborative learning can drive significant advancements. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9159064
- author
- Ihre-Thomason, Eric LU and Hagander, Tom LU
- supervisor
- organization
- course
- FMAM05 20241
- year
- 2024
- type
- H2 - Master's Degree (Two Years)
- subject
- keywords
- Decentralized Deep Learning, Privacy-Preserving Machine Learning, Collaborative Learning, Federated Learning, Similarity Metrics, Non-IID Data, Model Averaging
- publication/series
- Master's Theses in Mathematical Sciences
- report number
- LUTFMA-3536-2024
- ISSN
- 1404-6342
- other publication id
- 2024:28
- language
- English
- id
- 9159064
- date added to LUP
- 2024-06-28 16:04:00
- date last changed
- 2024-06-28 16:04:00
@misc{9159064, abstract = {{This thesis investigates various communication strategies and similarity metrics within decentralized deep learning (DL). Decentralized learning allows organizations or users to collaborate on improving personalized deep neural networks while maintaining the privacy of their datasets. When the distribution of data varies across participating users, this task becomes more challenging, as not all collaboration is beneficial. This underscores the need for effective algorithms and similarity metrics that can identify good collaborators without sharing private data. Specifically, this study considers two main communication strategies: Decentralized Adaptive Clustering (DAC) and Personalized Adaptive Neighbor Matching (PANM). It utilizes diverse similarity metrics such as inverse training loss, cosine similarity of weights and gradients, and the inverse L2 distance between weights. Different model merging protocols are also examined to provide a comprehensive analysis of DL strategies. Our research provides insights into the performance of these metrics and communication strategies, highlighting their potential for effective collaboration in DL and contributing to the development of robust DL methods.}}, author = {{Ihre-Thomason, Eric and Hagander, Tom}}, issn = {{1404-6342}}, language = {{eng}}, note = {{Student Paper}}, series = {{Master's Theses in Mathematical Sciences}}, title = {{With a Little Help from My Friends – A Comparative Study of Decentralized Deep Learning Strategies}}, year = {{2024}}, }