A Supervised Co-complex Probability Weighting of Yeast Composite Protein Networks using Gradient-boosted Trees for Protein Complex Detection
- DOI
- 10.2991/978-94-6463-388-7_21How to use a DOI?
- Keywords
- protein complex prediction; graph clustering; PPIN
- Abstract
Many studies in the past have proposed various methods to detect protein complexes from protein-protein interaction networks (PPINs) by applying clustering algorithms to the network, relying only on the topology of the PPIN. However, PPINs have a high number of false positives and false negatives, making them unreliable when used alone to detect protein complexes. Moreover, not all proteins in a protein complex interact with each other and not all proteins that interact with each other are from the same complex. Thus, relying alone on the physical interactions of proteins is not ideal for detecting protein complexes. This study extends the idea of a method by Yong et al. called SWC, where they integrated other heterogeneous data sources into the PPIN to create a composite network and where each edge is weighted according to its posterior co-complex probability. SWC, when combined with various clustering algorithms, resulted in more accurate results in detecting protein complexes. This study attempts to improve SWC by integrating additional data sources and by using a more advanced machine learning model called gradient-boosted trees. The proposed method outperformed SWC in every performance metric, often by a considerable margin in terms of precision-recall AUC, Brier score loss, and log loss when predicting cocomplex edges. More importantly, it also outperformed SWC in terms of precision-recall AUC when used together with the Markov Cluster algorithm (MCL) to detect protein complexes. Lastly, it also outperformed various unsupervised weighting methods in all the said performance evaluations. These evaluations were performed on two yeast PPINs.
- Copyright
- © 2024 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Anthony Van C. Cayetano AU - John Justine S. Villar PY - 2024 DA - 2024/02/29 TI - A Supervised Co-complex Probability Weighting of Yeast Composite Protein Networks using Gradient-boosted Trees for Protein Complex Detection BT - Proceedings of the Workshop on Computation: Theory and Practice (WCTP 2023) PB - Atlantis Press SP - 342 EP - 365 SN - 2589-4900 UR - https://doi.org/10.2991/978-94-6463-388-7_21 DO - 10.2991/978-94-6463-388-7_21 ID - Cayetano2024 ER -