J. Biosci. Agric. Res. | Volume 26, Issue 02, 2177-2184 | https://doi.org/10.18801/jbar.260220.266
Article type: Research article | Received: 22.10.2020; Revised: 17.11.2020; First published online: 10 December 2020.
Article type: Research article | Received: 22.10.2020; Revised: 17.11.2020; First published online: 10 December 2020.
Modified naive Bayes classifier for classification of protein-protein interaction sites
Mohammad Ahsan Uddin1 and Md. Shakil Ahmed2
1Department of Statistics, University of Dhaka, Dhaka-1000, Bangladesh.
2Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh.
✉ Corresponding author: [email protected] (Uddin, M.A.).
1Department of Statistics, University of Dhaka, Dhaka-1000, Bangladesh.
2Department of Statistics, University of Rajshahi, Rajshahi-6205, Bangladesh.
✉ Corresponding author: [email protected] (Uddin, M.A.).
Abstract
The prediction of protein-protein interaction sites (PPIs) is a vital importance in biology for understanding the physical and functional interactions between molecules in living systems. There are several classification approaches for the prediction of PPI sites; the naïve Bayes classifier is one of the most popular candidates. But the ordinary naïve Bayes classifier is sensitive to unusual protein sequence profiling feature dataset and sometimes it gives ambiguous prediction results. To overcome this problem we have been modified the naïve Bayes classifier by radial basis function (RBF) kernel for the prediction of PPI sites. We investigate the performance of our proposed method compared with the popular classifiers like linear discriminant analysis (LDA), naïve Bayes classifier (NBC), support vector machine (SVM), AdaBoost and k-nearest neighbor (KNN) by the protein sequence profiling data analysis. The mNBC method showed sensitivity (86%), specificity (81%), accuracy (83%) and MCC (65%) for prediction of PPI sites.
Key Words: Protein Sequences Profiling, PPI sites, Relative Solvent Accessibility (rSA), RBF Kernel and Naïve Bayes Classifier
The prediction of protein-protein interaction sites (PPIs) is a vital importance in biology for understanding the physical and functional interactions between molecules in living systems. There are several classification approaches for the prediction of PPI sites; the naïve Bayes classifier is one of the most popular candidates. But the ordinary naïve Bayes classifier is sensitive to unusual protein sequence profiling feature dataset and sometimes it gives ambiguous prediction results. To overcome this problem we have been modified the naïve Bayes classifier by radial basis function (RBF) kernel for the prediction of PPI sites. We investigate the performance of our proposed method compared with the popular classifiers like linear discriminant analysis (LDA), naïve Bayes classifier (NBC), support vector machine (SVM), AdaBoost and k-nearest neighbor (KNN) by the protein sequence profiling data analysis. The mNBC method showed sensitivity (86%), specificity (81%), accuracy (83%) and MCC (65%) for prediction of PPI sites.
Key Words: Protein Sequences Profiling, PPI sites, Relative Solvent Accessibility (rSA), RBF Kernel and Naïve Bayes Classifier
Article Full-Text PDF:
266.26.02.2020_modified_naive_bayes_classifier_for_classification_of_protein-protein_interaction_sites.pdf | |
File Size: | 1162 kb |
File Type: |
Article Metrics:
Share This Article:
|
|
Article Citations:
MLA
Uddin and Ahmed. “Modified naive Bayes classifier for classification of protein-protein interaction sites”. Journal of Bioscience and Agriculture Research, 26(02), (2020): 2177-2184.
APA
Uddin, M. A. and Ahmed, M. S. (2020). Modified naive Bayes classifier for classification of protein-protein interaction sites. Journal of Bioscience and Agriculture Research, 26(02), 2177-2184.
Chicago
Uddin, M. A. and Ahmed, M. S. “Modified naive Bayes classifier for classification of protein-protein interaction sites”. Journal of Bioscience and Agriculture Research, 26(02), (2020): 2177-2184.
Harvard
Uddin, M. A. and Ahmed, M. S. 2020. Modified naive Bayes classifier for classification of protein-protein interaction sites. Journal of Bioscience and Agriculture Research, 26(02), pp. 2177-2184.
Vancouver
Uddin, MA and Ahmed, MS. Modified naive Bayes classifier for classification of protein-protein interaction sites. Journal of Bioscience and Agriculture Research, 2020 December 26(02): 2177-2184.
Uddin and Ahmed. “Modified naive Bayes classifier for classification of protein-protein interaction sites”. Journal of Bioscience and Agriculture Research, 26(02), (2020): 2177-2184.
APA
Uddin, M. A. and Ahmed, M. S. (2020). Modified naive Bayes classifier for classification of protein-protein interaction sites. Journal of Bioscience and Agriculture Research, 26(02), 2177-2184.
Chicago
Uddin, M. A. and Ahmed, M. S. “Modified naive Bayes classifier for classification of protein-protein interaction sites”. Journal of Bioscience and Agriculture Research, 26(02), (2020): 2177-2184.
Harvard
Uddin, M. A. and Ahmed, M. S. 2020. Modified naive Bayes classifier for classification of protein-protein interaction sites. Journal of Bioscience and Agriculture Research, 26(02), pp. 2177-2184.
Vancouver
Uddin, MA and Ahmed, MS. Modified naive Bayes classifier for classification of protein-protein interaction sites. Journal of Bioscience and Agriculture Research, 2020 December 26(02): 2177-2184.
References:
- Asadabadi, E. B. and Abdolmaleki, P. (2013). Predictions of Protein-Protein Interfaces within Membrane Protein Complexes. Avicenna Journal of Medical Biotechnology, 5(3), 148-57.
- Berman, Helen, M., John, Westbrook, Zukang, Feng, Gary Gilliland, Talapady, N. Bhat, Helge, Weissig, Ilya, N. Shindyalov, and Philip E. B. (2000). The protein data bank. Nucleic acids research, 28(1), 235-242. https://doi.org/10.1093/nar/28.1.235
- Bhaskara, R. M., Padhi, A. and Srinivasan, N. (2014). Accurate prediction of interfacial residues in two-domain proteins using evolutionary information: implications for three-dimensional modeling. Proteins, 82(7), 1219-34. https://doi.org/10.1002/prot.24486
- Du, X., Cheng, J., Zheng, T., Duan, Z. and Qian, F. (2014). A novel feature extraction scheme with ensemble coding for protein-protein interaction prediction. International Journal of Molecular Sciences, 15(7), 12731-49. https://doi.org/10.3390/ijms150712731
- Esmaielbeiki, R., and Nebel, J. C. (2014). Scoring docking conformations using predicted protein interfaces. BMC Bioinformatics, 15, 171. https://doi.org/10.1186/1471-2105-15-171
- Fariselli, P., Pazos, F., Valencia, A. and Casadio, R. (2002). Prediction of protein--protein interaction sites in heterocomplexes with neural networks. European Journal of Biochemistry, 269(5), 1356-61. https://doi.org/10.1046/j.1432-1033.2002.02767.x
- Gallet, X., Charloteaux, B., Thomas, A. and Brasseur, R. (2000). A fast method to predict protein interaction sites from sequences. Journal of Molecular Biology, 302(4), 917-26. https://doi.org/10.1006/jmbi.2000.4092
- Li, Hui, Dechang, Pi, and Chishe, W. (2014). The prediction of protein-protein interaction sites based on RBF classifier improved by SMOTE. Mathematical Problems in Engineering (2014). https://doi.org/10.1155/2014/528767
- Madabushi, S., Gross, A. K., Philippi, A., Meng, E. C., Wensel, T. G. and Lichtarge, O. (2004). Evolutionary trace of G protein-coupled receptors reveals clusters of residues that determine global and class-specific functions. Journal of Biological Chemistry, 279(9), 8126-32. https://doi.org/10.1074/jbc.M312671200
- Murakami, Y. and Mizuguchi, K. (2010). Applying the Naïve Bayes classifier with kernel density estimation to the prediction of protein-protein interaction sites. Bioinformatics, 26(15), 1841-8. https://doi.org/10.1093/bioinformatics/btq302
- Ofran, Y. and Rost, B. (2003). Predicted protein-protein interaction sites from local sequence information. FEBS Letters, 544(1-3), 236-9. https://doi.org/10.1016/S0014-5793(03)00456-3
- Sowa, M. E., He, W., Slep, K. C., Kercher, M. A., Lichtarge, O. and Wensel, T. G. (2001). Prediction and confirmation of a site critical for effector regulation of RGS domain activity. Nature Structural Biology, 8(3), 234-7. https://doi.org/10.1038/84974
- Su, Z., Ning, B., Fang, H., Hong, H., Perkins, R., Tong, W. and Shi, L. (2011). Next-generation sequencing and its applications in molecular diagnostics. Expert Review of Molecular Diagnostics, 11(3), 333-43. https://doi.org/10.1586/erm.11.3
- Walia, R. R., Xue, L. C., Wilkins, K., El-Manzalawy, Y., Dobbs, D. and Honavar, V. (2014). RNA Bind R Plus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins. PLoS One., 9(5), e97725. https://doi.org/10.1371/journal.pone.0097725
- Wang, B., Chen, P., Huang, D. S., Li, J. J., Lok, T. M. and Lyu, M. R. (2006). Predicting protein interaction sites from residue spatial sequence profile and evolution rate. FEBS Letters., 580(2), 380-4. https://doi.org/10.1016/j.febslet.2005.11.081
- White, A. W., Westwell, A. D. and Brahemi, G. (2008). Protein-protein interactions as targets for small-molecule therapeutics in cancer. Expert Reviews in Molecular Medicine, 10, e8. https://doi.org/10.1017/S1462399408000641
- Yan, C., Dobbs, D. and Honavar, V. (2004b). A two-stage classifier for identification of protein-protein interface residues. Bioinformatics, 20 Suppl 1, i371–i378.
- Yan, C., Honavar, V. and Dobbs, D. (2004a). Identification of interface residues in protease-inhibitor and antigen antibody complexes: a support vector machine approach. Neural Computing and Applications, 13(2), 123-129. https://doi.org/10.1007/s00521-004-0414-3
- Zhou, Huan‐Xiang, and Yibing, S. (2001). Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins: Structure, Function, and Bioinformatics, 44(3), 336-343. https://doi.org/10.1002/prot.1099
© 2020 The Authors. This article is freely available for anyone to read, share, download, print, permitted for unrestricted use and build upon, provided that the original author(s) and publisher are given due credit. All Published articles are distributed under the Creative Commons Attribution 4.0 International License.
Journal of Bioscience and Agriculture Research EISSN 2312-7945.