Fuzzy kernel evidence Random Forest for identifying pseudouridine sitesShow others and affiliations
2024 (English)In: Briefings in Bioinformatics, ISSN 1467-5463, E-ISSN 1477-4054, Vol. 25, no 3, p. 1-14, article id bbae169Article in journal (Refereed) Published
Abstract [en]
Pseudouridine is an RNA modification that is widely distributed in both prokaryotes and eukaryotes, and plays a critical role in numerous biological activities. Despite its importance, the precise identification of pseudouridine sites through experimental approaches poses significant challenges, requiring substantial time and resources.Therefore, there is a growing need for computational techniques that can reliably and quickly identify pseudouridine sites from vast amounts of RNA sequencing data. In this study, we propose fuzzy kernel evidence Random Forest (FKeERF) to identify pseudouridine sites. This method is called PseU-FKeERF, which demonstrates high accuracy in identifying pseudouridine sites from RNA sequencing data. The PseU-FKeERF model selected four RNA feature coding schemes with relatively good performance for feature combination, and then input them into the newly proposed FKeERF method for category prediction. FKeERF not only uses fuzzy logic to expand the original feature space, but also combines kernel methods that are easy to interpret in general for category prediction. Both cross-validation tests and independent tests on benchmark datasets have shown that PseU-FKeERF has better predictive performance than several state-of-the-art methods. This new method not only improves the accuracy of pseudouridine site identification, but also provides a certain reference for disease control and related drug development in the future. © The Author(s) 2024. Published by Oxford University Press.
Place, publisher, year, edition, pages
Oxford: Oxford University Press, 2024. Vol. 25, no 3, p. 1-14, article id bbae169
Keywords [en]
evidence Random Forest, fuzzy feature set, kernel method, pseudouridine sites, RNA sequences
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hh:diva-53330DOI: 10.1093/bib/bbae169PubMedID: 38622357Scopus ID: 2-s2.0-85190762553OAI: oai:DiVA.org:hh-53330DiVA, id: diva2:1860119
Note
This work is funded by the National Natural Science Foundation of China (NSFC 32270786, 62172076 and U22A2038), the Zhejiang Provincial Natural Science Foundation of China (Grant No. LY23F020003) and the Municipal Government of Quzhou, China (2023D036).
2024-05-232024-05-232024-05-23Bibliographically approved