Fuzzy Neural Tangent Kernel Model for Identifying DNA N4-methylcytosine SitesVisa övriga samt affilieringar
2024 (Engelska)Ingår i: IEEE transactions on fuzzy systems, ISSN 1063-6706, E-ISSN 1941-0034, Vol. 14, nr 9, s. 5259-5271Artikel i tidskrift (Refereegranskat) Published
Abstract [en]
DNA N4-methylcytosine (4mC) site identification is a crucial field in bioinformatics, where machine learning methods have been effectively utilized. Due to the presence of noise, the existing deep learning methods for detecting 4mC have consistently low recognition rates in positive samples. With fuzzy rules and membership functions, fuzzy systems can achieve good results in processing noisy signals. In contrast to traditional fuzzy systems that lack deep feature representation and sample measurement, we introduce novel techniques to enhance generalization and feature representation. By incorporating the neural tangent kernel (NTK) and kernel learning algorithm into the fuzzy system, we propose the fuzzy neural tangent kernel (FNTK) model and the radius-based FNTK (R-FNTK) model to predict DNA 4mC sites. To achieve better generalization performance than traditional kernel functions, we first train the NTK for feature representation learning and sample measurement. Based on the membership function and NTK matrix, different fuzzy kernel matrices are constructed for each fuzzy subset of the fuzzy system. Finally, we utilize two types of iterative kernel optimization algorithms to effectively fuse multiple NTK-based fuzzy kernels and obtain the final prediction model. Rigorous testing using 6 benchmark datasets demonstrates the superiority of our approach, yielding significant improvements in the experiment's performance. © IEEE
Ort, förlag, år, upplaga, sidor
Piscataway, NJ: IEEE, 2024. Vol. 14, nr 9, s. 5259-5271
Nyckelord [en]
Biological system modeling, Brain modeling, DNA, DNA N4-methylcytosine, Fuzzy model, Fuzzy systems, Kernel, Neural tangent kernel, Predictive models, Sequence classification, Support vector machines
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:hh:diva-54352DOI: 10.1109/TFUZZ.2024.3425616Scopus ID: 2-s2.0-85198307928OAI: oai:DiVA.org:hh-54352DiVA, id: diva2:1886495
Anmärkning
Funding: The National Natural Science Foundation of China (NSFC 62250028, 62322215, 62172076 and U22A2038), the Excellent Young Scientists Fund in Hunan Province (Grant No. 2022JJ20077), the Zhejiang Provincial Natural Science Foundation of China (Grant No. LY23F020003) and the Municipal Government of Quzhou (2023D038).
2024-08-012024-08-012025-10-01Bibliografiskt granskad