Fuzzy Neural Tangent Kernel Model for Identifying DNA N4-methylcytosine SitesShow others and affiliations
2024 (English)In: IEEE transactions on fuzzy systems, ISSN 1063-6706, E-ISSN 1941-0034, Vol. 14, no 9, p. 5259-5271Article in journal (Refereed) Published
Abstract [en]
DNA N4-methylcytosine (4mC) site identification is a crucial field in bioinformatics, where machine learning methods have been effectively utilized. Due to the presence of noise, the existing deep learning methods for detecting 4mC have consistently low recognition rates in positive samples. With fuzzy rules and membership functions, fuzzy systems can achieve good results in processing noisy signals. In contrast to traditional fuzzy systems that lack deep feature representation and sample measurement, we introduce novel techniques to enhance generalization and feature representation. By incorporating the neural tangent kernel (NTK) and kernel learning algorithm into the fuzzy system, we propose the fuzzy neural tangent kernel (FNTK) model and the radius-based FNTK (R-FNTK) model to predict DNA 4mC sites. To achieve better generalization performance than traditional kernel functions, we first train the NTK for feature representation learning and sample measurement. Based on the membership function and NTK matrix, different fuzzy kernel matrices are constructed for each fuzzy subset of the fuzzy system. Finally, we utilize two types of iterative kernel optimization algorithms to effectively fuse multiple NTK-based fuzzy kernels and obtain the final prediction model. Rigorous testing using 6 benchmark datasets demonstrates the superiority of our approach, yielding significant improvements in the experiment's performance. © IEEE
Place, publisher, year, edition, pages
Piscataway, NJ: IEEE, 2024. Vol. 14, no 9, p. 5259-5271
Keywords [en]
Biological system modeling, Brain modeling, DNA, DNA N4-methylcytosine, Fuzzy model, Fuzzy systems, Kernel, Neural tangent kernel, Predictive models, Sequence classification, Support vector machines
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hh:diva-54352DOI: 10.1109/TFUZZ.2024.3425616Scopus ID: 2-s2.0-85198307928OAI: oai:DiVA.org:hh-54352DiVA, id: diva2:1886495
Note
Funding: The National Natural Science Foundation of China (NSFC 62250028, 62322215, 62172076 and U22A2038), the Excellent Young Scientists Fund in Hunan Province (Grant No. 2022JJ20077), the Zhejiang Provincial Natural Science Foundation of China (Grant No. LY23F020003) and the Municipal Government of Quzhou (2023D038).
2024-08-012024-08-012024-10-01Bibliographically approved