首页 > 其他分享 >LDAEXC: LncRNA-Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier.

LDAEXC: LncRNA-Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier.

时间:2023-12-08 09:57:12浏览次数:32  
标签:Associations lncRNA diseases XGBoost lncRNAs LncRNA LDAEXC features

LDAEXC: LncRNA-Disease Associations Prediction with Deep Autoencoder and XGBoost Classifier. 
作者: Lu Cuihong; Xie Minzhu 作者背景: College of Information Science and Engineering, Hunan Normal University, Changsha, China.; College of Information Science and Engineering, Hunan Normal University, Changsha, China. [email protected]. DOI: 10.1007/S12539-023-00573-Z 全部来源

PubMed期刊

施普林格·自然 期刊

Abstract / 摘要 MT翻译 Numerous scientific evidences have revealed that long non-coding RNAs (lncRNAs) are involved in the progression of human complex diseases and biological life activities. Therefore, identifying novel and potential disease-related lncRNAs is helpful to diagnosis, prognosis and therapy of many human complex diseases. Since traditional laboratory experiments are cost and time-consuming, a great quantity of computer algorithms have been proposed for predicting the relationships between lncRNAs and diseases. However, there are still much room for the improvement. In this paper, we introduce an accurate framework named LDAEXC to infer LncRNA-Disease Associations with deep autoencoder and XGBoost Classifier. LDAEXC utilizes different similarity views of lncRNAs and human diseases to construct features for each data sources. Then, the reduced features are obtained by feeding the constructed feature vectors into a deep autoencoder, and at last an XGBoost classifier is leveraged to calculate the latent lncRNA-disease-associated scores using reduced features. The fivefold cross-validation experiments on four datasets showed that LDAEXC reached AUC scores of 0.9676 ± 0.0043, 0.9449 ± 0.022, 0.9375 ± 0.0331 and 0.9556 ± 0.0134, respectively, significantly higher than other advanced similar computer methods. Extensive experiment results and case studies of two complex diseases (colon and breast cancers) further indicated the practicability and excellent prediction performance of LDAEXC in inferring unknown lncRNA-disease associations. TLDAEXC utilizes disease semantic similarity, lncRNA expression similarity, and Gaussian interaction profile kernel similarity of lncRNAs and diseases for feature construction. The constructed features are fed to a deep autoencoder to extract reduced features, and an XGBoost classifier is used to predict the lncRNA-disease associations based on the reduced features. The fivefold and tenfold cross-validation experiments on a benchmark dataset showed that LDAEXC could achieve AUC scores of 0.9676 and 0.9682, respectively, significantly higher than other state-of-the-art similar methods.   大量的科学证据表明,长链非编码RNA ( long non-coding RNAs,lncRNAs )参与了人类复杂疾病的进程和生物生命活动。因此,鉴定新的和潜在的疾病相关lncRNA有助于人类许多复杂疾病的诊断、预后和治疗。由于传统的实验室实验成本高、耗时长,大量的计算机算法被提出用于预测lncRNAs与疾病之间的关系。但是,仍有很大的提升空间。在本文中,我们引入了一个名为LDAEXC的精确框架,通过深度自编码器和XGBoost分类器来推断LncRNA -疾病关联。LDAEXC利用lncRNA和人类疾病的不同相似性视图为每个数据源构建特征。然后,将构建的特征向量输入到深度自编码器中得到约简后的特征,最后利用XGBoost分类器利用约简后的特征计算潜在的lncRNA -疾病关联分数。在4个数据集上的5折交叉验证实验表明,LDAEXC的AUC得分分别达到0.9676 ± 0.0043、0.9449 ± 0.022、0.9375 ± 0.0331和0.9556 ± 0.0134,显著高于其他先进的同类计算机方法。大量的实验结果和两个复杂疾病(结肠癌和乳腺癌)的案例研究进一步表明LDAEXC在推断未知lncRNA -疾病关联方面的实用性和出色的预测性能。TLDAEXC利用疾病语义相似度、lncRNA表达相似度、lncRNA与疾病的高斯交互轮廓核相似度进行特征构建。将构建的特征送入深度自编码器提取降维后的特征,并使用XGBoost分类器基于降维后的特征预测lncRNA -疾病关联。在基准数据集上的五折和十折交叉验证实验表明,LDAEXC能够取得0.9676和0.9682的AUC分数,显著高于其他先进的同类方法。

标签:Associations,lncRNA,diseases,XGBoost,lncRNAs,LncRNA,LDAEXC,features
From: https://www.cnblogs.com/wangprince2017/p/17884511.html

相关文章

  • LPI-IBWA: Predicting lncRNA-protein interactions based on an improved Bi-Random
    LPI-IBWA:PredictinglncRNA-proteininteractionsbasedonanimprovedBi-RandomwalkalgorithmMinzhuXie 1, RuijieXie 2, HaoWang 3Affiliations expandPMID: 37972912 DOI: 10.1016/j.ymeth.2023.11.007 SigninAbstractManystudies......
  • B4185. LPI-IBWA:Predicting lncRNA-protein Interactions Based on Improved Bi-Ran
    B4185.LPI-IBWA:PredictinglncRNA-proteinInteractionsBasedonImprovedBi-RandomWalkAlgorithmMinzhuXie1,HaoWang1 andRuijieXi11HunanNormalUniversityAbstract:Manystudieshaveshownthatlong-chainnoncodingRNAs(lncRNAs)areinvolvedinav......
  • 使用xgboost的c接口推理模型
    title:使用xgboost的c接口推理模型banner_img:https://cdn.studyinglover.com/pic/2023/07/b5c4ecf9aa476ca1073f99b22fe9605e.jpgdate:2023-9-1021:10:00categories:-踩坑tags:-机器学习使用xgboost的c接口推理模型官方capitutorial和文档,非常恶心的一点是,tutor......
  • 解决xgboost\core.py", ValueError: feature_names may not contain [, ] or <
    解决"xgboost\core.py",ValueError:feature_namesmaynotcontain[,]or<在使用xgboost进行特征工程时,有时会遇到类似下面的错误提示:pythonCopycodeFile"xgboost\core.py",lineXXX,inset_inforaiseValueError('feature_namesmaynotcontain[,]o......
  • Python信贷风控模型:梯度提升Adaboost,XGBoost,SGD, GBOOST, SVC,随机森林, KNN预测金
    原文链接:http://tecdat.cn/?p=26184 原文出处:拓端数据部落公众号最近我们被客户要求撰写关于信贷风控模型的研究报告,包括一些图形和统计输出。在此数据集中,我们必须预测信贷的违约支付,并找出哪些变量是违约支付的最强预测因子?以及不同人口统计学变量的类别,拖欠还款的概率如何......
  • XGBoost 2.0:对基于树的方法进行了重大更新
    XGBoost是处理不同类型表格数据的最著名的算法,LightGBM和Catboost也是为了修改他的缺陷而发布的。9月12日XGBoost发布了新的2.0版,本文除了介绍让XGBoost的完整历史以外,还将介绍新机制和更新。这是一篇很长的文章,因为我们首先从梯度增强决策树开始。基于树的方法,如决策树、随机......
  • Python信贷风控模型:梯度提升Adaboost,XGBoost,SGD, GBOOST, SVC,随机森林, KNN预测金
    原文链接:http://tecdat.cn/?p=26184 原文出处:拓端数据部落公众号最近我们被客户要求撰写关于信贷风控模型的研究报告,包括一些图形和统计输出。在此数据集中,我们必须预测信贷的违约支付,并找出哪些变量是违约支付的最强预测因子?以及不同人口统计学变量的类别,拖欠还款的概率如何......
  • XGboost详解
    一概述XGBoost提供梯度提升树(也称为GBDT,GBM),可以快速准确地解决许多数据科学问题,相同的代码可以在主要分布式环境运行(ApacheHadoop,ApacheSpark,ApacheFlink)。系统优化:并行计算:支持并行计算。树剪枝:用贪心算法来选择最佳分裂点,然后开始剪枝。硬件优化:有效利用硬件资源。......
  • 利用 XGBoost 进行时间序列预测
    推荐:使用NSDT场景编辑器助你快速搭建3D应用场景XGBoost应用程序的常见情况是分类预测(如欺诈检测)或回归预测(如房价预测)。但是,也可以扩展XGBoost算法以预测时间序列数据。它是如何工作的?让我们进一步探讨这一点。时间序列预测数据科学和机器学习中的预测是一种技术,用于根据一......
  • 论文解读:《iLoc-lncRNA:通过将八聚体组成纳入一般 PseKNC 来预测 lncrna 的亚细胞位置
    标题 iLoc-lncRNA:predictthesubcellularlocationoflncRNAsbyincorporatingoctamercompositionintogeneralPseKNCDOI 10.1093/bioinformatics/bty508期刊 Bioinformatics影响因子 5.8↓1.131中科院分区2区作者 ZhixunSu;YanHuang;Zhao-YueZhang;YueZhao;......