hh.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
A novel approach to estimate proximity in a random forest: An exploratory study
Viktoria Institute, Göteborg, Sweden.ORCID-id: 0000-0002-1043-8773
Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligenta system (IS-lab). Department of Electrical & Control Equipment, Kaunas University of Technology, Kaunas, Lithuania.ORCID-id: 0000-0003-2185-8973
2012 (engelsk)Inngår i: Expert systems with applications, ISSN 0957-4174, E-ISSN 1873-6793, Vol. 39, nr 17, s. 13046-13050Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

A data proximity matrix is an important information source in random forests (RF) based data mining, including data clustering, visualization, outlier detection, substitution of missing values, and finding mislabeled data samples. A novel approach to estimate proximity is proposed in this work. The approach is based on measuring distance between two terminal nodes in a decision tree. To assess the consistency (quality) of data proximity estimate, we suggest using the proximity matrix as a kernel matrix in a support vector machine (SVM), under the assumption that a matrix of higher quality leads to higher classification accuracy. It is experimentally shown that the proposed approach improves the proximity estimate, especially when RF is made of a small number of trees. It is also demonstrated that, for some tasks, an SVM exploiting the suggested proximity matrix based kernel, outperforms an SVM based on a standard radial basis function kernel and the standard proximity matrix based kernel. © 2012 Elsevier Ltd. All rights reserved.

sted, utgiver, år, opplag, sider
Amsterdam: Elsevier, 2012. Vol. 39, nr 17, s. 13046-13050
Emneord [en]
Random forest, Proximity matrix, Support vector machine, Kernel matrix, Data mining
HSV kategori
Identifikatorer
URN: urn:nbn:se:hh:diva-19380DOI: 10.1016/j.eswa.2012.05.094ISI: 000308449300031Scopus ID: 2-s2.0-84865043451OAI: oai:DiVA.org:hh-19380DiVA, id: diva2:548335
Tilgjengelig fra: 2012-08-30 Laget: 2012-08-30 Sist oppdatert: 2018-03-22bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Personposter BETA

Englund, CristoferVerikas, Antanas

Søk i DiVA

Av forfatter/redaktør
Englund, CristoferVerikas, Antanas
Av organisasjonen
I samme tidsskrift
Expert systems with applications

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 462 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf