CCPortal
DOI10.1016/j.ecoinf.2019.05.003
Classification and regression with random forests as a standard method for presence-only data SDMs: A future conservation example using China tree species
Zhang, Lei1; Huettmann, Falk2; Liu, Shirong3; Sun, Pengsen3; Yu, Zhen4; Zhang, Xudong1; Mi, Chunrong5
发表日期2019
ISSN1574-9541
EISSN1878-0512
卷号52页码:46-56
英文摘要

The random forests (RF) algorithm is a superb learner and classifier in machine learning applications. This ensemble model is also one of the most popular species distribution model algorithms (SDMs) available to date. RF by default can produce categorical and numerical species distribution maps based on its classification tree (CT) and regression tree (RT) algorithms, respectively. Statistically, CT can also produce numerical predictions (class probability). Many real-world applications (e.g. conservation planning) employ binary presence-absence outputs that use classification thresholds to make these conversions. However, there is little available information regarding the difference in model performance between CT and RT for inference settings. Here, under an ensemble modeling framework, 52 forest tree species with presence-only data for all of China were selected for comparison of the performance of CT and RT algorithms in projecting the distribution and potential range shifts of these species under current and future climates. Five climatic variables were used to develop CT and RT models. Eight threshold-setting approaches were employed to convert numerical predictions into binary predictions. With regard to probabilistic predictions, the relative performance of CT and RT depended on the choice of the evaluation criteria. For both RT and CT, threshold-setting methods significantly altered the determination of thresholds, model performance, and subsequently projections of species range shifts under climate change. The four threshold selection methods (MaxKappa, MaxOA, MaxTSS, and MinROCdist) based on the composite model accuracy measures most often achieved significantly higher model performance than CT default threshold method and other threshold methods. They consistently projected that species' geographical ranges changed in response to climate change with the same direction and magnitude. We argue for choosing RT rather than CT as the SDM if model discrimination capacity (the ability to differentiate between occurrences of presence and absence) is viewed as more important than model reliability (the agreement between predicted relative indexes of occurrence and observed proportions of occurrence), and vice versa. In line with gradient theory, we can recommend the use of numerical predictions for species distribution modeling since they help to convey more information than binary predictions. Binary conversion of model outputs should only be carried out when it is clearly justified by the application's objective. The four aforementioned threshold methods are promising objective methods for binary conversions of continuous predictions when presence-only data are available. This study proposes guidelines on how machine learning can be used for specific applied and theoretical applications in a SDM context.


WOS研究方向Environmental Sciences & Ecology
来源期刊ECOLOGICAL INFORMATICS
文献类型期刊论文
条目标识符http://gcip.llas.ac.cn/handle/2XKMVOVA/99955
作者单位1.Chinese Acad Forestry, Res Inst Forestry, Beijing 10091, Peoples R China;
2.UAF, Dept Biol & Wildlife, Inst Arctic Biol, EWHALE LAB, Fairbanks, AK USA;
3.Chinese Acad Forestry, Res Inst Forest Ecol Environm & Protect, Key Lab Forest Ecol & Environm State Forestry & G, Beijing 10091, Peoples R China;
4.Iowa State Univ Sci & Technol, Dept Ecol Evolut & Organismal Biol, Ames, IA 50011 USA;
5.Chinese Acad Sci, Inst Zool, Beijing 100101, Peoples R China
推荐引用方式
GB/T 7714
Zhang, Lei,Huettmann, Falk,Liu, Shirong,et al. Classification and regression with random forests as a standard method for presence-only data SDMs: A future conservation example using China tree species[J],2019,52:46-56.
APA Zhang, Lei.,Huettmann, Falk.,Liu, Shirong.,Sun, Pengsen.,Yu, Zhen.,...&Mi, Chunrong.(2019).Classification and regression with random forests as a standard method for presence-only data SDMs: A future conservation example using China tree species.ECOLOGICAL INFORMATICS,52,46-56.
MLA Zhang, Lei,et al."Classification and regression with random forests as a standard method for presence-only data SDMs: A future conservation example using China tree species".ECOLOGICAL INFORMATICS 52(2019):46-56.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zhang, Lei]的文章
[Huettmann, Falk]的文章
[Liu, Shirong]的文章
百度学术
百度学术中相似的文章
[Zhang, Lei]的文章
[Huettmann, Falk]的文章
[Liu, Shirong]的文章
必应学术
必应学术中相似的文章
[Zhang, Lei]的文章
[Huettmann, Falk]的文章
[Liu, Shirong]的文章
相关权益政策
暂无数据
收藏/分享

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。