CCPortal
DOI10.1186/s13321-018-0263-1
OPERA models for predicting physicochemical properties and environmental fate endpoints
Mansouri, Kamel1,2,3; Grulke, Chris M.1; Judson, Richard S.1; Williams, Antony J.1
发表日期2018-03-08
ISSN1758-2946
卷号10
英文摘要

The collection of chemical structure information and associated experimental data for quantitative structure-activity/property relationship (QSAR/QSPR) modeling is facilitated by an increasing number of public databases containing large amounts of useful data. However, the performance of QSAR models highly depends on the quality of the data and modeling methodology used. This study aims to develop robust QSAR/QSPR models for chemical properties of environmental interest that can be used for regulatory purposes. This study primarily uses data from the publicly available PHYSPROP database consisting of a set of 13 common physicochemical and environmental fate properties. These datasets have undergone extensive curation using an automated workflow to select only high-quality data, and the chemical structures were standardized prior to calculation of the molecular descriptors. The modeling procedure was developed based on the five Organization for Economic Cooperation and Development (OECD) principles for QSAR models. A weighted k-nearest neighbor approach was adopted using a minimum number of required descriptors calculated using PaDEL, an open-source software. The genetic algorithms selected only the most pertinent and mechanistically interpretable descriptors (2-15, with an average of 11 descriptors). The sizes of the modeled datasets varied from 150 chemicals for biodegradability half-life to 14,050 chemicals for logP, with an average of 3222 chemicals across all endpoints. The optimal models were built on randomly selected training sets (75%) and validated using fivefold cross-validation (CV) and test sets (25%). The CV Q(2) of the models varied from 0.72 to 0.95, with an average of 0.86 and an R-2 test value from 0.71 to 0.96, with an average of 0.82. Modeling and performance details are described in QSAR model reporting format and were validated by the European Commission's Joint Research Center to be OECD compliant. All models are freely available as an open-source, command-line application called OPEn structure-activity/property Relationship App (OPERA). OPERA models were applied to more than 750,000 chemicals to produce freely available predicted data on the U.S. Environmental Protection Agency's CompTox Chemistry Dashboard.


英文关键词OPERA;QSAR/QSPR;Physicochemical properties;Environmental fate;OECD principles;Open data;Open source;Model validation;QMRF
语种英语
WOS记录号WOS:000427171600001
来源期刊JOURNAL OF CHEMINFORMATICS
来源机构美国环保署
文献类型期刊论文
条目标识符http://gcip.llas.ac.cn/handle/2XKMVOVA/58949
作者单位1.US EPA, Natl Ctr Computat Toxicol, Off Res & Dev, Res Triangle Pk, NC 27711 USA;
2.Oak Ridge Inst Sci & Educ, 1299 Bethel Valley Rd, Oak Ridge, TN 37830 USA;
3.ScitoVation LLC, 6 Davis Dr, Res Triangle Pk, NC 27709 USA
推荐引用方式
GB/T 7714
Mansouri, Kamel,Grulke, Chris M.,Judson, Richard S.,et al. OPERA models for predicting physicochemical properties and environmental fate endpoints[J]. 美国环保署,2018,10.
APA Mansouri, Kamel,Grulke, Chris M.,Judson, Richard S.,&Williams, Antony J..(2018).OPERA models for predicting physicochemical properties and environmental fate endpoints.JOURNAL OF CHEMINFORMATICS,10.
MLA Mansouri, Kamel,et al."OPERA models for predicting physicochemical properties and environmental fate endpoints".JOURNAL OF CHEMINFORMATICS 10(2018).
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Mansouri, Kamel]的文章
[Grulke, Chris M.]的文章
[Judson, Richard S.]的文章
百度学术
百度学术中相似的文章
[Mansouri, Kamel]的文章
[Grulke, Chris M.]的文章
[Judson, Richard S.]的文章
必应学术
必应学术中相似的文章
[Mansouri, Kamel]的文章
[Grulke, Chris M.]的文章
[Judson, Richard S.]的文章
相关权益政策
暂无数据
收藏/分享

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。