【2017年第2期】深度学习在化学信息学中的应用(下)
徐優俊,?裴劍鋒
北京大學前沿交叉學科研究院定量生物學中心,北京 100871?
摘要:深度學習在計算機視覺、語音識別和自然語言處理三大領域中取得了巨大的成功,帶動了人工智能的快速發展。將深度學習的關鍵技術應用于化學信息學,能夠加快實現化學信息處理的人工智能化。化合物結構與性質的定量關系研究是化學信息學的主要任務之一,著重介紹各類深度學習框架(深層神經網絡、卷積神經網絡、循環或遞歸神經網絡)應用于化合物定量構效關系模型的研究進展,并針對深度學習在化學信息學中的應用進行了展望。
關鍵詞:深度學習;人工智能;定量構效關系;化學信息學
中圖分類號:TP301 ? ? 文獻標識碼:A
Deep learning for chemoinformatics
XU Youjun, PEI Jianfeng
Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
Abstract:?Deep learning have been successfully used in computer vision,speech recognition and natural language processing,leading to the rapid development of artificial intelligence.The key technology of deep learning was also applied to chemoinformatics,speeding up the implementation of artificial intelligence in chemistry.As developing quantitative structure-activity relationship model is one of major tasks for chemoinformatics,the application of deep learning technology in QSAR research was focused.How three kinds of deep learning frameworks,namely,deep neural network,convolution neural network,and recurrent or recursive neural network were applied in QSAR was discussed.A perspective on the future impact of deep learning on chemoinformatics was given.
Key words:?deep learning, artificial intelligence, quantitative structure-activity relationship, chemoinformatics
論文引用格式:徐優俊, 裴劍鋒.?深度學習在化學信息學中的應用[J], 大數據, 2017, 3(2): 45-66.
XU Y J, PEI J F.?Deep learning for chemoinformatics[J]. Big Data Research,?2017, 3(2): 45-66.
4 ?深度學習框架的對比與分析
表1是深度神經網絡框架在QSAR中的應用,可以看出,目前深度學習框架下的QSAR研究主要有以下幾個特點。
● 隨著數據集的增多以及多樣化,研究人員逐漸傾向于使用多任務模型的訓練策略,多任務學習中遷移學習的概念被應用到了數據較少的數據集中,提高對該任務的預測能力。多任務學習模型的評估方法大多是基于AUC的,說明多任務模型目前只適用于分類問題,在多任務的回歸模型的問題上,還有待開發出更好的訓練手段和策略。
● ReLU目前是在QSAR中最常用的一種訓練技術,在DNN和CNN框架中基本都使用了該技術。發展更好、更快的訓練。
從分子編碼技術在深度學習中的應用來看,筆者發現基于原子水平的特征輸入在逐漸取代基于分子描述符或指紋的特征輸入,這說明深度學習擁有足夠的能力從原子層面提取支持分子水平預測的信息,印證了其強大的特征提取能力。但目前比較不足的是對于這些深層特征的深層分析。目前研究人員主要采用的策略是重新設計實驗,專門用來可視化隱層中與目標性質相關的分子片段,并沒有直接從構建出來的高水平的QSAR模型本身出發進行隱層特征的分析,這方面的研究有待加強。
5 ?總結與展望
綜上所述,由于化學分子數量多、結構復雜多樣,使用傳統的算法處理時能力常有不足,深度學習的表現比起傳統機器學習算法更勝一籌,主要是因為深度學習是一種多層描述的表征學習,通過組合簡單、非線性模塊來實現,每個模塊都會將最簡單的描述(從原始或近原始輸入開始)轉變成較高層、較為抽象的描述。其關鍵之處在于這些抽象的特征并非人工設計,而是模型從大量數據中自動學習得到的。這樣的能力在面對化學中的大量實驗數據時顯得更為得心應手,更加智能化。從目前的應用表現來看,雖然深度學習在語音處理、計算機視覺和自然語言處理中的應用已經非常廣泛,但是深度學習在QSAR乃至化學信息學中的應用目前還只屬于初步的階段。而這些應用表現出來的成功之處可以折射出深度學習在化學領域的應用前景中必然是一條康莊大道。從QSAR問題的復雜度來看,多任務QSAR模型的開發本來是一件很難完成的事情,然而在深度學習面前就顯得相對簡單,在模型表現上也顯得極為突出。在QSAR模型編碼時,初步發現一些依靠化學專業知識設計的特征(如分子描述符)已經不再那么重要,僅僅依靠非常簡單的原子層面的信息就能組建高水平的QSAR模型。這無疑是歸功于深度學習的強大特征學習能力。而且這些特征甚至可以在隱層中被轉化為一些真實的化合物子片段的概念,如DeepTox中涉及的毒性片段以及NGF方法涉及的與目標性質相關的片段,促進了深度學習在QSAR中的可解釋性的研究。深度神經網絡是一套適合做“感知”的框架,讓適合做“感知”的深度學習結合以推理為核心的貝葉斯神經網絡,形成“感知—推理—決策”的范式,從而加快基于深度學習的新型藥物設計的發展。
深度學習應用于化學信息學還存在一些需要解決的關鍵科學問題,包括如何進一步改進過擬合現象和加快深度神經網絡的訓練過程;如何發展更適用于分子二維及三維結構信息特征的編碼方法和網絡結構、超參數優化算法及多目標深度學習算法;如何準確預測化合物與生物網絡的作用關系及其生物活性。如何高速有效地處理非結構化的化學分子相關文本文獻和圖像信息數據,也是一個需要解決的關鍵問題。深度學習對數據的強大處理和理解能力,也為人們提供了一條可能的新途徑,以便更好地理解化學分子結構的物理化學本質。
參考文獻:
[1]??????HINTON G, DENG L, YU D, et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups[J]. IEEE Signal Processing Magazine, ?2012, 29(6):82-97.
[2]??????KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2):1097-1105.
[3]??????COLLOBERT R, WESTON J. A unified architecture for natural language processing: deep neural networks with multitask learning[C]// The 25th International Conference on Machine Learning, July5-9, 2008, Helsinki, Finland. New York: ACM Press, 2008: 160-167.
[4]??????GAWEHN E, HISS J A, SCHNEIDER G. Deep learning in drug discovery[J]. Molecular Informatics, 2016, 35(1):3-14.
[5]??????RAGHU M, POOLE B, KLEINBERG J, et al. On the expressive power of deep neural networks[J]. 2016: arXiv:1606.05336.
[6]??????HINTON G E, OSINDERO S, TEH YW. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7):1527-1554.
[7]??????SRIVASTAVA N, HINTON G E, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1):1929-1958.
[8]??????IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[J]. 2015: arXiv:1502.03167.
[9]??????GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks[C]//The 14th International Conference on Artificial Intelligence and Statistics,April 11-13, 2011, Fort Lauderdale, USA.[S.l.:s.n.],2011: 315-323.
[10]??DUCHI J, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12(7):2121-2159.
[11]??ZEILER M D. ADADELTA: an adaptive learning rate method[J]. 2012: arXiv:1212.5701.
[12]??KINGMA D, BA J. Adam: a method for stochastic optimization[J]. 2014: arXiv:1412.6980.
[13]??MIKOLOV T, KARAFIáT M, BURGET L, et al. Recurrent neural network based language model[C]//The11th Annual Conference of the International Speech Communication Association, September 26-30, 2010, Makuhari, Chiba.[S.l.:s.n.], 2010: 1045-1048.
[14]??WU Y, SCHUSTER M, CHEN Z, et al.?Google's neural machine translation system: bridging the gap between human and machine translation[J]. 2016: arXiv:1609.08144.
[15]??VINCENT P, LAROCHELLE H, LAJOIE I, et al. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010, 11(12):3371-3408.
[16]??SOCHER R. Recursive deep learning for natural language processing and computer vision.Citeseer, 2014(8): 1.
[17]??HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[18]??孫潭霖,?裴劍鋒.?大數據時代的藥物設計與藥物信息[J].?科學通報, 2015(8):689-693.
[19]??SVETNIK V, LIAW A, TONG C, et al. Random forest: a classification and regression tool for compound classification and QSAR modeling[J]. Journal of Chemical Information and Computer Sciences, 2003, 43(6):1947-1958.
[20]??RUPP M, TKATCHENKO A, MüLLER KR, et al. Fast and accurate modeling of molecular atomization energies with machine learning[J]. Physical Review Letters, 2012, 108(5):3125-3130.
[21]??RACCUGLIA P, ELBERT K C, ADLER P D F, et al. Machine-learning-assisted materials discovery using failed experiments[J]. Nature, 2016, 533(7601):73-76.
[22]??DU H, WANG J, HU Z, et al. Prediction of fungicidal activities of rice blast disease based on least-squares support vector machines and project pursuit regression[J]. Journal of Agricultural and Food Chemistry, 2008, 56(22):10785-10792.
[23]??LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436-444.
[24]??JAITLY N, NGUYEN P, SENIOR A W, et al. Application of pretrained deep neural networks to large vocabulary speech recognition[C]//The13th Annual Conference of the International Speech Communication Association,September 9-13, 2012, Portland, OR, USA. [S.l.:s.n.],2012: 1-4.
[25]??DAHL G E, YU D, DENG L, et al. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1):30-42.
[26]??GRAVES A, MOHAMED AR, HINTON G. Speech recognition with deep recurrent neural networks[C]//2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),May 26-31, 2013, Vancouver, BC, Canada. New Jersey: IEEE Press, 2013: 6645-6649.
[27]??DENG L, YU D, DAHL G E. Deep belief network for large vocabulary continuous speech recognition: 8972253[P]. 2015-03-03.
[28]??GAO J, HE X, DENG L. Deep learning for web search and natural language processing[R]. Redmond:Microsoft Research, 2015.
[29]??MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013: arXiv:1310.4546.
[30]??SOCHER R, LIN C C, MANNING C, et al. Parsing natural scenes and natural language with recursive neural networks[C]//The 28th International Conference on MACHINE LEARNing (ICML-11), June 28-July 2, 2011, Bellevue, Washington, USA. [S.l.:s.n.], 2011:129-136.
[31]??HE K, ZHANG X, REN S, et al. Delving deep into rectifiers: surpassing human-level performance on imagenet classification[C]//The IEEE International Conference on Computer Vision,December 13-16, 2015, Santiago, Chile. New Jersey: IEEE Press, 2015: 1026-1034.
[32]??SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//The IEEE Conference on Computer Vision and Pattern Recognition, June 7-12, 2015, Boston, MA, USA. New Jersey: IEEE Press, 2015: 1-9.
[33]??RUSSAKOVSKY O, DENG J, SU H, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252.
[34]??HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//The IEEE Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vegas, NV, USA. New Jersey: IEEE Press, 2016: 770-778.
[35]??MARKOFF J. Scientists see promise in deep-learning programs[N]. New York Times, 2012-10-25.
[36]??CARHART R E, SMITH D H, VENKATARAGHAVAN R. Atom pairs as molecular features in structure-activity studies: definition and applications[J]. Journal of Chemical Information and Computer Sciences, 1985, 25(2):64-73.
[37]??KEARSLEY S K, SALLAMACK S, FLUDER E M, et al. Chemical similarity using physiochemical property descriptors[J]. Journal of Chemical Information and Computer Sciences, 1996, 36(1):118-127.
[38]??RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors[J]. Cognitive Modeling, 1988, 5(3):1.
[39]??MA J, SHERIDAN R P, LIAW A, et al. Deep neural nets as a method for quantitative structure-activity relationships[J]. Journal of chemical information and modeling, 2015, 55(2):263-274.
[40]??DAHL G E, JAITLY N, SALAKHUTDINOV R. Multi-task neural networks for QSAR predictions[J]. 2014: arXiv:1406.1231.
[41]??EVGENIOU T, PONTIL M. Regularized multi--task learning[C]//The 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,August 22 - 25, 2004, Seattle, WA, USA. New York: ACM Press,2004: 109-117.
[42]??MAURI A, CONSONNI V, PAVAN M, et al. Dragon software: an easy approach to molecular descriptor calculations[J]. Match, 2006, 56(2):237-248.
[43]??SNOEK J, LAROCHELLE H, ADAMS R P. Practical bayesian optimization of machine learning algorithms[J]. Advances in Neural Information Processing Systems, 2012: arXiv:1206.2944.
[44]??SNOEK J, SWERSKY K, ZEMEL R S, et al. Input warping for bayesian optimization of non-stationary functions[C]//International Conference on Machine Learning,June 21-26, 2014, Beijing, China. [S.l.:s.n.], 2014: 1674-1682.
[45]??FRIEDMAN J H. Greedy function approximation: a gradient boosting machine[J]. Annals of Statistics, 2001, 29(5):1189-1232.
[46]??UNTERTHINER T, MAYR A, KLAMBAUER G, et al. Multi-task deep networks for drug target prediction[J]. Neural Information Processing System, 2014: 1-4.
[47]??GAULTON A, BELLIS L J, BENTO A P, et al. ChEMBL: a large-scale bioactivity database for drug discovery[J]. Nucleic Acids Research, 2012, 40(D1):D1100-D1107.
[48]??ROGERS D, HAHN M. Extended-connectivity fingerprints[J]. Journal of Chemical Information and Modeling, 2010, 50(5):742-754.
[49]??HARPER G, BRADSHAW J, GITTINS J C, et al. Prediction of biological activity for high-throughput screening using binary kernel discrimination[J]. Journal of Chemical Information and Computer Sciences, 2001, 41(5):1295-1300.
[50]??LOWE R, MUSSA H Y, NIGSCH F, et al. Predicting the mechanism of phospholipidosis[J]. Journal of Cheminformatics, 2012, 4(1):2.
[51]??XIA X, MALISKI E G, GALLANT P, et al. Classification of kinase inhibitors using a Bayesian model[J]. Journal of Medicinal Chemistry, 2004, 47(18):4463-4470.
[52]??KEISER M J, ROTH B L, ARMBRUSTER B N, et al. Relating protein pharmacology by ligand chemistry[J]. Nature Biotechnology, 2007, 25(2):197-206.
[53]??WANG Y, SUZEK T, ZHANG J, et al. PubChem bioassay: 2014 update[J]. Nucleic Acids Research, 2014, 42(Database Issue):1075-1082.
[54]??ROHRER S G, BAUMANN K. Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data[J]. Journal of Chemical Information and Modeling, 2009, 49(2):169-184.
[55]??MYSINGER M M, CARCHIA M, IRWIN J J, et al. Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking[J]. Journal of medicinal chemistry, 2012, 55(14):6582-6594.
[56]??RAMSUNDAR B, KEARNES S, RILEY P, et al. Massively multitask networks for drug discovery[J]. 2015: arXiv:1502.02072.
[57]??MAYR A, KLAMBAUER G, UNTERTHINER T, et al.?DeepTox: toxicity prediction using deep learning[J]. Frontiers in Environmental Science, 2016, 3(8):80.
[58]??KAZIUS J, MCGUIRE R, BURSI R. Derivation and validation of toxicophores for mutagenicity prediction[J]. Journal of medicinal chemistry, 2005, 48(1):312-320.
[59]??FRIEDMAN J, HASTIE T, TIBSHIRANI R. Regularization paths for generalized linear models via coordinate descent[J]. Journal of Statistical Software, 2010, 33(1):1.
[60]??SIMON N, FRIEDMAN J, HASTIE T, et al. Regularization paths for Cox’s proportional hazards model via coordinate descent[J]. Journal of Statistical Software, 2011, 39(5):1.
[61]??DUVENAUD D K, MACLAURIN D, IPARRAGUIRRE J, et al. Convolutional networks on graphs for learning molecular fingerprints[J]. Advances in Neural Information Processing Systems, 2015: arXiv:1509.09292.
[62]??GRAVES A, WAYNE G, DANIHELKA I. Neural turing machines[J]. 2014: arXiv:1410.5401.
[63]??MORGAN H L. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service[J]. Journal of Chemical Documentation, 1965, 5(2):107-113.
[64]??DELANEY J S. ESOL: estimating aqueous solubility directly from molecular structure[J]. Journal of Chemical Information and Computer Sciences, 2004, 44(3):1000-1005.
[65]??GAMO F-J, SANZ L M, VIDAL J, et al. Thousands of chemical starting points for antimalarial lead identification[J]. Nature, 2010, 465(7296):305-310.
[66]??HACHMANN J, OLIVARES-AMAYA R, ATAHAN-EVRENK S, et al. The Harvard clean energy project: large-scale computational screening and design of organic photovoltaics on the world community grid[J]. The Journal of Physical Chemistry Letters, 2011, 2(17):2241-2251.
[67]??KEARNES S, MCCLOSKEY K, BERNDL M, et al. Molecular graph convolutions: moving beyond fingerprints[J]. Journal of Computer-Aided Molecular Design, 2016, 30(8):595-608.
[68]??HUGHES T B, MILLER G P, SWAMIDASS S J. Modeling epoxidation of drug-like molecules with a deep machine learning network[J]. ACS Central Science, 2015, 1(4):168-180.
[69]??HUGHES T B, MILLER G P, SWAMIDASS S J. Site of reactivity models predict molecular reactivity of diverse chemicals with glutathione[J]. Chemical research in toxicology, 2015, 28(4):797-809.
[70]??WALLACH I, DZAMBA M, HEIFETS A. AtomNet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery[J]. 2015: arXiv:1510.02855.
[71]??KOES D R, BAUMGARTNER M P, CAMACHO C J. Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise[J]. Journal of Chemical Information and Modeling, 2013, 53(8):1893-1904.
[72]??GABEL J, DESAPHY J R M, ROGNAN D. Beware of machine learning-based scoring functionson the danger of developing black boxes[J]. Journal of Chemical Information and Modeling, 2014, 54(10):2807-2815.
[73]??SPITZER R, JAIN A N. Surflex-Dock: docking benchmarks and real-world application[J]. Journal of Computer-Aided Molecular Design, 2012, 26(6):687-699.
[74]??COLEMAN R G, STERLING T, WEISS D R. SAMPL4 & DOCK3. 7: lessons for automated docking procedures[J]. Journal of Computer-Aided Molecular Design, 2014, 28(3):201-209.
[75]??ALLEN W J, BALIUS T E, MUKHERJEE S, et al. DOCK 6: impact of new features and current docking performance[J]. Journal of Computational Chemistry, 2015, 36(15):1132-1156.
[76]??PEREIRA J C, CAFFARENA E R, DOS SANTOS C N. Boosting docking-based virtual screening with deep learning[J]. Journal of Chemical Information and Modeling, 2016: arXiv:1608.04844.
[77]??LUSCI A, POLLASTRI G, BALDI P. Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules[J]. Journal of Chemical Information and Modeling, 2013, 53(7):1563-1575.
[78]??JAIN N, YALKOWSKY S H. Estimation of the aqueous solubility I: application to organic nonelectrolytes[J]. Journal of Pharmaceutical Sciences, 2001, 90(2):234-252.
[79]??LOUIS B, AGRAWAL V K, KHADIKAR P V. Prediction of intrinsic solubility of generic drugs using MLR, ANN and SVM analyses[J]. European Journal of Medicinal Chemistry, 2010, 45(9):4018-4025.
[80]??AZENCOTT C-A, KSIKES A, SWAMIDASS S J, et al. One-to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties[J]. Journal of Chemical Information and Modeling, 2007, 47(3):965-974.
[81]??FR?HLICH H, WEGNER J K, ZELL A. Towards optimal descriptor subset selection with support vector machines in classification and regression[J]. QSAR & Combinatorial Science, 2004, 23(5):311-318.
[82]??XU Y, DAI Z, CHEN F, et al. Deep learning for drug-induced liver injury[J]. Journal of Chemical Information and Modeling, 2015, 55(10):2085-2093.
[83]??LAKE B M, SALAKHUTDINOV R, TENENBAUM J B. Human-level concept learning through probabilistic program induction[J]. Science, 2015, 350(6266):1332-1338.
[84]??ALTAE-TRAN H, RAMSUNDAR B, PAPPU A S, et al. Low data drug discovery with one-shot learning[J]. 2016: arXiv:1611.03199.
[85]??KUHN M, LETUNIC I, JENSEN L J, et al. The SIDER database of drugs and side effects[J]. Nucleic Acids Research,2015, 44(D1):D1075.
[86]??GóMEZ-BOMBARELLI R, DUVENAUD D, HERNáNDEZ-LOBATO J M, et al. Automatic chemical design using a data-driven continuous representation of molecules[J]. 2016: arXiv:1610.02415.
[87]?SEGLER M H S, KOGEJ T, TYRCHAN C, et al. Generating focussed molecule libraries for drug discovery with recurrent neural networks[J]. 2017: arXiv:1701.01329.
徐優俊(1990-),男,北京大學前沿交叉學科研究院博士生,主要研究方向為藥物設計與藥物信息。
裴劍鋒(1975-),男,博士,北京大學前沿交叉學科研究院特聘研究員,主要研究方向為藥物設計與藥物信息。
總結
以上是生活随笔為你收集整理的【2017年第2期】深度学习在化学信息学中的应用(下)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: linux中GDB详细使用手册
- 下一篇: 进程间通信-system-v