By Xuan Zhang, Department of Computer Science and Engineering, Texas A&M University, USA | Limei Wang, Department of Computer Science and Engineering, Texas A&M University, USA | Jacob Helwig, Department of Computer Science and Engineering, Texas A&M University, USA | Youzhi Luo, Department of Computer Science and Engineering, Texas A&M University, USA | Cong Fu, Department of Computer Science and Engineering, Texas A&M University, USA | Yaochen Xie, Department of Computer Science and Engineering, Texas A&M University, USA | Meng Liu, Department of Computer Science and Engineering, Texas A&M University, USA | Yuchao Lin, Department of Computer Science and Engineering, Texas A&M University, USA | Zhao Xu, Department of Computer Science and Engineering, Texas A&M University, USA | Keqiang Yan, Department of Computer Science and Engineering, Texas A&M University, USA | Keir Adams, Department of Chemical Engineering, Massachusetts Institute of Technology, USA | Maurice Weiler, AMLab, University of Amsterdam, The Netherlands | Xiner Li, Department of Computer Science and Engineering, Texas A&M University, USA | Tianfan Fu, Department of Computer Science, University of Illinois Urbana-Champaign, USA | Yucheng Wang, Department of Electrical and Computer Engineering, Texas A&M University, USA | Alex Strasser, Department of Materials Science and Engineering, Texas A&M University, USA | Haiyang Yu, Department of Computer Science and Engineering, Texas A&M University, USA | YuQing Xie, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Xiang Fu, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Shenglong Xu, Department of Physics & Astronomy, Texas A&M University, USA | Yi Liu, Department of Applied Mathematics and Statistics, Stony Brook University, USA and Department of Computer Science, Stony Brook University, USA | Yuanqi Du, Department of Computer Science, Cornell University, USA | Alexandra Saxton, Department of Computer Science and Engineering, Texas A&M University, USA | Hongyi Ling, Department of Computer Science and Engineering, Texas A&M University, USA | Hannah Lawrence, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Hannes Stärk, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Shurui Gui, Department of Computer Science and Engineering, Texas A&M University, USA | Carl Edwards, Department of Computer Science, University of Illinois Urbana-Champaign, USA | Nicholas Gao, Department of Computer Science, Technical University of Munich, Germany | Adriana Ladera, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Tailin Wu, Department of Computer Science, Stanford University, USA | Elyssa F. Hofgard, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Aria Mansouri Tehrani, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Rui Wang, Department of Computer Science and Engineering, University of California San Diego, USA | Ameya Daigavane, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Montgomery Bohde, Department of Computer Science and Engineering, Texas A&M University, USA | Jerry Kurtin, Department of Computer Science and Engineering, Texas A&M University, USA | Qian Huang, Department of Computer Science, Stanford University, USA | Tuong Phung, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Minkai Xu, Department of Computer Science, Stanford University, USA | Chaitanya K. Joshi, Department of Computer Science and Technology, University of Cambridge, UK | Simon V. Mathis, Department of Computer Science and Technology, University of Cambridge, UK | Kamyar Azizzadenesheli, Nvidia, USA | Ada Fang, Department of Chemistry and Chemical Biology, Harvard University, USA | Alán Aspuru-Guzik, Department of Chemistry, University of Toronto, Canada and Department of Computer Science, University of Toronto, Canada | Erik Bekkers, AMLab, University of Amsterdam, The Netherlands | Michael Bronstein, Department of Computer Science, University of Oxford, UK and AITHYRA, Austria | Marinka Zitnik, Department of Biomedical Informatics, Harvard University, USA | Anima Anandkumar, Nvidia, USA and Department of Computing and Mathematical Sciences, California Institute of Technology, USA | Stefano Ermon, Department of Computer Science, Stanford University, USA | Pietro Liò, Department of Computer Science and Technology, University of Cambridge, UK | Rose Yu, Department of Computer Science and Engineering, University of California San Diego, USA | Stephan Günnemann, Department of Computer Science, Technical University of Munich, Germany | Jure Leskovec, Department of Computer Science, Stanford University, USA | Heng Ji, Department of Computer Science, University of Illinois Urbana-Champaign, USA | Jimeng Sun, Department of Computer Science, University of Illinois Urbana-Champaign, USA | Regina Barzilay, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Tommi Jaakkola, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Connor W. Coley, Department of Chemical Engineering, Massachusetts Institute of Technology, USA and Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Xiaoning Qian, Department of Computer Science and Engineering, Texas A&M University, USA and Department of Electrical and Computer Engineering, Texas A&M University, USA and Computing and Data Sciences, Brookhaven National Laboratory, USA | Xiaofeng Qian, Department of Electrical and Computer Engineering, Texas A&M University, USA and Department of Materials Science and Engineering, Texas A&M University, USA and Department of Physics & Astronomy, Texas A&M University, USA | Tess Smidt, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA | Shuiwang Ji, Department of Computer Science and Engineering, Texas A&M University, USA, sji@tamu.edu
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed, yet challenging. This work aims to provide a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim at understanding the physical world from the subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. We provide an in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations. We also discuss other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification. To facilitate learning and education, we provide categorized lists of resources that we found to be useful. We strive to be thorough and unified and hope this initial effort may trigger more community interests and efforts to further advance AI4Science.
Decades of artificial intelligence (AI) research have culminated in the renaissance of neural networks under the name of deep learning. Intensive research has led to many breakthroughs in this area, including, for example, ResNet, diffusion and score-based models, attention, transformers, and recently large language models (LLM) and ChatGPT. These developments have led to continuously improved performance for deep models. When coupled with growing computing power and large-scale datasets, deep learning methods are becoming dominant approaches in various fields, such as computer vision and natural language processing. Propelled by these advances, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, and giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed, yet challenging.
This monograph provides a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim to understand the physical world from the subatomic (wave functions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. An in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations is provided. Other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification, are also discussed. To facilitate learning and education, categorized lists of resources that we found to be useful are provided. The presentation is thorough and unified and it is hoped that this initial effort may trigger more community interests and efforts to further advance AI4Science.