12/19、欧州の研究者をお招きして「マテリアル探索自動化・自律化人勢育成セミナー」を開催します
2023年10月9日
最終更新日時 :
2023年10月30日
miyahara
解説:21世紀、材料科学の進化はデータ駆動の時代を迎えています。このセミナーシリーズで今回は欧州のClaudia Draxl教授とMilica Todorović准教授をお呼びして,日々蓄積される膨大なデータを知識と価値に変えるための戦略やテクニックを紹介して頂きます。FAIRデータ基盤の重要性や異質なデータを活用した材料科学AI技術が紹介されます。大学院生から大学教授、産業界の研究者まで、材料科学でのデータからの知識獲得やAIに興味を持つすべての方を対象としています。皆様のご参加をお待ちします。(NIMS木野日織)
- 形式:オンサイトのみ
- 言語:英語
- お申込:人数把握のためできる限り参加登録をお願いいたします。
(変更しました。)
The enormous amounts of research data produced every day in the field of materials science represent a gold mine of the 21st century. How can we turn these data into knowledge and value? Here, a FAIR (Findable, Accessible, Interoperable, and Re-usable) data infrastructure plays a decisive role as this gold mine is of little value if the data are not comprehensively characterized and made available. Only then, data can be readily shared and explored by data analytics and artificial-intelligence (AI) methods. Making data Findable and AI Ready (another interpretation of the acronym) will change the way how science is done today.
In this talk, I will discuss how the NFDI consortium FAIRmat [1] is approaching these goals [2], making data from sample synthesis, various experimental probes, and computational materials science FAIR. A particular emphasis will be on the I, the interoperability. With selected examples, I will also show how knowledge can be gained from these data, be it with unsupervised [3] or supervised [4] machine-learning techniques.
References
[1] https://fairmat-nfdi.eu
[2] M. Scheffler, M. Aeschlimann, M. Albrecht, T. Bereau, H.-J. Bungartz, C. Felser, M. Greiner, A. Groß, C. Koch, K. Kremer, W. E. Nagel, M. Scheidgen, C. Wöll, and C. Draxl, Nature 604, 635 (2022).
[3] M. Kuban, S. Gabaj, W. Aggoune, C. Vona, S. Rigamonti, and C. Draxl, MRS Bulletin 47, 991 (2022).
[4] T. Bechtel, D. Speckhard, J. Godwin, and C. Draxl, preprint.
The arrival of materials science data infrastructures in the past decade has ushered in the era of data-driven materials science based on artificial intelligence (AI) algorithms, which has facilitated breakthroughs in materials optimisation and design. Of particular interest are active learning algorithms, where datasets are collected on-the-fly in the search for optimal solutions. We encoded such a probabilistic algorithm into the Bayesian Optimization Structure Search (BOSS) Python tool for materials optimisation [1].
BOSS builds N-dimensional surrogate models for materials’ energy or property landscapes to infer global optima, allowing us to conduct targeted materials engineering. The models are iteratively refined by sequentially sampling materials data with high information content. This creates compact and informative datasets. We utilised this approach for computational density functional theory studies of molecular surface adsorbates [2], thin film growth [3], solid-solid interfaces [4] and molecular conformers [5]. With experimental colleagues, we applied BOSS to accelerate the development of novel materials with targeted properties, and to optimise materials processing [7]. With recent multi-objective and multi-fidelity implementations for active learning, BOSS can make use of different information sources to help us discover optimal solutions faster in both academic and industrial settings.
[1] npj Comput. Mater., 5, 35 (2019)
[2] Beilstein J. Nanotechnol. 11, 1577-1589 (2020), Adv. Func. Mater., 31, 2010853 (2021)
[3] Adv. Sci. 7, 2000992 (2020)
[4] ACS Appl. Mater. Interfaces 14 (10), 12758-12765 (2022)
[5] J. Chem. Theory Comput. 17, 1955 (2020)
[6] MRS Bulletin 47, 29-37 (2022)
[7] ACS Sustainable Chem. Eng. 10, 9469 (2022)
Artificial intelligence (AI) approaches have accelerated discovery in materials science, and are rapidly becoming a standard tool in our research methodology. The next challenge is increasingly focusing on how to boost AI-driven discovery and make it more data- and resource-efficent. This task is complicated by the heterogenous nature of materials data, which range from numerical data, via spectral and image data to text, for both computation and experiments.
In our work, we seek to convert the apparent obstacle of heterogenous data types into an advantage. By exploiting the complexity of different data types, our objective is to learn complementary information about materials that would otherwise be unavailable through a single data channel. To this purpose we develop approaches to address each data type (modality) alone, but we also explore complex AI multi-modal models frameworks of combining varied sources of information. Here, I will review a range of single-modality approaches on numerical, spectral and textual data, and even how we encoded scientific knowledge into an expert preference data type.
Next, I will introduce recent work on multi-modal AI algorithms and demonstrate initial applications on multi-fidelity data. We used multi-task Gaussian Process models to integrate information from classical force field FF simulations, density functional theory computations (PBE and PBE0 levels) and CCSD(T) data. Compared to a single-modal approach, multi-fidelity Bayesian optimisation structure search was notably more time-efficient. Benefits were delivered by the correlation between the fidelity levels and strategic sampling with cost-aware acquisition functions. By adapting such frameworks to include further information sources, we intend to guide materials research tasks towards optimal solutions and expedite the development of new technologies.
- カテゴリー
- Seminar