Skip to content

PEESEgroup/Awesome-Materials-Aware-Large-Language-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Awesome - Materials-Aware Large Language Models Awesome

Materials-Aware Large Language Models (LLMs) are transforming the field of materials science by automating complex tasks traditionally reliant on human expertise. Leveraging advancements in AI, these models facilitate everything from data extraction and property prediction to inverse design, synthesis planning and self-driven labs.

Here, we provide a curated, non-exhaustive list of research papers that showcase the applications of LLMs in advancing materials science.

🌟 Introduction

A non-exhaustive progression of LLMs tailored for materials science, highlighting key milestones within each general model family.

🔍 Applications Overview

sort by date

📑 Data Extraction

LLMs for data extraction can process text, images, tables, and graphs from scientific literature, converting unstructured information into structured data, which is essential for building comprehensive materials databases.

Name & Link Models Material Types Release Date Journal
MagBERT (Kumar et al.) BERT Magnesium 2024.08 Materials Today Communications
MagBERT (Zhumabayeva et al.) BERT Magnetic 2024.07 The Journal of Physical Chemistry C
ChemREL RoBERTa Chemicals 2024.07 Journal of Chemical Information and Modeling
LLaMat LLaMA-2-7B Crystal 2024.07 OpenReview
MaTableGPT GPT-4 Electrocatalysts 2024.06 arXiv
MatGPT GPT-3, LLaMA-7B Solar cell 2024.05 Cell Reports Physical Science
LLaMP GPT-3.5, GPT-4 General materials 2024.01 arXiv
MatSci-LumEn GPT-3.5, GPT-4 General materials 2024.01 GitHub
MatSciRE BERT, RoBERTa General materials 2024.01 arXiv
ACE Transformer Single-atom heterogeneous catalysts 2023.12 Nature Communications
MechGPT OpenOrca-Platypus2-13B Materials failure 2023.10 arXiv
DARWIN LLaMA-7B Solar cell 2023.08 arXiv
GPT Chemistry Assistant GPT-3.5, GPT-4 MOF 2023.06 Journal of the American Chemical Society
Recycle-BERT BERT Recycling plastic 2023.08 ACS Sustainable Chemistry & Engineering
GPT-MLP GPT-3, GPT-3.5, GPT-4 Solid-state, doped semiconductors, gold nanoparticle 2023.08 Communications Materials
MatSci-NLP BERT General materials 2023.05 arXiv
ChatExtract GPT-3.5, GPT-4 High entropy alloys 2023.03 Nature Communications
OpticalBERT BERT Optical 2023.03 Journal of Chemical Information and Modeling
BatteryDataExtractor BERT Battery 2022.09 Chemical Science
MaterialsBERT BERT Polymer 2022.09 npj Computational Materials
BatteryBERT BERT Battery 2022.05 Journal of Chemical Information and Modeling
MatBERT BERT Solid-state, doped semiconductors, gold nanoparticle 2022.04 Patterns
MatSciBERT BERT Solid oxide fuel cells 2021.09 npj Computational Materials
ChemRxnExtractor BERT Chemical Reaction 2021.06 Journal of Chemical Information and Modeling
ChemBERT BERT Chemical Reaction 2021.06 Journal of Chemical Information and Modeling
RXNMapper Transformer Chemical reactions 2021.04 Science Advances
RXN4Chemistry Transformer Chemical reactions 2019.12 GitHub
SciBERT BERT General scientific text 2019.03 arXiv

📊 Data Mining

LLMs for data mining support advanced querying, knowledge graph construction, and answering complex questions within materials science.

Name Models Material Types Release Date Journal
SciQAG vicuna-7b-v1.5-16k Question-answering 2024.05 arXiv
BatteryGPT ChatGPT Question-answering 2024.03 Cell Reports Physical Science
MatKG BERT Knowledge graph 2024.01 Scientific Data
LitLLM GPT-3.5, GPT-4 Literature Review 2023.12 arXiv
PaperQA GPT-3.5, GPT-4 Question-answering 2023.12 arXiv
LitQA GPT-3.5, GPT-4 Question-answering 2023.10 arXiv

🧬 Property Prediction

LLMs assist in predicting various properties of materials, helping researchers design new materials with targeted characteristics.

Name Models Material Types Release Date Journal
MolecularGPT T5 Organic molecule 2024.06 arXiv
ChatMOF GPT-4, GPT-3.5 MOF 2024.06 Nature Communications
ChemLLM InternLM2-Base-7B Organic molecule 2024.04 arXiv
AlloyBERT RoBERTa Alloy 2024.03 arXiv
CrystalLLM (Gruver et al.) LLaMA-2 70B Inorganic 2024.02 arXiv
GPTChem GPT-3 Organic molecule 2024.02 Nature Machine Intelligence
LLaMP GPT-3.5, GPT-4 Crystal 2024.01 arXiv
PolyNC T5 Polymer 2023.12 Chemical Science
FG-BERT BERT Organic molecule 2023.11 Briefings in Bioinformatics
LLM-Prop T5 Crystalline Solids 2023.10 arXiv
GPT-MolBERTa BERT, RoBERTa Organic molecule 2023.09 arXiv
CatBERTa RoBERTa Catalyst 2023.09 ACS Catalysis
DARWIN LLaMA-7B Thermoelectric 2023.08 arXiv
GIMLET T5 Thermoelectric 2023.08 arXiv
MolRoPE-BERT T5 Organic molecule 2023.07 Journal of Molecular Graphics and Modelling
BERTOS BERT Inorganic 2022.11 Advanced Science
SolvBERT BERT Solvent 2022.10 Digital Discovery
PolyBERT DeBERTa Polymer 2022.09 Nature Communications
ChemBERTa RoBERTa Organic molecule 2022.08 arXiv
ChemGPT GPT-Neo Organic molecule 2022.05 Nature Machine Intelligence
Mol-BERT BERT Organic molecule 2022.05 Journal of Chemistry
ChemBERTa RoBERTa Organic molecule 2022.03 arXiv
SMILES-BERT BERT RT 2019.09 Proceedings of the 10th ACM international conference on bioinformatics, computational biology and health informatics

⚛️ Structure Generation

LLMs contribute to generating new material structures, especially for complex materials, enabling accelerated discovery of novel materials.

Name Models Material Types Release Date Journal
ChatMol T5 MOF 2024.09 Bioinformatics
MatterGPT Customized GPT Crystalline Solids 2024.08 arXiv
MOLLEO GPT-4, T5 Organic molecule 2024.07 arXiv
AtomGPT GPT-2 Crystalline Solids 2024.06 The Journal of Physical Chemistry Letters
ChatMOF GPT-4, GPT-3.5 MOF 2024.06 Nature Communications
CrystalLLM (Antunes et al.) Transformer-based Rutiles, spinels, pyrochlores 2024.02 arXiv
GPTChem GPT-3 Organic molecule 2024.02 Nature Machine Intelligence
GPT Linker Designer GPT-3.5 MOF Linker 2023.12 Journal of the American Chemical Society
DARWIN LLaMA-7B MOF 2023.08 arXiv
Text+Chem T5 T5 Inorganic 2023.02 arXiv
MolT5 T5 Inorganic 2022.11 arXiv
MT-GPT GPT Inorganic 2022.10 arXiv
MT-GPT2 GPT-2 Inorganic 2022.10 arXiv
MT-GPTNeo GPT-Neo Inorganic 2022.10 arXiv
MT-GPTJ GPT-J Inorganic 2022.10 arXiv
MT-BART BART Inorganic 2022.10 arXiv
MT-RoBERTa RoBERTa Inorganic 2022.10 arXiv
MolGPT Customized GPT Organic molecule 2021.10 Journal of Chemical Information and Modeling

🧪 Synthesis Planning

LLMs are employed to predict synthesis routes, aiding researchers in planning experiments and identifying potential synthesis challenges.

Name Models Material Types Release Date Journal
CSLLM LLaMA-7B Crystal 2024.07 arXiv
SynthGPT GPT-3.5, GPT-4 Inorganic 2024.04 Journal of the American Chemical Society
ReactionT5 T5 Organic 2023.03 arXiv
MatChat LLaMA2 Inorganic 2023.10 arXiv
GPT Chemistry Assistant GPT-3.5, GPT-4 MOF 2023.08 Journal of the American Chemical Society
T5Chem T5 Organic 2022.03 Journal of the American Chemical Society
ChemFormer BART Organic 2022.01 Machine Learning: Science and Technology

🤖 Agent-Driven Laboratory

LLM-based agent systems facilitate laboratory automation by controlling instruments, analyzing real-time data, and autonomously adjusting experiments.

Name Models Material Types Release Date Journal
ChemAgents Llama-3-70B Literature reader, experiments designer, robot operator, computation performer 2024.07 ChemRxiv
LLMatDesign GPT-4o Data acquisition and filtering, integrated simulations, data analysis and visualization 2024.06 arXiv
MicroGPT GPT-4 - 2024.05 Digital Discovery
ChatGPT Research Group GPT-4 Synthesis conditions extraction, code generation, research planning, and procedural guidance 2023.11 ACS Central Science
GPT-Lab GPT-4 Requirements analysis, literature retrieval, text mining, human researcher feedback, experiment execution 2023.09 arXiv
AtomAgents GPT-4 Automatic robotic experiments 2023.07 arXiv
CREST GPT-3.5 - 2023.07 ChemRxiv
GPT-4 Reticular Chemist GPT-4 Project overview, progress summary, propose task choices, evaluation 2023.06 Angewandte Chemie International Edition
ChemCrow GPT-4 Synthesis execution 2023.04 Nature Machine Intelligence
Coscientist GPT-4 Web and documentation search, code execution 2023.03 Nature

Citation

If you find our work and this repository useful, please consider giving a star ⭐ and citation 🍺:

@misc{yuan2024materials,
      title={Materials-Aware Large Language Models as Enablers of Scaling Metadata Ontology and Autonomous Discovery}, 
      author={Wenhao Yuan, Guangyao Chen, Zhilong Wang and Fengqi You},
      year={2024},
      note={Unpublished manuscript},
      institution={Cornell University},
      url={https://github.com/PEESEgroup/Awesome-Materials-Aware-Large-Language-Models}
}

How to Contribute

Contributions are welcome! Please submit a pull request to add new resources, models, or papers to the repository.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published