My CV
Scientific computing and data science.
(And sometimes both together)
Last updated: 2022.08.20
Broad research interests
- Predicting materials properties with ML
- Text mining scientific literature
- Adaptive design (black box optimization) for scientific computation
- Presenting these technologies via accessible, open interfaces (e.g., webpages)
Formal education
- B.Sc., Chemical Engineering - UCLA (2014 - 2017)
- M.S. Materials Science - UC Berkeley (2017 - 2020)
- Ph.D. Materials Science - UC Berkeley (2017 - present)
- Research topic: High throughput modelling, data-driven design, and text mining for semiconductor materials
Experience
Graduate Student Research Assistant @ LBNL. Using data-mining to elucidate structure-property relationships and accelerate predictions of material properties. Running many thousands of density functional theory (DFT) calculations to evaluate candidate thermoelectrics, communicating results to experimental collaborators. Writing open-source software packages for data mining materials properties and running massively parallel calculations on supercomputers. (2017 - present)
Scientific Software Engineering Consultant @ Toyota Research Institute - Advanced Materials Design and Discovery. (Contracted thru HireArt) Software for predicting Li-ion battery cycling characteristics (lifetimes) with machine learning. (2020 - present)
Consultant @ MaterialsQM Consulting. High-throughput synthesis pathway screening using density functional theory and combinatorics. Communicating with clients, preparing reports, and helping guide discovery of novel semiconductor materials. (2018 - 2019)
Undergraduate Student Research Assistant @ LBNL. Remote position. Wrote a black-box Bayesian optimization (adaptive design) package for use with the workflow software FireWorks. Incorporated several machine learning algorithms as optimization engines, and tested the performance on two example use cases in materials science. (2016 - 2017)
Principal Web Developer @ RYE Limousine, Inc. Remote position. Designed and deployed corporate website serving hundreds of customers per month for limousine service using Wordpress and LimoAnywhere. Website included live chat between RYE employees and customers and ability for customers to interface with remote scheduling system. (2018)
Howard Hughes Medical Institute Undergraduate Researcher @ UCLA. Studied on-chip microscopy at the Ozcan Lab. Investigated techniques for rapidly polymerizing nanolenses inside mobile microscopes to identify nanoparticles (such as viruses). (2015 - 2016)
Lead App Designer @ UCLA Dept. of Anesthesiology Mobile App Team. Lead UX design for a mobile application for perioperative/anesthetic care for UCLA Health. Worked alongside the UCLA Health Center to develop a comprehensive program for wireless bioinformatics. (2015 - 2016)
Peer-reviewed Papers and Books
2022
Dagdelen, J., Dunn, A.. “Algorithms for Materials Discovery” in Accelerated Materials Discovery. De Gruyter. Chapter, Book
Walker, N., Trewartha, A., Huo, H., Lee, S., Cruse, K., Dagdelen, J., Dunn, A., Persson, K., Ceder, G., Jain, Quantifying the advantage of domain-specific pre-training on named entity recognition tasks in materials science Patterns. 3, 4 (2022)
Huo, H., Bartel, C., He, T., Trewartha, A., Dunn, A., Ouyang, B., Jain, A., Ceder, G. “Machine-Learning Rationalization and Prediction of Solid-State Synthesis Conditions. Chemistry of Materials. 34, 16 (2022)
2020
Dunn, A., Wang, Q., Ganose, A., Dopp, D., Jain, A. Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer Reference Algorithm npj Comput. Mater. 6, 138 (2020)
Dylla, M. Dunn, A. Anand, S., Jain, A., Snyder, G. J. Machine Learning Chemical Guidelines for Engineering Electronic Structures in Half-Heusler Thermoelectric Materials Research 2020, 6375171, (2020)
Bartel, C. J., Trewartha, A., Wang, Q., Dunn, A., Jain, A., Ceder, G. A critical examination of compound stability predictions from machine-learned formation energies. npj Comput. Mater. 6, 97 (2020)
Ricci, F., Dunn, A., Jain, A., Rignanese, G. M., Hautier, G. Gapped metals as thermoelectric materials revealed by high-throughput screening J. Mater. Chem. A 8, 17579-17594 (2020)
Pohls, J-H., Chanakian, S., Park, J., Ganose, M., Dunn, A., Friesen, N., Bhattacharya, A., Hogan, B., Bux, S., Jain, A., Mar, A., Zevalkink, A. Experimental validation of high thermoelectric performance in RECuZnP2 predicted by high-throughput DFT calculationsi. Materials Horizons 8, 209-215 (2020)
2019
Tshitoyan, V., Dagdelen, J., Weston, L., Dunn, A., Rong Z., Kononova, O., Persson, K.A., Ceder, G.,& Jain, A. Unsupervised word embeddings capture latent knowledge from materials science literature Nature 571, 95-98 (2019)
Dunn, A., Brenneck, J., Jain, A. Rocketsled: a software library for optimizing high-throughput computational searches. J. Phys. Mater. 2, 034002 (2019).
2018
- Ward, L., Dunn, A., Faghaninia, A., Zimmermann, N. E. R., Bajaj, S., Wang, Q., Montoya, J. H., Chen, J., Bystrom, K., Dylla, M., Chard, K., Asta, M., Persson, K., Snyder, G. J., Foster, I., Jain, A. Matminer: An open source toolkit for materials data mining. Comput. Mater. Sci. 152, 60-69 (2018).
Speaking
Dunn, A. “Machine Learning with Matminer” at Materials Project Workshop 2021, Remote. August 8, 2021
Dunn, A., Jain, A. “Software tools for Accelerating Materials Discovery with Machine Learning” at Foundational and Applied Data Science for Molecular and Material Science Engineering (Lehigh I-DISC Institute for Data, Intelligent Systems, and Computation), Bethlehem, Pennsylvania. May 23, 2019.
Dunn, A., Wang, Q., Ganose, A., Faghaninia, A., Jain, A. “An Automatic Materials Science Machine Learning Tool for Benchmarking and Prediction” at AI-based Investigation of Material Properties (TMS 2019), San Antonio, Texas. March 12, 2019
Dunn A., Faghaninia, A. “Matminer: Data Mining for Materials Science” at Materials Project Workshop 2018, Berkeley, California. August 10, 2018
Dunn A., Bajaj, S., Jain, A. “Automatic Optimization Algorithms for Maximum-Throughput Materials Design and Discovery” at Science Undergraduate Laboratory Internship Program, Berkeley, California. August 5, 2016
Dunn, A., Ray, A., Daloglu, M.U., Ozcan, A. “The Development of Polymer-based Nanolenses Towards Enhanced Nanoparticle Imaging” at UCLA HHMI Day, Los Angeles, California. May 31, 2016
Ganose A., Dunn, A. “Data Mining for Materials” at Materials Project Workshop 2019, Berkeley, California. August 2, 2019
Leadership, Memberships, and Awards
- NERSC User Group Executive Committee - Elected member of executive committee which administrates supercomputing policy at NERSC. (2019 - 2022)
- UCLA Chemical and Biomolecular Engineering Alumni Association - VP of Technology (2021 - present)
- UC Berkeley Graduate Data Visualization Contest Overall Winner - Won schoolwide competition by creating interactive website for graduate financial data. (2019)
- Computational Materials Science at Berkeley - Co-Founder, officer (2018)
- Magna Cum Laude - UCLA (2017)
- Tau Beta Pi CA Epsilon - Distinguished member (2016)
- Edward and Doris Rhoad Scholarship - Selected recipient (2014)
- National AP Scholar with Distinction - (2014)
- Regent’s and Chancellor’s Scholarship at Berkeley - Selected but declined (2014)
Skills
Programming
- Python - 5+ years experience
- Bash scripting - 4+ years experience
- NoSQL (MongoDB) - 4+ years experience
- Julia - 1 year experience
- C++ - 1 year experience
- C - occasional use
- Go - occasional use
- Javascript - occasional use
Linux
- Networking, file operations, process management
- Parallel CPU computing frameworks such as OpenMP and MPI
- GPU computing frameworks such as CUDA and Thrust
- Queue computing platforms such as SLURM and PBS
- DevOps tools such as Docker and Rancher
Data science libraries
- Machine learning - scikit-learn, keras, pytorch
- Numerical analysis - pandas, numpy, scipy
- Dashboarding - Plotly Dash
- Tools of the trade - ipython, jupyter notebook, matplotlib, plotly
Other software and frameworks (ordered by experience)
- MongoDB
- HTML, CSS
- LaTeX
- Git
- MATLAB
- Julia
- VASP - Ab-initio Density Functional Theory simulation software
Open source software
- rocketsled: (maintainer) Black box optimization framework for high-throughput computing.
- lbnlp: (maintainer) Natural language processing tools for chemistry/physics/materials science literature.
- automatminer: (maintainer) An autoML tool for predicting materials properties..
- matbench: (maintainer) A benchmarking test set and python package for property prediction in materials science.
- matminer: (primary developer) Data mining tools for materials science.
- matscholar: (developer) Text mining analysis of millions of materials science abstracts, including public API and website.
- pymatgen: (contributor) Python materials genomics.
- fireworks: (contributor) High throughput workflow management.
- atomate: (contributor) Pre-built workflows to calculate materials properties.
- among others…