Theses Doctoral

Machine learning and statistical approaches to extend structure solution methods to lower symmetry cases

Lan, Ling

Crystallography has transformed our understanding of atomic arrangements in materials, yet modern applications increasingly demand more complex, nanoscale structures where traditional methods fall short. The atomic pair distribution function (PDF), derived from X-ray total scattering data, has proven valuable for probing these lower-symmetry structures. However, encoding PDF data into structural information, especially for materials with intricate atomic disorders, presents an ill-posed inverse problem that requires innovative solutions.

In this thesis, we propose a “divide and conquer” framework that decomposes this challenge into manageable, well-defined sub-problems by applying constraints on the scope of structures, then solving each sub-problem using machine learning methods. Defining these sub-problems is itself challenging, as lower-symmetry structures often exhibit slight randomness and local deviations from the average structure that are difficult to quantify and simulate in general. To address this, we use a continuous representation of finite clusters and propose a symmetry-breaking measure based on Jensen-Shannon divergence. This measure not only offers a statistical tool that facilitates complex structure analysis by quantifying local symmetry changes in response to distortions, but also provides a universal metric for comparing various structures with differing distortions. The measure also supports ML-based symmetry discovery by serving as a loss function or labeling method.

To demonstrate the potential of ML methods in extracting structural information from PDF and diffraction data, we first evaluate the robustness of PDFs as inputs for deep learning models. We then address two case studies: classifying point defects in metals and regressing octahedral tilts in perovskites. For the latter, we introduce a new parameterization of distortions that ensures invariance under distance-preserving transformations, enabling a one-to-one mapping between

structural signals and parameterized values. For both studies, we generate structural and PDF datasets, ensuring accessibility to the ML community and fostering interdisciplinary collaboration. Our results show that deep learning models can effectively extract distortion information from PDFs when the inverse problem is well-defined. This work provides novel tools and insights for applying PDFs and ML to analyze complex, lower-symmetry structures.

Files

This item is currently under embargo. It will be available starting 2026-01-27.

More About This Work

Academic Units
Applied Physics and Applied Mathematics
Thesis Advisors
Du, Qiang
Billinge, Simon
Degree
Ph.D., Columbia University
Published Here
January 29, 2025