Benjamin Sanchez-Lengeling
I am a research scientist at Google Research on the Brain Team working at the intersection of molecules and AI. My research centers on using and improving computational tools for molecular discoveries, striving to make them real, for molecules of all sizes: small, large (proteins) and periodic (polymers); in application areas that include solar cells, solubility, drug-design, and olfaction. I care about interpretability for scientific discoveries and making research clear and approachable.
Besides research, I am also passionate about science education and divulgation. I am one of the founders and organizers of a STEM-education NGO Clubes de Ciencia Mexico and a LatinX-centered AI conference RIIAA.
Education
PhD in Chemistry @ Harvard with Alán Aspuru-Guzik, secondary field in computational science and engineering and a specialization in energy policy.
Masters in Quantum Chemistry European Erasmus Mundus
BA in Mathematics and Computer Science @ University of Guanajuato.
Research
Machine learning for olfaction
There is machine learning (ML) for vision and sound, but what about smell? Olfaction is our guide to the chemical world. If we want to understand and digitize smell, we must study small molecules. We charted the basics of how to map smell in 2019 and then spent three years validating this idea in several scenarios: 1) validating the technology with a panel of raters smelling never before smelt molecules, 2) discovering new mosquito repellents, 3) exploring the relationship of distances in odor space with metabolic space and 4) private industrial collaborations. This work eventually became the foundation for a startup.
Data-driven molecular design
With deep learning techniques and large datasets, we have ways of transforming molecules into vectors that live in a latent space with meaningful qualities for chemistry. With this newly learned representation, we can improve our capabilities at 1) predicting properties, 2) generalizing to small datasets, 3) searching for similar molecules, 4) generating new molecules and 5) optimizing molecules based on constants. Among many other things! We also want to move things to the lab so I co-organize a NeurIPS workshop in this direction.
Deep learning with graphs
Graphs are a very flexible data structure, they allow us to model objects and their relationships. Applying deep learning to graph data is now possible with Graph Neural Networks (GNN). We motivated this technology in a gentle, illustrated, and interactive blog post in Distill.
Because we wanted to make scientific discoveries that involve graphs, we looked at building trust in GNNs by probing the strengths and weaknesses of different explainability techniques.
Applications with molecular materials
Molecules are everywhere, literally! So of course we can study their effects in many systems across a multitude of scales such as photovoltaic solar cells, mixing in solution, metabolic networks crystal structures, metal-organic frameworks, batteries.