Invited Keynote Speakers

AlphaFold: Improved protein structure prediction using potentials from deep learning – Dr Andrew Senior (DeepMind)
DeepMind’s AlphaFold protein structure prediction system was recently ranked first in free-modelling at the CASP13 (Critical Assessment of Protein Structure Prediction) Biennial blind assessment of prediction methods. The system relies upon prediction of inter-residue distances by a very deep neural network. Using these distance distributions and a reference distribution from a similar neural network, we construct a potential and show that we can optimize this potential by a simple application of gradient descent, as well as with a more conventional fragment assembly / simulated annealing algorithm. Despite not using templates the system also performed well in the CASP template-based category. We will discuss the training and use of the neural network and present contact- and structure-prediction results from the CASP assessment and indicate potential future directions.

Bio: Andrew Senior is a research scientist in the science team at DeepMind in London where he led the AlphaFold team which achieved significant improvements in protein structure prediction in the CASP13 assessment. Previously he was tech lead for neural networks research in Google’s speech recognition acoustic modelling group. Before joining Google, he taught at Columbia University and worked at IBM Research on computer vision and biometrics. He received his PhD from Cambridge University for his thesis on neural networks and is a fellow of the IEEE.

‘Next–next’ Generation Quantum DNA Sequencing with Chemical Surface Design and Capsule Nets – Professor Tim Albrecht (University of Birmingham)
In the project “’Next–next’ Generation Quantum DNA Sequencing with Chemical Surface Design and Capsule Nets”, we combine quantum tunnelling-based biosensing with advanced Machine Learning methods. DNA sequencing based on quantum mechanical tunnelling in principle allows for the label-free identification of nucleotides, based on their intrinsic electronic properties, and thus in some ways constitutes the ultimate limit in single-molecule sensing and sequencing. While the sensor performance is affected by many factors, including the design of the tunnelling junction and the surface chemistry of the (metal) electrodes, in this project the main focus is on maximising the level of information that is extracted from the data. For example, we have been able to demonstrate a significantly improved error rate when employing Convolutional Neural networks for “base calling”, compared to Support Vector Machines, and are now exploring Capsule Nets for further improvements.[1,2]
[1] T. Albrecht, E. Alonso et al., “Deep learning for single-molecule science”, Nanotechnology 2017, (42), 423001.
[2] A. Vladyka, T. Albrecht et al., manuscript in preparation

Bio: Tim joined the School of Chemistry at the University of Birmingham from Imperial College in 2017, as Chair in Physical Chemistry, and became the School’s Director of Research in 2018. His research interests cover a broad range of topics with focus on charge transport at the nanoscale, single-molecule thermoelectrics, single-molecule biosensing using nanopores and nanopipettes, automation, data analysis and machine learning, in particular for unsupervised data classification and sensing.

Machine Learning for Modelling Microstructure Evolution – Professor Nigel Clarke (University of Sheffield)
Our aim is to enhance our modelling capabilities for microstructure evolution with machine learning. In particular, we focus on Gaussian processes (GPs), a popular non-parametric class of models used extensively in ML and uncertainty quantification, which have well documented predictive abilities. In our preliminary studies, we apply existing GP methodology to microstructure evolution to determine the feasibility of generating an emulator to supplement more traditional, computationally intense, approaches. As an exemplar, we focus on the non-linear Cahn-Hilliard equation for describing phase separation in blends. Spatio-temporal problems are particularly challenging for ML due to their high dimensionaility, hence we use a method recently proposed for using machine learning to predict video images, based on the idea of light cones, in which the present is only dependent on the past in the immediate spatial neighbourhood, analogous to real-space time-stepping numerical schemes for PDEs. We will present results which highlight the both the strengths and challenges of using ML for modelling microstructure.

Bio: Nigel joined the Department of Physics and Astronomy at the University of Sheffield as Professor of Soft Matter Theory in 2011, following over 10 years in The Department of Chemistry at Durham University. Nigel is also a University of Sheffield alumnus, having earned his first degree and PhD at the University in 1991 and 1994, respectively. Nigel’s experience spans a number of disciplines, including physics, mathematics, material sciences and chemistry. The range of his research contributions include topics as diverse as current/voltage characteristics in organic photovoltaics, dynamics and structure in polymer nanocomposites, instabilities and pattern formation in thin films, mechanical properties of organo-gels and the cytoskeleton, blood flow through vein valves, phase separation and microstructure evolution in polymer blends and the coupling between phase transitions and flow. Nigel’s direct contributions encompass theoretical and computational science, simulations and experimental science.

Data driven models that predict protein function from sequence – Dr Lucy Colwell (University of Cambridge)
A central challenge is to predict functional properties of a protein from its sequence, and thus (i) discover new proteins with specific required functionality and (ii) better understand the functional effect of genomic mutations. Experimental breakthroughs in our ability to read and write DNA enable the data required to train and validate machine learning models that predict protein function from sequence to be rapidly acquired. Because in many cases phenotypic changes are controlled by more than one amino acid, the mutations that separate one phenotype from another may not be independent, requiring us to build models that take into account the correlation structure of the data. In this talk we show that such models rival the accuracy of existing hidden Markov models at sequence annotation, even when given relatively little training data. The representation of sequence space learned by the model can be used to build families that the model was not trained on. Finally, we report experimental confirmation that machine learning models can accurately identify variants of the AAV capsid protein that assemble integral capsids and package their genome, for potential application in gene therapy.

Bio: Lucy is a fellow at Clare College in the University of Cambridge; she is also an Assistant Professor in the Chemistry Department; she is the head of her group in Center for Molecular Informatics. She completed her PhD in 2010 in Applied Mathematics at Harvard University, where she then worked as a Postdoctoral Researcher in Applied Mathematics. Her main research interests are in making sense of data; how can we gain new insight and understanding from large bodies of data?

Multi-fidelity Statistical Learning Approach for Organic Molecular Crystal Structure Prediction – Dr Roohollah Hafizi & Dr Olga Egorova (University of Southampton)
The discovery of new crystalline materials is important in many application areas, including healthcare, energy generation and storage. The discovery of new crystal forms can be guided by computational methods for crystal structure prediction (CSP), which typically involves a global search of the lattice energy surface, followed by energy ranking of the local energy minima, which correspond to possible crystal structures. Because of the weak nature of packing forces in molecular crystals, the energy differences between structures are small and energy ranking should be performed using a high level of theory, such as hybrid-functional solid state density functional theory. However, the number of energy evaluations in a typical calculation is prohibitive, and a more efficient way of energy ranking is required. Here, we show that statistical learning can be applied to upgrade the energy ranking provided by efficient force field methods. We start from an affordable atomistic energy model to provide the distinct crystal structures and their energies, then collect more accurate energy data points at various levels using solid state quantum mechanical methods, and use it as our training set for learning the difference between different levels of theory. Gaussian process regression is applied to learn corrections to the energy differences between different levels of theory, using descriptors of local atomic environments to define similarities between crystal structures. The results demonstrate that high level energy ranking of structures can be achieved at low cost.

Olga Bio: Olga is a Postdoctoral Research Fellow in Chemistry at the University of Southampton. She got her MSc and then PhD in Statistics from University of Southampton in 2017, with the heart of her research lying in methodological developments for optimal experimental design. Then she gained experience in the area of consulting, mainly working on applying and adapting statistical methods and tools for data processing and process automation in infrastructure. In October 2018 she joined the project on Active Learning for Computational Polymorph Landscape Analysis, focusing on using Bayesian approach and Gaussian processes for exploratory and optimisation problems in crystal structure prediction framework.

Roohollah Bio: Roohollah is a Postdoctoral Research Fellow in Statistics at the University of Southampton. He obtained his Ph.D. in computational condensed matter physics from the Isfahan University of Technology in 2018. Electronic structure calculations in the framework of density functional theory are the underlying method of all his researches, including inorganic material discovery, construction of machine-learned interatomic potentials and tight-binding models, and simulation of quantum transport properties of semiconductors. In February 2019, he joined the project on Active Learning for Computational Polymorph Landscape Analysis, focusing on efficiently sampling the landscape of molecular crystals at GGA and hybrid level of DFT using a combination of PW- and GTO-based codes.

Non-equilibrium Physics and Machine Learning – Professor Juan P. Garrahan (University of Nottingham)
I will discuss work at the interface of current questions in non-equilibrium physics and machine learning methods. I will focus on the general statistical mechanics issue of accessing and characterising rare dynamical events in stochastic systems. I will describe the connection between trajectory ensemble methods – often based on the mathematics of large deviations – and reinforcement learning (RL) in Markov decision processes (MDPs). I will explain how the problem of “making rare events typical” in a stochastic system corresponds to finding the optimal dynamics in an MDP. The results I will present illustrate the many possible synergies between statistical physics and machine and deep learning.

Bio: Juan P. Garrahan has held a Chair in Physics at the University of Nottingham since 2007. His research covers a broad area of theoretical statistical physics and its applications, with particular interests in the dynamics of complex and slow relaxing materials such as supercooled liquids and glasses, molecular self-assembly, quantum non-equilibrium systems, and the theory of large deviations. He obtained his PhD from the University of Buenos Aires in 1997, was Glasstone Fellow at the University of Oxford, an EPSRC Advanced Fellow, and a visiting professor at UC Berkeley in 2007. At the University of Nottingham he is currently the head of the Centre for Quantum Non-Equilibrium Systems (CQNE) and the director of the Machine Learning in Science Initiative (MLiS) of the Faculty of Science.

Materials Development in the Energy and Electronics Sectors through Combinatorial Synthesis, High-Throughput Screening and Machine Learning – Professor Brian Hayden (University of Southampton)
The combinatorial synthesis of solid-state materials combined with high-throughput characterization and screening provides an opportunity to develop increasingly large materials data-bases. Machine learning approaches are crucial in several aspects of the building, interpretation and explitation of such data-bases: These can also be constructed to include physical and chemical descriptors from, for example, ab-initio calculation. The challenge is to ensure an audited content and consistent format of the data. Examples of how machine learning is being developed in the interpretation of raw data sets is presented using data from high throughput investigations of electrocatalysts mediating the oxygen reduction reaction (ORR) and oxygen evolution reaction (OER) for the development of reversible fuel cells and rechargeable metal-air batteries. The results provide an insight into the potential opportunities of machine learning in the future in the predictive development of functional materials.

Bio: Brian Hayden (FRSC, FIOP) obtained his PhD in Bristol in 1979 was a postdoctoral fellow at the Fritz Haber Institute of the Max Planck Society, Berlin, and appointed lecturer at the University of Bath (1983) and Southampton (1988) where he was appointed to a Personal Chair in 1995. In 2000, he extended thin film MBE based methodologies to the combinatorial synthesis and high-throughput screening of materials. He is a founder (2004) and Chief Scientific Officer of Ilika plc involved in materials discovery and development for the electronics and energy sectors. He is author of over 150 refereed papers {h-index 39}, and 30 active patent families.

The Automation of Science: Robot Scientists for Chemistry and Biology – Professor Ross King (Chalmers Technical University)
A Robot Scientist is a physically implemented robotic system that applies techniques from artificial intelligence to execute cycles of automated scientific experimentation. A Robot Scientist can automatically execute cycles of hypothesis formation, selection of efficient experiments to discriminate between hypotheses, execution of experiments using laboratory automation equipment, and analysis of results. The motivation for developing Robot Scientists is to both to better understand the scientific method, and to make scientific research more efficient. The Robot Scientist ‘Adam’ was the first machine to autonomously discover scientific knowledge: both formed and experimentally confirmed novel hypotheses. Adam worked in the domain of yeast functional genomics. The Robot Scientist ‘Eve’ was originally developed to automate early-stage drug development: active machine learning for Quantitative Structure Activity Relationship (QSAR) learning. More recently my colleagues and I have adapted Eve to work on yeast systems biology, and cancer. We argue that it is likely that advances in AI and lab automation will drive the development of ever-smarter Robot Scientists. The Physics Nobel Frank Wilczek is on record as saying that in 100 years’ time the best physicist will be a machine. If this comes to pass it will transform our understanding of science and the Universe.

Bio: Ross D. King is Professor of Machine Intelligence at Chalmers Technical University, Sweden, and a Fellow of the Alan Turing Institute. He is one of the most experienced machine learning researchers in Europe. His main research interest is the interface between computer science and science. He originated the idea of a ‘Robot Scientist’: integrating AI and laboratory robotics to physically implement closed-loop scientific discovery. His Robot Scientist ‘Adam’ was the first machine to autonomously discover scientific knowledge. His Robot Scientist ‘Eve’ is currently searching for drugs against neglected tropical diseases, and cancer. This research has been published in top scientific journals, Science, Nature, etc. and has received wide publicity. He is currently developing a 3rd Generation Robot Scientist for yeast systems biology. He also originated the idea of implementing nondeterministic universal Turing machines using DNA, and he is developing special purpose DNA based hardware for NP-complete problems.

Deep Learning Enhanced Quantum Chemistry: Pushing the limits of Materials Discovery – Dr Reinhard Maurer (University of Warwick)
Atomistic simulation based on quantum mechanics (QM) is currently being revolutionized by artificial intelligence and machine-learning (ML) methods. This involves approaches to efficiently predict materials and molecules with specific properties within the vast space of possible chemical compounds. It also involves efficient regression in high-dimensional parameter spaces to accelerate computationally demanding quantum chemical calculations of molecular properties such as the thermodynamic stability or spectroscopic signatures while retaining the predictive power of QM.Most previous approaches have used ML to predict measurable observables that arise from the QM wave function of molecules. However, all properties derive from the wave function, therefore an AI model able to predict the wave function, has the potential to predict all molecular properties. In this talk, I will explore ML approaches to directly represent wave functions for the purpose of developing methods that use AI and quantum chemistry in synergy. After presenting approaches to encode physical symmetries into deep learning infrastructures, I will present our recent efforts to use data-driven deep learning to develop a highly efficient Density-Functional Tight-binding simulation method to describe hybrid metal-organic materials.

Bio: Reinhard Maurer is an Assistant Professor in Computational Chemistry at the University of Warwick. He obtained his PhD in Theoretical Chemistry at the Technical University Munich in 2014, and then worked as a Postdoctoral Fellow thre for 7 months before moving to be a Postdoctoral Associate at Yale University until 2017. His research focuses on the theory and simulation of molecular reactions on surfaces and in materials. He studies the structure, composition, and reactivity of molecules interacting with solid surfaces. Using quantum mechanical simulation methods, such as Density-Functional Theory, his goal is to find a detailed understanding of the explicit molecular-level dynamics of molecular reactions as they appear in catalysis, photochemistry and nanotechnology. My method development efforts target the efficient simulation of nonadiabatic and quantum effects in large surface-adsorbate systems, complex surface nanostructures, and gas-surface dynamics.

Learning with Complex Priors and Interactions – Professor John Shawe-Taylor (University College London)
We have seen remarkable success in learning from large quantities of raw data. For many scenarios this seems to represent a wasteful approach, in that we should be able to leverage prior knowledge and models or a sequence of interactions in order to make learning more efficient. The talk will review work that has looked at methods and analysis of such approaches.

Bio: John is a professor at University College London (UK) where he is Director of the Centre for Computational Statistics and Machine Learning (CSML). His main research area is Statistical Learning Theory, but his contributions range from Neural Networks, to Machine Learning, to Graph Theory. John Shawe-Taylor obtained a PhD in Mathematics at Royal Holloway, University of London in 1986. He subsequently completed an MSc in the Foundations of Advanced Information Technology at Imperial College. He was promoted to Professor of Computing Science in 1996. He has published over 150 research papers. He moved to the University of Southampton in 2003 to lead the ISIS research group. He has been appointed the Director of the Centre for Computational Statistics and Machine Learning at University College, London from July 2006. He has coordinated a number of European wide projects investigating the theory and practice of Machine Learning, including the NeuroCOLT projects. He is currently the scientific coordinator of a Framework VI Network of Excellence in Pattern Analysis, Statistical Modelling and Computational Learning (PASCAL) involving 57 partners.

Predicting the Activity of Drug Candidates where there is No Target – Professor Matthew Todd (University College London)
The discovery of new antimalarial medicines with novel mechanisms of action is key to combating the increasing reports of resistance to our frontline treatments. The Open Source Malaria (OSM) consortium have been developing compounds (“Series 4”) which possess potent activity against Plasmodium falciparum in vitro and in vivo and have been suggested to act through the inhibition of PfATP4, an essential ion pump in the parasite membrane that regulates intracellular Na+ and H+ concentrations. This pump has not yet been crystallised, so in the absence of structural information about this target, a public competition was created to develop a model that would allow us to predict when compounds in Series 4 are likely to be active. In the first round in 2016, six participants used the open data collated by OSM to develop moderately predictive models using diverse methods. Notably all submitted models were available to all other participants in real time. Since then further bioactivity data have been acquired and machine learning methods have rapidly developed, so a second round of the competition is now underway. The best-performing models from this second round will be used to predict novel analogs in Series 4 that will be synthesised and evaluated against the parasite. As such the project will openly demonstrate the abilities of new machine learning algorithms in the prediction of active compounds where there is no confirmed target, frequently the central problem in phenotypic drug discovery.

Bio: Mat Todd obtained his PhD in organic chemistry from Cambridge University in 1999, was a Wellcome Trust postdoc at The University of California, Berkeley, a college fellow back at Cambridge University, a lecturer at Queen Mary, University of London and between 2005 and 2018 was at the School of Chemistry, The University of Sydney. He is now Chair of Drug Discovery at University College London. His research interests include the development of new ways to make molecules, particularly how to make chiral molecules with new catalysts. He is also interested in making metal complexes that do unusual things when they meet biological molecules or metal ions. His lab motto is “To make the right molecule in the right place at the right time”, and his students are currently trying to work out what this means. He has a significant interest in open science, and how it may be used to accelerate research, with particular emphasis on open source discovery of new medicines. He founded and currently leads the Open Source Malaria (OSM) and Open Source Mycetoma (MycetOS) consortia, and is a founder of a broader Open Source Pharma movement. In 2011 he was awarded a NSW Scientist of the Year award in the Emerging Research category for his work in open science and in 2012 the OSM consortium was awarded one of three Wellcome Trust/Google/PLoS Accelerating Science Awards. For his open source research, Mat was selected for the Medicine Maker’s Power List in 2017 and 2018. He is on the Editorial Boards of PLoS One, ChemistryOpen and Nature Scientific Reports.

The UKRI Review of Support for AI – Dr Renée Van de Locht (EPSRC)
There is growing interest in Artificial Intelligence from researchers and innovators, innovation funders, users of innovation, and Government, with particular debate around the role the UK can play in the development of AI technologies and their application. UKRI, and its research and innovation communities, have a key role to play in the development and application of world leading AI technologies. In order to provide a holistic approach to the support of AI research and innovation across UKRI, which spans the remit, portfolio, and community of UKRI, EPSRC is working with colleagues across UKRI to undertake a review of the AI landscape. This review aims to understand current UK support for AI research and innovation, and engage key stakeholders in identifying the opportunities and future directions for AI research and innovation in the UK.

Bio: Renée joined EPSRC in 2014 and has worked in the ICT Theme as a Senior Portfolio Manager since 2017. Renée coordinates the ICT Peer Review including standard mode panels. Renée manages the strategy of the following topics: Natural Language Processing and Artificial Intelligence.

Explainable AI and Scientific Discovery – Dr Richard Tomsett (IBM)
A broad variety of industries are interested in the potential of AI (particularly machine learning) technologies for supporting business decisions. However, many companies are hesitant to invest in machine learning systems as they are not easily interpretable. At least part of this hesitance is driven by regulations that require firms to be able to explain certain kinds of decision (for example, the so-called “right to an explanation” under GDPR). Such concerns have stimulated investment in “explainable AI” research, leading to an explosion in methods for explaining the behaviour of black-box machine learning models. In this talk, I will present an overview of recent work on explainable AI methods by both academic and industry research groups, how this work is being applied in business, and what these developments could mean for scientific research. In particular, I will discuss the potential application of explainable machine learning techniques as a tool for scientific discovery.

Bio: Richard works in the Emerging Technology team in IBM Research, where he specialises in AI and data science applications across a broad range of industries. He is also a researcher in the Distributed Analytics & Information Sciences International Technology Alliance, a 10-year program that brings together partners from government, academia and industry in the UK and US to conduct collaborative fundamental research on distributed analytics. His focus in this program is on interpretable and adversarial machine learning, particularly in distributed computing environments. Prior to joining IBM, Richard completed a PhD in theoretical neuroscience at Newcastle University, followed by a postdoctoral fellowship studying neural information processing at the Okinawa Institute of Science and Technology in Japan. He only occasionally misses the sub-tropical island lifestyle.

Deep Machine Learning of Quantum Chemical Hamiltonians – Professor David Yaron (Carnegie Mellon University)
The high computational cost of ab initio electronic structure calculations remains a challenge for computational design of molecules and materials. Semi-empirical models, such as Density Functional Tight Binding Theory (DFTB), can compute electronic structure at a greatly reduced cost. However, the accuracy of such models is insufficient for many applications. We will present a means to systematically improve the accuracy of such models while maintaining their low computational cost. The key enabling technology is implementation of the DFTB as a layer that can be integrated into deep learning models of machine learning. This layer takes, an input, DFTB parameters generated from a standard deep learning network and generates, as output, electronic properties from self-consistent solutions of the DFTB model. The quantum chemical layer allows backpropagation, such that the system can be trained efficiently to data on electronic properties.

Bio: David Yaron is a Professor in the Department of Chemistry at Carnegie Mellon University where he develops quantum chemical methods for large systems, including especially organic materials for electronic and photophysical applications. Most recently, he has been working on ways to integrate machine learning into quantum chemical models.

Selected Submitted Speakers

Isometric classifications of periodic crystals – Dr Vitaliy Kurlin (University of Liverpool)
Solid crystalline materials (briefly, crystals) have numerous applications from high-temperature superconductors to gas capture. A periodic crystal is an infinite arrangement of atoms or molecules obtained by translating a unit cell (a non-rectangular box) in 3 independent directions. The Crystal Structure Prediction (CSP) aims to discover solid crystal that is based on a given chemical composition and has several target properties, most importantly the energy of its thermodynamic stability. Prof Sally Price (UCL) has summarised the state-of-the-art in CSP as “embarrassment of over-prediction”, because modern software outputs 1000s of simulated crystals without identifying only few most promising candidates for synthesising in a lab. The underlying problem is the enormous ambiguity of crystal representation via conventional unit cells, because many different unit cells can define identical (up to a rigid motion) or nearly identical crystals. Reduced cells compare crystals only exactly (giving an answer yes/no) without quantifying a similarity between crystals in a continuous way. We propose a new classification of crystals by geometric invariants that are continuously changing under perturbations (atomic vibrations of atoms). These invariants will provide a well-defined distance between crystals that can be used for visualising large datasets of simulated crystals by continuously varying a threshold for similarity. This work is joint with several colleagues from the groups of Prof Andy Cooper FRS and Prof Matt Rosseinsky FRS at the Materials Innovation Factory, University of Liverpool.

Bio: Vitaliy is a Computer Scientist at the Materials Innovation Factory in Liverpool, where he facilitates the collaboration between Chemists and Computer Scientists. He was awarded the Marie Curie International Incoming Fellowship (2005-2007) and the EPSRC grant “Persistent Topological Structures in Noisy Images” (2011-2013). In 2014-2016 he has gained industrial experience through Knowledge Transfer Secondments in the Computer Vision group at Microsoft Research, Cambridge, UK. From 2018 he leads the Liverpool team on a £2.8M EPSRC 5-year grant “Application-Driven Topological Data Analysis” (with Oxford and Swansea). His research group includes one postdoc and five PhD students working on applications of topology and geometry to Materials Science, Computer Vision and Climate.

Practical applications of deep learning to imputation of drug discovery data – Dr Benedict Irwin (Optibrium)
We describe a novel deep learning method for completing sparse data matrices that accepts both molecular descriptors and sparse experimental data as inputs to exploit the correlations between experimentally measured endpoints, as well as structure-activity relationships (SAR). The method can robustly estimate the confidence in each prediction and often greatly improves the accuracy of prediction over conventional quantitative SAR models. We describe practical applications to drug discovery, including pharma-scale collections comprised of up to one million compounds as well as smaller, project-specific data sets. We illustrate how the filling in of missing data, combined with the ability to focus on the most confident predictions, guides the selection of compounds and prioritisation of experimental resources in hit-to-lead and lead optimisation.

Bio: Benedict Irwin has been a senior scientist at Optibrium Ltd. since 2018. Before this he completed a theoretical physics PhD at the University of Cambridge (UK) in the theory of condensed matter group studying free energy methods in molecular simulation. He currently works on applying machine learning and deep learning methods to drug discovery data as well as designing mathematical solutions for drug discovery problems.

Dense periodic packings in the light of crystal structure prediction – Miloslav Torda (Leverhulme Research Centre for Functional Materials Design)
One of the methods in the design of new materials is to predict the crystal structure of a new compound from it’s molecular composition. This process involves generating many hypothetical structures based on lattice energy optimization. A different approach based only on the geometry of a molecule with it’s potential to speed up classical crystal structure prediction computations will be presented. Our preliminary results with regard to the periodic packing of a geometric representations of the pentacene molecule using Monte-Carlo molecular dynamic simulations will be also shown. Limitations and downsides of presented approach will be discussed, and future directions will be proposed.

Bio: Miloslav has been a PhD Student in Computer Science at the University of Liverpool since 2018, and is funded by the Leverhulme Research Center for Functional Materials Design. Before that, Miloslav gained an MSc in Probability theory and Mathematical statistics at the Faculty of Mathematics, Physics and Informatics of the Comenius University in Bratislava. Miloslav’s current research is centred around exploring applications of discrete and computational geometry to problems in materials science, specifically crystal structure prediction.

Data Science and the Physical Sciences Data-Science Service – Dr Nicola Knight (University of Southampton)
The Physical Sciences Data-science Service is a newly funded national research facility to provide access to high-quality curated data resources for UK researchers in the Physical Sciences. This talk introduces the new service and data resources and will have a discussion about community data usage.

Bio: Nicola Knight is an Enterprise Research Fellow at the University of Southampton working on the Physical Sciences Data-Science Service (PSDS). She completed her Masters of Chemistry (MChem) at the University of Southampton previously before undertaking a PhD in Chemistry under the supervision of Professor Jeremy Frey. Her PhD focused on the interface between Chemistry and Computing with research in chemical modelling, remote experiments and the implementation of IoT technology in scientific research. Nicola’s current research interests are: chemical data and its use both in the physical sciences community and beyond, and the use of computing within scientific labs and the research process.

Ellipsoids as a new descriptor for materials – Dr James Cumby (University of Edinburgh)
A key challenge in applying machine learning to crystalline materials is to generate materials ‘descriptors’ that accurately and concisely capture the atomic arrangements in a manner allowing comparisons between different structures. This is particularly true of extended solids, where the lack of a molecular boundary can pose a problem. Many of the existing approaches are based on atom-centred functions such as radial distributions or SOAP (smooth overlap of atomic positions) which, although useful, are not necessarily interpretable. Such methods can also provide huge feature spaces, which have negative consequences for machine learning applications where the amount of data is limited. As such, there is a need for new atom-centred descriptors (applicable to both periodic and non-periodic problems) which result in few parameters. This talk will focus on short-range coordination environments as the building blocks of materials, and discuss a new method based on minimum-bounding ellipsoids for comparing different bonding environments on an equal scale. The method can also quantify the distortions which are commonly present in such coordination polyhedra, which can lead to many important physical properties such as bulk polarisation or magnetic ordering. This ellipsoidal approach has currently been applied to metal oxides, revealing an understanding of ferroelectric phase transitions and a potential new area in multiferroic materials, but is more generally applicable to any atomic environment where the geometry of nearest neighbours can be defined.

Bio: James Cumby is a Lecturer in Inorganic Chemistry at the University of Edinburgh. He completed his PhD in Materials Chemistry at the University of Birmingham in 2014. His work examines the complex electronic, ionic and magnetic behaviour in metal oxide-fluoride materials. He is particularly interested in combining experimental techniques with data-driven approaches to discover new functional materials.