This is paper is recommended by my colleague Eunji. She states that this paper is well written and have a huge influence on the current molecular generation model
Since I personally don’t have a strong background on the molecular generation, I can will put my strength on introducing the motivation, background introduction and diffusion part.
Overall, the molecular generation is about generate a molecular to satisfy some function. For example, like insulin, or some enzyme to facilitate
In this paper, the motivation is to do some modification on the original large molecular so that it is more stable and easy to produce.
Noted that the modification is not by adding or cut some small chemical group on original group. The modification or “prediction” from my opinion is by generating the 3D geometry shape of the molecular by using the prior knowledge we have for the molecular. In other word, we can plot the chemistry expression on a 2D paper just like in the high school. However, what we don’t know is the angle between two group. This shape is really crucial to keep the function of the molecular and it is also crucial for the production. Since we want the most stable shape that can keep the function of molecular.
So far, we understand that the task for this paper is actually predicting the shape of the molecular so that it is stable and it has its original function.
However there is no explicit function that we can apply to find this molecular. Besides that, by iterating through all possible shape of the large molecular is intractable. Therefore the pipeline we have here is that first we use some computing method that generate a bunch of plausible shape. The stability can be approximately solve through so called Boltzmann method that based on statistical mechanics. We then select some molecular to run the simulation. According the simulation result, we then select some of the best candidates for the real chemical experiment.
For this paper we only focus on the first stage which is generating reasonable amount of plausible molecular.
Notation and Problem Definition
We represent the molecular in the form of a graph $\mathcal{G} = <\mathcal{V}, \mathcal{E}>$
Where V is nodes or vertices, and E is the edges
V represents the atoms and E represents the inter-atomic bonds. Noted that we use a virtualize edges to represent two disconnected atoms. Therefore, $\mathcal{E} \subset (\mathcal{V},\mathcal{V})$
In this paper, we find author represent the angles difference by represent every molecular location using 3D coordinates in a matrix called C. $C = [c_1, c_2, ... c_n] \in \mathbb{R}^{n\times3}$
Here we define the problem more formally: We first name this task as “molecular configuration generation”. We are interested in generating a stable conformation given multiple graphs G. And for each G we have its conformation C so that its underlying Boltzmann distribution is satisfied. We are approximating its distribution using $\theta$. The underline Boltzmann distribution is actually a distribution that is stable for a particular graph. $p_\theta(\mathcal{C}|\mathcal{G})$. We use diffusion model to approximate this distrbution
Equivariance
We will introduce them by following manner: Basic method, equivariant reverse generative process
The procedure is easy to see from this graph but just in a “”Molecular” manner.
The left one which is $C^T$ is actually the very noisy version of the molecular shape described in matrix $\mathcal{C}$. We we gradually denoise it we can have the so called original form or ground truth form of the coordinates.
The advantages for molecular denoising is very interesting. This is quote from the paper and some thermodynamics paper. The diffusion process in micro world is actually by increasing the temperature of the environment. Then every part of the molecular start to do the irregular motion. This motion can be represented using Gaussian Distribution !!!!!!!!! We find the origin of the diffusion model. When the temperature falls back, then the motion settles down, we have a very clear observation that the uncertainty is decreasing, we have a new structure of the molecular again. This part in CS we call it denoising. However, in micro-world, we have a distribution, most of the compound we generated actually is the stable compound.