Spent the Summers of 2018 at EPFL as a research intern under Professor Victor Panaretos, the chair of mathematical statistics. Analyzed and identified the shortcomings in two of the leading state-of-the-art algorithms in molecular estimation - RELION and CryoSPARC. Derived the formulas of the expectation maximization algorithm used in RELION and stochastic gradient descent algorithm used in cryoSPARC. Also implemented, the algorithms for estimating 2D structures instead of 3D structures for verifying our analysis. In the process, I developed an efficient yet accurate algorithm for the Radon transform and its corresponding back projection operation entirely in the Fourier domain.
Mathematical details of the algorithm
To have a look at the derivation of all the formulas, how the SGD is implemented and how the expectation maximation algorithm has been formulated, you may have a look at reports/cryosparc-algorithm for CryoSPARC and reports/math-relion.pdf for RELION.
The entire repository can be found over here.
The CryoSPARC algorithm resides in scripts/cryoSPARC.m which produces an initial estimate for the image by iteratively gradient descending to the right structure.
The RELION algorithm resides in scripts/MAP2D.m which, from the projections of the image, correctly estimates the orientation of each of those projections and then reconstructs the image using the formulae derived in the expectation maximization algorithm.
The back-projection and the projection algorithms algortihms have been implemented in scripts/backproject_fourier_alternate.m and scripts/project_fourier_alternate.m. The probability of each projection having a particular orientation is calculated by scripts/calc_prob_for_each_orientation.m.
For a more detailed overview please refer to this comprehensive review over here.
This is the structure of the original image we are trying to estimate.
This is how the successive iterations of RELION look like -
The iterations in the ab-initio reconstruction of the object, using my implementation of cryoSPARC looks like this -
This work would not have been possible without the guidance of professor Victor Panaretos and the resources provided to me by EPFL. Also, some of the references used in this work are -
- A Bayesian View on Cryo-EM Structure Determination
- RELION: Implementation of a Bayesian approach to cryo-EM structure determination
- For an idea about how to go about the projection and back-projection operation - Direct Fourier Reconstruction of a Tomographic Slice
- For the interpolation-scheme using least-squares approach - inpaint_nans