Christian Anfinsen于1972年就提出,蛋白质的氨基酸链可以完全决定其结构。此后五十年研究者们就一直致力于解决这一问题。
CASP竞赛提供的评估指标是GDT(Global Distance Test),这一指标简单来讲可以被认为是amino acid residues (beads in the protein chain)预测值与真实值空间距离小于某一误差的百分比例。超过90%可以认为是解决了这一问题,AlphaFold2成功达到这一指标。

模型是一个端到端的attention-based neural network system,来理解这个图结构,在其建立的隐式图上进行推理。它使用了evolutionarily related sequences, multiple sequence alignment (MSA), and a representation of amino acid residue pairs来改进图。

原文图片配文:An overview of the main neural network model architecture. The model operates over evolutionarily related protein sequences as well as amino acid residue pairs, iteratively passing information between both representations to generate a structure. ↩︎