A hybrid AI system beats AlphaFold at predicting complex protein structures
D-I-TASSER combines deep learning with physics-based simulations to achieve superior accuracy on difficult proteins and multi-domain structures.

A new hybrid artificial intelligence system that combines deep learning with traditional physics-based modeling has achieved groundbreaking results in protein structure prediction, outperforming even the famous AlphaFold system in challenging scenarios. The method, called D-I-TASSER, represents a significant departure from purely AI-driven approaches by integrating multiple sources of machine learning with classical physics simulations.
D-I-TASSER achieved the highest modeling accuracy in both single-domain and multidomain structure prediction categories in the most recent community-wide CASP15 experiment, with average accuracy scores 18.6% and 29.2% higher than AlphaFold2 on the most challenging protein targets. The research, published today in Nature Biotechnology, demonstrates that combining AI with physics-based approaches can push protein prediction beyond what either method achieves alone.
The system was developed by a team led by Yang Zhang at the University of Michigan, building on decades of work with the original I-TASSER protein folding platform. D-I-TASSER introduces a domain splitting and assembly protocol for the automated modeling of large multidomain protein structures, addressing a key limitation in current prediction methods.
Unlike AlphaFold, which relies primarily on end-to-end deep learning, D-I-TASSER takes a hybrid approach. It uses multiple neural networks to predict spatial relationships between amino acids, then employs Monte Carlo simulations to physically assemble these predictions into complete protein structures. The pipeline couples multisource deep learning features, including contact/distance maps and hydrogen-bonding networks, with cutting-edge iterative threading assembly simulations for atomic-level protein tertiary structure modeling.
One key advantage of this approach is its ability to handle complex, multi-domain proteins—structures that contain multiple functional units that must fold and interact correctly. Two-thirds of prokaryotic proteins and four-fifths of eukaryotic proteins incorporate multiple domains and execute higher-level functions through domain–domain interactions. Most current AI methods, including AlphaFold, struggle with these complex structures because they were primarily designed for single-domain proteins.
The D-I-TASSER team tested their method extensively, including on a challenging set of 500 proteins with no similar structures in existing databases. D-I-TASSER achieved an average template modeling (TM) score of 0.870, which is 108% and 53% higher than the previous I-TASSER-based pipelines and consistently outperformed AlphaFold2 on 84% of the test cases.
Where D-I-TASSER really shines is with the most difficult protein targets. For the 148 more difficult domains, where at least one of the methods performed poorly, the TM score difference is dramatic (0.707 for D-I-TASSER versus 0.598 for AlphaFold2). This suggests that while AlphaFold excels at routine predictions, hybrid approaches may be necessary for the hardest cases.
The method's success in the blind CASP15 experiment was particularly striking. D-I-TASSER outperformed all other groups in terms of the sums of z scores, achieving cumulative z scores of 67.20 and 35.53, which were 2- and 16-fold higher than the performance of AlphaFold2 for the domains and multidomain targets, respectively.
Beyond individual protein structures, the team applied D-I-TASSER to model the entire human proteome—all ~20,000 human proteins. D-I-TASSER was applied to the structural modeling of the entire human proteome and resulted in a larger coverage of foldable sequences compared to the recently released AlphaFold Structure Database. This genome-wide analysis revealed interesting patterns, including protein clusters on specific chromosomes that were particularly challenging to model.
The research has important implications for drug discovery and disease research. Following the sequence-to-structure-to-function paradigm, the team further applied the well-established COFACTOR protocol to annotate biological functions of the human genome based on the D-I-TASSER-predicted models. They found that human proteins are most commonly involved in oxidation-reduction processes and frequently bind ATP and iron-sulfur clusters.
However, D-I-TASSER is not without limitations. The method requires significantly more computational resources than AlphaFold, taking an average of 8.2 hours and 20 GB of memory compared to AlphaFold's 1.2 hours and 60 GB of memory. D-I-TASSER achieves an average TM score of 0.67 for targets with very few homologous sequences (neff < 1), highlighting the dependence of the modeling results on the quality of multiple sequence alignments.
The work represents a broader trend in computational biology toward hybrid approaches that combine the pattern recognition capabilities of AI with the fundamental physics of molecular interactions. AlphaFold 1 used a number of separately trained modules to produce a guide potential, which was then combined with a physics-based energy potential. AlphaFold 2 replaced this with a system of interconnected sub-networks, forming a single, differentiable, end-to-end model based on pattern recognition. D-I-TASSER suggests there may be value in returning to physics-guided approaches, at least for certain challenging cases.
The implications extend beyond academic research. As computational protein prediction becomes more accurate and reliable, it's increasingly being used to guide experimental work and drug development. Recent CASPs saw enormous jumps in the accuracy of computed structures, first in CASP14 (2020) for single proteins and domains, with many models competitive in accuracy with experiment, and second, in CASP15 (2022), with a large increase in the accuracy of protein complexes.
Looking forward, the research team has made their D-I-TASSER software freely available to the scientific community, along with their complete human proteome predictions. The D-I-TASSER programs and the genome-wide modeling results have been made freely accessible to the community through https://zhanggroup.org/D-I-TASSER/.
The success of D-I-TASSER demonstrates that while pure AI approaches like AlphaFold have achieved remarkable success, there's still room for innovation by combining machine learning with traditional scientific approaches. As protein prediction moves from single domains to complex multi-protein assemblies and dynamic structures, hybrid methods may prove essential for tackling biology's remaining structural mysteries.
The research represents a important step forward in computational structural biology, showing that the future of protein prediction may not lie in choosing between AI and physics, but in finding the optimal way to combine both approaches.
Related Coverage: