The Hepatitis B polymerase: A difficult problem begins to crack

Hi all,

I’d like to share some ongoing work in my lab that is greatly improving our understanding of HBV replication and accelerating the drug discovery efforts in my lab.

The HBV polymerase protein (abbreviated as “P”) is the only enzyme (ie, protein that makes chemical reactions occur) that HBV produces. It has 2 functions essential for HBV to copy itself, the reverse transcriptase (RT) that makes new viral DNA and the ribonuclease H (RH) that destroys the viral RNA after it has been copied into the first of 2 strands of the HBV DNA. Both are functions are essential for viral replication. The RT is the target of the nucleoside analog drugs like tenofovir and entecavir, but there are no drugs against the RH. Overall, our understanding of P’s function is extremely limited because the protein is incredibly difficult to work with.

I’ve been working with P since 1992 and have been making slow but steady progress. Over that entire time, I’ve wanted to know its 3 dimensional structure because a protein’s structure dictates its function. About 2 years ago Google figured out how to use artificial intelligence to accurately predict structures of most proteins, so we used their program (AlphaFold2) to predict the structure. We then worked really hard to validate that the prediction was right–and it is!. A free version of the paper reporting the structure is on bioRxiv at

This structure has given us great ideas about how HBV P binds to the viral RNA, starts synthesizing DNA, and how the DNA and RNA move from the RT site to the RH site. This is interesting basic science in its own right, but it also has big implications about drug discovery. 1) it gives guidance to the drug companies how to make better nucleoside analog drugs; 2) it permitted us to launch multiple new drug discovery efforts against P that do not target the RT or RH sites (such drugs would be analogous to the NNRTI’s that work so well against HIV); and 3) it told us exactly why we were having so much trouble making active RH enzyme in the lab. We’ve used that information to engineer 2 versions of the HBV RH that can easily be made in the lab, and we are now adapting the enzyme to make assays that are useful for development of anti-RH drugs. Having such an enzyme in our toolbox is really accelerating our progress with the RH drug discovery project.

I’m personally deeply happy about this advance because I’ve been working on these problems for over 30 years, and they are finally beginning to crack. Hopefully they will yield benefits for HBV+ patients in time.



@john.tavis Thank you so much for the efforts. To think you’ve been on this for such a long time, it’s really incredible. You are well appreciated. Hopefully your findings will accelerate the much needed cure for this virus. God be your strength.


Thank you Sir for this good news. May this discovery will lead to a complete cure of hepatits b. May the good Lord continue to shower you and your team strength and wisdom.

Amazing work. Superb that such a hard problem can finally be solved right out of left field

Hopefully these first two attempts you’re making prove fruitful. Who knows maybe they even provide the answer now the blank has finally been filled with this critical information

It would be very interesting to hear how it goes, please do send any updates time permitting

1 Like

Great work, @john.tavis.

I know others have been asking about the use of AI and machine learning algorithms in HBV research (e.g. @IWillBeCured, @catcher.007, @ImHopefull), and it looks like this has been the seed of this structure.

I wanted to ask how big a grain of salt you take these AI/machine learning outputs with? Is it “trust, but verify” or “it’s a start, but assume that they’re probably wrong in some way”?



I had the same question, so I took a look (programming / machine learning is my field), and assuming that this is generally the quality that it provides, it seems like the results are quite stunning if you scroll to the gif of the predicted protein and the actual protein here: GitHub - deepmind/alphafold: Open source code for AlphaFold. and it doesn’t look too hard to run either

I’ll let @john.tavis answer with actual experience in the field, I just thought that the GitHub page was interesting meanwhile


Hi all,

Thanks for the kind words! They are very much appreciated.

As to reliability: Thomas’ 2nd option, “It’s a start, but assume they’re probably wrong in some way” is the best way to go.

As with any computer program, Alphafold has its limitations. It tends to give the energetic ground state for a structure (ie, the one with the least amount of tension within it), but that may not be the active form. Also, it is very poor at predicting areas of a protein that are highly flexible (but those areas don’t have a single, well-defined structure anyway). It is also a single snap-shot of the protein, but proteins are high dynamic–they flex, twist, stretch, and skrunch in response to heat energy (they are constantly being bombarded by water molecules). Finally, most proteins function in complexes with other proteins, nucleic acids, or other cofactors such as metal ions, ATP, etc. Those interactions can alter the shape of the protein, and AlphaFold may or may not reflect those interactions.

In our case, my response when I saw the structure was "That’s a pretty picture, but does it mean anything? So I tasked 3 people in my lab with validating it. We worked hard for 3 months digging up every bit of information we could find about the HBV polymerase or related animal proteins such as the Duck Hepatitis B Virus polymerase and comparing the data to the model. It passed those quite stringent tests really well!

The current status is that we are confident the basic fold of the amino acid chain is correct for the large majority of the protein. The extreme ends of the chains have substantial uncertainty in their positions, and the “Spacer domain” is so flexible as to be completely undefined in the model. Other limitations are that we don’t have a numerical understanding of its resolution (ie, how far off it is from reality), if the resolution is the same over the regions of the protein that fold well, and whether it represents one of the >=5 conformations the protein is known to adopt early during reverse transcription or if it is an average of those states.

So yes, a healthy dose of skepticism is essential in this process! It needs to be measured by its utility–Does it guide good experiments to advance the science? So far, it is being very useful!

Here’s a picture of the enzyme. I’m so happy with it that I might just get it framed, and my grad student Daniel was so excited that he printed it with his home 3D printer! For the aficionados among you, red is the terminal protein domain, gray is the spacer, yellow is the reverse transcriptase domain, and green is the ribonuclease H domain. Magenta spheres are magnesium ions, and Y63 is tyrosine residue 63 that primes (ie, starts) DNA synthesis.

HBV P  072822


The deep science being shared on this site is so impressive. It is unbelievably refreshing to see such patient research rather than uninformed opinions and hyperbole.


Dear @john.tavis

I really don’t understand much about this. However, I am really happy that your research has good progress. I really can’t wait to see the cure in the next few years with all the hard work from all of you


1 Like

Thanks John for you efforts so far in ensuring there is solution for mankind . I pray God to grant you the knowledge to speedily eradicate this deadly virus.

A post was merged into an existing topic: Understanding fatty liver disease

Hello dr john, how is the progress on ur research …
Praying for a breakthrough this year ! Wanted to ask u how close are we this year toward a DEFINITE functional cure at least for hbv patients … thanks GOD BLESS U AND UR TEAM