A collection of computer systems and programming tips that you may find useful.
 
Brought to you by Craic Computing LLC, a bioinformatics consulting company.

Thursday, March 14, 2019

Using Google BERT to Classify Biomedical Papers

I have been using Google's BERT language representation model to help classify a certain type of biomedical paper based on abstracts in the PubMed database.

The scripts that I use for data preparation and a detailed walk through of the process are written up in a new GitHub repository that I have created.

My work was based on a blog post by Javed Qadrud-Din that I found extremely helpful.

With my dataset I am getting around 91% accuracy - which is much better than my earlier experiments with LSTM, CNN, etc approaches.

Archive of Tips