Hello Everyone,

Join us next Wednesday, June 3rd at 1 pm ET for an exciting talk tackling one of plant genomics' most pressing challenges: predicting enzyme function at scale. Discover how large language models (LLMs), protein language models (PLMs), and phylogenetics are being combined into an end-to-end annotation workflow to make it possible.

As part of the AgBioData webinar series, Gaurav MogheAssociate Professor in the School of Integrative Plant Science at Cornell University, will present the talk titled: “Advancing biochemical studies with LLMs and PLMs”.

The zoom link and abstract are below. We hope you can join us!

Best,

Marcela

--
Wednesday, May 6th, 1PM ET
10A PT | 11A MT | 12P CT | 1P ET  
Find your local time here.

Meeting ID: 820 3835 6125
Passcode: 160683
--


Speaker: Gaurav Moghe, Associate Professor, Integrative Plant Science @ Cornell University

Title: Advancing biochemical studies with LLMs and PLMs

Abstract: As the number of sequenced plant genomes continues to accumulate rapidly, innovation in functional annotation of genes is increasingly becoming a critical endeavor. However, allelic divergence, duplication-divergence, promiscuity, and redundancy are major hurdles in function prediction. In this talk, I will describe a workflow we have developed for enzyme function prediction that includes LLM-assisted extraction from literature (FuncFetch), organizing in databases (FuncZymeDB), functional prediction using phylogeny (FuncPred-OG) and using protein language models (FuncPred-AI). Together, this workflow enables rapid prediction of substrate classes utilized by BAHD acyltransferases (our test enzyme family), has high data provenance, is expandable to other families, and will complement manual biocuration efforts.



--
Marcela Karey Tello-Ruiz, PhD
AgBioData Program Manager
Phoenix Bioinformatics