Hello Everyone,
Join
us tomorrow Wednesday, June 3rd at 1 pm ET for an exciting talk tackling
one of plant genomics' most pressing challenges: predicting enzyme
function at scale. Discover how large language models (LLMs), protein
language models (PLMs), and phylogenetics are being combined into an
end-to-end annotation workflow to make it possible.
As part of the AgBioData webinar series, Gaurav Moghe, Associate Professor in the School of Integrative Plant Science at Cornell University, will present the talk titled: “Advancing biochemical studies with LLMs and PLMs”.
The zoom link and abstract are below. We hope you can join us!
Best,
Marcela
--
Wednesday, June 3rd, 1PM ET
10A PT | 11A MT | 12P CT | 1P ET
Find your local time here.
Meeting ID: 820 3835 6125
Passcode: 160683
--
Speaker: Gaurav Moghe, Associate Professor, Integrative Plant Science @ Cornell University
Title: Advancing biochemical studies with LLMs and PLMs
Abstract: As
the number of sequenced plant genomes continues to accumulate rapidly,
innovation in functional annotation of genes is increasingly becoming a
critical endeavor. However, allelic divergence, duplication-divergence,
promiscuity, and redundancy are major hurdles in function prediction. In
this talk, I will describe a workflow we have developed for enzyme
function prediction that includes LLM-assisted extraction from
literature (FuncFetch), organizing in databases (FuncZymeDB), functional
prediction using phylogeny (FuncPred-OG) and using protein language
models (FuncPred-AI). Together, this workflow enables rapid prediction
of substrate classes utilized by BAHD acyltransferases (our test enzyme
family), has high data provenance, is expandable to other families, and
will complement manual biocuration efforts.