Hello Everyone,
Join us next Wednesday, June 3rd at 1 pm ET for an exciting talk tackling
one of plant genomics' most pressing challenges: predicting enzyme function
at scale. Discover how large language models (LLMs), protein language
models (PLMs), and phylogenetics are being combined into an end-to-end
annotation workflow to make it possible.
As part of the AgBioData webinar series, *Gaurav Moghe*, Associate
Professor in the School of Integrative Plant Science at Cornell University,
will present the talk titled: “*Advancing biochemical studies with LLMs and
PLMs*”.
The zoom link and abstract are below. We hope you can join us!
Best,
Marcela
--
Wednesday, May 6th, 1PM ET
10A PT | 11A MT | 12P CT | 1P ET
Find your local time *here*
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+June+20…>
.
Join Zoom Meeting
https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
Meeting ID: 820 3835 6125
Passcode: 160683
--
Speaker: Gaurav Moghe, Associate Professor, Integrative Plant Science @ Cornell
University
Title: Advancing biochemical studies with LLMs and PLMs
Abstract: As the number of sequenced plant genomes continues to accumulate
rapidly, innovation in functional annotation of genes is increasingly
becoming a critical endeavor. However, allelic divergence,
duplication-divergence, promiscuity, and redundancy are major hurdles in
function prediction. In this talk, I will describe a workflow we have
developed for enzyme function prediction that includes LLM-assisted
extraction from literature (FuncFetch), organizing in databases
(FuncZymeDB), functional prediction using phylogeny (FuncPred-OG) and using
protein language models (FuncPred-AI). Together, this workflow enables
rapid prediction of substrate classes utilized by BAHD acyltransferases
(our test enzyme family), has high data provenance, is expandable to other
families, and will complement manual biocuration efforts.
--
Marcela Karey Tello-Ruiz, PhD
AgBioData Program Manager
Phoenix Bioinformatics