Dear all,
Registration for the new working group (WG) on *Natural Language Processing
(NLP) for biocuration* is now OPEN!
The NLP4Biocuration WG will explore the application of existing NLP tools
to biocuration and identify their limitations with AgBioData-curated
content.
*Join the NLP4biocuration working group by completing this form
<https://docs.google.com/forms/d/e/1FAIpQLSe0exyHDowMCnL1_VkCBsff6T0sSPTl3pg…>
by February 21! *
*The commitment will generally be between 2-6 hours monthly*, including
meeting attendance and offline work (e.g., literature, writing, etc.). This
is your opportunity to shape the future of our research community by
contributing to developing best practices of data archiving and management
that can benefit you and all the other stakeholders.
Participating in the AgBioData working groups will also offer you
opportunities for
- professional networking
- leadership skill development
- authorship in impactful manuscripts
- possibility to participate in the AgBioData Ambassador and
Championship programs
Please share the registration link with anybody that you think can be
interested.
Best,
Annarita
Hi everybody,
Don't miss our monthly webinar TOMORROW, Feb. 5th, at 12 PM CDT. David
Molik and Adam Wright from the Genome Assembly and Annotation Nomenclature
working group will present their recent publication (
https://doi.org/10.1093/genetics/iyaf006) and recommendations on gene and
genome nomenclature.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Guidelines for Gene and Genome Assembly Nomenclature (GAAN)*
Clear and informative naming schemes enhance the utility of genome
assemblies and gene annotations. We present a comprehensive nomenclature
framework which incorporates species, sequencing group,
colony/breed/strain, version, and other critical metadata in a
well-structured format. This approach aligns with standards from AgBioData
discussions and ensures compatibility with major repositories such as those
in the INSDC. To facilitate adoption, we developed the Gene and Genome
Assembly Nomenclature (GAAN) tool for validating names under these
guidelines. Future iterations of GAAN will integrate external databases
like the Darwin Tree of Life Identifiers and the Vertebrate Breed Ontology
for further validation.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=David+Molik+and+A…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Dear agricultural genome or phenome researcher:
This is a further announcement of the series of four “Field Days” on Single
Cell genomics in agriculture. Also, please see below for a second reminder
on travel funding for attending the AG2PI Single Cell Workshop March 29-30!
Details on the four sessions can be found at:
https://www.faang.org/bbs?s=2025_wkshop..txt
If you are interested in attending one or more of these *free* online
sessions, please register at least 2 days in advance at:
https://docs.google.com/forms/d/1W2aSNN0QEE-yzqehj2DHsGKa9N3fvCXHPMY9gAeZa7…
We will follow up with registered people to provide Zoom information one
day before the session and provide information on obtaining the sessions'
recordings once they are available.
*ALSO:*
**We still have a handful of seats available for early career researchers
to request $600 funding that will cover costs for the SC Workshop in
Orlando, Florida:
https://www.agbt.org/home/home/agbt-ag/agworkshops/#workshop3
If you are interested in this, have your advisor send confirmation you are
eligible: that you are currently in graduate school or within 5 years of
attaining your PhD.**
Thanks for your interest!
Chris Tuggle for the AG2PI Single Cell Workshop Organizing Committee
Hi everybody,
Join us for the first monthly webinar of 2025! Next Wednesday, Feb. 5th, at
12 PM CDT, David Molik and Adam Wright from the Genome Assembly and
Annotation Nomenclature working group will present their recent publication
(https://doi.org/10.1093/genetics/iyaf006) and recommendations on gene and
genome nomenclature.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Guidelines for Gene and Genome Assembly Nomenclature (GAAN)*
Clear and informative naming schemes enhance the utility of genome
assemblies and gene annotations. We present a comprehensive nomenclature
framework which incorporates species, sequencing group,
colony/breed/strain, version, and other critical metadata in a
well-structured format. This approach aligns with standards from AgBioData
discussions and ensures compatibility with major repositories such as those
in the INSDC. To facilitate adoption, we developed the Gene and Genome
Assembly Nomenclature (GAAN) tool for validating names under these
guidelines. Future iterations of GAAN will integrate external databases
like the Darwin Tree of Life Identifiers and the Vertebrate Breed Ontology
for further validation.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Dear everyone,
AgBioData will be at PAG 32 in San Diego! If you are attending the
conference, please join us:
- For *our workshop* on *Friday, Jan. 10, at 4 PM in room Town and
Country C*. There will be presentations from the current working groups
and we will provide updates on the consortium's direction. More info on the
workshop is available here <https://www.agbiodata.org/agb-pag32>.
- At* booth #210! *We will be there with our member databases to meet
you in person and answer questions.
Looking forward to seeing you at PAG!
Best,
Annarita
Hi everybody,
Join us TOMORROW, at 12 PM CDT, for our monthly webinar with Sarah Dyer.
She will discuss Ensembl as a precious resource for annotated genomes
across the tree of life.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Plants, pollinators, and pests in Ensembl - integrating genomics data for
agriculturally relevant species*
Ensembl is an open platform integrating publicly available genomic data to
support the exploration of gene annotations, genetic variation, and
comparative genomics. Increasing numbers of genomes are available for
agriculturally relevant species, with multiple high-quality genomes now
being generated for many crops. In addition, large-scale biodiversity
projects are increasing the number of insect genomes available for pest and
pollinator species. Providing ways to explore this genomic data is key to
supporting research and breeding efforts for agricultural species. Ensembl
Metazoa and Ensembl Plants now hold more than 380 invertebrate and almost
160 plant genomes, respectively. In addition, our new Ensembl site (
https://beta.ensembl.org) already holds more than 1,500 invertebrates and
over 100 plant genomes. The new site will ultimately replace the current
suite of Ensembl component sites, bringing annotated genomes together from
across the Tree of Life. In this webinar, we will look at the species, data
types, and tools you can find in Ensembl and ways you can access these
resources.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us TOMORROW, at 12 PM CDT, for our monthly webinar with Sarah Dyer.
She will discuss Ensembl as a precious resource for annotated genomes
across the tree of life.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Plants, pollinators, and pests in Ensembl - integrating genomics data for
agriculturally relevant species*
Ensembl is an open platform integrating publicly available genomic data to
support the exploration of gene annotations, genetic variation, and
comparative genomics. Increasing numbers of genomes are available for
agriculturally relevant species, with multiple high-quality genomes now
being generated for many crops. In addition, large-scale biodiversity
projects are increasing the number of insect genomes available for pest and
pollinator species. Providing ways to explore this genomic data is key to
supporting research and breeding efforts for agricultural species. Ensembl
Metazoa and Ensembl Plants now hold more than 380 invertebrate and almost
160 plant genomes, respectively. In addition, our new Ensembl site (
https://beta.ensembl.org) already holds more than 1,500 invertebrates and
over 100 plant genomes. The new site will ultimately replace the current
suite of Ensembl component sites, bringing annotated genomes together from
across the Tree of Life. In this webinar, we will look at the species, data
types, and tools you can find in Ensembl and ways you can access these
resources.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us for our monthly webinar next Wednesday, Dec. 4th, at 12 PM CDT.
Sarah Dyer will discuss Ensembl as a precious resource for
annotated genomes across the tree of life.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Plants, pollinators, and pests in Ensembl - integrating genomics data for
agriculturally relevant species*
Ensembl is an open platform integrating publicly available genomic data to
support the exploration of gene annotations, genetic variation, and
comparative genomics. Increasing numbers of genomes are available for
agriculturally relevant species, with multiple high-quality genomes now
being generated for many crops. In addition, large scale biodiversity
projects are increasing the number of insect genomes available for pest and
pollinator species. Providing ways to explore this genomic data is key to
supporting research and breeding efforts for agricultural species. Ensembl
Metazoa and Ensembl Plants now hold more than 380 invertebrate and almost
160 plant genomes, respectively. In addition, our new Ensembl site (
https://beta.ensembl.org) already holds more than 1,500 invertebrate and
over 100 plant genomes. The new site will ultimately replace the current
suite of Ensembl component sites, bringing annotated genomes together from
across the tree of life. In this webinar we will look at the species, data
types and tools you can find in Ensembl and ways you can access these
resources.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Dear all,
We invite the AgBioData community to *a
virtual roundtable on artificial intelligence* in agricultural
genomics *TOMORROW
(November 6) at 12 PM Central Time* (Zoom link
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>
).
AI and natural language processing (NLP) for biocuration are becoming more
popular in the community, allowing several applications and creating new
data-related issues. *We invite the AgBioData member databases and the
larger community to provide feedback on these challenges*, help us
understand their importance for the AgBioData member databases, and define
the focus of new AgBioData working groups.
The meeting will be 1-hour long and will feature breakout room sessions on two
main topics:
- *How databases can make data ready for AI*
- *NLP for biocuration.*
Please forward this invite to anyone you think might be interested.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Virtual…>
Join Zoom Meeting
https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
Meeting ID: 820 3835 6125
Passcode: 160683
Dear all,
We invite the AgBioData community to *a virtual roundtable on
artificial intelligence* in agricultural genomics *next Wednesday (November
6) at 12 PM Central Time* (Zoom link
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>
).
AI and natural language processing (NLP) for biocuration are becoming more
popular in the community, allowing several applications and creating new
data-related issues. *We invite the AgBioData member databases and the
larger community to provide feedback on these challenges*, help us
understand their importance for the AgBioData member databases, and define
the focus of new AgBioData working groups.
The meeting will be 1-hour long and will feature breakout room sessions on
two main topics:
- *How databases can make data ready for AI*
- *NLP for biocuration.*
Please forward this invite to anyone you think might be interested.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Virtual…>
Join Zoom Meeting
https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
Meeting ID: 820 3835 6125
Passcode: 160683
Join Zoom Meeting
Join Zoom Meeting
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us *tomorrow*, Oct. 2nd, *at 12 PM CDT*. Montana Smith will discuss
the National Microbiome Data Collaborative (NMDC) initiative and its work
on advancing microbiome science through FAIR and standardized metadata and
data.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*NMDC: Advancing microbiome science through FAIR and standardized metadata
and data*
The National Microbiome Data Collaborative (NMDC)’s mission is to support a
FAIR microbiome data-sharing network through infrastructure, data
standards, and community building that addresses pressing challenges in
environmental sciences. In this webinar, we will dive into what the NMDC is
and how standardized metadata capture enables FAIR data. We will walk
through the 4 NMDC products and how they’re lowering barriers for
experimental scientists to conduct their research in a way that ensures
data re-use.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us for our monthly webinar next Wednesday, Oct. 2nd, at 12 PM CDT.
Montana Smith will talk about the National Microbiome Data Collaborative
(NMDC) initiative and their work on advancing microbiome science through
FAIR and standardized metadata and data.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*NMDC: Advancing microbiome science through FAIR and standardized metadata
and data*
The National Microbiome Data Collaborative (NMDC)’s mission is to support a
FAIR microbiome data-sharing network through infrastructure, data
standards, and community building that addresses pressing challenges in
environmental sciences. In this webinar, we will dive into what the NMDC is
and how standardized metadata capture enables FAIR data. We will walk
through the 4 NMDC products and how they’re lowering barriers for
experimental scientists to conduct their research in a way that ensures
data re-use.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us for our monthly webinar tomorrow at 12 PM CDT! *Dr. David Emms from
InstaDeep* will discuss AgroNT, a foundational large language model for
plant genomics.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*AgroNT: A Foundational Large Language Model for Plant Genomics*
Foundational large language models can be pre-trained on large unlabelled
datasets and subsequently fine-tuned to a wide range of specific tasks.
We’ll present AgroNT (Agro Nucleotide Transformer), a foundational DNA
large language model pre-trained on reference genomes from 48 plant species
with a predominant focus on crops. We have shown that AgroNT can be
fine-tuned to obtain state-of-the-art predictions of many genomic elements,
including polyadenylation sites, splice sites, open chromatin and enhancer
regions. Furthermore, AgroNT can be fine-tuned to e.g. predict
tissue-specific gene expression levels or to prioritize functional variants.
Building on our Nucleotide Transformer, the novel SegmentNT model is able
to make nucleotide resolution predictions, well suited to tasks such as de
novo genome annotation of previously unseen species. Both our AgroNT and
SegmentNT models are open-sourced for academic research and non-commercial
uses on our GitHub repository
https://github.com/instadeepai/nucleotide-transformer and HuggingFace space
https://huggingface.co/InstaDeepAI.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us for our monthly webinar next Wednesday, Sept. 4th, at 12 PM CDT.
Dr. David Emms from InstaDeep will discuss AgroNT, a foundational large
language model for plant genomics.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*AgroNT: A Foundational Large Language Model for Plant Genomics*
Foundational large language models can be pre-trained on large unlabelled
datasets and subsequently fine-tuned to a wide range of specific tasks.
We’ll present AgroNT (Agro Nucleotide Transformer), a foundational DNA
large language model pre-trained on reference genomes from 48 plant species
with a predominant focus on crops. We have shown that AgroNT can be
fine-tuned to obtain state-of-the-art predictions of many genomic elements,
including polyadenylation sites, splice sites, open chromatin and enhancer
regions. Furthermore, AgroNT can be fine-tuned to e.g. predict
tissue-specific gene expression levels or to prioritize functional variants.
Building on our Nucleotide Transformer, the novel SegmentNT model is able
to make nucleotide resolution predictions, well suited to tasks such as de
novo genome annotation of previously unseen species. Both our AgroNT and
SegmentNT models are open-sourced for academic research and non-commercial
uses on our GitHub repository
https://github.com/instadeepai/nucleotide-transformer and HuggingFace space
https://huggingface.co/InstaDeepAI.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
This is a friendly reminder of tomorrow's webinar at 12 PM CDT. *Seth
Murray <https://soilcrop.tamu.edu/people/murray-seth-c/> *(Texas A&M
University, TAMU) will present on temporal field phenomics.
I have included below more details about the webinar and the Zoom link to
attend the webinar.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Capturing Nature AND Nurture with Temporal Field Phenomics to Breed Better
Crops*
An organism’s phenome results from genotype (nature), environment and
management effects (nurture) and their interactions, as well as measurement
error. For over 30 years, DNA sequencing and genomics tools have advanced
genotyping to where genomes can now be routinely saturated with
measurements. In contrast, most focus in high throughput phenotyping and
phenomics to date has been on automating previously known “traits” as
measurable and interpretable phenotypes; akin to focusing on measuring a
single DNA marker rather than measuring a saturated genome. Tools such as
unoccupied aerial systems (UAS, aka UAVs, drones) collecting temporal
phenomic measurements in the field now allow novel methods in plant
breeding and new insights into plant biology. Viewing phenomics as a
platform for discovery, similar to genomics, opens new methods for
capturing phenomena in nature and nurture. To date, our experience with
phenomic prediction from UAS in maize breeding for cumulative, complex
phenotypes such as grain yield suggests it’s possible to predict organismal
performance in untested environments; in fact possibly better than
gold-standard genomic methods. Surprising insights into biology have also
been made in through these activities predicting plant disease and
resistance, evaluating genotypic resilience to stress, and identifying
early season growth periods for crop improvement that have not been able to
be selected. Method development and data analytics in phenomics are large
investments, but worth making. Successfully measuring the phenome will
impact every aspect of science and society, in biological disciplines from
germplasm curators, physiologists to breeders, to education, the courtroom
and policy.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us next Wednesday, August 7th, at 12 PM CDT for our monthly webinar. *Seth
Murray <https://soilcrop.tamu.edu/people/murray-seth-c/> *(Texas A&M
University, TAMU) will present on temporal field phenomics.
I have included below more details about the webinar and the Zoom link to
attend the webinar.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Capturing Nature AND Nurture with Temporal Field Phenomics to Breed Better
Crops*
An organism’s phenome results from genotype (nature), environment and
management effects (nurture) and their interactions, as well as measurement
error. For over 30 years, DNA sequencing and genomics tools have advanced
genotyping to where genomes can now be routinely saturated with
measurements. In contrast, most focus in high throughput phenotyping and
phenomics to date has been on automating previously known “traits” as
measurable and interpretable phenotypes; akin to focusing on measuring a
single DNA marker rather than measuring a saturated genome. Tools such as
unoccupied aerial systems (UAS, aka UAVs, drones) collecting temporal
phenomic measurements in the field now allow novel methods in plant
breeding and new insights into plant biology. Viewing phenomics as a
platform for discovery, similar to genomics, opens new methods for
capturing phenomena in nature and nurture. To date, our experience with
phenomic prediction from UAS in maize breeding for cumulative, complex
phenotypes such as grain yield suggests it’s possible to predict organismal
performance in untested environments; in fact possibly better than
gold-standard genomic methods. Surprising insights into biology have also
been made in through these activities predicting plant disease and
resistance, evaluating genotypic resilience to stress, and identifying
early season growth periods for crop improvement that have not been able to
be selected. Method development and data analytics in phenomics are large
investments, but worth making. Successfully measuring the phenome will
impact every aspect of science and society, in biological disciplines from
germplasm curators, physiologists to breeders, to education, the courtroom
and policy.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
ReplyForward
Dear all,
Thank you for being part of the AgBioData community. We are almost at the
end of a three-year National Science Foundation (NSF) RCN project. We want
to estimate the impact of these years' efforts on increasing the awareness
and implementation of FAIR practices in the ag research community.
*If you haven't already*, we invite you to participate in a *brief survey* *on
the impact of AgBioData activities on FAIR data management awareness*. This
survey, which follows up on one we ran in 2022, will help us quantify any
significant change in the implementation of FAIR practices since the
beginning of the project.
Click here <https://tinyurl.com/AgBioData24> to participate in this survey.
Your participation in this survey is crucial for our mission to enhance
FAIR data in agricultural research. It will provide insights to help us
define the consortium's directions and secure future funding. We want to
emphasize that your participation is entirely voluntary and anonymous.
Best,
Annarita
Dear all,
A friendly reminder of *tomorrow's appointment* with the scRNA Biocuration
WG *at 8 a.m. PST / 10 a.m. CST / 11 a.m. EST / 5 p.m. CET* (Zoom link
<https://us06web.zoom.us/j/85091905235?pwd=Oqhn0Pvy3iXd5jObJi0JyacJy0bgmf.1>
).
Muskan Kapoor, a graduate research assistant in Tuggle's lab, will discuss
the current state of developing a single-cell data portal for farm animals.
There are more details on the talk at the bottom of this email.
I hope you will join us.
Best,
Annarita
----------------------------------------------------------------------------------------------------------------
*Abstract*:
*Building a FAIR data ecosystem for incorporating single-cell genomics data
into agricultural G2P research*
The agriculture genomics community has numerous data submission standards
available, but the standards for describing and storing single-cell (SC,
e.g., scRNA-seq) data are comparatively underdeveloped. To bridge this gap,
we leveraged recent advancements in human genomics infrastructure, such as
the integration of the Human Cell Atlas Data Portal with Terra, a secure,
scalable, open-source platform for biomedical researchers to access data,
run analysis tools, and collaborate, co-developed by the Broad Institute of
MIT and Harvard, Microsoft, and Verily. In parallel, the Single Cell
Expression Atlas at EMBL-EBI offers a comprehensive data ingestion portal
for high-throughput sequencing datasets, including plants, protists, and
animals (including humans). Developing data tools connecting these
resources would offer significant advantages to the agricultural genomics
community. The FAANG data portal at EMBL-EBI emphasizes delivering rich
metadata and highly accurate and reliable annotation of farmed animals but
is not computationally linked to either of these resources. Herein, we
describe a pilot-scale project that determines whether the current FAANG
metadata standards for livestock can be used to ingest scRNA-seq datasets
into Terra in a manner consistent with HCA Data Portal standards.
Importantly, rich scRNA-seq metadata can now be brokered through the FAANG
data portal using a semi-automated process, thereby avoiding the need for
substantial expert curation. We have further extended the functionality of
this tool so that validated and ingested SC files within the HCA Data
Portal are transferred to Terra for further analysis. In addition, we
verified data ingestion into Terra, hosted on Azure, and demonstrated the
use of a workflow to analyze the first ingested porcine scRNA-seq dataset.
Additionally, we have also developed prototype tools to visualize the
output of scRNA-seq analyses on genome browsers to compare gene expression
patterns across tissues and cell populations. This JBrowse tool now
features distinct tracks, showcasing PBMC scRNA-seq alongside two bulk
RNA-seq experiments. We intend to further build upon these existing tools
to construct a scientist-friendly data resource and analytical ecosystem
based on Findable, Accessible, Interoperable, and Reusable (FAIR) SC
principles to facilitate SC-level genomic analysis through data ingestion,
storage, retrieval, re-use, visualization, and comparative annotation
across agricultural species.
ReplyForward
Dear all,
The AgBioData Standards for Genetic Variation Working Group (SGV) is
preparing a white paper to support the adoption of rsIDs for agriculture
and seeks your valuable input and data to further our research efforts.
They invite the AgBioData community to join them on* July 18 at 11:00 ET *to
review the material and *discuss collaboration opportunities*. In
particular, the SGV is looking for the following information:
1. Genetic Markers Linked to Traits:
-
A list of genetic markers associated with specific traits used by the
breeding and research community across various agricultural species,
including plants, animals, and insects. See an example here
<https://docs.google.com/spreadsheets/d/167DaYxbdKejoL0l6UhGqMPu-GbGsW6vIok7…>
.
-
Information on the species and traits these markers are linked to and
the methodologies/platforms employed for genotyping in your community.
2. Functional Validation and Fitness Outcomes:
-
Examples of genetic variations that have been functionally validated,
including descriptions of the phenotypic differences these
variations lead
to.
-
Case studies or data demonstrating these genetic variations have
resulted in measurable fitness outcomes for the species in question.
-
Information on related publications and any supporting data available.
-
We would like to request examples from each of the different
species/database providers for review at the upcoming GV working group
meeting in July.
3. Standards and Data Access for Genetic & Phenotypic Variation:
FAIR/ Interoperability
-
Standard formats for exchanging information, identifiers, formats,
and controlled vocabulary on:
- Germplasm
- Genetic Variation
- Phenotypes
-
Types of views and files provided for data access.
-
Future targets for operating, displaying, or providing access to
these data types.
-
Usage of rsIDs and potential barriers to their adoption in your
resource.
-
For more information on RSIDs, please look at the following
document RefSNPs: Clustered Variants
<https://docs.google.com/document/d/1PHXqW7M50mE5SSl4Zprd904KCr9nWzSR4L8sx3U…>
4. Collaboration Opportunities:
-
Barriers encountered when working with these data types.
-
Opportunities for collaboration or data standards, sharing, and
interoperability.
We really appreciate your support and look forward to your positive
response to the request for information in any of the four categories.
*If you are willing to participate or are interested in contributing to our
initiative and would like to collaborate, please complete this form
(https://forms.gle/jcjWnLibHEKuESJe6 <https://forms.gle/jcjWnLibHEKuESJe6>)
by July 16. *We will follow up with you with the Zoom link and other
resources.
Best,
The AgBioData Standards for Genetic Variation WG
Dear all,
The scRNA Biocuration WG invites you to an open meeting on *July 17th at 8
a.m. PST / 10 a.m. CST / 11 a.m. EST / 5 p.m. CET* (Zoom link
<https://us06web.zoom.us/j/85091905235?pwd=Oqhn0Pvy3iXd5jObJi0JyacJy0bgmf.1>
).
Muskan Kapoor, a graduate research assistant in Tuggle lab, will discuss
the current state of developing a single-cell data portal for farm animals.
More details on the talk at the bottom of this email.
I hope you will join us.
Best,
Annarita
----------------------------------------------------------------------------------------------------------------
*Abstract*:
*Building a FAIR data ecosystem for incorporating single-cell genomics data
into agricultural G2P research*
The agriculture genomics community has numerous data submission standards
available, but the standards for describing and storing single-cell (SC,
e.g., scRNA-seq) data are comparatively underdeveloped. To bridge this gap,
we leveraged recent advancements in human genomics infrastructure, such as
the integration of the Human Cell Atlas Data Portal with Terra, a secure,
scalable, open-source platform for biomedical researchers to access data,
run analysis tools, and collaborate, co-developed by the Broad Institute of
MIT and Harvard, Microsoft, and Verily. In parallel, the Single Cell
Expression Atlas at EMBL-EBI offers a comprehensive data ingestion portal
for high-throughput sequencing datasets, including plants, protists, and
animals (including humans). Developing data tools connecting these
resources would offer significant advantages to the agricultural genomics
community. The FAANG data portal at EMBL-EBI emphasizes delivering rich
metadata and highly accurate and reliable annotation of farmed animals but
is not computationally linked to either of these resources. Herein, we
describe a pilot-scale project that determines whether the current FAANG
metadata standards for livestock can be used to ingest scRNA-seq datasets
into Terra in a manner consistent with HCA Data Portal standards.
Importantly, rich scRNA-seq metadata can now be brokered through the FAANG
data portal using a semi-automated process, thereby avoiding the need for
substantial expert curation. We have further extended the functionality of
this tool so that validated and ingested SC files within the HCA Data
Portal are transferred to Terra for further analysis. In addition, we
verified data ingestion into Terra, hosted on Azure, and demonstrated the
use of a workflow to analyze the first ingested porcine scRNA-seq dataset.
Additionally, we have also developed prototype tools to visualize the
output of scRNA-seq analyses on genome browsers to compare gene expression
patterns across tissues and cell populations. This JBrowse tool now
features distinct tracks, showcasing PBMC scRNA-seq alongside two bulk
RNA-seq experiments. We intend to further build upon these existing tools
to construct a scientist-friendly data resource and analytical ecosystem
based on Findable, Accessible, Interoperable, and Reusable (FAIR) SC
principles to facilitate SC-level genomic analysis through data ingestion,
storage, retrieval, re-use, visualization, and comparative annotation
across agricultural species.
Hello everyone,
We will be in Honolulu next week for the Plant Biology 2024 Conference!
If you will be there, join us for the "Plant Bioinformatics Resources for
FAIR Agricultural Data Discovery and Reuse" on Saturday, June 22, at 10 AM
in room 312. Some of our member databases will present their recent updates
and resources (https://www.agbiodata.org/pbr-aspb-2024).
We will also be in *booth #406* to meet all of you in person and talk about
our work and future directions.
Have a lovely week,
Annarita
Dear AgBioData members,
This is a friendly reminder to participate in our *brief survey* on the
impact of AgBioData activities on FAIR data management awareness if you
haven't already done so.
This survey is a follow-up to a similar one we ran in 2022 and will help us
quantify the impact of our efforts during the three-year National Science
Foundation (NSF) RCN project on increasing the awareness and implementation
of FAIR practices in the AgBioData community.
Your participation in this survey is crucial for our mission to enhance
FAIR data in agricultural research and will provide insights that can help
us define the consortium's directions and secure future funding.
Click here <https://tinyurl.com/AgBioData24> to participate in this survey.
We would like to emphasize that your participation in this survey is
entirely voluntary and anonymous.
For general questions or assistance accessing and completing the survey,
please write to me or contact Dr Michael Coe at
michael(a)cedarlakeresearch.com.
Thanks for your contribution to AgBioData and for being part of our
community!
Best,
Annarita
Hi everybody,
This is a friendly reminder of our monthly webinar tomorrow at 12 PM CDT.
We have *Ethy Cannon *(USDA-ARS) talking about the *pan-genomic resources
at MaizeGDB*.
I have included below more details about the webinar and the Zoom link to
attend the webinar.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Pan-genomic resources at MaizeGDB*
Pan-genomes, encompassing the entirety of genetic sequences found in a
collection of genome assemblies within a clade, can be more useful than
single reference genomes. This is especially true for Zea mays, which has a
particularly diverse and complex genome. Presenting full pan-genome data is
challenging, especially for a diverse species, but valuable when
pan-genomic data can be linked to extensive gene models and gene data,
including classical gene information, markers, insertions, expression and
proteomic data, and protein structures, as is the case at MaizeGDB. I will
present the pan-gene analysis pipeline Pandagma and MaizeGDB’s pan-gene
data centre, which offers a variety of browsing and visualizations,
including sequence alignment visualization, gene trees and more, which
enables exploration of pan-genes in Zea.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us next Wed, June 5th, at 12 PM CDT for our monthly webinar. *Ethy
Cannon *(USDA-ARS) will talk about the *pan-genomic resources at MaizeGDB*.
I have included below more details about the webinar and the Zoom link to
attend the webinar.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Pan-genomic resources at MaizeGDB*
Pan-genomes, encompassing the entirety of genetic sequences found in a
collection of genome assemblies within a clade, can be more useful than
single reference genomes. This is especially true for Zea mays, which has a
particularly diverse and complex genome. Presenting full pan-genome data is
challenging, especially for a diverse species, but valuable when
pan-genomic data can be linked to extensive gene models and gene data,
including classical gene information, markers, insertions, expression and
proteomic data, and protein structures, as is the case at MaizeGDB. I will
present the pan-gene analysis pipeline Pandagma, and MaizeGDB’s pan-gene
data centre, which offers a variety of browsing and visualizations,
including sequence alignment visualization, gene trees and more, which
enables exploration of pan-genes in Zea.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Dear AgBioData members,
This is a friendly reminder to participate in our *brief survey* on the
impact of AgBioData activities on FAIR data management awareness if you
haven't already done so.
This survey is a follow-up to a similar one we ran in 2022 and will help us
quantify the impact of our efforts during the three-year National Science
Foundation (NSF) RCN project on increasing the awareness and implementation
of FAIR practices in the AgBioData community.
Your participation in this survey is crucial for our mission to enhance
FAIR data in agricultural research and will provide insights that can help
us define the consortium's directions and secure future funding.
Click here <https://tinyurl.com/AgBioData24> to participate in this survey.
Please submit your answers *by May 31*.
We would like to emphasize that your participation in this survey is
entirely voluntary and anonymous.
For general questions or assistance accessing and completing the survey,
please feel free to write to me or contact Dr Michael Coe at
michael(a)cedarlakeresearch.com.
Thanks for your contribution to AgBioData and for being part of our
community!
Best,
Annarita