Dear all,
We invite the AgBioData community to *a
virtual roundtable on artificial intelligence* in agricultural
genomics *TOMORROW
(November 6) at 12 PM Central Time* (Zoom link
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>
).
AI and natural language processing (NLP) for biocuration are becoming more
popular in the community, allowing several applications and creating new
data-related issues. *We invite the AgBioData member databases and the
larger community to provide feedback on these challenges*, help us
understand their importance for the AgBioData member databases, and define
the focus of new AgBioData working groups.
The meeting will be 1-hour long and will feature breakout room sessions on two
main topics:
- *How databases can make data ready for AI*
- *NLP for biocuration.*
Please forward this invite to anyone you think might be interested.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Virtual…>
Join Zoom Meeting
https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
Meeting ID: 820 3835 6125
Passcode: 160683
Dear all,
We invite the AgBioData community to *a virtual roundtable on
artificial intelligence* in agricultural genomics *next Wednesday (November
6) at 12 PM Central Time* (Zoom link
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>
).
AI and natural language processing (NLP) for biocuration are becoming more
popular in the community, allowing several applications and creating new
data-related issues. *We invite the AgBioData member databases and the
larger community to provide feedback on these challenges*, help us
understand their importance for the AgBioData member databases, and define
the focus of new AgBioData working groups.
The meeting will be 1-hour long and will feature breakout room sessions on
two main topics:
- *How databases can make data ready for AI*
- *NLP for biocuration.*
Please forward this invite to anyone you think might be interested.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Virtual…>
Join Zoom Meeting
https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
Meeting ID: 820 3835 6125
Passcode: 160683
Join Zoom Meeting
Join Zoom Meeting
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us *tomorrow*, Oct. 2nd, *at 12 PM CDT*. Montana Smith will discuss
the National Microbiome Data Collaborative (NMDC) initiative and its work
on advancing microbiome science through FAIR and standardized metadata and
data.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*NMDC: Advancing microbiome science through FAIR and standardized metadata
and data*
The National Microbiome Data Collaborative (NMDC)’s mission is to support a
FAIR microbiome data-sharing network through infrastructure, data
standards, and community building that addresses pressing challenges in
environmental sciences. In this webinar, we will dive into what the NMDC is
and how standardized metadata capture enables FAIR data. We will walk
through the 4 NMDC products and how they’re lowering barriers for
experimental scientists to conduct their research in a way that ensures
data re-use.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us for our monthly webinar next Wednesday, Oct. 2nd, at 12 PM CDT.
Montana Smith will talk about the National Microbiome Data Collaborative
(NMDC) initiative and their work on advancing microbiome science through
FAIR and standardized metadata and data.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*NMDC: Advancing microbiome science through FAIR and standardized metadata
and data*
The National Microbiome Data Collaborative (NMDC)’s mission is to support a
FAIR microbiome data-sharing network through infrastructure, data
standards, and community building that addresses pressing challenges in
environmental sciences. In this webinar, we will dive into what the NMDC is
and how standardized metadata capture enables FAIR data. We will walk
through the 4 NMDC products and how they’re lowering barriers for
experimental scientists to conduct their research in a way that ensures
data re-use.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us for our monthly webinar tomorrow at 12 PM CDT! *Dr. David Emms from
InstaDeep* will discuss AgroNT, a foundational large language model for
plant genomics.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*AgroNT: A Foundational Large Language Model for Plant Genomics*
Foundational large language models can be pre-trained on large unlabelled
datasets and subsequently fine-tuned to a wide range of specific tasks.
We’ll present AgroNT (Agro Nucleotide Transformer), a foundational DNA
large language model pre-trained on reference genomes from 48 plant species
with a predominant focus on crops. We have shown that AgroNT can be
fine-tuned to obtain state-of-the-art predictions of many genomic elements,
including polyadenylation sites, splice sites, open chromatin and enhancer
regions. Furthermore, AgroNT can be fine-tuned to e.g. predict
tissue-specific gene expression levels or to prioritize functional variants.
Building on our Nucleotide Transformer, the novel SegmentNT model is able
to make nucleotide resolution predictions, well suited to tasks such as de
novo genome annotation of previously unseen species. Both our AgroNT and
SegmentNT models are open-sourced for academic research and non-commercial
uses on our GitHub repository
https://github.com/instadeepai/nucleotide-transformer and HuggingFace space
https://huggingface.co/InstaDeepAI.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us for our monthly webinar next Wednesday, Sept. 4th, at 12 PM CDT.
Dr. David Emms from InstaDeep will discuss AgroNT, a foundational large
language model for plant genomics.
I have included more details about the webinar and the Zoom link to attend
the webinar below.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*AgroNT: A Foundational Large Language Model for Plant Genomics*
Foundational large language models can be pre-trained on large unlabelled
datasets and subsequently fine-tuned to a wide range of specific tasks.
We’ll present AgroNT (Agro Nucleotide Transformer), a foundational DNA
large language model pre-trained on reference genomes from 48 plant species
with a predominant focus on crops. We have shown that AgroNT can be
fine-tuned to obtain state-of-the-art predictions of many genomic elements,
including polyadenylation sites, splice sites, open chromatin and enhancer
regions. Furthermore, AgroNT can be fine-tuned to e.g. predict
tissue-specific gene expression levels or to prioritize functional variants.
Building on our Nucleotide Transformer, the novel SegmentNT model is able
to make nucleotide resolution predictions, well suited to tasks such as de
novo genome annotation of previously unseen species. Both our AgroNT and
SegmentNT models are open-sourced for academic research and non-commercial
uses on our GitHub repository
https://github.com/instadeepai/nucleotide-transformer and HuggingFace space
https://huggingface.co/InstaDeepAI.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
This is a friendly reminder of tomorrow's webinar at 12 PM CDT. *Seth
Murray <https://soilcrop.tamu.edu/people/murray-seth-c/> *(Texas A&M
University, TAMU) will present on temporal field phenomics.
I have included below more details about the webinar and the Zoom link to
attend the webinar.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Capturing Nature AND Nurture with Temporal Field Phenomics to Breed Better
Crops*
An organism’s phenome results from genotype (nature), environment and
management effects (nurture) and their interactions, as well as measurement
error. For over 30 years, DNA sequencing and genomics tools have advanced
genotyping to where genomes can now be routinely saturated with
measurements. In contrast, most focus in high throughput phenotyping and
phenomics to date has been on automating previously known “traits” as
measurable and interpretable phenotypes; akin to focusing on measuring a
single DNA marker rather than measuring a saturated genome. Tools such as
unoccupied aerial systems (UAS, aka UAVs, drones) collecting temporal
phenomic measurements in the field now allow novel methods in plant
breeding and new insights into plant biology. Viewing phenomics as a
platform for discovery, similar to genomics, opens new methods for
capturing phenomena in nature and nurture. To date, our experience with
phenomic prediction from UAS in maize breeding for cumulative, complex
phenotypes such as grain yield suggests it’s possible to predict organismal
performance in untested environments; in fact possibly better than
gold-standard genomic methods. Surprising insights into biology have also
been made in through these activities predicting plant disease and
resistance, evaluating genotypic resilience to stress, and identifying
early season growth periods for crop improvement that have not been able to
be selected. Method development and data analytics in phenomics are large
investments, but worth making. Successfully measuring the phenome will
impact every aspect of science and society, in biological disciplines from
germplasm curators, physiologists to breeders, to education, the courtroom
and policy.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
Hi everybody,
Join us next Wednesday, August 7th, at 12 PM CDT for our monthly webinar. *Seth
Murray <https://soilcrop.tamu.edu/people/murray-seth-c/> *(Texas A&M
University, TAMU) will present on temporal field phenomics.
I have included below more details about the webinar and the Zoom link to
attend the webinar.
I hope you will join us.
Best,
Annarita
:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
*Abstracts:*
*Capturing Nature AND Nurture with Temporal Field Phenomics to Breed Better
Crops*
An organism’s phenome results from genotype (nature), environment and
management effects (nurture) and their interactions, as well as measurement
error. For over 30 years, DNA sequencing and genomics tools have advanced
genotyping to where genomes can now be routinely saturated with
measurements. In contrast, most focus in high throughput phenotyping and
phenomics to date has been on automating previously known “traits” as
measurable and interpretable phenotypes; akin to focusing on measuring a
single DNA marker rather than measuring a saturated genome. Tools such as
unoccupied aerial systems (UAS, aka UAVs, drones) collecting temporal
phenomic measurements in the field now allow novel methods in plant
breeding and new insights into plant biology. Viewing phenomics as a
platform for discovery, similar to genomics, opens new methods for
capturing phenomena in nature and nurture. To date, our experience with
phenomic prediction from UAS in maize breeding for cumulative, complex
phenotypes such as grain yield suggests it’s possible to predict organismal
performance in untested environments; in fact possibly better than
gold-standard genomic methods. Surprising insights into biology have also
been made in through these activities predicting plant disease and
resistance, evaluating genotypic resilience to stress, and identifying
early season growth periods for crop improvement that have not been able to
be selected. Method development and data analytics in phenomics are large
investments, but worth making. Successfully measuring the phenome will
impact every aspect of science and society, in biological disciplines from
germplasm curators, physiologists to breeders, to education, the courtroom
and policy.
*| 1P ET | 12P CT | 11A MT | 10A PT |*
Find your local time here
<https://www.timeanddate.com/worldclock/fixedtime.html?msg=AgBioData+Monthly…>
*Join Zoom
Meetinghttps://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09
<https://us06web.zoom.us/j/82038356125?pwd=YVFMRElMdEpHZmtObXFvZlA4QVFXQT09>*
Meeting ID: 820 3835 6125
Passcode: 160683
ReplyForward
Dear all,
Thank you for being part of the AgBioData community. We are almost at the
end of a three-year National Science Foundation (NSF) RCN project. We want
to estimate the impact of these years' efforts on increasing the awareness
and implementation of FAIR practices in the ag research community.
*If you haven't already*, we invite you to participate in a *brief survey* *on
the impact of AgBioData activities on FAIR data management awareness*. This
survey, which follows up on one we ran in 2022, will help us quantify any
significant change in the implementation of FAIR practices since the
beginning of the project.
Click here <https://tinyurl.com/AgBioData24> to participate in this survey.
Your participation in this survey is crucial for our mission to enhance
FAIR data in agricultural research. It will provide insights to help us
define the consortium's directions and secure future funding. We want to
emphasize that your participation is entirely voluntary and anonymous.
Best,
Annarita
Dear all,
A friendly reminder of *tomorrow's appointment* with the scRNA Biocuration
WG *at 8 a.m. PST / 10 a.m. CST / 11 a.m. EST / 5 p.m. CET* (Zoom link
<https://us06web.zoom.us/j/85091905235?pwd=Oqhn0Pvy3iXd5jObJi0JyacJy0bgmf.1>
).
Muskan Kapoor, a graduate research assistant in Tuggle's lab, will discuss
the current state of developing a single-cell data portal for farm animals.
There are more details on the talk at the bottom of this email.
I hope you will join us.
Best,
Annarita
----------------------------------------------------------------------------------------------------------------
*Abstract*:
*Building a FAIR data ecosystem for incorporating single-cell genomics data
into agricultural G2P research*
The agriculture genomics community has numerous data submission standards
available, but the standards for describing and storing single-cell (SC,
e.g., scRNA-seq) data are comparatively underdeveloped. To bridge this gap,
we leveraged recent advancements in human genomics infrastructure, such as
the integration of the Human Cell Atlas Data Portal with Terra, a secure,
scalable, open-source platform for biomedical researchers to access data,
run analysis tools, and collaborate, co-developed by the Broad Institute of
MIT and Harvard, Microsoft, and Verily. In parallel, the Single Cell
Expression Atlas at EMBL-EBI offers a comprehensive data ingestion portal
for high-throughput sequencing datasets, including plants, protists, and
animals (including humans). Developing data tools connecting these
resources would offer significant advantages to the agricultural genomics
community. The FAANG data portal at EMBL-EBI emphasizes delivering rich
metadata and highly accurate and reliable annotation of farmed animals but
is not computationally linked to either of these resources. Herein, we
describe a pilot-scale project that determines whether the current FAANG
metadata standards for livestock can be used to ingest scRNA-seq datasets
into Terra in a manner consistent with HCA Data Portal standards.
Importantly, rich scRNA-seq metadata can now be brokered through the FAANG
data portal using a semi-automated process, thereby avoiding the need for
substantial expert curation. We have further extended the functionality of
this tool so that validated and ingested SC files within the HCA Data
Portal are transferred to Terra for further analysis. In addition, we
verified data ingestion into Terra, hosted on Azure, and demonstrated the
use of a workflow to analyze the first ingested porcine scRNA-seq dataset.
Additionally, we have also developed prototype tools to visualize the
output of scRNA-seq analyses on genome browsers to compare gene expression
patterns across tissues and cell populations. This JBrowse tool now
features distinct tracks, showcasing PBMC scRNA-seq alongside two bulk
RNA-seq experiments. We intend to further build upon these existing tools
to construct a scientist-friendly data resource and analytical ecosystem
based on Findable, Accessible, Interoperable, and Reusable (FAIR) SC
principles to facilitate SC-level genomic analysis through data ingestion,
storage, retrieval, re-use, visualization, and comparative annotation
across agricultural species.
ReplyForward