A portable R script that produces BibTeX citations, a network-conditional acknowledgments file, and a review-flags file for a list of FLUXNET sites you have downloaded. It draws on the manifest from your FLUXNET shuttle download as its metadata source — not a live query to the shuttle.
FLUXNET is not a single monolithic dataset. The measurements behind each tower site were collected by an individual site team — typically a principal investigator plus students, technicians, and collaborators — who installed the instruments, maintained them through seasons and storms, and spent years quality-controlling the records. A single FLUXNET site entry can represent a decade or more of on-the-ground work before a single data file is made available for synthesis.
The data hubs — AmeriFlux, ICOS, TERN, SAEON, JapanFlux, KoFlux, and others — coordinate distribution, standardize processing through the ONEFlux pipeline, and host the downloads. They are essential infrastructure. But they do not own the underlying measurements; those belong to the site teams. The hubs require user attribution as a condition of access, and this is not bureaucracy: it is how site teams demonstrate impact, justify continued funding, recruit students, and get credit for work that often spans careers.
The required attribution goes beyond citing the hub or the ONEFlux pipeline paper. It requires an individual site-level reference for every tower whose data appears in your analysis. A study using 50 FLUXNET sites needs 50 site-level citations in its reference list — one per tower — plus the relevant hub-level and synthesis references. Assembling those 50 citations by hand from the FLUXNET registry is tedious and error-prone. That is what this tool does for you.
Failing to attribute individual sites correctly is unfair to the researchers whose work you used, and it may breach the data use agreements you accepted when you downloaded the data. Both outcomes are avoidable.
Does:
- Reads a manifest CSV from a completed FLUXNET shuttle download.
- Produces a
.bibfile with one@miscBibTeX entry per site, formatted correctly for the site's hub family (AmeriFlux, ICOS, TERN, SAEON), plus mandated synthesis references appended at the end. - Produces a
_acknowledgments.mdfile with network-conditional acknowledgment text ready to paste into a manuscript. - Produces a
_review_flags.mdfile noting per-site issues that need human attention before the bibliography is finalised (parse errors, missing identifiers, preserved upstream typos, no-author ICOS entries). - Optional subsetting: an optional
site_ids_csvparameter restricts citations to a named subset of the manifest. Any site ID in the list that is not present in the manifest causes an explicit error — you cannot cite what you have not downloaded.
Does not:
- Download FLUXNET data. A completed download is a prerequisite.
- Query
flux_listall()or the shuttle at citation-generation time. Citations reflect the manifest's metadata at download time. This is intentional: pulling fresh metadata later could produce citations that don't match what was actually downloaded. Six months of upstream changes — new sites added, data re-versioned, author typos corrected, PI lists updated — could silently corrupt your bibliography. - Verify citations against the BIF (site information) files on disk.
- Check author affiliations or resolve persistent identifiers (DOI, handle) against their registries.
- Produce non-BibTeX output formats.
R 4.0 or later (R 4.6 recommended).
Three packages from CRAN:
install.packages(c("dplyr", "readr", "stringr"))No other R dependencies. No Python. No environment variables.
To also run the shuttle download workflow that produces the manifest in the
first place, install Eric Scott's fluxnet R package:
install.packages("pak")
pak::pak("EcosystemEcologyLab/fluxnet-package")The example/ directory in this repo contains a frozen 10-site manifest so
you can run the tool and inspect the outputs without installing the fluxnet
package or downloading any data.
# From the fluxnet-citations/ directory:
setwd("example")
source("../generate_fluxnet_citations.R")
generate_fluxnet_citations(
site_ids_csv = "example_sites.csv",
manifest_path = "example_manifest.csv",
output_prefix = "output/example"
)Three files are written to example/output/:
output/example.biboutput/example_acknowledgments.mdoutput/example_review_flags.md
See example/README.md for a full description of the expected outputs.
Download your sites using the shuttle. The recommended path is through the
fluxnet R package, which handles authentication and batch management:
library(fluxnet)
# Always save the manifest before downloading.
manifest <- flux_listall()
readr::write_csv(manifest, "fluxnet_shuttle_snapshot_20260601.csv")
flux_download(
file_list_df = manifest,
download_dir = "data/raw"
)Save flux_listall() output to a CSV file before calling flux_download().
That saved CSV is your manifest. The manifest freezes the metadata state —
author names, product IDs, citation strings — as they existed when you
downloaded the data. Without it, you have no durable record of that state.
flux_download() itself does not save a manifest; if you skip this step, you
cannot recover it exactly later.
The manifest is the CSV you saved in step 1. Its filename follows the pattern
fluxnet_shuttle_snapshot_<YYYYMMDDTHHMMSS>.csv. Keep it alongside your data
files or in a version-controlled location.
See Where is my manifest? below for details.
If your analysis used only a subset of your downloaded sites, create a
one-column CSV with a site_id header:
site_id
US-Ha1
DE-Tha
AU-How
If you want citations for all sites in your manifest, skip this step.
source("generate_fluxnet_citations.R")
generate_fluxnet_citations(
site_ids_csv = "my_sites.csv", # omit to cite all manifest sites
manifest_path = "fluxnet_shuttle_snapshot_20260601T120000.csv",
output_prefix = "outputs/citations/fluxnet_2026"
)The manifest is the CSV produced by flux_listall() and saved before your
download run.
Behavioral rule: Always save your flux_listall() output to a CSV file
before calling flux_download(). That saved CSV is your manifest. The shuttle
does not save one for you automatically — you must do it explicitly.
manifest <- flux_listall()
readr::write_csv(manifest, paste0(
"fluxnet_shuttle_snapshot_",
format(Sys.time(), "%Y%m%dT%H%M%S"),
".csv"
))If you used the paper-repo workflow (01_download.R): the manifest was
saved automatically by write_snapshot() before each download run. It lives
in data/snapshots/fluxnet_shuttle_snapshot_<timestamp>.csv in your
repository clone. The most recent snapshot written before or during your
download run is the correct input. There may be several timestamped snapshots
in data/snapshots/ if you ran the download in multiple batches; use the one
that matches the run in question.
If you downloaded with the Python shuttle CLI directly: check whether you
ran fluxnet-shuttle listall -o <dir> before downloading. If you did, the
output is fluxnet_shuttle_snapshot_<timestamp>.csv in the directory you
specified. If you did not save a manifest before downloading, there is no exact
recovery path; your best option is to call flux_listall() now and save it,
accepting that some metadata may have drifted since your download.
All three outputs share the same prefix you pass to output_prefix.
{prefix}.bib — BibTeX entries, one @misc per site. A %% NOTICE
block at the top of the file explains the provenance. Sites are followed by
mandated synthesis references (@article entries) that vary by the networks
present in your site list.
{prefix}_acknowledgments.md — Network-conditional acknowledgment text.
Each hub family (AmeriFlux, ICOS, TERN/OzFlux, ChinaFlux/KoFlux, SAEON) has
its own paragraph, generated only when sites from that family are present. A
data availability statement with [SOURCE] and [DATE] placeholders appears
at the end; fill these in before submission.
{prefix}_review_flags.md — Sites that need human review before the
bibliography is finalised. Common flags:
- ICOS/JPF/KOF sites with no author block in their citation string
- Preserved upstream typos (e.g.,
AU-Cow'sFLUXNEXT,US-Hsm'sKyle_Delwiche) — flagged and preserved verbatim, not silently corrected - Sites missing a
product_id(entry generated, but no persistent identifier) - Parse errors (rare; indicate an unexpected citation format in the manifest)
- A checklist of mandated references to verify before submission
-
A manifest is required. Users who have not downloaded FLUXNET data, or who downloaded without saving a manifest, cannot use this tool. This is intentional: the manifest is the only durable record of what you actually downloaded and what the metadata said at that time.
-
Upstream typos are preserved, not corrected.
AU-Cow's product citation contains "FLUXNEXT" (a typo for "FLUXNET") andUS-Hsm's citation contains the author name "Kyle_Delwiche" (with a literal underscore). These are reproduced verbatim from the manifest and flagged in_review_flags.md. They reflect upstream data — correcting them silently would make your citations diverge from the registered record. -
No automated verification. Citations are taken from the manifest's
product_citationfield without cross-checking against BIF files or the FLUXNET registry. The_review_flags.mdoutput is not optional reading before submission — it exists specifically to flag cases that need a human check. -
BibTeX only. No RIS, CSL, or plain-text output formats.
See CITATION.md for:
- How to cite this tool (placeholder; to be updated when the FLUXNET 2026 annual paper publishes)
- How to cite Eric Scott's
fluxnetR package - The Pastorello et al. 2020 ONEFlux pipeline reference (placeholder; to be updated with the FLUXNET 2026 synthesis citation)
- A note on why site-level citations are required and not optional
MIT. See LICENSE.
Issues and pull requests are welcome on GitHub: https://github.com/EcosystemEcologyLab/fluxnet-citations
For direct contact: David J. P. Moore — davidjpmoore@arizona.edu