JOSE paper#657
Conversation
β¦o ncc/jose-paper
β¦o ncc/jose-paper
β¦o ncc/jose-paper
|
@navidcy - I appreciate being included and happy to be here. I will attempt to read today before the tidal pull of the multi-tasking moon sinks this deep into the to-do list. Thanks for your efforts and those of other authors. |
| require non-trivial development, understanding of model numerics, and validation before | ||
| they can be used confidently. If every new user or project had to reconstruct | ||
| those workflows independently, a large amount of effort would be repeatedly spent | ||
| on rebuilding analysis code rather than conducting research using the model output. |
There was a problem hiding this comment.
not convinced that "using the model output" is adding much here; could just be deleted
| other groups. First, each notebook is intended to be self-contained and | ||
| well-documented, so learners can read, run, and modify a complete workflow | ||
| without reconstructing missing context from scattered notes. Second, the | ||
| repository distinguishes tutorials from recipes. Tutorials teach transferable |
There was a problem hiding this comment.
"Second, the repository distinguishes tutorials from recipes." β when I first read this sentence, I thought "hang on, that needs to be explained earlier on". I think it might be in the initial cookbook introduction paragraph, but I missed it on the first pass because of the foregrounding of cooking terms. If you're okay with my suggested rephrase there, then ignore this comment, otherwise I think the purpose and distinguishing features of recipes & tutorials should be made clearer somehow
| tutorials and finally to recipe notebooks that address concrete scientific | ||
| questions. The categories of "easy" and "advanced" recipes provide | ||
| a lightweight pedagogical cue about expected complexity and scope, with | ||
| "regional specialties" specifically covering recipes for regional model configurations |
There was a problem hiding this comment.
Is this "recipes to do analysis on regional model configurations"? In which case I'm not clear on why the regional configs need their own recipes, and why these aren't the same thing as the global model grids.
Or is it "recipes to set up regional model configurations"? In which case I think that deserves more clarity. Perhaps "specifically covering recipes to configure a regional model"?
|
Thanks Navid & Julia I'm happy to contribute, though I don't really feel I've done enough. I'm going to trust that you've used some reasonable principles to make this decision, so I'll say, "Sure, count me in!" but please be aware that I'm not going to be offended if I'm removed from the author list because I haven't really done much. I've left a handful of comments on the text. My one generic comment is to do with the COSIMA ethics, and if it's related enough as a set of guidelines for sharing code that it's worth mentioning? Also, I was very confused for years about what the purpose of the COSIMA cookbook was, and the value that other community members were getting from it (I'm not denying that it clearly had value for people, the specifics were just confusing me). It's great that we're having conversations about this now, and personally that's going to make it much easier for me to contribute and support the community. I do think that the opening & point of this paper could be further strengthened by having some more of those discussions, and being more clear on whether we want it to be a teaching tool for the basics of how you do analysis, or a teaching tool for specific sets of analysis in general, or a bunch of stuff that just works to begin with and lets you focus on other things. Maybe my mind's being clouded by the fact that I know these discussions are happening, but it seems like the intro might be trying to be vague about which of these purposes are being served and I'm not sure that's beneficial. |
Yes, we have reasonable principles in place. |
Excellent points! Also resonate by some comments made by others in this PR or through chats here and there. |
| those workflows independently, a large amount of effort would be repeatedly spent | ||
| on rebuilding analysis code rather than conducting research using the model output. | ||
|
|
||
| The COSIMA Cookbook facilitates knowledge sharing and accelerates research with |
There was a problem hiding this comment.
Maybe knowledge sharing is a bit broad here? Would something like the sharing of analysis methods and workflows be a bit more accurate ?
There was a problem hiding this comment.
actually "workflow" appears quite frequently in this paragraph already
| affiliation: 2 | ||
| - name: Matthis Auger | ||
| orcid: 0000-0001-6228-5732 | ||
| affiliation: 16 |
There was a problem hiding this comment.
Now in "University of Brest, France"
|
HI @navidcy, I am happy to be included. I'll have a read through it and will provide some feedback next week! Cheers, Jaap |
|
The paper is looking great! Well done everyone, particularly @navidcy and @julia-neme . I have added a few comments. More generally, we should consider mentioning AI coding tools for the following reasons:
|
|
Just a friendly reminder for people to read this and provide comments by June 30th. Thanks! ππΌ Also: @chrisb13, @NoahDay, @AVEllepola, @lidefi87, @mauricehuguenin, @aidanheerdegen, @AndyHoggANU, @wghuneke, @ruth-moorman, @paigem, @max-anu, @mmr0, @PaulSpence I haven't heard from you whether you wanna be part of this. If I don't hear back by June 30th I'll take you off the manuscript. |
|
@navidcy -- all, thanks for putting this together -- great idea! I do apologise that I missed this. I would appreciate being included, I'll have a read this week. Let me know if there's anything specific it would be helpful for me to look at. |
|
Thanks @navidcy and @julia-neme for leading this! I would love to be included. I will read the draft asap :) |
|
Hi @navidcy if you are happy to have me I am equally happy to be included. Thanks for organising this. I have read the draft but have no comments at this time. |
Likewise I think the draft is great! A lot of breadth to cover in a limited number of pages; you've done well |
paigem
left a comment
There was a problem hiding this comment.
I'm thrilled to see this paper in the works - the Cookbook is very worthy of an official publication! Like many others, I don't feel like I've necessarily contributed enough, but I trust your process and am happy to be included.
A few wording suggestions - hopefully they are helpful, but feel free to ignore if not!
| landing page used to navigate tutorials and recipes. The live site is available | ||
| at https://cosima-recipes.readthedocs.io. \label{fig:website}](website.png) | ||
|
|
||
| Following the "Cookbook" concept, the website sections are deliberately named |
There was a problem hiding this comment.
I agree with the comments here. As a first pass, I suggest rewording the first sentence so as not to imply that that the concept of a "Cookbook" has been mentioned before this paragraph. One rewording option is:
| Following the "Cookbook" concept, the website sections are deliberately named | |
| The COSIMA Cookbook is deliberately organized |
| at https://cosima-recipes.readthedocs.io. \label{fig:website}](website.png) | ||
|
|
||
| Following the "Cookbook" concept, the website sections are deliberately named | ||
| using a gastronomy theme. "Cooking Tutorials" refers to tutorials that teach |
There was a problem hiding this comment.
I think it would be helpful to add a few words to explain why the gastronomy theme is used, as it may not be immediately obvious to a new reader of the paper. E.g.:
| using a gastronomy theme. "Cooking Tutorials" refers to tutorials that teach | |
| using a gastronomy theme to help users quickly understand and navigate the website. "Cooking Tutorials" refers to tutorials that teach |
| that are more elaborate and demonstrate advanced analysis techniques, and lastly the "Regional Specialties" | ||
| contains recipes for regional model configurations [@barnes2024regionalmom6]. | ||
| Together these sections present the documentation as a browsable collection of | ||
| lessons and recipes gathered into a single cookbook. |
There was a problem hiding this comment.
I think it would be worth mentioning that the concept of "Cookbook" for computational workflows is not unique - COSIMA is just one example of using this terminology. I suggest a potential sentence to add at the end of this paragraph where we'd link to a couple other examples. There are likely some better examples out there - these are just some I'm aware of!
| lessons and recipes gathered into a single cookbook. | |
| lessons and recipes gathered into a single cookbook. The COSIMA Cookbook is one of several examples of using a gastronomy theme for computational workflows, for example [Project Pythia Cookbooks](https://cookbooks.projectpythia.org) for the general geoscience community and the [FAIR Cookbook](https://faircookbook.elixir-europe.org/content/home.html) for working with life science data. |
|
|
||
| Within this structure, the tutorials leverage and demonstrate open source tools for | ||
| scientific analysis of large data. This includes examples of loading model output | ||
| through Intake-based catalogues [@intake], using xarray [@hoyer2017xarray] |
There was a problem hiding this comment.
Pretty sure Xarray should be capitalized (see Xarray docs here)... (happy to be proven wrong if others disagree!)
| through Intake-based catalogues [@intake], using xarray [@hoyer2017xarray] | |
| through Intake-based catalogues [@intake], using Xarray [@hoyer2017xarray] |
| Ocean and climate model analysis has a steep entry cost. New users must learn | ||
| how to find datasets, load them efficiently, interpret metadata, operate on | ||
| multi-dimensional arrays, and produce scientifically meaningful diagnostics. | ||
| General-purpose libraries such as xarray [@hoyer2017xarray] provide the |
There was a problem hiding this comment.
| General-purpose libraries such as xarray [@hoyer2017xarray] provide the | |
| General-purpose libraries such as Xarray [@hoyer2017xarray] provide the |
| always practical to distribute or mirror in full. However, beyond the initial | ||
| data-loading step, which usually occurs early in each recipe, the analysis | ||
| workflows are generally applicable to most operational ocean--sea ice model | ||
| outputs that can be loaded with xarray [@hoyer2017xarray]. |
There was a problem hiding this comment.
| outputs that can be loaded with xarray [@hoyer2017xarray]. | |
| outputs that can be loaded with Xarray [@hoyer2017xarray]. |
| "regional specialties" specifically covering recipes for regional model configurations | ||
| [@barnes2024regionalmom6]. | ||
|
|
||
| The intended mode of use is also explicit. The repository is designed around |
There was a problem hiding this comment.
I would be more explicit that these recipes/tutorials will only run (as is) on NCI.
I think "designed around" wasn't clear enough to me while reading that these are specifically for use on NCI.
| The intended mode of use is also explicit. The repository is designed around | |
| The intended mode of use is also explicit. The tutorials and recipes showcase Jupyter-based analysis that can be run directly on Australia's National Computational Infrastructure, |
Also, since JupyterLab is mentioned later in this sentence, we could also remove mention of that here.
| [@barnes2024regionalmom6]. | ||
|
|
||
| The intended mode of use is also explicit. The repository is designed around | ||
| Jupyter-based analysis on the Australian National Computational Infrastructure, |
There was a problem hiding this comment.
Remove if you accept my previous suggested edit
| Jupyter-based analysis on the Australian National Computational Infrastructure, |
| accessing shared data holdings. This makes the Cookbook more than a static | ||
| collection of examples: it is operational documentation for a real analysis | ||
| environment. At the same time, the notebooks demonstrate broadly transferable | ||
| patterns for working with labelled geophysical data, so individual lessons can |
There was a problem hiding this comment.
| patterns for working with labelled geophysical data, so individual lessons can | |
| workflows for working with labelled geophysical data, so individual lessons can |
| collection of examples: it is operational documentation for a real analysis | ||
| environment. At the same time, the notebooks demonstrate broadly transferable | ||
| patterns for working with labelled geophysical data, so individual lessons can | ||
| be reused or adapted outside that environment. |
There was a problem hiding this comment.
Trying to use clearer language here
| be reused or adapted outside that environment. | |
| be reused or adapted for specific research questions or for use on other computational platforms. |
Happy that you trust the process. :) |
|
Sorry for the delayed response, thanks @navidcy and @julia-neme! I also don't feel like I've contributed enough, but if you think then I'm happy to be included of course π Either way, I'll have a read over it asap. |
|
A lot of people are commenting that they feel they don't deserve to be part of this or something along those lines. Just a remark: names were not randomly drawn out of a hat or something. |
|
Thank you for including me for this!! Happy to be a part of it and will try to contribute more to the process. |
|
Thanks for leading this effort, happy to be included. I'll have a read before the hackathon. |
|
Count me in - thanks! |
|
Hi Navid, |
| used to analyse modern ocean model output are powerful, but the gap between | ||
| package-level documentation and reproducible end-to-end workflows remains | ||
| large. Users need to understand not only Python and Jupyter | ||
| [@kluyver2016jupyter], but also how to navigate high-dimensional model output, |
There was a problem hiding this comment.
| [@kluyver2016jupyter], but also how to navigate high-dimensional model output, | |
| [@kluyver2016jupyter], but also how to navigate high-dimensional and often very large model output, |
| challenge: turning expert tacit knowledge into examples that newcomers can | ||
| adapt. The Cookbook offers a model for how a scientific collaboration can | ||
| capture that knowledge in version-controlled notebooks, publish it as a living | ||
| resource, and continuously improve it through community contribution. |
There was a problem hiding this comment.
| resource, and continuously improve it through community contribution. | |
| resource online, and continuously improve it through community contribution. |
There was a problem hiding this comment.
Was just thinking of adding a note that the recipes have already been used by other communities. Some in the US have copied parts of recipes and adapted them for use in CESM. So the recipes, by nature of being open-source, have already been very helpful for others.
There was a problem hiding this comment.
cf-xarray links to the COSIMA Model Agnostic Analysis recipe in their documentation.
Is that worth mentioning?
| communities maintain model output on shared infrastructure and face the same | ||
| challenge: turning expert tacit knowledge into examples that newcomers can | ||
| adapt. The Cookbook offers a model for how a scientific collaboration can | ||
| capture that knowledge in version-controlled notebooks, publish it as a living |
There was a problem hiding this comment.
| capture that knowledge in version-controlled notebooks, publish it as a living | |
| capture that knowledge in version-controlled notebooks, publish it as a living open |
|
@navidcy, I'd love to contribute |
I'm of the opinion that having examples that teach people what they're doing and why are even more important in the time of AI coding tools, not because the tools will work well with these but because I think learning + understanding yourself is important, and if resources like the cookbook don't exist then people will fall back on the easy AI option without properly understanding what they're doing, and then their code will have bugs. Okay with mentioning AI tools, but I think we should be careful not to frame as "everyone uses AI tools so the cookbook supports this approach" |
|
I have a couple of broad and rather minor points to make:
But otherwise, this was a great read and a good summary of what the cookbook does for the community. |
βοΈ |
Hi team!
cc @willaguiar, @MatthisAuger, @ashjbarnes, @rbeucher, @dhruvbhagtani, @chrisb13, @hrsdawson, @NoahDay, @fabiobdias, @edoddridge, @AVEllepola, @matthew-england-unsw, @lidefi87, @angus-g, @mauricehuguenin, @aidanheerdegen, @AndyHoggANU, @rmholmes. @wghuneke, @jemmajeffree, @aekiss, @minghangli-uni, @josuemtzmo , @janjaapmeijer, @Thomas-Moore-Creative @ruth-moorman, @paigem, @adele-morrison, @jmunroe, @adityarn, @micaeljtoliveira, @ongqingyee, @max-anu, @mmr0, @schmidt-christina, @taimoorsohail, @PaulSpence, @dougiesquire, @anton-seaice, @charles-turner-1, @vsilvafelipe, @marc-white, @luweiyang, @claireyung, and @janzika!
Me and @julia-neme have been thinking about pushing out a paper about the Cookbook in the Journal of Open Source Education! JOSE is an open source journal (similar to JOSS). They publish relatively short paper (~3-5 pages). But together with the paper, also the whole repository and its documentation are reviewed and published!
(Similar to the regional-mom6 paper.) All the review process occurs openly on Github via issues and pull requests.
We've included as coauthors everybody that have contributed to the repository and had their name in the
.zenodo.jsonfile. That said, we may have left somebody out -- it's not intentional! Please point it out any such case if it is the case or message @navidcy privately if you prefer.This PR adds a first draft of the JOSE paper. After every commit, a Github Action generates a PDF and uploads it at: https://github.com/COSIMA/cosima-recipes/blob/ncc/jose-paper/paper/paper.pdf
What is needed from you at this point?
If you are keen to be part of this then:
Have a read (it's only 3-4 pages)! Click on the link mentioned above to see the PDF.
You can suggest edits on the Markdown file (
paper.md) the same way you would do in a PR. Or if you prefer, you can make a PR from your fork to this branchncc/jose-paper.Give generic thoughts/feelings/comments as a post in this PR.
Check your affiliation and add any grants in the acknowledgments.
The aim is to submit during or just after the mini-hackathon at the beginning of July 2026.
Closes #665
Confirmed participation [@navidcy edits list below when they see verbal confirmation]