What subset of data to load: pixels, echelle orders, nods, full frames, nights?

I was mucking around with the dataloader with IGRINS today at the Texas Innovation Center meetup with Jae-Joon Lee, Kyle Kaplan, Erica S, and Greg Mace.  We currently subimage a single echelle order, and we treat the index as the nod.  This nodding approach has some regularization advantage because we think the underlying spectrum and coordinate system is unchanged from nod to nod, just which pixels the target fell upon.  That gives lots of resilience to cosmic rays, etc.  

But the problem is that the IGRINS data are too large: I run out of GPU RAM.  We could simply move to a bigger GPU, and we should.  But it made me wonder whether we should just move to grabbing a random subset of the pixels.  We'll have to run much longer, but eventually every pixel will be seen.  This may encourage random walking of parameters that we don't want to let drift too much. 

How this would work is the data loader yields some number of pixels, say 10,000-40,000 at a time depending on GPU usage.  In IGRINS there are 4 million, so it'll take ~100-400 iterations just to see the full batch.  Then we'll need to run for hundreds or thousands of epochs.  But we can do it with 1 (cheap) GPU.

Specifically, `model.forward(inds)` would take in this set of $(i,j)$ pixel coordinates.  We tried something similar with blase and it *didn't* work.  So I'm not too hopeful.  Just brainstorming ways to get IGRINS to work.  

Maybe a bigger GPU is easier than changing the architecture...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What subset of data to load: pixels, echelle orders, nods, full frames, nights? #25

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

What subset of data to load: pixels, echelle orders, nods, full frames, nights? #25

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions