Skip to content

What subset of data to load: pixels, echelle orders, nods, full frames, nights? #25

@gully

Description

@gully

I was mucking around with the dataloader with IGRINS today at the Texas Innovation Center meetup with Jae-Joon Lee, Kyle Kaplan, Erica S, and Greg Mace. We currently subimage a single echelle order, and we treat the index as the nod. This nodding approach has some regularization advantage because we think the underlying spectrum and coordinate system is unchanged from nod to nod, just which pixels the target fell upon. That gives lots of resilience to cosmic rays, etc.

But the problem is that the IGRINS data are too large: I run out of GPU RAM. We could simply move to a bigger GPU, and we should. But it made me wonder whether we should just move to grabbing a random subset of the pixels. We'll have to run much longer, but eventually every pixel will be seen. This may encourage random walking of parameters that we don't want to let drift too much.

How this would work is the data loader yields some number of pixels, say 10,000-40,000 at a time depending on GPU usage. In IGRINS there are 4 million, so it'll take ~100-400 iterations just to see the full batch. Then we'll need to run for hundreds or thousands of epochs. But we can do it with 1 (cheap) GPU.

Specifically, model.forward(inds) would take in this set of $(i,j)$ pixel coordinates. We tried something similar with blase and it didn't work. So I'm not too hopeful. Just brainstorming ways to get IGRINS to work.

Maybe a bigger GPU is easier than changing the architecture...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions