Skip to content

Run RawKNNClassifier._predict_fc as parallel to avoid memory issues#9

Open
grovduck wants to merge 5 commits into
mainfrom
predict-fc-batch
Open

Run RawKNNClassifier._predict_fc as parallel to avoid memory issues#9
grovduck wants to merge 5 commits into
mainfrom
predict-fc-batch

Conversation

@grovduck

Copy link
Copy Markdown
Member

Closes #8 by creating a joblib.Parallel job to retrieve predictions for a feature collection. Options are provided for specifying the size of the batch (chunk_size) and the number of threads to use (num_threads). Once all neighbors are retrieved, the result is stitched back together into an ee.FeatureCollection.

Note that this is a way to do this server-side, but that may not be the best workflow for this use case. Typically, one wants to run the feature collection mode to do cross-validation on the plots used to fit the model or run a new set of targets. We are investigating the possibility of: 1) converting the feature collection client-side; and 2) using sknnr to run the prediction locally.

We will keep this PR open as we decide on the best path forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

User memory issues on immediate retrieval of predict for feature collections

1 participant