Skip to content

ENH: add reduce_data() + reconstruct_data(); add tests (closes #43)#51

Open
Maryamvalian wants to merge 4 commits into
Eelbrain:masterfrom
Maryamvalian:mv
Open

ENH: add reduce_data() + reconstruct_data(); add tests (closes #43)#51
Maryamvalian wants to merge 4 commits into
Eelbrain:masterfrom
Maryamvalian:mv

Conversation

@Maryamvalian

Copy link
Copy Markdown
Collaborator

@christianbrodbeck @proloyd Please check reduce_data() and reconstruct_data() and related tests. (Closes #43)

@christianbrodbeck christianbrodbeck left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Maryamvalian ! Functionality is looks great! I've added a lot of more coding style based comments :) hope they're helpful! Let me know if any of them are unclear or you're wondering about why.

Comment thread ncrf/_model.py Outdated
Reduce model size and save meta data in model.megmeta.
"""
if getattr(self, "_data", None) is None:
raise ValueError("self._data is None.")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this error message meaningful to end users? I suspect this would correspond to something like "The model has not been fit yet"?

Also, I think this would more properly be a RuntimeError (see https://docs.python.org/3/library/exceptions.html#ValueError)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christianbrodbeck It can happen if the model has already been reduced. The model still exists, but ._data has been removed.

Comment thread ncrf/_model.py Outdated
raise ValueError("self._data is None.")

if not hasattr(self._data, "meg"):
raise ValueError("self._data.meg is missing.")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above (and more below). We should only check for problems that we can actually identify, and issue meaningful error messages. If it's a problem we're not aware of, then the "self._data.meg is missing" is not actually more meaningful than Python's AttributeError

Comment thread ncrf/_model.py Outdated
"""
Reduce model size and save meta data in model.megmeta.
"""
if getattr(self, "_data", None) is None:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During normal use, self._data should always be present. I would not use getattr, as it makes code harder to read and suggests (wrongly) that there are use cases in which there is no self._data.

(This comment also applies to several cases below)

Comment thread ncrf/_model.py Outdated
raise ValueError("self._data.meg is missing.")

if not hasattr(self, "megmeta") or self.megmeta is None:
self.megmeta = {}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep that private (unless there is a use case for a public attribute).

Comment thread ncrf/_model.py Outdated
Comment on lines +923 to +932
meglen = len(self._data.meg)
meg_shapes = []
for i in range(meglen):
meg_i = self._data.meg[i]
if not hasattr(meg_i, "shape"):
raise TypeError(f"self._data.meg[{i}] has no .shape attribute.")
meg_shapes.append(tuple(meg_i.shape))

self.megmeta["meglen"] = meglen
self.megmeta["meg_shapes"] = meg_shapes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make that simpler because meglen is already contained in len(meg_shapes).

In general, let's add explicit type specification (e.g. RegressionData.meg needs to be a list of arrays) and rely o this, rather than adding vagueness and flexibility for cases we don't expect to happen in normal usage (like testing hasattr(meg_i, "shape")).

Comment thread ncrf/_model.py Outdated
meg_all,
stim_all,
in_place=False,
do_post_normalization=True,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it make sense to store this parameter on the NCRF object instead of having to pass it again?

Comment thread ncrf/_model.py Outdated
self,
meg_all,
stim_all,
in_place=False,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this copy or copy_data?

In Python, in_place is generally interpreted as “this method mutates the object the method is operating on” (i.e., not an argument).

Comment thread ncrf/_model.py Outdated
):
"""
Reconstruct RegressionData
in_place = False : meg and stim data will not change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please complete the docstring

Comment thread ncrf/_model.py Outdated
got = tuple(meg_proc.shape)
exp = tuple(expected_shapes[i])
if got != exp:
raise ValueError(f"Processed MEG shape mismatch at run {i}")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to put some more information in the error message. From personal experience :) : A user will send you the error message they get. You want as much information in there as may help you diagnose the problem.

@Maryamvalian Maryamvalian Feb 3, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christianbrodbeck This error occurs when the input MEG for reconstructing data has a different shape (e.g., different channel count or time points) than the shape stored in the MEG metadata.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the information that should be communicated to the user in the error message (and no need to explain these on GitHub separately :) )

Comment thread ncrf/_model.py Outdated
@Maryamvalian

Copy link
Copy Markdown
Collaborator Author

@christianbrodbeck Updated code according to your feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: Option to Reduce Model Pickle File Size

2 participants