Skip to content

Update SMP scripts#84

Merged
jomey merged 7 commits into
api_upload_updatefrom
add_smp
Oct 21, 2025
Merged

Update SMP scripts#84
jomey merged 7 commits into
api_upload_updatefrom
add_smp

Conversation

@jomey

@jomey jomey commented Oct 16, 2025

Copy link
Copy Markdown
Member

Closes #42

This imports all the SMP data and does not subset as it used to.

To speed up the imports I added/changed two things:

  • Add a local lookup cache for the BaseUpload class that holds DB objects of metadata locally instead of getting it over and over again per upload. This was a bottleneck when adding one SMP file, which holds more than 100K records.
  • Changed points and layers to use a bulk insert per batch and call session.commit() for each. This was another performance boost.

With the two changes, one SMP file now uploads in little more than a minute where it took around 5 before.

Dependencies

Needs PR M3Works/insitupy#32

jomey added 5 commits October 9, 2025 13:32
After a successful download, the list of files is of type string. When checking for
existence of those files locally, the pathlib returns Path objects. This makes the return
type identical by casting them again as strings.
Add the measurement ID as key to the site name so we are creating one location per
measurement in the sites table.
Add an in memory lookup cache attribute to the base class to prevent repeated DB selects
per inserted data row. This has a big impact when inserting layer data of the same
measurement type such as the SMP with +100K records of the same type.
…call

Add all layer row entries in one transaction and commit once at the end. This speeds up
the import. Also add an expunge to reduce memory footprint after a profile has been
uploaded in the DB session.
jomey added 2 commits October 16, 2025 08:31
Improve the lookup cache by storing a bare bones object with key and primary ID.
This also changes the strategy to use a bulk add and commit via a configurable batch size.
None has been set for points, but layers use 100K.

@aaarendt aaarendt left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming I tested this locally and it works as expected.

@jomey jomey merged commit ca4fdab into api_upload_update Oct 21, 2025
0 of 4 checks passed
@jomey jomey deleted the add_smp branch October 21, 2025 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants