use js-regexp for wasm targets#64
Conversation
waltzofpearls
left a comment
There was a problem hiding this comment.
Hey @Jazzpirate, thanks for adding js-regexp for wasm!
I left some comments in the diffs. The implementation looks good, but the CI ran into issues with tests. When running cargo test --all-features on non-wasm targets, the tests fail because js-regexp requires a wasm runtime.
Other than that, I have some feedback/questions:
- I assume you've tested the changes with wasm. Have you observed any performance regression comparing to the non-wasm version? I'm curious because in your wasm specific implementation of the new/constructor, it doesn't create/compile the regex, as a result, every single call to the date and time format (and the regex is_match method) will trigger a recompile of the regex. That could significantly slows down the date and time parsing, if the code iterates on a large set of data.
- The wasm target doesn't have any unit test coverage. It would be nice to add some for wasm with the support of
wasm-bindgen-test. That said, if you feel this is out of scope. I can add some tests later.
| ) -> Option<R>; | ||
| } | ||
|
|
||
| #[cfg(not(feature = "wasm"))] |
There was a problem hiding this comment.
Can you please change this to
#[cfg(not(all(feature = "wasm", target_arch = "wasm32")))]|
|
||
| #[cfg(not(feature = "wasm"))] | ||
| use regex::Regex; | ||
| #[cfg(not(feature = "wasm"))] |
There was a problem hiding this comment.
Can you please change this to
#[cfg(not(all(feature = "wasm", target_arch = "wasm32")))]| } | ||
| } | ||
|
|
||
| #[cfg(feature = "wasm")] |
There was a problem hiding this comment.
Can you please change this to
#[cfg(all(feature = "wasm", target_arch = "wasm32"))]| } | ||
| } | ||
| } | ||
| #[cfg(feature = "wasm")] |
There was a problem hiding this comment.
Can you please change this to
#[cfg(all(feature = "wasm", target_arch = "wasm32"))]| chrono = "0.4.31" | ||
| lazy_static = "1.4.0" | ||
| regex = "1.10.2" | ||
| js-regexp = {version="0.2",optional=true} |
There was a problem hiding this comment.
Can you please remove it from [dependencies], and add another section just for wasm:
[target.'cfg(target_arch = "wasm32")'.dependencies]
js-regexp = { version = "0.2", optional = true }
I have not compared them directly, but I am actively using this branch for my own purposes now, and yes, I would suspect there is a performance regression involved; there is naturally a tradeoff here between binary size and performance. Using
Yes, the API for There are ways to optimize around that; e.g. I could happily experiment with
Sounds reasonable. I haven't worked with |
|
update: I started toying around with thread_local and benchmarks show significant performance improvements in non-wasm targets by over lazy_static |
…ic (seems to also improve performance), 3. wasm tests
|
ok, added another commit that refactors things to avoid constructing the same regex in wasm every time; also got rid of lazy_static in favor of thread_local for more uniformity (seems to show performance improvements as well) and added wasm tests, including to the CI. |
|
update: made a mistake in the wasm test and didn't even use the feature -.- getting errors now. Will fix |
|
ok, several rabbit holes later, it turned out there is a bug in So instead, I got rid of I also |
waltzofpearls
left a comment
There was a problem hiding this comment.
LGTM.
Thanks for the contribution. Great additions and improvements!
The clippy errors are not caused by your changes. I will fix them separately in this repo.
|
awesome, thank you :) |
the
regexcrate is a relatively complex state machine that includes the entire unicode table for parsing. This of course makes perfect sense in general, but it also means that it adds ~0.5MB of binary size. When compiling to wasm and targeting browser, this is nonsensical, since javascript already has a regex implementation that could be used instead, so avoidingregexin wasm is strongly recommended.This PR adds a feature "wasm" that uses
js-regexpinstead of theregexcrate. Since the latter has a very different interface, I had to do some minor refactoring, but I was careful to change as little as possible, offloading things to a common trait implemented by bothregex::Regexand (a custom wrapper around)js_regexp::RegExp