Docker OS packages scan (for Debian and Alpine)#164
Conversation
|
This looks interesting, though I'm wondering if it's trying to do too much in one go as I'm seeing what feels like three distant parts:
It looks like the dpkg status parser can conform to the same interface as the other parsers, so I think it's probably best to land that in its own PR since its a valid parser all by itself even if the scanner doesn't immediately support using it - that'll then increase the sample size of "package manager" type parsers which should make it easier to think about how to tackle #124 and co (on an aside, it would be good to try and increase the sample size further if possible, so if anyone knows other tools that could fit and look into parsers for them that'd be good - the main one that comes to mind for me is what do you think? |
|
I agree with you, I can make a PR only for parser first (adding also more tests!). Do you think that apk and dpkg parsers could be moved to another package like "pkgmanager" leaving lockfile only to "real" lockfiles? About additional parsers: if we add yum, which osv Ecosystem will map to them? (I'm still not into osv.dev internals). For point 2: which use case are you thinking about (other than scanning container packages)? Scanning local os? Thank you! |
I don't think there'd be much value in that for now because while yes technically they're not necessarily lockfiles in the purest sense, I think the essence still fits and that really we've never going to find one term that is correct for every kind of file we are parsing with
I think none right now, but that's because no one (afaik) has proposed an ecosystem, but that shouldn't be a blocker to writing a parser especially for a well-used tool like (note that I've not done a lot of work with
Yeah scanning os/system is the main one that comes to mind - we've spoken a bit about it here, but this is also why I'm interested in expanding the "sample size" because I think right now we've identified there's at least one OS-level tool we could detect automatically ( |
|
I see your point.
Let me know if I'm missing something. |
ah ok see that was the sort of difference I was expecting to hit by increasing the sample size - reading from a db (even one that is locally available) is probably going to be more complex and costly than parsing a single file, so might require more than a single flag. It'll still be something to support eventually, but probably worth opening a dedicated issue where we can capture notes/thoughts/etc |
|
Thank you very much for drafting this, and sorry for the delay in getting back to you on this (it's been a hectic few weeks). What you have here looks like a reasonable starting point, but we'll need to do a bit more investigation on the approach here to make sure we make the best design decision :) This also touches on some points in #176 as well, and we want to make sure we get that right too. More generally: relying on external libraries is fine and certainly not a blocker for this. We'll circle back on this soon! |
As discussed in #164 [here](#164 (comment)), this PR adds supports for DPKG parsing. Structure is similar to APK parser. --------- Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com> Co-authored-by: Gareth Jones <Jones258@Gmail.com>
8bad1f9 to
1a20860
Compare
As discussed in #164 [here](#164 (comment)), this PR adds supports for DPKG parsing. Structure is similar to APK parser. --------- Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com> Co-authored-by: Gareth Jones <Jones258@Gmail.com>
As discussed in google#164 [here](google#164 (comment)), this PR adds supports for DPKG parsing. Structure is similar to APK parser. --------- Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com> Co-authored-by: Gareth Jones <Jones258@Gmail.com>
As discussed in google#164 [here](google#164 (comment)), this PR adds supports for DPKG parsing. Structure is similar to APK parser. --------- Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com> Co-authored-by: Gareth Jones <Jones258@Gmail.com>
|
Closing this as #836 overlaps this in functionality. |
Hello @oliverchang, @G-Rath (and others!),
I propose a solution to start addressing #64.
It's only a draft to understand if you all think that is an actionable direction so the code is minimal and of course needs to be improved, but if you decide to discard it, not much time will be wasted! :-)
The main key point is: use stereoscope library to quickly access docker images filesystem and layers. I don't know if it's ok for you from a technical/licensing (but it's Apache 2) etc... point of view.
Included in this PR:
Another key point for possible future enhancements: to add full scanning of image filesystem, maybe lockfile parsers will need to be modified to accept an io.ReadCloser object instead of a string with pathname. Alternatively, files can be extracted from image and put in a temporary folder, then processed by parser by pathname on local filesystem. Maybe it won't be the cleanest solution but the code changes will be minimal.
Passing an io.ReadCloser could also address #95 (but I think it's not a priority).
Going along with this PR could also address #124 and #119 (I think...).
Thank you.
Regards.