Skip to content

Docker OS packages scan (for Debian and Alpine)#164

Closed
cmaritan wants to merge 2 commits into
google:mainfrom
cmaritan:container-scan
Closed

Docker OS packages scan (for Debian and Alpine)#164
cmaritan wants to merge 2 commits into
google:mainfrom
cmaritan:container-scan

Conversation

@cmaritan

Copy link
Copy Markdown
Contributor

Hello @oliverchang, @G-Rath (and others!),

I propose a solution to start addressing #64.
It's only a draft to understand if you all think that is an actionable direction so the code is minimal and of course needs to be improved, but if you decide to discard it, not much time will be wasted! :-)
The main key point is: use stereoscope library to quickly access docker images filesystem and layers. I don't know if it's ok for you from a technical/licensing (but it's Apache 2) etc... point of view.
Included in this PR:

  1. Identification of Image OS (Debian and Alpine)
  2. Selection of corresponding parser
  3. Scan of package manager db file inside the image to identify installed packages vulnerabilities

Another key point for possible future enhancements: to add full scanning of image filesystem, maybe lockfile parsers will need to be modified to accept an io.ReadCloser object instead of a string with pathname. Alternatively, files can be extracted from image and put in a temporary folder, then processed by parser by pathname on local filesystem. Maybe it won't be the cleanest solution but the code changes will be minimal.

Passing an io.ReadCloser could also address #95 (but I think it's not a priority).
Going along with this PR could also address #124 and #119 (I think...).

Thank you.
Regards.

@G-Rath

G-Rath commented Jan 25, 2023

Copy link
Copy Markdown
Collaborator

This looks interesting, though I'm wondering if it's trying to do too much in one go as I'm seeing what feels like three distant parts:

  1. adding a dpkg status parser
  2. supporting scanning "package manager" stuff
  3. improving docker container scanning

It looks like the dpkg status parser can conform to the same interface as the other parsers, so I think it's probably best to land that in its own PR since its a valid parser all by itself even if the scanner doesn't immediately support using it - that'll then increase the sample size of "package manager" type parsers which should make it easier to think about how to tackle #124 and co (on an aside, it would be good to try and increase the sample size further if possible, so if anyone knows other tools that could fit and look into parsers for them that'd be good - the main one that comes to mind for me is yum).

what do you think?

@cmaritan

cmaritan commented Jan 25, 2023

Copy link
Copy Markdown
Contributor Author

I agree with you, I can make a PR only for parser first (adding also more tests!).

Do you think that apk and dpkg parsers could be moved to another package like "pkgmanager" leaving lockfile only to "real" lockfiles?

About additional parsers: if we add yum, which osv Ecosystem will map to them? (I'm still not into osv.dev internals).

For point 2: which use case are you thinking about (other than scanning container packages)? Scanning local os?

Thank you!

@G-Rath

G-Rath commented Jan 26, 2023

Copy link
Copy Markdown
Collaborator

Do you think that apk and dpkg parsers could be moved to another package like "pkgmanager" leaving lockfile only to "real" lockfiles?

I don't think there'd be much value in that for now because while yes technically they're not necessarily lockfiles in the purest sense, I think the essence still fits and that really we've never going to find one term that is correct for every kind of file we are parsing with lockfile - that combined with the fact that Go only includes what is actually used from a package means I don't think it's worth stressing over the package name being technically incorrect for some of these parsers/files :)

About additional parsers: if we add yum, which osv Ecosystem will map to them? (I'm still not into osv.dev internals).

I think none right now, but that's because no one (afaik) has proposed an ecosystem, but that shouldn't be a blocker to writing a parser especially for a well-used tool like yum - this recently happened with #59.

(note that I've not done a lot of work with yum / RedHat, so I'm assuming it is at least somewhat similar to apk and apt in terms of how it records what versions of stuff is installed, but ldo let me know if I'm wrong and it's somehow different)

For point 2: which use case are you thinking about (other than scanning container packages)? Scanning local os?

Yeah scanning os/system is the main one that comes to mind - we've spoken a bit about it here, but this is also why I'm interested in expanding the "sample size" because I think right now we've identified there's at least one OS-level tool we could detect automatically (apk) and we've got a possibly good way to handle that proposed, but it's hard if that is a good fit with a sample size of 1 because other OS-level tools might have caveats (like being expensive) and once we land something we can't make breaking changes without doing a new major - so increasing the sample size helps get a better idea of what we can support and what is required for that support.

@cmaritan

Copy link
Copy Markdown
Contributor Author

I see your point.
To recap:

  1. adding a dpkg status parser
    I made dedicated PR Support for DPKG (Debian) parser #168 maintaining standard parsers API, same structure of already merged APK support.
  2. supporting scanning "package manager" stuff
    Discussion on how to implement this is still in progress, cmdline switch has been halted for now. In the meantime, I can take a look at RPM. It will be different from other parsers since RPM relies on sqlite db and on Berkley DB for versions lower than 4.16.0.
  3. improving docker container scanning
    This PR start addressing this point. I wait for further opinion from you, @oliverchang and/or others to go on. This could surely be the first use case of "os level package scanning".

Let me know if I'm missing something.
Thank you for your feedback and support.

@G-Rath

G-Rath commented Jan 27, 2023

Copy link
Copy Markdown
Collaborator

It will be different from other parsers since RPM relies on sqlite db and on Berkley DB for versions lower than 4.16.0.

ah ok see that was the sort of difference I was expecting to hit by increasing the sample size - reading from a db (even one that is locally available) is probably going to be more complex and costly than parsing a single file, so might require more than a single flag.

It'll still be something to support eventually, but probably worth opening a dedicated issue where we can capture notes/thoughts/etc

@oliverchang

Copy link
Copy Markdown
Collaborator

Thank you very much for drafting this, and sorry for the delay in getting back to you on this (it's been a hectic few weeks).

What you have here looks like a reasonable starting point, but we'll need to do a bit more investigation on the approach here to make sure we make the best design decision :) This also touches on some points in #176 as well, and we want to make sure we get that right too.

More generally: relying on external libraries is fine and certainly not a blocker for this. We'll circle back on this soon!

another-rex added a commit that referenced this pull request Feb 20, 2023
As discussed in #164
[here](#164 (comment)),
this PR adds supports for DPKG parsing.
Structure is similar to APK parser.

---------

Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com>
Co-authored-by: Gareth Jones <Jones258@Gmail.com>
@cmaritan cmaritan force-pushed the container-scan branch 2 times, most recently from 8bad1f9 to 1a20860 Compare March 1, 2023 19:56
hayleycd pushed a commit that referenced this pull request Mar 9, 2023
As discussed in #164
[here](#164 (comment)),
this PR adds supports for DPKG parsing.
Structure is similar to APK parser.

---------

Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com>
Co-authored-by: Gareth Jones <Jones258@Gmail.com>
julieqiu pushed a commit to julieqiu/osv-scanner that referenced this pull request May 2, 2023
As discussed in google#164
[here](google#164 (comment)),
this PR adds supports for DPKG parsing.
Structure is similar to APK parser.

---------

Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com>
Co-authored-by: Gareth Jones <Jones258@Gmail.com>
julieqiu pushed a commit to julieqiu/osv-scanner that referenced this pull request May 2, 2023
As discussed in google#164
[here](google#164 (comment)),
this PR adds supports for DPKG parsing.
Structure is similar to APK parser.

---------

Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com>
Co-authored-by: Gareth Jones <Jones258@Gmail.com>
@another-rex

Copy link
Copy Markdown
Collaborator

Closing this as #836 overlaps this in functionality.

@another-rex another-rex closed this Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants