pkg-stats was initially a Buildroot maintenance oriented tool: it was
designed to examine all Buildroot packages and provide
statistics/details about them.
However, it turns out that a number of details provided by pkg-stats,
especially CVEs, are relevant also for Buildroot users, who would like
to check regularly if their specific Buildroot configuration is
affected by CVEs or not, and possibly check if all packages have
license information, license files, etc.
The cve-checker script was recently introduced to provide an output
relatively similar to pkg-stats, but focused on CVEs only.
But in fact, its main difference is on the set of packages that we
consider: pkg-stats considers all packages, while cve-checker uses
"make show-info" to only consider packages enabled in the current
configuration.
So, this commit introduces a -c option to pkg-stats, to tell pkg-stats
to generate its output based on the list of configured packages. -c is
mutually exclusive with the -p option (explicit list of packages) and
-n option (a number of packages, picked randomly).
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Currently, pkg-stats expects being executed from Buildroot's top-level
source directory. As we are going to extend pkg-stats to cover only
the packages available in the current configuration, it makes sense to
be able to run it from the output directory, which can be anywhere
compared to Buildroot's top-level directory.
This commit adjusts pkg-stats to this, by inferring all Buildroot
paths based on the location of the pkg-stats script itself.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Note that one is silenced, rather than fixed: we indeed need to import
after we add the local directory to the modules search path.
Signed-off-by: Yann E. MORIN <yann.morin.1998@free.fr>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Arnout Vandecappelle <arnout@mind.be>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Current X.org X server is incompatible with this driver.
We no longer support unmaintainted versions of X.org X server.
Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
The affects method of the CVE uses the Package class defined in
pkg-stats. The purpose of migrating the CVE class outside of pkg-stats
was to be able to reuse it from other scripts. So let's remove the
Package dependency and only use the needed information.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
In order to be able to use the CVE checking logic outside of
pkg-stats, move the CVE class in a module that can be used by other
scripts.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Some CVE entries in the NVD database have version_value set to "-",
which seems to indicate that it applies to all versions of the
software project, or that they don't really know which versions are
affected, and which are not.
So, for the benefit of doubt, it seems more appropriate to consider
such CVEs as affecting our packages.
This makes the total number of CVEs affecting our next branch jump
from 141 CVEs to 658 CVEs, but that number will go back down once we
switch to the JSON 1.1 schema. Indeed, in the JSON 1.0 schema, there
are often cases where a version_value is set to "=" *and* specific
versions are set to.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
This commit slightly improves the output of pkg-stats by showing the
progress of the upstream URL checks and latest version retrieval, on a
package basis:
Checking URL status
[0001/0062] curlpp
[0002/0062] cmocka
[0003/0062] snappy
[0004/0062] nload
[...]
[0060/0062] librtas
[0061/0062] libsilk
[0062/0062] jhead
Getting latest versions ...
[0001/0064] libglob
[0002/0064] perl-http-daemon
[0003/0064] shadowsocks-libev
[...]
[0061/0064] lua-flu
[0062/0064] python-aiohttp-security
[0063/0064] ljlinenoise
[0064/0064] matchbox-lib
Note that the above sample was run on 64 packages. Only 62 packages
appear for the URL status check, because packages that do not have any
URL in their Config.in file, or don't have any Config.in file at all,
are not checked and therefore not accounted.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
This commit reworks the code that checks if the upstream URL of each
package (specified by its Config.in file) using the aiohttp
module. This makes the implementation much more elegant, and avoids
the problematic multiprocessing Pool which is causing issues in some
situations.
Suggested-by: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
This commit reworks the code that retrieves the latest upstream
version of each package from release-monitoring.org using the aiohttp
module. This makes the implementation much more elegant, and avoids
the problematic multiprocessing Pool which is causing issues in some
situations.
Since we're now using some async functionality, the script is Python
3.x only, so the shebang is changed to make this clear.
Suggested-by: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
This fixes the following flake8 warning:
support/scripts/pkg-stats:1005:9: E117 over-indented
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
With python 3, when a package has a version number x-y-z instead of
x.y.z, then the version returned by LooseVersion can't be compared
which raises a TypeError exception:
Traceback (most recent call last):
File "./support/scripts/pkg-stats", line 1062, in <module>
__main__()
File "./support/scripts/pkg-stats", line 1051, in __main__
check_package_cves(args.nvd_path, {p.name: p for p in packages})
File "./support/scripts/pkg-stats", line 613, in check_package_cves
if pkg_name in packages and cve.affects(packages[pkg_name]):
File "./support/scripts/pkg-stats", line 386, in affects
return pkg_version <= cve_affected_version
File "/usr/lib64/python3.8/distutils/version.py", line 58, in __le__
c = self._cmp(other)
File "/usr/lib64/python3.8/distutils/version.py", line 337, in _cmp
if self.version < other.version:
TypeError: '<' not supported between instances of 'str' and 'int'
This patch handles this exception by adding a new return value when
the comparison can't be done. The code is adjusted to take of this
change. For now, a return value of CVE_UNKNOWN is handled the same way
as a CVE_DOESNT_AFFECT return value, but this can be improved later
on.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
When the 'nvd-path', 'json' and 'html' are used like this:
--html ~/foo
then the tilde expansion is properly done by the shell. However, when
they are used like this:
--html=~/foo
The shell doesn't do the tilde expansion, and pkg-stats doesn't do
it. This commit modifies pkg-stats to ensure that tilde expansion is
done when parsing the 'nvd-path', 'json' and 'html' arguments.
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
[Thomas: improve commit log]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
flake8 complains with:
support/scripts/pkg-stats:339:13: E722 do not use bare 'except'
Due to the construct:
try:
something
except:
print("some message")
raise
Which is in fact OK because the exception is re-raised. This issue is
discussed at https://github.com/PyCQA/pycodestyle/issues/703, and the
general agreement is that these "bare except" are OK, and should be
ignored from flake8 using a noqa statement.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
flake8 complains with:
pkg-stats:38:1: E402 module level import not at top of file
This is due to sys.path.append() being before the import from
getdeveloperlib, but we really need this sys.path.append() to be
before, so let's ignore this flake8 warning.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
If there is no infra set or infra is virtual the status is set to 'na'.
This is done for the follwing checks:
- license
- license-files
- hash
- hash-license
- patches
- version
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
This value can be used for later processing.
In the buildroot-stats application this is used to create links pointing
to the git repo of buildroot.
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Unify the status check information. The status is stored in a tuple. The
first entry is the status that can be 'ok', 'warning' or 'error'. The
second entry is a verbose message.
The following checks are performed:
- url: status of the URL check
- license: status of the license presence check
- license-files: status of the license file check
- hash: status of the hash file presence check
- patches: status of the patches count check
- pkg-check: status of the check-package script result
- developers: status if a package has developers in the DEVELOPERS file
- version: status of the version check
With that status information the following variables are replaced:
has_license, has_license_files, has_hash, url_status
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Use the function 'parse_developers' function from getdeveloperlib that
collect the information about the developers and the files they
maintain. Then set the maintainer(s) to each package.
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Remove the patch_count attribute and use a class property instead.
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
This patch changes the type of the latest_version variable to a dict.
This is for better readability/usability of the data. With this the json
output is more descriptive in later processing of the json output.
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
During the CVE checking phase, we can still see a huge amount of
Python processes (actually 128) running on the host, even though
the CVE step is entirely ran in the main thread.
These are actually the worker processes spawned to check for the
packages URL statuses and the latest versions from release-monitoring.
This is because of an issue in Python's multiprocessing implementation:
https://bugs.python.org/issue34172
The problem was already there before the CVE matching step was
introduced, but because pkg-stat was terminating right after the
release-monitoring step, it went unnoticed.
Also, do not hold a reference to the multiprocessing pool from
the Package class, as this is not needed.
Signed-off-by: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
In Python 3, the functions from the subprocess module return bytes
(and no longer strings as in Python 2), which must be decoded for
further text operations.
Now, pkg-stats can be run in Python 3.
Signed-off-by: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
It seems like throughout the series that the CVE pkg-stats support
went through, the support for ignoring CVEs in the per-package
<pkg>_IGNORE_CVES variable was forgotten.
Let's re-introduce this, which is now very simple thanks to the CVE
class, its .identifier() propertly and the .is_cve_ignored() method of
the Package class
Cc: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
During the CVE checking phase, we can still see a huge amount of
Python processes (actually 128) running on the host, even though
the CVE step is entirely ran in the main thread.
These are actually the worker processes spawned to check for the
packages URL statuses and the latest versions from release-monitoring.
This is because of an issue in Python's multiprocessing implementation:
https://bugs.python.org/issue34172
The problem was already there before the CVE matching step was
introduced, but because pkg-stat was terminating right after the
release-monitoring step, it went unnoticed.
Also, do not hold a reference to the multiprocessing pool from
the Package class, as this is not needed.
Signed-off-by: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
In Python 3, the functions from the subprocess module return bytes
(and no longer strings as in Python 2), which must be decoded for
further text operations.
Now, pkg-stats can be run in Python 3.
Signed-off-by: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
The NVD files that are used to build the list of CVEs affecting
Buildroot packages are quite large (a few hundreds MB of json),
and cause the pkg-stats scripts to have a huge memory footprint
(a few GB with Python 2.7).
However, because we only need to iterate on CVE items one by one,
we can process them in streaming (ie decoding one CVE at a time
from the JSON representation). Because the json module from the
python standard library does not support such a mode of operation,
we switch to the third-party package ijson, which is compatible
with both Python 2 and Python3.
To run the script with these modifications, one should install
the ijson python package. This can be done with pip:
`pip install ijson`. On Debian based distributions, this can
also be done with the apt package manager:
`apt install python-ijson`.
Signed-off-by: Titouan Christophe <titouan.christophe@railnova.eu>
Reviewed-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Tested-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
The NVD files that are used to build the list of CVEs affecting
Buildroot packages are quite large (a few hundreds MB of json),
and cause the pkg-stats scripts to have a huge memory footprint
(a few GB with Python 2.7).
However, because we only need to iterate on CVE items one by one,
we can process them in streaming (ie decoding one CVE at a time
from the JSON representation). Because the json module from the
python standard library does not support such a mode of operation,
we switch to the third-party package ijson, which is compatible
with both Python 2 and Python3.
To run the script with these modifications, one should install
the ijson python package. This can be done with pip:
`pip install ijson`. On Debian based distributions, this can
also be done with the apt package manager:
`apt install python-ijson`.
Signed-off-by: Titouan Christophe <titouan.christophe@railnova.eu>
Reviewed-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Tested-by: Thomas De Schampheleire <thomas.de_schampheleire@nokia.com>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
It seems like throughout the series that the CVE pkg-stats support
went through, the support for ignoring CVEs in the per-package
<pkg>_IGNORE_CVES variable was forgotten.
Let's re-introduce this, which is now very simple thanks to the CVE
class, its .identifier() propertly and the .is_cve_ignored() method of
the Package class
Cc: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
This commit extends the pkg-stats script to grab information about the
CVEs affecting the Buildroot packages.
To do so, it downloads the NVD database from
https://nvd.nist.gov/vuln/data-feeds in JSON format, and processes the
JSON file to determine which of our packages is affected by which
CVE. The information is then displayed in both the HTML output and the
JSON output of pkg-stats.
To use this feature, you have to pass the new --nvd-path option,
pointing to a writable directory where pkg-stats will store the NVD
database. If the local database is less than 24 hours old, it will not
re-download it. If it is more than 24 hours old, it will re-download
only the files that have really been updated by upstream NVD.
Packages can use the newly introduced <pkg>_IGNORE_CVES variable to
tell pkg-stats that some CVEs should be ignored: it can be because a
patch we have is fixing the CVE, or because the CVE doesn't apply in
our case.
>From an implementation point of view:
- A new class CVE implement most of the required functionalities:
- Downloading the yearly NVD files
- Reading and extracting relevant data from these files
- Matching Packages against a CVE
- The statistics are extended with the total number of CVEs, and the
total number of packages that have at least one CVE pending.
- The HTML output is extended with these new details. There are no
changes to the code generating the JSON output because the existing
code is smart enough to automatically expose the new information.
This development is a collective effort with Titouan Christophe
<titouan.christophe@railnova.eu> and Thomas De Schampheleire
<thomas.de_schampheleire@nokia.com>.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Titouan Christophe <titouan.christophe@railnova.eu>
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
As suggested by Baruch Siach, using "git rev-parse HEAD" is a lot
simpler than playing around with "git log" to just retrieve the commit
id corresponding to the current HEAD.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
pkg-stats extracts the Buildroot commit id from which the package
information was collected. However, when doing so, it always assumes
we're using the master branch, by running "git log master".
But in fact, pkg-stats can be run from any branch/tag, so it makes a
lot more sense to use "git log HEAD".
Cc: victor.huesca@bootlin.com
Cc: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
The major bottleneck in pkg-stats is the time spent waiting for
answers from remote servers. Two functions involve such communication
with remote servers:
- 'check_package_urls' which checks that each package upstream website
is up, it is efficient due to the use of process-pools thanks to
Matt Weber.
- 'check_package_latest_version' which fetches the latest package
version from release-monitoring, it uses a http-pool but runs
sequentially.
This patch extends the use of process-pools to 'check_latest_version'.
Due to some limitations of multiprocess callbacks, this patch loses
the overall progress of packages in favour of just the current package
name.
Runtimes for this function are ~3m vs ~25m for the linear version.
Tested on an i7 7500U (2/4 cores/threads @3.5GHz) with 15ms ping.
Note: There have already been work trying to parallelize this function
using threads but there were a failure on some configurations [1].
This implementation rely on a dedicated module already in use on this
script, so it's unlikely to see failure with this version.
[1] http://lists.busybox.net/pipermail/buildroot/2018-March/215368.html
Signed-off-by: Victor Huesca <victor.huesca@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Fixes:
- blank space before ':'
- unused 'o' variable left from a previous patch
- bad continuous alignment
Signed-off-by: Victor Huesca <victor.huesca@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
The pkg-stats calls 3 times `make` to get a bunch of variables. These
variables can be obtained in only one make invocation. This patch
replaces the three calls by just one and adjusts the parsing logic
accordingly.
Note: another option suggested by Arnout would be to run `make
show-info` that produces a json with the necessary variables. This
would avoid the duplicated effort done in pkg-stats and pkg-utils and
allow to add other infos to pkg-stats like dependencies, reversed
dependencies or if the package is virtual.
In order to use this method, the following changes are required in
pkg-generic's show-info:
- include license_files;
- have an option to run it on *all* packages, not just the selected
ones.
This patch take the simplest approach of only factorizing the make
calls as it requires less changes.
Signed-off-by: Victor Huesca <victor.huesca@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Since it's used only for the HTML output, and all other functions used
for HTML output are prefixed by dump_html, let's do so for
dump_gen_info() as well by renaming it to dump_html_gen_info().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
The 'dump_html' and 'dump_json' both include commit infos as well as the
current date. It make more sense to retrieve these information once.
This patch simply does this factorization.
Signed-off-by: Victor Huesca <victor.huesca@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Pkg-stats is a great script that get a lot of interesting info from
buildroot packages. Unfortunately it is currently designed to output a
static HTML page only. While this is great to include on the
buildroot's website, the HTML is not designed to be easily parsable and
thus it is difficult to reuse it in other scripts.
This patch provide a new option to output a JSON file in addition to the
HTML one.
The old 'output' option has been renamed to 'html' to distinguish from
the new 'json' option.
Signed-off-by: Victor Huesca <victor.huesca@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Move the mutual exculsion of the '-n' and '-p' options to be part of the
parser instead of being checked in main.
Signed-off-by: Victor Huesca <victor.huesca@bootlin.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>