Helping to clean CPAN: June 2016

7 minute read

… where I get out my virtual broom and sweep up cruft in my assigned distribution for this month’s edition of the CPAN Pull Request Challenge.

This month’s module: Algorithm::TokenBucket

Algorithm::TokenBucket implements a token bucket rate limiting algorithm.

First impressions

In this section I try to get a feeling for the state of the module, how up to date it is, how often people are contributing to it, how many other distributions are depending on it, how many bugs/issues it currently has, what the CPANTS kwalitee is, etc.

Initial inspection of the source code

After forking the repo and cloning a local copy, let’s have a look at the project to see what build system it uses, if the test suite works and the tests pass, if it could do with a Travis-CI config file (or if present, if it can be updated).

  • README is markdown, however looks like it is automatically generated from API docs, however it doesn’t mention build instructions.
    • README is automatically generated from Pod::Markdown
  • uses Module::Build, however Build.PL says that it is automatically generated by MINILLA, which is a distribution building tool similar to Dist::Zilla
  • Minilla: https://metacpan.org/pod/Minilla
  • there is no META.yml in the repository
  • no Travis-CI config; add a .travis.yml?
    • fixed in PR#2
  • cpanm --installdeps . works ok
  • perl Build.PL runs without a problem; warns that META.yml is missing
  • ./Build runs without problems
  • ./Build test runs without problems (perl 5.18.4, Linux x64, Debian Jessie)
  • only one test file! It’s jam-packed with tests, though.

Code Coverage

Looking at the code coverage can give an indication of code quality. If the project is well covered, this means most changes made in pull requests can be made with some confidence that any problems will be caught by the test suite. If the code coverage is low, then this is something that one could address as a pull request (or set of pull requests).

In EUMM and Build::Module projects, one simply needs to install Devel::Cover and run

$ cover -test

In Dist::Zilla projects, one needs to install the Dist::Zilla::App::Command::cover plugin, after which the code coverage can be checked via:

$ dzil cover

In this distribution, the coverage is:

100% statement coverage; 96.5% total coverage

This is great coverage!

POD checks

The utility podchecker searches through Perl source code for POD which might not conform to the POD standard, and thus not necessarily be parseable by all POD parsers. Fixing any issues found by podchecker has the positive effect of also removing any warnings noted in the project’s documentation displayed on MetaCPAN.

Running podchecker gives the following errors and warnings:

$ find ./lib -name '*.pm' | xargs podchecker |& grep ERROR # 4 errors

*** ERROR: unresolved internal link 'state' at line 84 in file ./lib/Algorithm/TokenBucket.pm
*** ERROR: unresolved internal link 'until()' at line 247 in file ./lib/Algorithm/TokenBucket.pm
*** ERROR: unresolved internal link 'until' at line 272 in file ./lib/Algorithm/TokenBucket.pm
*** ERROR: unresolved internal link 'get_token_count' at line 275 in file ./lib/Algorithm/TokenBucket.pm
$ find ./lib -name '*.pm' | xargs podchecker |& grep WARN # 0 warnings

Fixed as part of PR#1.

Nit-picking

Some projects consider this a must, and will disallow commits to be submitted which contain trailing whitespace (the Linux kernel is an example project where trailing whitespace isn’t permitted). Other projects see whitespace cleanup as simply nit-picking. Either way one sees it personally, this could be a useful pull request to a project, so it’s worthwhile fixing and submitting; the worst that can happen is that the pull request is closed unmerged.

To look for files with trailing whitespace, run git grep ' $'. It can be helpful to load the files found directly into vim:

$ vim $(git grep ' $' | cut -d':' -f1 | sort | uniq)
3 files to edit

Fixed as part of PR#1.

perlcritic

Perl::Critic will show up many potential issues for Perl code. By simply running perlcritic on the lib and t directories, one can get a further handle on the code’s quality.

$ perlcritic lib
  • code before strictures warnings found

Fixed in PR#3.

$ perlcritic t

Source in t directory OK.

Stale URLs

Links to websites can go out of date, so it’s a good idea to see if they need updating or removing. A quick grep finds all the links. After which, we just need to see which links need fixing, if any.

$ git grep 'http://\|https://'
$ git grep 'ftp://'
$ git grep 'git://'

The site http://www.eecs.harvard.edu/cs143/assignments/pa1/ no longer exists, however could be found on the internet archive. The link was redirected to the internet archive as part of PR#1.

Other than that, links are ok.

This probably also sits in the nit-picking category for some people, however, copyright dates (theoretically) need to be kept up to date. The appropriate copyright year is usually the year of the last release. However, if a release looks imminent, then the current year is likely to be the right candidate. Some distributions put the author’s email address on the same line as the copyright date, hence this needs to be checked as well.

Here we do a case-insensitive grep over the source for the word “copyright”, the line of which we check for the existence of a year (i.e. 4 digits), look for the appropriate year and then clean up the grep results to get something we can pass directly to vim.

$ git grep -i copyright
$ vim $(git grep -i copyright | grep -P '\d{4}' | grep -v 2015 | cut -d':' -f1 | uniq)

=> 3 files

This is really only 2 files since the LICENSE file contains the copyright date for the LICENSE file itself. Only lib/Algorithm/TokenBucket.pm and README.md need to be updated, and it looks like the README.md is updated from the module’s POD, hence really only one file needs to be fixed.

$ vim $(git grep -i copyright | grep -P '\d{4}' | grep -v '@' | cut -d':' -f1 | uniq)

=> 3 files

Same as above; only need to update the module’s POD. Possibly doesn’t need to have the email address added though.

The copyright date (2015) is actually ok, since there hasn’t been a release since then.

Spell check files with POD

Good documentation can be a wonder to read. Not everyone’s docs are awesome, however we can keep the error rate to a minimum. A quick spell check will pick up most typos that don’t need to be there and fixing them can help improve the quality of a project.

In general, we want to find all files containing POD and run a spell checker (e.g. aspell) over all files, fixing typos we come across as we go. Not all projects require this much effort, however here’s a general-ish way to look for and check all POD in a project:

$ files_with_pod=$(find ./lib -name '*.pm' -o -name '*.pod' \
                 | xargs podchecker 2>&1 \
                 | grep 'pod syntax' | cut -d' ' -f1)
$ for filename in $files_with_pod
do
    pod2text $filename > $filename.txt;
    aspell -c $filename.txt;
done

Now look for .bak files and check differences between them and their respective output .txt files:

$ find ./ -name '*.bak'

$ diff -u lib/Path/To/Module.pm.pm.txt.bak lib/Path/To/Module.pm.txt

Then update the appropriate .pm file as necessary.

There is only one page of POD in this module, so no need for the hard work mentioned above. There weren’t any obvious typos or spelling errors, so the text was simply read and checked for ease of reading and basic sentence flow.

Kwalitee tests

Although CPANTS is the main kwalitee reference, one can also run the kwalitee tests locally. One can use the t/kwalitee.t test script from http://peateasea.de/cpan-pull-request-challenge/ for this purpose. However, the script only uses Test::Kwalitee which doesn’t cover as many metrics as CPANTS. Test::Kwalitee::Extra uses set of metrics closer to that used on CPANTS, so replace the kwalitee_ok call with simply use Test::Kwalitee::Extra. More information about the many options to Test::Kwalitee::Extra can be found on the MetaCPAN page.

Run the kwalitee tests in an ExtUtils::MakeMaker or Build::Module distribution like so:

$ RELEASE_TESTING=1 prove t/99kwalitee.t

or, if the distribution uses Dist::Zilla, run

$ dzil test --author --release

Basic kwalitee tests pass.

In order to check the kwalitee, it was necessary to install Minilla (cpanm Minilla) so that the minil dist command could be run in order that the MANIFEST was created as expected.

GitHub issues

There aren’t any issues on GitHub to fix.

RT issues

There didn’t turn out to be enough time to attack the issues in RT this month.

Overview of the pull requests made

Conclusion

Many thanks to KAPPA for promptly merging the PRs!