Perl Toolchain Summit 2018 report

I’m back from another incredible Perl Toolchain Summit – my first in a couple years. As usual, it was an amazing experience: getting dedicated time to work with incredible contributors on code at the heart of Perl’s community and ecosystem.

This year, we were back where we started ten years ago: Oslo, Norway. Oslo is a beautiful city, and I spent a couple days wandering around recovering from jet lag while getting ready for the summit.

My main goal for the summit was to follow through on a decision we’d reached at a toolchain summit several years ago: automatically approving PAUSE ID requests instead of holding them for manual moderation. My plan was to implement reCAPTCHA v2 on the ID request page and automatically approve applications if it validated.

Making that happen required me to shave some yaks. A lot of yaks.

Day 1: Thursday

Back in 2015, Kenichi Ishigaki (charsbar) converted PAUSE to run on Plack instead of modperl, which made it much easier to run PAUSE locally on a laptop for development testing. Unfortunately, I discovered that the PAUSE README describing how to get a working local installation was out of date. So my first order of business was discovering how to do it and updating the README. I parked myself at a table next to Andreas Koenig and Kenichi and had two experts ready to help.

My quest involved learning how to install mysqld and nginx via macports and configure them. After beating them into submission over a couple hours, I had a self-signed TLS reverse proxy running against the Plack-ified PAUSE code. I later found an obscure config option that let PAUSE run locally without TLS and was able to ditch nginx and test PAUSE directly with Plack.

Once I had PAUSE running locally it was straightforward to get the reCAPTCHA rendering on the page. Andreas asked me to protect it with a feature flag so we could turn it off if we had concerns, so I did that, too. In the waning bit of the day, I wrote the backend code for verifying reCAPTCHAs. But actually wiring it up into the website was going to have to wait for Friday.

Day 2: Friday

The problem with working on PAUSE code is that it’s old… really old. It’s so old, the code I needed to work on was in a directory called “pause_1999”. Many of the pages are rendered, validated, and do post-form processing in single subroutines, each often hundreds of lines long. The HTML generation is not templated – snippets of HTML are pushed onto an array to be joined later. The HTML generating code is frequently interspersed with database SQL calls.

I didn’t want to try to wire up reCAPTCHA without refactoring the existing user registration into distinct, reusable units of work, so that took much of Friday. Take “Render HTML for submitted ID request”… and put it in a subroutine. Take “Send one time password email”… and put it in a subroutine. An so on, and so on. Eventually, just before the end of the day, I had all the pieces I needed and reCAPTCHA-validated user registration was running on my local PAUSE! I cleaned up my work in some rebases, got Ricardo Signes to code-review it and sent Andreas a pull request.

Towards the end of the day, Merijn Brand (Tux) said he had some available time to help out anyone who needed it, so I asked him to be fresh set of eyes to try my README for setting up a local PAUSE web server. He promptly found several typos and thinkos, which I fixed up on Saturday.

As the day wound down, Ricardo and I discussed ideas for consolidating the business logic code for PAUSE module permissions management – a project would wind up being my second major deliverable from the toolchain summit.

Day 3: Saturday

While I was working on PAUSE reCAPTCHA, charsbar was nearing completion of a more ambitious project he started in 2017: converting PAUSE to run on top of the Mojolicious web framework. On Saturday, he and I discussed how to get my work into his branch… which largely turned out to be him taking my PR and just splicing it by hand into his work. Thank you, charsbar!

Andreas had some concerns about reCAPTCHA abuse, so I implemented a simple, server-side rate limiter. After a pre-set number of user registrations in a day, reCAPTCHA would be disabled and the legacy, manual moderation process would be used instead.

In the existing PAUSE code, the approving PAUSE admin’s ID was recorded in new user records. In an auto-approval world, that doesn’t apply, so I created a dummy ‘RECAPTCHA’ PAUSE account to server as the “approver” for such accounts.

At this point on Saturday, we were entering into the home stretch and everyone was hard at work trying to ensure they could finish what they’d started.

For me, I was ready to start one last project: the PAUSE permissions manager. The problem I was trying to solve was that the database code for module permissions checks and modification was in SQL statements scattered throughout the code base. We wanted to centralize that logic – initially as a pure lift-out refactoring, so I created a class for it and began the painstaking process of lifting out each piece and testing each change.

Along the way, I discovered some subtle expectations around database handle management and localization of error handling. I based my code on a branch that Ricardo was working on to refactor state management across various PAUSE modules. So, in addition to the lift out, I made sure that every database call was using the same, centralized handle management that Ricardo had put in place. That made some unexpected test failures go way.

I also had a startling discovery about PAUSE permissions error handling: in order to effect an idempotent insert (i.e. upsert-like logic), inserts were run with exceptions turned off and errors ignored, so that unique key constraint error could be ignored. Of course, that silently ignores any other errors, too! While I preserved that logic in the lift-out, I’ve bookmarked it as an area for future work.

Saturday night, the summit local organizer team, Salve Nilsen and Stig Palmquist, invited us to hang out at their heavily-graffitied hacker-space, hackeriet.no.

Day 4: Sunday

Most of Sunday was spent finishing up the PAUSE permissions refactor, which was uneventful, if dull. But the code was much more DRY afterwards, and I was happy to see it merged the same day.

Throughout the toolchain summit, I’d been applying some pull requests from my long backlog. On Sunday, since I didn’t want to start any new major work for PAUSE, I tackled the backlog with intensity, shipping over a half-a-dozen minor updates.

Over the whole summit, I shipped new versions of twelve modules: Capture::Tiny, Data::GUID::Any, DateTime::Tiny, Dist::Zilla::Plugin::BumpVersionAfterRelease, Dist::Zilla::Plugin::OSPrereqs, HTTP::Tiny::UA, Session::Storage::Secure, TAP::Harness::Restricted, Task::BeLike::DAGOLDEN, Tie::Handle::Offset, Time::Tiny, and Types::Path::Tiny.

Closing thoughts and thanks

I hadn’t been to a toolchain summit in a couple years and being back was a great reminder of why it’s so valuable, both to the community and to me personally.

For the community, having so many high-caliber people able to spend dedicated time on the infrastructure of Perl is a hugely effective way of getting things done and making the most of volunteer time. Having the right people in the room means that almost no question is too obscure to get an answer from at least one of the attendees.

For big projects, like PAUSE or MetaCPAN, having the key developers face-to-face also helps with high-bandwidth discussions about change. Changing these sites is risky, and being able to talk and plan F2F means decisions happen much faster than on email and IRC the rest of the year.

For me, personally, I felt much more energized about the Perl ecosystem and came out of the summit with renewed interest in contributing to the modernization of PAUSE.

Such a wonderful event would be impossible without help of the organizers and the support of the Perl Toolchain Sponsors. Thank you very much to Salve, Stig, Philippe, Laurent, Neil, and NUUG Foundation, Teknologihuset, Booking.com, cPanel, FastMail, Elastic, ZipRecruiter, MaxMind, MongoDB, SureVoIP, Campus Explorer, Bytemark, Infinity Interactive, OpusVL, Eligo, Perl Services, and Oetiker+Partner.