highway-code-diff
11 Apr 2022
For most of my life I’ve been fortunate to live in cities where there has either been ample cycle infrastructure and/or cheap public transport, so I never really felt any urge to learn to drive until the pandemic hit and I could no longer travel around as easily. Here in the UK we have both a written and a practical exam, with the former based heavily on the information contained in the Highway Code, a set of rules and guidance for all road users in the UK.
Once you’ve passed your test, though, that’s far from the end of your relationship with the Code because it is, unsurprisingly, a living document. The Code has had at least one update a year since 2015 (apart from in 2020), and often more than one. It has been updated twice since I started learning to drive, and I only found out about the update in January 2022 because I read a newspaper article about it. Clearly this wasn’t the right way to go about staying up-to-date with changes to the Code, so I went looking for ways to keep myself better informed.
The Code’s website maintains a list of updates, but they’re not terribly helpful:
Updated rule 1 to include reference to footways, and to add guidance about remaining aware of your environment and avoiding unnecessary distractions.
In that example I know that a specific rule has changed, and I have a sense of how it has changed, but I don’t know what the new wording is; for that I need to go and look at the rule itself (which isn’t linked to from the update, unfortunately). This, combined with the fact that I’d have to remember to visit this page frequently, left me looking for something else.
You can also sign up for email alerts about changes to the Code. These are better both in the sense that they let you know when something has changed, as opposed to you having to remember to periodically check a page on a government website, and they also link to a more useful page listing the relevant changes and linking to the changed rules.
Unfortunately it turns our that knowing which rule has changed isn’t really all that helpful either. Some of the rules are long, and there’s no way to tell what specifically has been changed in them, so you’re forced to re-read each in their entirety and compare them against your hazy recollection of what they might have said previously. Perhaps this is intentional and they want you to re-read each rule completely? For me it feels a bit like reviewing a pull request: sometimes the change requires reading the rest of the function in order to understand it, but sometimes it’s a one-liner and enough by itself, it all depends on the context.
That’s where my highway-code-diff
tool comes in, it:
- periodically fetches the HTML from the official Highway Code website;
- converts the HTML into Markdown; and
- opens a pull request in the same repo whenever there are changes.
This gives me a single view of all changes (including changes that aren’t included in the official updates) and I can use Github’s pull request UI to get more context on a given change as needed. I get notified when the pull request is created, and a Github action performs the periodic refresh so that I don’t have to remember to do so myself. On top of that I can view the individual Markdown files in the repo as an alternative to the going to the website.
I should say that it’s not without its own issues:
- the HTML on the official website isn’t always semantically correct, so the Markdown conversion library I’m using (
markdownify
) can end up making mistakes and I might miss (part of) a change as a result - the code will pick up any change and notify me of it, even if it’s only somebody adding a trailing space to an HTML page, so I run the risk of getting needless notifications
- it will only keep working as long as I’m able to scrape the official site, which isn’t necessarily a given because I’m hitting all the URLs related to the Code, even image ones, so I could get rate-limited or even blocked (this happened numerous times during development)
Beyond that the environment for developing Github Actions could definitely do with some love, as the iteration cycles aren’t particularly quick depending on what you’re doing – my naive initial method was to set the action to run whenever I pushed a change to the repo, which got a little tiresome after a while. Security is definitely a concern too, given the fact that the built-in actions are limited and you may need to end up giving marketplace actions access to tokens. Github’s own stability has been patchy of late also, and this has affected Actions, but I’d imagine that’s only temporary. In principle, though, and I know this is far from an original observation, I am quite taken with Github Actions as a pattern for stateful, version-controlled serverless functions and plan to make more use of them in the future.