Developing in the Open

“Hey Ben what have you been working on recently?” is a question that usually ends in a lengthy answer about new work stuff and how I'm excited for the future but usually no code links to show what I've been playing working on. Until now at least. This is because the team behind bbc.co.uk/programmes is switching to a “Developing in the Open” mindset for our new projects. This is a process popularised by GDS, The Guardian, and The Financial Times.

The BBC has a long tradition of publishing Open Source software and has used Developing in the Open for internal-use libraries such as Sport's Grandstand CSS framework, but I think this might be the first instance of opening up the application code that is used to directly power BBC systems.

Here's a little primer on what that means and what the team had to go through to get to this point.

What is Developing in the Open?

Developing in the Open is having your source code publicly accessible and licensed and thus available to be seen and used by all. This is very similar to the traditional model you think of when you hear “Open Source”, however there is a crucial difference. Open Source is for projects that you expect other people to use and contribute to, while Developing in the Open offers no such expectation.

As the code we are publicising is specifically for our team's needs, nobody else would find value in copying the whole thing verbatim and thus wouldn't be interested in providing patches. Though I'm sure some people will make some typo fixes in our READMEs so they get marked as having contributed to a BBC repo. However that doesn't mean we should hide our code away - there is still value in small portions of the app: a reusable CSS utility, or how to structure a set of domain models would be useful things that other people can see and make use of.

It's an inversion of the classical belief that things should be hidden by default - instead lets make things visible by default and only hide that which is sensitive. If something isn't worth selling, then why not offer it for free?

Why Develop in the Open?

The big, selfish reason is a sense of pride. We're working on applications we're proud of, both in terms of output and how they're put together. We should be able to show the world what we've made.

Occasionally we come up with something we think is worth sharing, such as a CSS utility to deal with image ratios, a neat naming convention for grid width classes, or even a methodology for structuring a design system and having our code be open allows us to show people those things within the context of our applications, in addition to creating toy demo examples.

It helps improve our code quality. We have to be stricter about following best practices instead of giving into the temptation of getting sloppy and we'll feel more compelled to produce higher quality code / commit messages if we know there's a chance more people shall read what we produce.

The Ministry of Justice recently put out a blog post about why they code in the open that covers a couple of other pros.

Due Diligence

Moving to an open by default model does have some risks. You need to be more careful about how you handle sensitive data such as passwords, API keys and information that may give malicious outsiders too much information. Ideally you should already be mindful about these things - storing secrets in a private place that is then merged with your public code using environment variables or some other mechanism as part of your deployment process.

Ensuring nothing sensitive gets into the repo is an ongoing process and should anything sensitive be accidentally committed then the secret should be revoked and changed immediately. Fortunately we are petty good at this already, our secrets are stored in a separate system - Cosmos - the BBC's in-house deployment tooling.

As we are opening up projects that have already been in active development we had to go audit the history of the project to ensure nothing sensitive existed in the repo.

We found no passwords but we stored our AWS infrastructure configuration - CloudFormation templates and the Troposphere config that builds them as part of the repository. While this information is not instantly recognisable as sensitive - it is a template we fill with precise values at a later point - we decided that there is no value in exposing this information. The templates are equivalent to a blueprint of a house - any potential bad-actor would be able to work it out should they get inside, but that doesn't mean we should tell them the layout up front. This configuration was purged from our git history using git filter-branch, following GitHub's guide for removing sensitive data.

GDS have wrote a couple of articles about other things to consider when opening up a code base and when it might not be appropriate:

Licensing

By default a person (or team) retains the rights to their own work and thus nobody else may used that work. You need to explicitly include a LICENSE file in your repository to loosen or change those rights if you want other people to be able to use your work without fear of repercussion. choosealicence.com provides a list of licenses suitable for open-source that allows other people to use your work, roughly ranked in order of restrictiveness.

For the my team's projects we wanted to allow anybody to be able to use our code in any project. We don't want a viral copyleft license like AGPL that enforces any project that takes code from our repo to also be openly licensed under the same terms. Making a project have to change its license if they wish to use a neat CSS pattern we created is unreasonably heavy handed.

We also want to retain the rights to any patentable work. Software patents are idiotic, but by us asserting that we own the right to patent anything we produce, we stop somebody else taking our work and patenting it (and profiting off it) themselves.

The license we chose is the Apache 2.0 License. This offers us patent protection, which its nearest sibling the MIT License does not - this is pretty much the only difference between the two. It also does not enforce a license on any projects that use any of our code. This is the “standard” license that be BBC uses for its open-source works and also what the Guardian uses for its webapp code.

Conclusion

A list of our public repositories can be found on GitHub, tagged with bbc‑programmes. Of particular note right now are:

  • programmes-pages-service - A PHP library containing our Database schema, domain models and tools to request data from our Database. Our new model layer.
  • programmes-clifton - A JSON API powered by Symfony 3, using programmes-pages-service. It is a drop-in replacement for a legacy API, so we can continue to produce pages using the the existing www.bbc.co.uk/programmes frontend while we work on its replacement.
  • programmes-frontend - The new Symfony 3 powered web app that shall eventually power www.bbc.co.uk/programmes. This is very, very work-in-progress right now and we're still working out how it'll all fit together.

Bower vs. npm for packaging Sass

Package managers for the front end. Fun eh? I'm about to be playing with creating a modular CSS framework that needs to be shared across applications, so I figured now would be a good time to investigate the tools available. As nobody else seems to have done this with a specific eye to CSS I might as well write about it too. Here is my use-case:

  • I'm building a CSS framework, written in Sass (though this investigation's outcome would apply to all CSS preprocessors).
  • I'll have some fundamental mixins and functions, which shall be used by multiple components, these components shall then be included into my application.
  • My build pipeline shall be using Grunt.

There are two package managers on my shortlist: npm and Bower. Both are generic package managers. Both make you specify your dependencies in json files. Both install their dependencies into a folder within your application. At first glance they are very similar and my initial gut feeling was “if I'm already using npm for Grunt then why do I need Bower?”. Why complicate the project with an additional package manager? There must be something I'm missing.

The key difference between the two lies in how they store dependent packages. Bower has a flat listing, while npm uses a nested hierarchy. To demonstrate this difference lets use a couple of hypothetically named packages, that map neatly onto a real world example used by the Guardian:

  • Application - your app (e.g. The Guardian website) - which depends on:
  • Component - a generic reusable CSS object (e.g. guss-layout) - which depends on:
  • Helper - a selection of utility Sass mixins / functions (e.g. sass-mq)

This is a drastically simplified example with a single component and helper, but think about how this can expand when there is multiple component packages each depending on various helper packages.

Bower's flat listing

Bower flattens your dependency graph and installs all dependencies at the same level, so after running bower install the application folder would like this:

|--- bower_components
|    |--- component
|    |    |--- _component.scss
|    |--- helper
|    |    |--- _helper.scss
|--- bower.json
|--- styles.scss

This is really simple, but it can potentially pose a major problem - dependency hell, which can occur when multiple components rely on different versions of the same helper. Because Bower does not allow installing two versions of the same package everything grinds to a halt.

Bower's flat listing means that the path to all of the app's dependencies are one level deep so the app's styles.css would look like this:

@import 'bower_components/helper/helper';
@import 'bower_components/compoent/component';

This can be tidied up by adding bower_components to the Sass load_path to save repeating it every time:

@import 'helper/helper';
@import 'compoent/component';

Very neat and tidy.

Npm's dependency tree

Npm keeps the dependency tree intact, installing each package's dependencies into a node_modules folder within that package, so after running npm install the application folder would look like this:

|--- node_modules
|    |--- component
|    |    |--- _component.scss
|    |    |--- node_modules
|    |    |    |--- helper
|    |    |    |    |--- _helper.scss
|--- package.json
|--- styles.scss

This layout is a bit more complex than Bower's flat listing, but it avoids dependency hell as each component is responsible for its own dependencies rather than having them all merged into the same level.

This nested dependency graph means the app's stylesheet's @imports end up looking a little uglier than the Bower version:

@import 'node_modules/component/node_modules/helper/helper';
@import 'node_modules/compoent/component';

All this nesting means there isn't much value in adding node_modules to the load path as there is still a load of node_modules references at lower levels littering the @import paths anyway.

Unavoidable Dependency Hell

The thing is though, npm's resolution to dependency hell works great for JS modules where you can scope the inclusion of a particular version of a helper to a particular component. Sass has no such scoping though. You can only import files into the global scope which means that any mixins and functions contained within them also live in the global scope. Sass does not complain about overwriting an existing mixin or function, which can lead to subtle and insidious bugs when one version of a helper overwrites a different one and gets used within a component that does not support it.

It's a potential minefield and I can't come up with a way out of this mismatched dependency thing in Sass other than being careful not to fall prey to it in the first place. Which can be done by being careful about not breaking backwards compatibility in your helper's API, or when you do inevitably break it, you have to upgrade everything that depends on that helper all at the same time. It's a sucky problem but really your helpers should be so simple that you will not need to update them in a way that breaks prior expectations anyway. As the old joke goes: “Doctor doctor, it hurts when I do this” “Well, don't do it then”. Not got a better solution that the utterly unhelpful “you just need to know when you're about to do it wrong”, sorry.

So, npm's nested dependency tree is not actually a feature that is relevant to writing a Sass framework anyway as Sass does not support that sort of scoping. Which means that Bower is looking a lot nicer due to its simpler folder structure (as npm's apparent complexity does not help solve anything in Sass-land) and its ability to give bit of sugar around load paths. Balls, this wasn't the answer I was hoping for, I don't want to require a second package manager. What if I can make npm use a flat folder structure like Bower?

NPM Peer Dependencies

What if I fiddle with that dependency graph a little bit. What if I say that a component should not be responsible for loading in the helpers it needs, but instead should trust that the application has already loaded a compatible version of the helper that is available for the component to use. It means your components break if you do not include that helper in your application's scss file but it seems like a small price to pay for ensuring you are only pulling in a single version of your helpers. It sounds daft and horrible but it might get us out of this.

Here is the revised dependency chain, where the application explicitly states that it requires Component and Helper, and Component hints that it needs a specific version of Helper:

  • Application - your app (e.g. The Guardian website) - which depends on:
    • Component (hinted: I need Helper to be included before me)
    • Helper

NPM has support for this hinting of things a component needs, but never calls directly as a feature called peer dependencies. By specifying the helper as a peer dependency npm shall throw a error if two components attempt to rely on two different versions of a single helper.

The component's package.json would look like:

{
  "name": "component",
  "version": "1.0.0",
  "dependencies": {},
  "devDependencies": {},
  "peerDependencies": {
    "helper": "~1.0.0"
  }
}

The app's package.json would look like this:

{
  "name": "myApp",
  "dependencies": {
    "helper": "~1.0.0",
    "component": "~1.0.0"
  },
  "devDependencies": {},
  "peerDependencies": {}
}

When you run npm install in your application's tree shall look like:

|--- node_modules
|    |--- component
|    |    |--- _component.scss
|    |--- helper
|    |    |--- _helper.scss
|--- package.json
|--- styles.scss

And thus your app's stylesheet would look like:

@import 'node_modules/helper/helper';
@import 'node_modules/compoent/component';

Currently peer dependencies are installed automatically and npm throws an error when packages want to install conflicting versions of a peer dependency, so we don't really need to explicitly specify them in the application's package.json. However the npm maintainers don't like the idea of peer dependencies in this form. They would rather change it so that peerDependencies are not installed automatically and that npm would warn rather than error when there are peer dependency conflicts. I would rather be a bit more explicit in the application's package.json to be ready for that impending change. So now we have that flat Bower-like layout that we were hoping for, at the expense of having to write a little more in our package.json manifest file. I'm pretty happy with that.

Conclusion

The more I sit and thing about this, the more I think that using npm with peerDependencies is a good idea. It is not explicitly what peerDependencies was originally envisioned for, but I think it seems like a good ideological fit and certainly appears to solve the problem - while keeping build-time complexity down thanks to not needing the overhead of Bower. Please tell me if I'm crazy.

Powered By Middleman

I started using static site generators some time last year, beginning with probably the most well known: Jekyll, which powers GitHub Pages. I found it ideal for creating simple sites with no need for interactivity that can be deployed anywhere (due to being pure html/css/js) while still allowing me control over the build process and asset optimisation.

Jekyll's focus is on being minimal, lean and slightly opinionated and I was getting tired of having to implement functionality I wanted as plugins. If only there was some tool that concerned itself with front-end best practices like concatenation and minification of CSS and JavaScript and offered tooling like LiveReload as standard…

Enter Middleman

Middleman is a full-fat static site generator that cares about optimisation and developer joy out of the box (or via offical extensions). A quick list of things that I had to implement in Jekyll that are easy in Middleman:

Naturally this site is built using Middleman and published onto GitHub Pages, and the source to generate it is available on my GitHub account.

Tubewhack

This is a backdated post from way back in the day

Based on a suggestion in b3ta newsletter #419 I have created Tubewhack; a tool that lets you find words whose letters appear in a single tube station name. For instance: Pimlico is the only Underground station which does not contain any of the letters in the word “badger”.

Credit for the name goes to @djmarland as Tubewhack is considerably more succinct than “Words Not Contained In Tube Names”.

Kittenify Bookmarklet

This is a backdated post from way back in the day

Based on a suggestion in b3ta newsletter #390 I have created Kittenify; a little bookmarklet that replaces all images on the current page with pictures of kittens pulled from Flickr.

One Week Later: WOO and YAY I've made it into b3ta newsletter #391 twice. Once for the bookmarklet itself, and again for getting b3ta a bit of media coverage as Kittenfy appeared in both the online and print editions of the Metro newspaper.