Innovating Open

Sam Pikesley

The idea of “Innovation time” originated, as far as I’ve been able to establish, at 3M in 1948. HP are reported to have allowed their engineers to work on whatever they wanted on a Friday afternoon, and “20% time” is a firm fixture at Google.

Here at the ODI, the Tech Team are fortunate enough to be given 20% time, but it’s not always easy to think of something to work on. However yesterday we were having a discussion around how we manage deployments (and there’ll doubtless be lots more about that in future blosts as we move towards the Holy Grail of continuous deployment) and something James said gave me an idea.

So I’ve created a cookbook for the Chef configuration-management tool; this will certainly solve a problem for us, but because we’re all about Open it’s available to anybody who wants it under the MIT license. The hope is that it will help other people with a similar itch to scratch, but more than that, that some of those people can help us to make it better - we hold it as axiomatic that there are more good ideas in the world than there are in the ODI.

The Technical Stuff

Our applications rely on a bunch of configuration variables (API keys, Oauth tokens, URLs etc) which up until now we’ve been storing in our shared password file and passing around between nodes as required. This process is clearly error prone (“Why is the Jenkins build broken again?” “Oh, I forgot to add that new env thing, sorry”) and would (as with so many other things) be much better handled by robots. So, let’s automate!

Our configuration-management weapon of choice is Chef, which has the concept of data bags. We’ll store our configuration data as JSON in one of these.

So we have a bit of JSON that looks like this:

{
  "id": "development",
  "content": {
    "capsulecrm": {
      "account_name": "foobar",
      "api_token": "123abc",
      "default_owner": "some_user"
    },
    "eventbrite": {
      "api_key": "456xyz",
      "organizer_id": "6031769",
      "user_key": "s00pahs3kr3t"
    },
    "github": {
      "login": "user",
      "organisation": "theodi",
      "password": "icouldtellyoubutthenidhavetokillyou"
    },
    "leftronic": {
      "api_key": "igot99problems",
      "github": {
        "forks": "987fgh",
        "issues": "asdf1974"
      }
    }
  }
}

(NOTE: FAKE DATA!!!) Now we’re using the rather splendid Dotenv gem, which expects to read environment files in either a conventional Unix-ish KEY=value format or, more interestingly for us, a YAML-ish KEY: value style. It’s only YAML-ish, though, it doesn’t support nesting (AFAIK). We could just stuff the JSON with the raw keys, but doing them hierarchically like this reduces redundancy, improves readability, and makes processing the data a lot more interesting.

So I have this data and I need to turn it into something Dotenv-friendly like this:

CAPSULECRM_ACCOUNT_NAME: foobar
CAPSULECRM_API_TOKEN: 123abc
CAPSULECRM_DEFAULT_OWNER: some_user
EVENTBRITE_API_KEY: 456xyz
EVENTBRITE_ORGANIZER_ID: 6031769
EVENTBRITE_USER_KEY: s00pahs3kr3t
GITHUB_LOGIN: user
GITHUB_ORGANISATION: theodi
GITHUB_PASSWORD: icouldtellyoubutthenidhavetokillyou
LEFTRONIC_API_KEY: igot99problems
LEFTRONIC_GITHUB_FORKS: 987fgh
LEFTRONIC_GITHUB_ISSUES: asdf1974

which means recursion.

Once, a long time ago, I knew how to do recursion. It’s astonishing how completely you can forget things. So, for the benefit of me and anybody else who’s stumbled across this because they’re having similar problems, here’s the solution at which I arrived after several hours of headscratching and Googling:

def walk hash, stack = [], output = []
  hash.each do |key, val|
    stack << key
    if val.is_a?(Hash)
      walk val, stack, output
    else
      output << "%s: %s" % [
          stack.join("_").upcase,
          val
      ]
      stack.pop
    end
  end
  stack.pop

  output.join "\n"
end

So, we pass in our hash, plus two arrays (which default to empty). We step through the top level of the hash, and for each entry, push the key onto our stack, and examine the value. If it’s a hash, we go around again, but this time the arrays may contain some stuff: particularly, the stack might have one or more keys in it.

Sometimes, the value is not a hash, it’s a string. We have a winner! So we take our current stack of keys and glue them together, with underscores, into a string, and attach our value, and throw this into our output array. And because we’ve now got the value for this (compound) key, we can pop the last value off the stack and bin it.

And now we come to the bit that caused the most swearing: when we’ve processed the entire hash, we need to pop the remaining value off the stack. Because I’d constructed my test JSON poorly, it all looked superficially successful until I pointed it at the real stuff. There’s a lesson there somewhere.

Anyway, that’s enough CompSci 101. If you’re of a DevOps-ish bent and would like to actually use this cookbook, you can find it here; and if you’d like to help us improve it, we are always open to pull-requests.