chriswarbo-net: 5ef2eb185a4f10d7fac63c6cd0c1c6e4634c6b24

     1: ---
     2: title: Continuous Integration
     3: ---
     4: 
     5: Whilst Nix's strict control of software versions and dependencies is excellent
     6: when handling *other people's* programs, it can be a little frustrating when
     7: developing one's own software.
     8: 
     9: In particular, version control systems like Git have their own notion of
    10: immutable state (commits) and dependencies (each commit's hash includes that of
    11: its parent, just like Nix derivations include their dependencies' hashes). This
    12: similarity can often lead to overlaps, where I need to synchronise Nix packages
    13: with git updates, or vice versa.
    14: 
    15: I've tried several methods of resolving this, but have finally found one that
    16: I'm happy with. Here I'll explain these approaches.
    17: 
    18: ## Manual Git Revisions ##
    19: 
    20: This is the most tedious, but it ensures you're using a known-good
    21: configuration. For example, this is the way you'd manage an official package in
    22: nixpkgs.
    23: 
    24: Let's say we're maintaining the following packages `foo` and `bar`. For the sake
    25: of argument, let's say we're maintaining these definitions in our
    26: `~/.nixpkgs/config.nix` file:
    27: 
    28: ```
    29: foo = stdenv.mkDerivation {
    30:         name = "foo";
    31:         src  = fetchgit {
    32:                  url    = "https://example.com/foo.git";
    33:                  rev    = "f000001";
    34:                  sha256 = "f000000000000000000000000000000000000000000000000001";
    35:                };
    36:       };
    37: 
    38: bar = stdenv.mkDerivation {
    39:         name = "bar";
    40:         src  = fetchgit {
    41:                  url    = "https://example.com/bar.git";
    42:                  rev    = "b000001";
    43:                  sha256 = "b000000000000000000000000000000000000000000000000001";
    44:                };
    45:         buildInputs = [ foo ];
    46:       };
    47: ```
    48: 
    49: Now let's say we make a change to `foo`, which lives in a new git revision
    50: `f000002`. We need to go and edit our package definition:
    51: 
    52: ```
    53: foo = stdenv.mkDerivation {
    54:         name = "foo";
    55:         src  = fetchgit {
    56:                  url    = "https://example.com/foo.git";
    57:                  rev    = "f000002";
    58:                  sha256 = "f000000000000000000000000000000000000000000000000001";
    59:                };
    60:       };
    61: ```
    62: 
    63: In fact, this package will not build, since the SHA256 hash will not match. Keep
    64: in mind that the SHA sum of the git commit doesn't have anything to do with the
    65: SHA sum of the nix derivation, so we can't just copy it over. Instead, we must
    66: ask Nix what the new SHA sum should be, and the easiest way to do that is to
    67: attempt to build the package:
    68: 
    69: ```
    70: $ nix-shell -p foo
    71: ...
    72: output path ‘/nix/store/...foo’ should have r:sha256 hash
    73: ‘f000000000000000000000000000000000000000000000000001’, instead has
    74: ‘f000000000000000000000000000000000000000000000000002’
    75: ...
    76: ```
    77: 
    78: This is what we were expecting, so we can now safely copy that new hash in place
    79: of the old one:
    80: 
    81: ```
    82: foo = stdenv.mkDerivation {
    83:         name = "foo";
    84:         src  = fetchgit {
    85:                  url    = "https://example.com/foo.git";
    86:                  rev    = "f000002";
    87:                  sha256 = "f000000000000000000000000000000000000000000000000002";
    88:                };
    89:       };
    90: ```
    91: 
    92: Clearly, this is a tedious process. The worst part is that we need to do the
    93: same thing even for *tiny* changes. Let's say that `bar` requires some small
    94: change to the API of `foo`, e.g. changing a private function into a public one
    95: (before you scoff, read on for a treatment of the software engineering
    96: aspects!).
    97: 
    98: Even if we have
    99: [a nice development workflow for `bar`](developing_on_nixos.html), with this
   100: setup we will still need to bump the version of `foo` *globally*, just to have
   101: the new version available to `bar`.
   102: 
   103: This is problematic in a couple of ways: firstly, *all* package definitions will
   104: now point to the new version of `foo`. Of course, thanks to the way Nix works
   105: this won't affect any *existing* derivations (i.e. all of our installed packages
   106: will carry on working as-is), however it will affect *new* derivations, like
   107: those instantiated by `nix-shell`. Symptoms might include a load of expensive
   108: rebuilds, or subtle breakages that we don't want to contend with *right now*
   109: (software engineers, read on!).
   110: 
   111: The other problem is that our change might not work! With this approach, we must
   112: go to all this effort and bump the global version of `foo` *just to try
   113: something out*!
   114: 
   115: As for the software engineering perspective, it's certainly true that any change
   116: in public API should lead to a version increase, and is not considered
   117: "tiny". It's also true that we should try not to release new API versions if
   118: they're known to break existing clients, at least without understanding whether
   119: the breakages are acceptable.
   120: 
   121: *However*, our problem shouldn't have anything to do with releases and versions!
   122: Even at the stage of *experimenting* and *prototyping*, to see if the proposed
   123: change will even work in the first place, we are suddenly forced to deal with a
   124: release management scenario!
   125: 
   126: This is obviously a bad state of affairs, in particular because it penalises
   127: modularity: these headaches in tying packages together will subconsciously bias
   128: us against re-using existing components, or splitting up our task into smaller
   129: sub-tasks.
   130: 
   131: ## Ignoring Git ##
   132: 
   133: We can also go to the opposite extreme and ignore version control altogether!
   134: Consider the following:
   135: 
   136: ```
   137: foo = stdenv.mkDerivation {
   138:         name = "foo";
   139:         src  = ~/Programming/foo;
   140:       };
   141: 
   142: bar = stdenv.mkDerivation {
   143:         name = "bar";
   144:         src  = ~/Programming/bar;
   145:         buildInputs = [ foo ];
   146:       };
   147: ```
   148: 
   149: Using hard-coded paths is clearly a bit of a code smell, as these package
   150: definitions are no longer portable to other machines. However, look at what
   151: we've gained! We no longer need to specify a git revision or an SHA256 checksum:
   152: Nix will scan the contents of the directories, and if they've changed from the
   153: last build it will copy them into the store and make a new derivation.
   154: 
   155: Unfortunately, there are obvious problems with this approach. In particular,
   156: we've still got the issue that any changes we make will have a global effect. In
   157: fact, we no longer have to commit something in order to alter our system! If we
   158: have unstaged changes in the `foo` or `bar` directories, any new derivations
   159: depending on these packages will get those changes, regardless of whether we
   160: intended them to or not.
   161: 
   162: This undermines a lot of the purpose of Nix, since we want to have confidence in
   163: our system integrity *and* the freedom to develop software without fear of
   164: breaking anything with half-finished changes.
   165: 
   166: ## Automating Updates ##
   167: 
   168: Another solution I tried is to automate the process of updating revisions and
   169: hashes. I wrote two scripts, called `bumpCommit` and `bumpEverything`, to aid
   170: this process. First, the packages would be defined to load their revision and
   171: hash attributes from external files:
   172: 
   173: ```
   174: foo = stdenv.mkDerivation {
   175:         name = "foo";
   176:         src  = fetchgit {
   177:                  url    = "https://example.com/foo.git";
   178:                  rev    = import "./foo.rev.nix";
   179:                  sha256 = import "./foo.sha256.nix";
   180:                };
   181:       };
   182: 
   183: bar = stdenv.mkDerivation {
   184:         name = "bar";
   185:         src  = fetchgit {
   186:                  url    = "https://example.com/bar.git";
   187:                  rev    = import "./bar.rev.nix";
   188:                  sha256 = import "./bar.sha256.nix";
   189:                };
   190:         buildInputs = [ foo ];
   191:       };
   192: ```
   193: 
   194: `bumpCommit` would look at the current working directory, and use a look up
   195: table to find the corresponding `rev.nix` and `sha256.nix` files, which it would
   196: overwrite with the new versions.
   197: 
   198: Since these package definitions *themselves* are stored in git repositories,
   199: running `bumpCommit` on a project may cause changes in other projects, which
   200: requires more commits, and so on.
   201: 
   202: That's where `bumpEverything` came in: it would loop through all projects in
   203: dependency order, commit any outstanding changes to `rev.nix` or `sha256.nix`
   204: files, then invoke `bumpCommit`.
   205: 
   206: This was clearly a hack, and was very fragile in the face of changing projects,
   207: e.g. splitting a project into two parts. It also generated a large number of git
   208: commits, which would do nothing other than bump revision numbers and hashes.
   209: 
   210: ## latestGit ##
   211: 
   212: My current solution abandons the `bumpCommit` and `bumpEverything`
   213: commands. Instead, we try to combine the best features of the git approach *and*
   214: the no-git approach.
   215: 
   216: The aim is to treat a git URL as if it were a local directory: no need to
   217: specify a particular revision (just fetch a sensible default like `HEAD`,
   218: `master`, etc.) and no need to specify a particular hash (just like with local
   219: directories, if it's changed then build a new derivation; of course, in practice
   220: a particular git revision *will not* change, and is hence safe to cache).
   221: 
   222: To do this, we abuse Nix's idea of derivations in order to run arbitrary scripts
   223:  and cache the results in the Nix store for subsequent access.
   224: 
   225: First, we write a "package" which is just a single file, containing the latest
   226: git revision of a particular repository:
   227: 
   228: ```
   229: with import <nixpkgs> {};
   230: with builtins;
   231: 
   232: getHeadRev = { url, ref ? "HEAD" }:
   233:   stdenv.mkDerivation {
   234:     inherit url ref;
   235:     name    = "repo-head-${hashString "sha256" url}";
   236:     version = toString currentTime;
   237: 
   238:     # Required for SSL
   239:     GIT_SSL_CAINFO = "${cacert}/etc/ssl/certs/ca-bundle.crt";
   240: 
   241:     buildInputs  = [ git gnused ];
   242:     buildCommand = ''
   243:       source $stdenv/setup
   244:       # printf is an ugly way to avoid trailing newlines
   245:       printf "%s" $(git ls-remote "$url" "$ref" | head -n1 | sed -e 's/\s.*//g') > "$out"
   246:     '';
   247:   };
   248: ```
   249: 
   250: The `getHeadRev` function takes a `url` parameter and, optionally, a `ref`
   251: (defaulting to `HEAD`). It then generates a package with a name derived from the
   252: given URL, and a version based on the current time; this ensures that we avoid
   253: the Nix cache.
   254: 
   255: The contents of the package, stored in the file at location `$out`, are
   256: generated by the `buildCommand`, which queries the given URL for the latest
   257: revision ID.
   258: 
   259: Next, we need a way to access this revision information from within Nix. We do
   260: this by coercing the packages generated by `getHeadRev` into strings, which will
   261: correspond the the `$out` path at which they're stored. We use `readFile` to
   262: read the contents of the generated files:
   263: 
   264: ```
   265: rev = args: unsafeDiscardStringContext (readFile "${getHeadRev args}");
   266: ```
   267: 
   268: We use `unsafeDiscardStringContext` to ignore the "dependencies" of this string,
   269: i.e. the particular invocation of the `git ls-remote` command; all we care about
   270: is the revision, not the time at which it was looked up.
   271: 
   272: Now that we have a git revision, we just need to avoid the hash requirement of
   273: `fetchgit`, to prevent it from being a so-called "fixed-output derivation". We
   274: do this in two steps, utilising Nix's lazy evaluation. First, we do a regular
   275: `fetchgit` invocation:
   276: 
   277: ```
   278: fg = args: fetchgit {
   279:        url = args.url;
   280:        rev = rev args;
   281: 
   282:        # Dummy hash
   283:        sha256 = hashString "sha256" args.url;
   284:      };
   285: ```
   286: 
   287: Finally, we override this derivation to strip out all of the hashing
   288: information:
   289: 
   290: ```
   291: latestGit = args: stdenv.lib.overrideDerivation (fg args) (old: {
   292:               outputHash     = null;
   293:               outputHashAlgo = null;
   294:               outputHashMode = null;
   295:               sha256         = null;
   296:             });
   297: ```
   298: 
   299: Without these details, Nix will use the hashes it calculates from the
   300: source. Now we can use this `latestGit` function to specify our package sources:
   301: 
   302: ```
   303: foo = stdenv.mkDerivation {
   304:         name = "foo";
   305:         src  = latestGit {
   306:                  url    = "https://example.com/foo.git";
   307:                };
   308:       };
   309: 
   310: bar = stdenv.mkDerivation {
   311:         name = "bar";
   312:         src  = latestGit {
   313:                  url    = "https://example.com/bar.git";
   314:                };
   315:         buildInputs = [ foo ];
   316:       };
   317: ```
   318: 
   319: This ensures that Nix and git are always synchronised, since git revisions are
   320: now the canonical form of package versions, and Nix will ask Git if these have
   321: changed when a new derivation is being instantiated.
   322: 
   323: From a development point of view, this gives us the freedom to tinker and
   324: experiment without fear of breaking our system. When our local changes are
   325: suitable for wider use, we can do the usual `git push` to make them available to
   326: the world; which now includes *our* installation of Nix!
   327: 
   328: Our packages are also portable, as long as `latestGit` is made available
   329: somewhere.
   330: 
   331: This doesn't *quite* solve the problem of using experimental versions of `foo`
   332: from within `bar`, however we're free to define *extra* packages, like
   333: `foo-unstable`, which use the local directory as their source. If we make `foo`
   334: a *parameter* of the `bar` package, we can override it with `nix-shell` to get a
   335: one-off development shell using `foo-unstable`, without breaking anything,
   336: without exposing our dodgy prototypes to the world, and without building a
   337: mountain of trivial git commits.

Generated by git2html.