chriswarbo-net: 4e6c76d388c0f183fc0e1708e89f5b5aab66b166

1: ---
2: title: Paths of Least Resistance
3: ---
4:
5: > There are two types of programming languages: those which people complain
6: > about, and those which nobody uses.
7:
8: My previous boss used to trot out this line whenever I bemoaned some failing of
9: the technologies we were using (which almost always meant PHP doing something
10: braindead). This is an example of a [thought-terminating cliche](
11: https://en.wikipedia.org/wiki/Thought-terminating_clich%C3%A9): a pithy way to
12: end a discussion with a whiff of resolution, without actually addressing any
13: point or preventing the problem arising again. I've seen this same "apologist"
14: attitude come up a few times in online discussions too, so I thought it was time
15: I wrote the sort of rebuttal I think it deserves. I apologise for the verbosity
16: of this response; it is unfortunate that, by design, the densely-packed nonsense
17: implicit in a thought-terminating cliche can take a lot of work to extract,
18: expose and debunk.
19:
20: ---
21:
22: Like many things in life, everything in programming has problems; that *doesn't*
23: mean everything is equally problematic! In particular, I've become wary of
24: approaches which are fraught with:
25:
26: - Gotchas: These are known, acknowledged breakages in functionality or
27: abstraction, which seemingly come out of nowhere for the uninitiated
28: (violating the principle of least surprise). Those with more experience tend
29: to instinctively code defensively to avoid them, obfuscating otherwise-good
30: code for the fear of triggering one of these situations.
31: - Misaligned Incentives: Where following the "correct" practice is directly in
32: conflict with some other objective. For example, if the "correct" solution is
33: more verbose, slower, harder to debug, harder to read, less modular, etc.
34: - Lack of Objectively Checkable Criteria: Where there's no way to agree if
35: something is done "correctly" or not. This makes it hard for learners to
36: guess what the "correct" approach calls for in any given situation; and also
37: allows goalposts to be moved after the fact, to denounce genuine attempts to
38: follow the practice as "not done *properly*" if they turn out to make things
39: worse.
40:
41: This can be summarised as fixing *the path of least resistance*. Here are some
42: examples of practices which are *all* flawed, but where some are less prone to
43: the above than the alternatives.
44:
45: # Static Typing #
46:
47: Most static typing gotchas are conservative, i.e. some code isn't allowed even
48: though it *might* be correct. This is preferable to allowing broken code, as
49: long as it's not too burdensome to pass the type-checker. The widespread use of
50: statically typed languages (e.g. Java and C#) shows that it need not be too
51: burdensome. Examples of gotchas are [Haskell's "monomorphism restriction"](
52: https://wiki.haskell.org/Monomorphism_restriction) and [Java's lack of
53: multiple-inheritance](
54: https://stackoverflow.com/questions/52620936/why-does-java-not-allow-multiple-inheritance-but-does-allow-conforming-to-multip);
55: in both of these cases it's *possible* to cause a problem, so the type checker
56: forbids it (even though it *might* be fine).
57:
58: For 'incentives', type-checkers don't have any of their own, but they're a
59: mechanism to bring the programmer's incentives ("get this code to compile") into
60: alignment with those of library designers ("make sure users call things
61: correctly"). Hence they're the *opposite* of misaligned incentives. An example
62: is string concatenation, which is the easiest way to dynamically create URLs,
63: HTML pages, SQL queries, shell commands, etc. but is vulnerable to injection
64: attacks. If a library/framework makes each of these a different type then this
65: easy-but-vulnerable approach is no longer possible; if escaping
66: functions/methods are the only way to convert a string from one language to
67: another, then the easiest way to combine strings is to escape them
68: appropriately, hence bringing the user's incentive ("do the easiest thing that
69: compiles") into alignment with the designer's ("prevent vulnerabilities").
70: Another example is sequential coupling, where one function/method, like
71: `dbConnect`, must be called before others, like `dbQuery`. The easiest thing is
72: to just call `dbQuery`, which won't work; the designer can forbid this misuse by
73: having that function require a `ConnectedDB`, and have `dbConnect` be the only
74: way to obtain one. Again, the easiest thing for the user to do (in order for the
75: compiler to succeed) is to use the library correctly. There are many other
76: examples of this sort of thing: in general, we *cannot* necessarily increase the
77: safety or correctness of the easiest approach to a problem; but we *can* use
78: types to easily *forbid* such easy "solutions", forcing the use of more
79: "correct" approaches. Overall this can make life more annoying, but it increases
80: the safety of the path of least resistance.
81:
82: Regarding objective checkability, static types are not only objective, but also
83: *automatically* checkable, since that's what type-checkers are all about. It's
84: less trivial to know whether a particular *choice* of types is objectively
85: "good", but there *are* some general criteria, e.g. "invalid states should be
86: unrepresentable". Relatedly, we can objectively (and *sometimes* automatically)
87: check whether all cases have been handled, which gives an indication of whether
88: our types are a good fit for the problem (e.g. if we're passing around JSON data
89: using strings, there will be lots of boilerplate and/or unhandled cases for what
90: to do when given a non-JSON string; using a more precise type would reduce
91: this, indicating that it's a better fit for the problem). Note that these are
92: general rules, not vague sometimes-"proper"-sometimes-not heuristics.
93:
94: # Automated Testing #
95:
96: Testing has both overly-conservative and overly-liberal gotchas.
97: Overly-conservative examples are things like depending too heavily on
98: implementation details (e.g. checking if one list equals another, when their
99: order doesn't actually matter) or testing invalid situations (e.g. generating
100: test data which doesn't satisfy some required invariant). Like static typing,
101: these will forbid some correct code, making some tasks harder than
102: necessary. Overly-liberal examples are things like only testing the happy path,
103: or forgetting some edge case or space of inputs (e.g. negative numbers), or
104: failing to test the right code (e.g. mocking the thing we want to test). These
105: allow easy-but-broken code through, which might otherwise be caught. This is a
106: real problem with automated testing, and one reason why it's not a silver
107: bullet.
108:
109: Automated testing also has misaligned incentives, if the developer of some
110: feature is the one deciding which tests to write. This is because the path of
111: least resistance is to have no tests at all. Some measures try to prevent this,
112: e.g. code coverage and mutation testing, but they have their own
113: issues. Incentives can be aligned more if the some of the tests come from a
114: separate source, e.g. acceptance tests based on some requirements spec.
115:
116: There are objective criteria for automated tests, like code coverage and
117: mutation testing mentioned above. They're not perfect, but seem to work well as
118: long as they're not being gamed (i.e. when they're treated as an indicator, not
119: as a goal).
120:
121: # Purity #
122:
123: This is another example of gotchas being overly-conservative, since we might
124: want to use impure components like an internal cache or in-place mutation, which
125: are impure but might just-so-happen to be safe. Again, I think it's better to
126: make the path of least resistance more correct, even if the general way to do
127: that is to forbid the easy things (which are often wrong).
128:
129: I think the only real incentive misalignment for purity is efficiency; yet as
130: Knuth tells us, this is usually fine to ignore. It's certainly the case that
131: having more pervasive use of pure languages in our stack (from the kernel up to
132: our scripting) would probably make things slower overall; yet I think the safety
133: benefits would outweigh those downsides for many (me included).
134:
135: The interesting thing about purity is how much it can simplify (automated)
136: reasoning: sure it might seem inefficient at first glance, but compilation can
137: perform much more invasive changes on pure code than are possible in the
138: presence of side-effects. Supercompilation and superoptimisation come to mind;
139: although there is certainly a cognitive burden when trying to keep track of how
140: our code will ultimately compile.
141:
142: Purity is trivially (automatically) checkable, since we simply don't bother
143: putting mutable things into our language. Haskell has famously struggled with
144: this, sticking to lazy evaluation (since, after all, anyone wanting strict
145: semantics could already pick any of the MLs, Schemes, etc.) which seems to
146: necessitate purity (due to evaluation being forced "back to front"). This "hair
147: shirt" approach paid off massively with the recognition of the importance of
148: monads; more recent investigations into algebraic effects have also proved
149: useful, again precisely because of the need to deal with purity (arrows were
150: interesting back in the day, but fell out of favour once applicative functors
151: and profunctors were adopted).

Generated by git2html.