chriswarbo-net: 90d33fe5372ed09b5f2a4845f067458062fde2e4

     1: ---
     2: title: Small Functions
     3: ---
     4: 
     5: I saw
     6: [Small functions considered harmful](https://medium.com/@cindysridharan/small-functions-considered-harmful-91035d316c29)
     7: on [Hacker News](https://news.ycombinator.com/item?id=14988206) today, and
     8: wanted to comment but it got a little long, so I've posted it here ;)
     9: 
    10: As a lot of other commenters point out, too much of anything is a bad thing (by
    11: definition!).
    12: 
    13: Whilst reading the article, a few things bubbled up in my subconscious (NOTE:
    14: the below "quotes" are paraphrased, but hopefully not misrepresenting the
    15: arguments):
    16: 
    17: > Splitting functionality over many small functions makes it harder to read/see
    18: > what's happening
    19: 
    20: If you find yourself wanting/needing to read a whole bunch of definitions at
    21: once, then *that* is the problem that needs solving; not necessarily the fact
    22: that so many functions are used. This can be partially due to the code (i.e.
    23: when it *really does* require digging down a few levels to understand what's
    24: going on), but it can also be due to the developer or team psychology
    25: (i.e. subconsciously doubting that `doFoo` actually does 'foo').
    26: 
    27: The latter might be related to the difficulty many people have with recursion,
    28: since that also requires the ability to reason *under the assumption that* some
    29: "black box" function call works as expected. For example, `mergesort` is trivial
    30: to understand *under the assumption that* `mergesort(firstHalf)` and
    31: `mergesort(secondHalf)` work as advertised; without that suspension of
    32: disbelief, it's utter voodoo.
    33: 
    34: Note that as a code smell (rather than an overly distrustful dev), this might be
    35: due to *not enough* abstraction, as well as too much or misaligned abstractions
    36: like the article mentioned. For example, I might open a file at random and pick
    37: a random function, say `sendOrder`: if that contains a bunch of
    38: number-twiddling, deletes some fields, rearranges some lists, etc. then I might
    39: doubt its correctness, since I have no idea why those things have anything to do
    40: with sending orders; on the other hand, if that stuff were pulled out into a
    41: `formatForInitech` function then I might find it reasonable that sending an
    42: order requires formatting it for initech, and if I stumbled upon the
    43: `formatForInitech` function at random then I might find it reasonable that such
    44: formatting requires deleting fields, twiddling numbers, etc. Note that this
    45: could also be achieved by adding a comment, but I don't think that changes the
    46: argument: if I see `// Required to comply with Initech systems` followed by some
    47: gobbledegook, I should (be able to) have confidence that the code is doing what
    48: the comment says (and no more).
    49: 
    50: This confidence in randomy-chosen functions isn't just a nice-to-have property
    51: of a codebase either: when we're debugging we *know* there's a problem
    52: somewhere, and it saves time and mental capacity if we can skim over unrelated
    53: things when doing a deep dive into the problematic area.
    54: 
    55: Another problem with *lack* of abstraction, which can force us to dig into
    56: function definitions, is when our API is too fine-grained: it allows things
    57: which don't make sense in the domain, and hence places a burden on us to avoid
    58: such incoherent states. If we try sticking to the "language keyword" level, it's
    59: very easy to end up in such situations. For example, a common (anti?) pattern is
    60: an "iterator" object, with methods like `hasNext` and `next`. This lets us use
    61: our language's built-in `for` loops, but it couples control flow to
    62: otherwise-meaningless state (the "cursor position" of our iterator) whose leaks
    63: can cause problems like composition-breaking interference (if some inner call
    64: tries to loop over the same iterator). Much better to provide small functions
    65: like `map` and `filter` (or, even better, domain-specific alternatives), which
    66: are "unfamiliar" compared to keywords but which avoid inconsistent states and
    67: make sense in the domain.
    68: 
    69: If functions are "invisibly" coupled to each other, e.g. `doBar(...)` assumes
    70: that `doFoo(...)` has been called beforehand, then I'd say these may
    71: legitimately be "too small"; specifically, we might say they aren't "doing one
    72: thing", they're actually doing "half a thing": again, the API is allowing too
    73: many things (i.e. "barring in an unfooed state"). A common example is
    74: `dbQuery(...)` requiring `dbConnect(...)` to be run first.
    75: 
    76: Inlining isn't the only, or necessarily best, solution though. If a "fooed
    77: state" makes sense in the problem domain, e.g. "a connected database", then
    78: another solution is to have `doFoo` *return* the relevant state and have `doBar`
    79: consume it, so we get `doBar(..., doFoo(...))`. This is especially useful if we
    80: might want to re-use the same result (e.g. the same DB connection) many
    81: times. We can introduce nice types to ensure correct usage too,
    82: e.g.
    83: 
    84:     dbConnect : DbCredentials -> Database
    85: 
    86:     dbQuery : Database -> Sql -> Table
    87: 
    88: This way there's no chance to attempt a query without having connected.
    89: 
    90: Alternatively, if the "intermediate state" doesn't make sense in our domain and
    91: is only an implementation detail, then another solution is to have a `withFoo`
    92: function, which takes a function as argument and runs it in the correct state,
    93: e.g.
    94: 
    95:     withFoo = function(f) { doFoo(); return f(); };
    96: 
    97: This came to mind when I saw `renderPageWithSetupAndTeardown`: I agree that it
    98: seems rather pungent. One possible alternative would be to have the `setup` and
    99: `teardown` functions be (optional?) fields of a `page`, which the regular
   100: `renderPage` function would call if found. Alternatively, we could pull out a
   101: function:
   102: 
   103:     withSetupAndTeardown = function(setup, teardown, f) {
   104:                              result = f(setup());
   105:                              teardown(result);
   106:                              return result;
   107:                            };
   108: 
   109: This can be used with anything, including the regular `renderPage` function
   110: (this is analogous to the widely-used `bracket` function in Haskell, for
   111: example).
   112: 
   113: I think that this specific `renderPageWithSetupAndTeardown` example is actually
   114: quite weak, since it smells to me of coming from a language without first-class
   115: functions, which is going to make any proposed solutions overly convoluted
   116: (since "setup" and "teardown" are inherently function-like concepts), and hence
   117: there are many ways to improve it in languages *with* first-class functions
   118: (like Ruby, or whatever).
Generated by git2html.