chriswarbo-net: fe4a57bc398a6073b316e5debfe1583d1ad5a895

     1: ---
     2: title: Deleting files
     3: ---
     4: 
     5: While trying to free up space on my NixOS installation, I ran into the following
     6: situation.
     7: 
     8: There is a directory containing `n` files, which we can list in alphabetical
     9: order. Some files are in use, so can't be deleted, but we don't know which ones;
    10: we want to delete all of the files which aren't in use.
    11: 
    12: We can delete any set of files at once, but we pay a large constant cost. If any
    13: file in the chosen set is in use, none will be deleted.
    14: 
    15: What strategy can we follow which will delete all of the required files, using
    16: the fewest delete operations?
    17: 
    18: # Trivial Solutions #
    19: 
    20: If we try to delete all of the files at once, we use the fewest number of calls;
    21: but we fail to delete *any* files if there is one or more in use, so on its own
    22: this isn't actually a solution.
    23: 
    24: If we delete each file individually, we're guaranteed to remove all the required
    25: files, but this fails to utilise any batching, giving us O(n) time.
    26: 
    27: # Smarter approaches #
    28: 
    29: One approach which leaps out as a programmer is a divide-and-conquer approach:
    30: we try to delete the whole lot, and see if that works. If not, we split our list
    31: of files into two halves and try again with those. This is elegant and
    32: recursive.
    33: 
    34: In the worst case, every other file is in use: no grouping of neighbouring files
    35: will work, so we end up deleting each one individually, which takes O(n) time as
    36: mentioned above, but we also have an O(log(n)) factor as we split up the list
    37: into smaller and smaller pieces, giving a worst time complexity of O(n*log(n)).
    38: 
    39: The best case is O(1), since it's the case where all files can be deleted, and
    40: our first call does the lot.
    41: 
    42: I think this is the best approach, but would love to know if there's something
    43: better!
Generated by git2html.