As an avid TeX user, I like to have a full installation of TeX Live, the standard Unix distribution, around. New TeX Live releases come every year, but it’s often useful to keep old releases around to ensure you can build a paper you wrote three years ago, or to help others which run a non-recent version of TeX Live.
Unfortunately, TeX Live installations are getting quite big; see here the sizes of the versions I have:
# du --apparent-size -shc /opt/texlive/20*
4.9G /opt/texlive/2017
5.5G /opt/texlive/2018
6.2G /opt/texlive/2019
6.2G /opt/texlive/2020
Note that on common file systems, these require even more disk space due to consisting of many rather small text files files, which are rounded up to the block size, e.g. here on an ext4 file system:
6.6G /opt/texlive/2020
Luckily, the machine above uses ZFS which has built-in LZ4 compression that does a pretty good job on TeX Live:
4.2G /opt/texlive/2017
4.7G /opt/texlive/2018
5.3G /opt/texlive/2019
5.3G /opt/texlive/2020
20G total
Yesterday, I realized that actually many of these files don’t change, since there are many packages (especially fonts) that rarely get updated. Let’s use jdupes to find out how many:
# jdupes -r -m /opt/texlive/20*
Scanning: 729045 files, 53145 items (in 4 specified)
467364 duplicate files (in 208444 sets), occupying 13469 MB
13GB of duplicates, that is very impressive! After a short
verification that tlmgr update
will break hardlinks (it does, due to
use of tar
), I decided I can safely hardlink all the same files together:
# jdupes -r -L /opt/texlive/20*
The final result (scanning in reverse order, so you see the overhead of old releases):
# du -shc /opt/texlive/20{20..17}
5.2G /opt/texlive/2020
672M /opt/texlive/2019
1.5G /opt/texlive/2018
1.1G /opt/texlive/2017
8.4G total
So you can keep around TeX Live 2019 at the cost of 12% it’s original size, and even older versions for less than 25%.
NP: Pearl Jam—Quick Escape