leah blogs: April 2020

26apr2020 · Brute forthing minimal programs for stack arrangements

Recently I have been nerdsniped into thinking about optimal stack programs, and one part of this are optimal stack arrangements, that is, words (i.e. procedures) that rearrange the stack but do not look into the contents. Borrowing terms of functional programming parlor, we can call these “words for free”, as their behavior is uniquely determined by their stack effect, which we write as ( before -- after ) as is customary in Forth. For example ( a -- aa ) is dup, which duplicates top of stack, while ( ab -- ba ) is swap, which exchanges the two topmost elements of stack.

In most common Forths, the stack arrangement primitives are: drop, dup, swap. Together with the words to handle the return stack (>r and r>), every stack arrangement can be performed, as we’ll see below. I don’t want to explain the semantics of the return stack, for now we can just think of >r moving an element from the stack to the return stack, and r> from the return stack to the stack. Additionally, at the end of the word, the return stack must be in the same state at as the beginning of the word.

For reasons of performance, of course many Forths also implement more complex words as primitives.

I wondered if a Forth with fewer than these 5 words can still be complete, and a bit of research turned up a paper by Mikael Patel from 1990. It’s written in Swedish, but we can figure out the essentials from page 4:

variable t
: t! ( x -- ) t ! ;
: t@ ( -- x ) t @ ;

If you don’t speak Forth, that’s just a single global variable and two words to set (t!) and read it (t@). The paper then shows how to define drop, dup and swap from these. (Or see below.)

We shall call a Forth with t!, t@, >r, and r> a Minimal Forth.

So, I wondered how to get the optimal definitions for many stack words, and while I wished for a smarter approach I didn’t find anything that can deal with the infinite stack well (of course, you can limit it, but also implementing a stack machine in a way a SAT/SMT solver likes seems to be inconvenient), I just decided I can brute force short forth programs.

So I wrote femtoforth, which implements these minimal subsets in Ruby:

def femtoforth(input, code)
  s = input.dup
  r = []
  t = nil
  code.each { |insn|
    case insn
    when :"t@";    s << t
    when :"t!";    return :oos  if s.empty?;      t = s.pop
    when :">r";    return :oos  if s.empty?;      r << s.pop
    when :"r>";    return :oor  if r.empty?;      s << r.pop
    when :"dup";   return :oos  if s.empty?;      s << s.last
    when :"swap";  return :oos  if s.size < 2;    s[-1],s[-2] = s[-2],s[-1]
    when "drop";   return :oos  if s.empty?;      s.pop
    else           raise "Invalid insn: #{insn}"
    end
  }

  return :error  if !r.empty? or s.include? nil
  s
end

And then we can happily brute force all programs of length l:

stdops.repeated_permutation(l).each { |code|
  if femtoforth(input, code) == output
    return code
  end
}

Turns out that with recent hardware, programs up to size 16 are tractable in a few hours without a particularily fast implementation.

These are the results I found for the standard words, and some custom ones I think can be useful:

Word Effect Standard Forth Minimal Forth
DROP ( a – ) drop t!
DUP ( a – aa ) dup t! t@ t@
SWAP ( ab – ba ) swap t! >r t@ r>
NIP ( ab – b ) swap drop >r t! r>
OVER ( ab – aba ) >r dup r> swap >r t! t@ r> t@
TUCK ( ab – bab ) dup >r swap r> t! >r t@ r> t@
ROT ( abc – bca ) >r swap r> swap >r >r t! r> r> t@
2SWAP ( abcd – cdab ) >r swap >r swap r> r> swap >r swap r> >r t! >r >r t@ r> r> r> t! >r >r t@ r> r>
2DUP ( ab – abab ) >r dup r> dup >r swap r> t! t@ t@ >r >r t! t@ r> t@ r>
2OVER ( abcd – abcdab ) n/a n/a
2DROP ( ab – ) drop drop t! t!
-ROT ( abc – cab ) swap >r swap r> t! >r >r t@ r> r>
3REV ( abc – cba ) swap >r swap r> swap t! >r t@ >r t! r> r> t@
4REV ( abcd – dcba ) >r swap r> swap >r swap >r swap r> swap r> swap t! >r t@ >r t! r> r> t@ >r >r >r t! r> r> r> t@
THIRD ( abc – abca ) >r >r dup r> swap r> swap >r >r t! t@ r> r> t@
FOURTH ( abcd – abcda ) >r >r >r dup r> swap r> swap r> swap >r >r >r t! t@ r> r> r> t@
3DUP ( abc – abcabc ) dup >r swap dup >r swap >r >r dup r> swap r> swap r> r> n/a

The words with n/a are not possible to define in up to 16 words.

We know that 2OVER is >r >r 2dup r> r> 2swap, so by mere inlining that we get 21 primitive words in Standard Forth, and 28 primitive words in Minimal Forth. A definition for 3DUP is third third third, so 3DUP is at most 21 words in Minimal Forth, which is shorter than a translation of Standard Forth 3DUP to Minimal Forth.

Why are both sets of words complete in the sense that any stack arrangement can be expressed with it? Essentially, we can decompose every stack arrangement into a compsition of multiple words that extract a certain stack value and then keep the rest of the stack below, e.g. to implement the word ( abc -- cab ) we can compose the four words ( abc -- cabc ) then ( abc -- aabc ) then ( abc -- babc ) then ( abc -- ). Of course, this doesn’t yield an efficient program. But all of these words are easy to define by shuffling the stack to the return stack until there is the value you need, and then dup‘ing and swap‘ing it around until you move the whole return stack back to the stack in the end.

Word Effect Standard Forth Minimal Forth
W1 ( abc – cabc ) dup >r swap >r swap r> r> t! >r >r t@ r> r> t@
W2 ( abc – aabc ) >r >r dup r> r> >r >r t! t@ t@ r> r>
W3 ( abc – babc ) >r dup >r swap r> r> >r t! >r t@ r> t@ r>
W4 ( abc – ) drop drop drop t! t! t!

In this contemplation, we focused on the runtime of the words (assuming every primitive takes 1 step). It would also be interesting to optimize for size, by factoring out common (sub)words into their own definitions, but this makes the problem a lot more complex.

NP: Pearl Jam—Take The Long Way

18apr2020 · Shrinking multiple TeX Live installations

As an avid TeX user, I like to have a full installation of TeX Live, the standard Unix distribution, around. New TeX Live releases come every year, but it’s often useful to keep old releases around to ensure you can build a paper you wrote three years ago, or to help others which run a non-recent version of TeX Live.

Unfortunately, TeX Live installations are getting quite big; see here the sizes of the versions I have:

# du --apparent-size -shc /opt/texlive/20*
4.9G    /opt/texlive/2017
5.5G    /opt/texlive/2018
6.2G    /opt/texlive/2019
6.2G    /opt/texlive/2020

Note that on common file systems, these require even more disk space due to consisting of many rather small text files files, which are rounded up to the block size, e.g. here on an ext4 file system:

6.6G    /opt/texlive/2020

Luckily, the machine above uses ZFS which has built-in LZ4 compression that does a pretty good job on TeX Live:

4.2G    /opt/texlive/2017
4.7G    /opt/texlive/2018
5.3G    /opt/texlive/2019
5.3G    /opt/texlive/2020
20G     total

Yesterday, I realized that actually many of these files don’t change, since there are many packages (especially fonts) that rarely get updated. Let’s use jdupes to find out how many:

# jdupes -r -m /opt/texlive/20*
Scanning: 729045 files, 53145 items (in 4 specified)
467364 duplicate files (in 208444 sets), occupying 13469 MB

13GB of duplicates, that is very impressive! After a short verification that tlmgr update will break hardlinks (it does, due to use of tar), I decided I can safely hardlink all the same files together:

# jdupes -r -L /opt/texlive/20*

The final result (scanning in reverse order, so you see the overhead of old releases):

# du -shc /opt/texlive/20{20..17}
5.2G    /opt/texlive/2020
672M    /opt/texlive/2019
1.5G    /opt/texlive/2018
1.1G    /opt/texlive/2017
8.4G    total

So you can keep around TeX Live 2019 at the cost of 12% it’s original size, and even older versions for less than 25%.

NP: Pearl Jam—Quick Escape

Copyright © 2004–2022