leah blogs: April 2020

Recently I have been nerdsniped into thinking about optimal stack programs, and one part of this are optimal stack arrangements, that is, words (i.e. procedures) that rearrange the stack but do not look into the contents. Borrowing terms of functional programming parlor, we can call these “words for free”, as their behavior is uniquely determined by their stack effect, which we write as ( before -- after ) as is customary in Forth. For example ( a -- aa ) is dup, which duplicates top of stack, while ( ab -- ba ) is swap, which exchanges the two topmost elements of stack.

In most common Forths, the stack arrangement primitives are: drop, dup, swap. Together with the words to handle the return stack (>r and r>), every stack arrangement can be performed, as we’ll see below. I don’t want to explain the semantics of the return stack, for now we can just think of >r moving an element from the stack to the return stack, and r> from the return stack to the stack. Additionally, at the end of the word, the return stack must be in the same state at as the beginning of the word.

For reasons of performance, of course many Forths also implement more complex words as primitives.

I wondered if a Forth with fewer than these 5 words can still be complete, and a bit of research turned up a paper by Mikael Patel from 1990. It’s written in Swedish, but we can figure out the essentials from page 4:

variable t
: t! ( x -- ) t ! ;
: t@ ( -- x ) t @ ;

If you don’t speak Forth, that’s just a single global variable and two words to set (t!) and read it (t@). The paper then shows how to define drop, dup and swap from these. (Or see below.)

We shall call a Forth with t!, t@, >r, and r> a Minimal Forth.

So, I wondered how to get the optimal definitions for many stack words, and while I wished for a smarter approach I didn’t find anything that can deal with the infinite stack well (of course, you can limit it, but also implementing a stack machine in a way a SAT/SMT solver likes seems to be inconvenient), I just decided I can brute force short forth programs.

So I wrote femtoforth, which implements these minimal subsets in Ruby:

def femtoforth(input, code)
  s = input.dup
  r = []
  t = nil
  code.each { |insn|
    case insn
    when :"t@";    s << t
    when :"t!";    return :oos  if s.empty?;      t = s.pop
    when :">r";    return :oos  if s.empty?;      r << s.pop
    when :"r>";    return :oor  if r.empty?;      s << r.pop
    when :"dup";   return :oos  if s.empty?;      s << s.last
    when :"swap";  return :oos  if s.size < 2;    s[-1],s[-2] = s[-2],s[-1]
    when "drop";   return :oos  if s.empty?;      s.pop
    else           raise "Invalid insn: #{insn}"
    end
  }

  return :error  if !r.empty? or s.include? nil
  s
end

And then we can happily brute force all programs of length l:

stdops.repeated_permutation(l).each { |code|
  if femtoforth(input, code) == output
    return code
  end
}

Turns out that with recent hardware, programs up to size 16 are tractable in a few hours without a particularily fast implementation.

These are the results I found for the standard words, and some custom ones I think can be useful:

Word	Effect	Standard Forth	Minimal Forth
DROP	( a – )	drop	t!
DUP	( a – aa )	dup	t! t@ t@
SWAP	( ab – ba )	swap	t! >r t@ r>
NIP	( ab – b )	swap drop	>r t! r>
OVER	( ab – aba )	>r dup r> swap	>r t! t@ r> t@
TUCK	( ab – bab )	dup >r swap r>	t! >r t@ r> t@
ROT	( abc – bca )	>r swap r> swap	>r >r t! r> r> t@
2SWAP	( abcd – cdab )	>r swap >r swap r> r> swap >r swap r>	>r t! >r >r t@ r> r> r> t! >r >r t@ r> r>
2DUP	( ab – abab )	>r dup r> dup >r swap r>	t! t@ t@ >r >r t! t@ r> t@ r>
2OVER	( abcd – abcdab )	n/a	n/a
2DROP	( ab – )	drop drop	t! t!
-ROT	( abc – cab )	swap >r swap r>	t! >r >r t@ r> r>
3REV	( abc – cba )	swap >r swap r> swap	t! >r t@ >r t! r> r> t@
4REV	( abcd – dcba )	>r swap r> swap >r swap >r swap r> swap r> swap	t! >r t@ >r t! r> r> t@ >r >r >r t! r> r> r> t@
THIRD	( abc – abca )	>r >r dup r> swap r> swap	>r >r t! t@ r> r> t@
FOURTH	( abcd – abcda )	>r >r >r dup r> swap r> swap r> swap	>r >r >r t! t@ r> r> r> t@
3DUP	( abc – abcabc )	dup >r swap dup >r swap >r >r dup r> swap r> swap r> r>	n/a

The words with n/a are not possible to define in up to 16 words.

We know that 2OVER is >r >r 2dup r> r> 2swap, so by mere inlining that we get 21 primitive words in Standard Forth, and 28 primitive words in Minimal Forth. A definition for 3DUP is third third third, so 3DUP is at most 21 words in Minimal Forth, which is shorter than a translation of Standard Forth 3DUP to Minimal Forth.

Why are both sets of words complete in the sense that any stack arrangement can be expressed with it? Essentially, we can decompose every stack arrangement into a compsition of multiple words that extract a certain stack value and then keep the rest of the stack below, e.g. to implement the word ( abc -- cab ) we can compose the four words ( abc -- cabc ) then ( abc -- aabc ) then ( abc -- babc ) then ( abc -- ). Of course, this doesn’t yield an efficient program. But all of these words are easy to define by shuffling the stack to the return stack until there is the value you need, and then dup‘ing and swap‘ing it around until you move the whole return stack back to the stack in the end.

Word	Effect	Standard Forth	Minimal Forth
W1	( abc – cabc )	dup >r swap >r swap r> r>	t! >r >r t@ r> r> t@
W2	( abc – aabc )	>r >r dup r> r>	>r >r t! t@ t@ r> r>
W3	( abc – babc )	>r dup >r swap r> r>	>r t! >r t@ r> t@ r>
W4	( abc – )	drop drop drop	t! t! t!

In this contemplation, we focused on the runtime of the words (assuming every primitive takes 1 step). It would also be interesting to optimize for size, by factoring out common (sub)words into their own definitions, but this makes the problem a lot more complex.

NP: Pearl Jam—Take The Long Way

leah blogs: April 2020

26apr2020 · Brute forthing minimal programs for stack arrangements

18apr2020 · Shrinking multiple TeX Live installations