leah blogs: October 2022

11oct2022 · 50 blank pages, or: black-box debugging of PDF rendering in printers

I was happily typesetting a book with ConTeXt and its new engine LMTX when I decided to print a few pages to see if I got the sizes right. Since I despise printer daemons, I directly print stuff over the network using a little script.

To my surprise, I just got a blank page out.

As the title of this post suggests, it won’t be the only blank page. This is the story of me debugging PDF generation in LMTX.

Giving it a closer look, the page wasn’t entirely blank. I could see the cutmarks I added to indicate the page size. During creation, I used MuPDF as a previewer—it’s lightweight and stays out of my way. But apparently the PDF was broken, so I tried a few other previewers. Evince and Firefox pdf.js rendered it fine. I looked at Okular and xpdf, and it came out nicely as well. At a later point, I even installed ancient Acrobat 7(!) and it would display as intended.

I tried the other printer in our university office. Another blank page.

Two different vendors, yet they both fail to print a simple PDF?

I secretly hoped some previewer would also render a blank page. Then I could just compile its source code on my machine and throw all kinds of debugging tools at it… but they all worked.

I tried converting the PDF to PDF with Ghostscript, and then it printed fine. So the PDF couldn’t be too wrong. But I wanted to fix it directly.

So how do you debug a PDF that gets printed wrongly but seems to be fine else?

My first intuition was to make a PDF that works, and then look at the differences. So I created a simple document and ran it through the previous ConTeXt version, called MKIV. This version uses LuaTeX as an engine. It printed fine. (No nobody’s surprise—I would have discovered this years ago else.)

I put both PDFs through various PDF validators, but they all said both where good.

Time to dig deeper. I disabled PDF compression and looked at both PDF files in a text editor. Sure, there were a lot of little differences. But fundamentally? Pretty much the same.

… I looked at the first printout again. Not only the page marks were printed, but actually the tiny page numbers inside them were too! I checked the PDF, and saw it uses two fonts (using pdffonts):

NSOLKP+TeXGyreSchola-Regular         CID Type 0C       Identity-H       yes yes yes      4  0
FFMARX+DejaVuSansMono                CID TrueType      Identity-H       yes yes yes      5  0

The page numbers use the DejaVu Sans font, which is supplied in TrueType format. I changed the main font of my test document to DejaVu Sans, and voilà: it printed fine. I was very happy about this, as it meant LMTX can generate printable PDF files in principle. But for its default font (Latin Roman) and the font I wanted to print in (TeX Gyre Schola), there apprently was an issue.

I knew the basics of PDF from decades ago when I wrote a PDF generator from scratch. (I never got around doing more than putting a few characters on a page, though.) Now it was time to learn about the PDF font formats.

Both the MKIV and the LMTX engine use the “CID Type 0C” font format these days, which embeds only the actually used glyphs from an OpenType font into the PDF. I pulled out the CID fonts from the PDF (using mutool extract). While file didn’t recognize the file format, luckily FontForge could open it fine. (As I learned later, FontForge can open the PDF directly and import its fonts.)

I noticed a first difference: while MKIV (and thus LuaTeX) spread out the glyphs over the positions, LMTX nicely arranged the used glyphs starting from code point 1. I had already contacted Hans Hagen, the main developer behind ConTeXt, and we wondered whether starting the glyphs from 31 would help… again it rendered nicely on all previewers, but still printed blank pages.

I had the strong suspicion that the font embedding was the problem. To verify this hypothesis, I manually fiddled the LMTX font into the MKIV document (this was easy because it was smaller, so I just had to add some padding to make the document be valid again), adjusted some code points in the PDF, and it would render glyphs on the screen. But it would not print. So now I was fairly sure that the font stream was the culprit, and not some other part of the PDF.

After more research, I found a tool to dump a CID font in a readable format: CFFDump. This small Java program turned out to be essential for tracking down the bug.

It generates a dump that looks like this:

% CFF Dump Output
% File: font-0008.cid


Header (0x00000000):
    major: 1
    minor: 0
    hdrSize: 4
    offSize: 4


Name INDEX (0x00000004):
  count: 1, offSize: 1
    [0]: (CLLXEY+LMRoman12-Regular)


Top DICT INDEX (0x00000021):
  count: 1, offSize: 1
  [0] (0x00000026):
    /ROS << /Registry (Adobe) /Ordering (Identity) /Supplement 0 >>
    /CIDCount 15
    /FamilyName (LMRoman12)  % SID 392
    /FullName (LMRoman12-Regular)  % SID 393
    /Weight (Normal)  % SID 394
    /FontBBox [-422 -280 1394 1127]
    /isFixedPitch false
    /ItalicAngle 0
    /UnderlinePosition -175
    /UnderlineThickness 44
    /CharstringType 2
    /FontMatrix [0.001 0 0 0.001 0 0]
    /StrokeWidth 0
    /CharStrings 257  % offset
    /charset 220  % offset
    /FDArray 1751  % offset
    /FDSelect 249  % offset
    /Private [23 1728]  % [size offset]
    % ----- Following entries are missing, so they get default values: -----
    /PaintType 0  % default
    /CIDFontVersion 0  % default
    /CIDFontRevision 0  % default
    /CIDFontType 0  % default

And it goes on and on, detailing all the things specified in the font.

Inevitably, I had to dig into the internals of CFF fonts, that is Adobe’s Technical Note #5176.

I carefully compared the dump of the working MKIV font with the broken LMTX font… and didn’t find substantial differences. Sure, one copied a few more metadata fields, and the other had more font fields set, but mostly to values that were the default anyway. Nothing that seemed to be related to our bug. And also, the various PDF viewers rendered the document fine, so there couldn’t have been a major mistake there.

By now I had learned about the design of LMTX, and luckily I saw that all parts of this font embedding were written in quite straight-forward Lua code that I could easily modify, so experiments were easy. Unfortunately, I didn’t have a printer at home so I had to annoy some of my friends to do test prints for me. They printed a lot of blank pages…

But I just couldn’t track down the problem. A reasonable person would have given up ages ago and just fed the PDF through Ghostscript before printing, but I wanted to get to the bottom of the thing; and I also wanted this new TeX engine to produce working documents out of the box.

In my time as a software developer, one thing I learned about debugging is that if a thing takes a long time to debug, it can be for two reasons: either the cause is much more simple than you thought, or it’s much more complicated.

I chose violence. I corrupted the CID font in various ways… the printer would stop working and printed an error message instead. Some printers have an internal error log, but before these experiments it was empty.

Perhaps the document wasn’t wrong, but the printer software was? But by now we could reproduce the issue with a bunch of printers—how can they all have the same issue?

After some wrong attempts related to font hinting, I was out of ideas and decided to kill all fields one by one and check if it made any difference.

I deleted the /FontMatrix entry and… suddenly it printed nicely.

Now, the font matrix is a feature of CFF fonts to encode their scaling and shearing factors. It’s a 2x3 matrix that encodes an affine transformation (perhaps you know this from SVG). The details don’t matter, but in practice you only have two values set and they determine the font size relative to the sizes used in the font drawing instructions. By default, the font matrix is [0.001 0 0 0.001 0 0], meaning that moving by 1000 units will move by 1 PostScript point on paper.

I was happy, but I also was very confused: of all things, why exactly did that fix it? I noticed earlier that the MKIV document didn’t have the font matrix set, but I also looked at the Ghostscript output and there it worked fine. Even more so, LMTX set the font matrix to its default value! It shouldn’t make a difference at all!

Gone this far, I wasn’t satisfied without a real answer. I wondered if LMTX encoded the font matrix the wrong way, but after digging into the spec for that (Technical Note #5177) and double checking, it seemed fine. The working Ghostscript PDF used exactly the same byte sequence to encode the font matrix.

Staring some more at CFFDump output, I finally noticed what Ghostscript did differently: the CFF had two font matrices defined! CFF allows defining a font matrix in the “Top DICT INDEX” as well as the “Font DICT INDEX”.

And while the “Top DICT INDEX” was the same that we used, [0.001 0 0 0.001 0 0], the one in the “Font DICT INDEX” was [1 0 0 1 0 0], i.e. the identity matrix. I added this matrix to LMTX output, and finally the PDF printed properly.

Still, this was a surprise. Why would explicitly setting the font matrix to its default value change the behavior? It turns out the reason for this is an interaction between both of these default values. Unfortunately, it seems to be not specified by Adobe. I found a similar bug in Ghostscript that explains the reasonable thing to do:

1) If both Top DICT and Font DICT does _not_ have FontMatrix, then Top DICT = [0.001 0 0 0.001 0 0], Font DICT 
= [1 0 0 1 0 0].  (Or, Top DICT = (absent), Font DICT = [0.001 0 0 0.001 0 0] then let '/CIDFont defineresource' 
make Top DICT = [0.001 0 0 0.001 0 0], Font DICT = [1 0 0 1 0 0].)

2) If Top DICT has FontMatrix and Font DICT doesn't, then Top DICT = (supplied matrix), Font DICT = [1 0 0 1 0 0].

3) If Top DICT does not have FontMatrix but Font DICT does, then Top DICT = [1 0 0 1 0 0], Font DICT = 
(supplied matrix).  (Or, Top DICT = (absent), Font DICT = (supplied matrix) then let '/CIDFont defineresource' 
make Top DICT = [0.001 0 0 0.001 0 0], Font DICT = (supplied matrix 1000 times larger). I think this is better.)

4) If both Top DICT and Font DICT _does_ have FontMatrix, then Top DICT = (supplied matrix), Font DICT = 
(supplied matrix).

All previewers seem to have adapted this algorithm. But certain older printers botched step 2. They end up with two font matrices [0.001 0 0 0.001 0 0] that are multiplied together, which ends up printing your document at a thousandth of its size; i.e. you get a blank page. But note that it’s a perfectly valid PDF!

We thus had two ways to fix the bug: write no font matrix at all, or write both of them. I was first learning towards the latter, and do it as Ghostscript does, but we found an issue with FontForge that it will render the fonts internally at 1000x the size and thus consume a lot more memory. Since we did not find a need to use a non-default font matrix, we decided to go with the former: no font matrix at all. After all, it worked fine for LuaTeX all those years, too.

(Why did this issue not affect the TrueType font? It’s embedded in a different format that only has a single scaling factor and has no concept of a font matrix.)

A trial print of the PDF on many printers is on-going but seems to be very promising so far, so that this fix (essentially, deletion of one line of code) will be shipped soon in a ConTeXT snapshot for general availability.

I would like to thank Hans Hagen for not giving up on helping me with this, and all my friends that test-printed some page for me and/or had to hear me talking about nothing else for a week or so.

NP: Rites of Spring—All Through A Life

Copyright © 2004–2022