Render PostScript to PDF

We are about to arrive to the pièce de résistance. Adobe developed PostScript 1982 for typesetting, and Apple made it popular using it as the printer language for the LaserWriter 1984. PostScript was the Rosetta Stone for desktop publishing and the question raised if it could have a wider use as graphics language for documents in general and display.

But PostScript was a kind of too powerful. As a Turing complete language it could be a challenge for computers to handle the graphics never knowing when the page would be ready.

Therefore Adobe came 1991 with the PDF format. PDF is based on PostScript, but it has a reduced instruction set which is mainly the path operators cutting away the general programming language. This, together with the design choice to embed everything in one document, was a wise decision. PDF is now one of the longest living document formats having a public specification making is suitable for long time archiving. It is also readable on virtually any device except that is not an image standard on browsers. Actually it is the only archiving format for documents I would recommend you to use.

Rendering to PDF is however quite a challenge. Though PostScript is a text based format and most part of the PDF is text based too, it is a technically binary format. PDF was designed of random access so that the reader application does not have to read the entire file. This random access needs an index with byte offsets for each element of the page. We will have to handle them. Current PDF uses also compression heavily, but we are not obliged to use it. PDF 1.0 specification allows for clear text of all elements.

The code that follows is a longer interaction of trial and error. I did read the book (Portable Document Format Reference Manual), but I did also open simple PDF files in a text editor, reverse engineered the code, exported files and checked them against Preview first, Acrobat Reader later.

The PDF file looks like this

%PDF-1.1 %•±Î rpn 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 /MediaBox [0 0 590 330] >> endobj 3 0 obj << /Type /Page /Parent 2 0 R /Resources <> >> /Contents 4 0 R >> endobj 4 0 obj << /Length 229322 >> stream 10.5468696 100 m 10.5468696 109.2285109 l... 75.45897133999999 l 354.9705725800001 75.45897133999999 l 354.9705725800001 75.45897133999999 l h 0.100 0.100 0.100 rg F endstream endobj xref 0 5 0000000000 65535 f 0000000018 00000 n 0000000069 00000 n 0000000154 00000 n 0000000251 00000 n trailer << /Root 1 0 R /Size 5 >> startxref 229629 %%EOF

The PDF file is a list of objects. Each objects starts with id generation obj and ends with endobj.

The id number is ordinal. The generation number would allow to create multiple versions of the same object without overwriting; we will not use it. Values are separated by space and newline. Each object is a dictionary which uses a notation with double tag brackets and a simple list of name keys and values.

Each object has a type

Object 1 Catalog is the main object that just shows us that the pages list is object 2.
Object 2 Pages is a list of pages. We have one page which is object 3 and the MediaBox indicates the size of the page.
Object 3 Page may have resources which is the empty list of fonts and the content of the page width is object 4.
Object 4 is the content stream. This is no dictionary, just reduced PostScript code. The operators are abbreviated

m move
l lineto
c curveto
h closepath
rg setfillcolor
RG setstrokectlor
w setlinewidh
F fill
S stroke

Finally, there is a cross reference table. There are 6 elements starting at 0. Each object has the byte offset, generation and the status. The object 0 is max generation (65535) and free (f), the other are generation 0 and used (n). Finally, startxref which is always at the end of the file shows the offset of the xref table.

We first build a rpnPDFDevice which just creates an empty page to work if it can be opened by Preview and add a link to postScriptEditor. Then we add fill and stroke.

Javascript Editor

rpnPDFDevice = class {
 constructor(urlnode, width, height) { 
 this.urlnode = urlnode;
 this.initgraphics(width, height, 1);
 this.catalog = {};
 this.pages = {};
 this.elements = [];
 } 
 initgraphics(width, height, oversampling) { 
 if (Number.isFinite(width)) this.width = width; 
 if (Number.isFinite(height)) this.height = height; 
 };
 numberFormat(z) {
 return Intl.NumberFormat('en-IN', { minimumFractionDigits: 3, maximumFractionDigits: 3 }).format(z);
 }
 objectDict(obj) {
 const elements = [];
 for (var k in obj) {
 const v = obj[k];
 if (typeof v == "object") {
 elements.push("/" + k + " " + this.objectDict(v));
 } else {
 elements.push("/" + k + " " + v)
 } 
 }
 return "<< " + elements.join(" ") + " >> ";
 }
 getPath(context) { 
 const p = [];
 for (var subpath of context.graphics.path) {
 if (!subpath.length) continue;
 p.push(subpath[0][1] + " " + subpath[0][2] + " m");
 for (var line of subpath) {
 if (line[0] == "C") {
 p.push(line[3] + " " + line[4] + " " + line[5] + " " + line[6] + " " +line[7] + " " + line[8] + " c");
 } else {
 p.push(line[3] + " " + line[4] + " l");
 }
 }
 } 
 p.push("h");
 return p.join(" "); 
 }
 fill(context) { 
 if (!(context.device.pdfurl)) return context;
 this.elements.push(this.getPath(context));
 const rs = this.numberFormat(context.graphics.color[0]/255);
 const gs = this.numberFormat(context.graphics.color[1]/255);
 const bs = this.numberFormat(context.graphics.color[2]/255);
 this.elements.push(rs+' '+gs+' '+bs+' rg'); 
 this.elements.push("F");
 return context; 
 }
 stroke(context) { 
 if (!(context.device.pdfurl)) return context;
 this.elements.push(this.getPath(context));
 const rs = this.numberFormat(context.graphics.color[0]/255);
 const gs = this.numberFormat(context.graphics.color[1]/255);
 const bs = this.numberFormat(context.graphics.color[2]/255);
 this.elements.push(rs+' '+gs+' '+bs+' RG'); 
 const ws = this.numberFormat(context.graphics.linewidth);
 this.elements.push(ws+' w'); 
 this.elements.push("S");
 return context; 
 }
 showpage(context) { 
 if (!(context.device.pdfurl)) return context; 
 const xrefoffset = [];
 var file = "%PDF-1.1" + endofline; // signature
 file += "%¥±ë rpn" + endofline; // random binary characters 
 xrefoffset.push(file.length);

// Catalog dictionary
        file += "1 0 obj" + endofline;
        this.catalog.Type = "/Catalog";
        this.catalog.Pages = "2 0 R"; 
        file += this.objectDict(this.catalog) + endofline;
        file += "endobj" + endofline + endofline;
        xrefoffset.push(file.length);

// Pages dictionary
        file += "2 0 obj" + endofline;
        this.pages.Type = "/Pages";
        this.pages.Kids = "[ 3 0 R ]";
        this.pages.Count = "1";
        this.pages.MediaBox = "[0 0 " + this.width + " " + this.height + "]";
        file += this.objectDict(this.pages) + endofline;
        file += "endobj" + endofline + endofline;
        xrefoffset.push(file.length);

// Page dictionary
       file += "3 0 obj" + endofline;
       const dict = {}
       dict.Type = "/Page";
       dict.Parent = "2 0 R";
       const ressourcesdict = {};
       const fontdict= {};
       ressourcesdict.Font = fontdict;
       dict.Resources = ressourcesdict;
       dict.Contents = "4 0 R";
       file +=  this.objectDict(dict) + endofline; 
       file += "endobj" + endofline + endofline;
       xrefoffset.push(file.length);
     
       //stream = 'BT /F1 18 Tf 100 100 Td (Hello World) Tj ET';
        const stream = this.elements.join(endofline);
        const streamdict = {};
        streamdict.Length = stream.length;
        file += "4 0 obj " + this.objectDict(streamdict)+ endofline;
        file += "stream" + endofline;
        file += stream + endofline;
        file += "endstream" + endofline;
        file += "endobj" + endofline + endofline;

const startxref = file.length;

file += "xref" + endofline;
        file += "0 5" + endofline;
        file += "0000000000 65535 f" + endofline; 
        for(var i in xrefoffset) {
            const x = new Intl.NumberFormat('en-IN', { minimumIntegerDigits: 10 , useGrouping: false}).format(xrefoffset[i]);
            file += x + ' 00000 n ' + endofline;
        }

// trailer
        const trailderdict = {};
        trailderdict.Root = "1 0 R";
        trailderdict.Size = 5;
        file += "trailer " + this.objectDict(trailderdict) + endofline;
        file += 'startxref'+ endofline;
        file += startxref + endofline;
        file +='%%EOF' + endofline;

const url = "data:application/pdf;base64," + btoa(file);
        this.urlnode.href = url; 
        this.urlnode.setAttribute("download", "PS.pdf");
        this.urlnode.style.display = (context.device.pdfurl) ? "block" : "none";
        return context; 
    }

}

postScriptEditor = function(code) {
    const id = Math.floor(Math.random() * 1000);  // create unique id for console.log
    const node = document.createElement("DIV");   // build the HTML nodes
    node.id = "id" & id;
    node.className = "psmain";
    const node2 = document.createElement("DIV");
    node2.className = "editzone";
    node.appendChild(node2);
    const node3 = document.createElement("DIV");
    node3.className = "editheader";
    node3.innerHTML = "PostScript";
    node2.appendChild(node3);
    const node4 = document.createElement("FORM");
    node2.appendChild(node4);
    const node5 = document.createElement("BUTTON");
    node5.type = "button";
    node5.id = "button" + id;
    node5.innerHTML = "Run";
    node4.appendChild(node5);
    const node6 = document.createElement("TEXTAREA"); 
    node6.id = "editor" + id; 
    node6.className = "pseditor"; 
    node6.innerHTML = code;
    node6.rows = code.split(endofline).length + 1;
    node4.appendChild(node6);
    const node7 = document.createElement("DIV");
    node7.id = "console" + id;
    node7.className = "jsconsole";
   
    const node8 = document.createElement("CANVAS");
    node8.id = "raw" + id;
    node8.className = "jscanvas";
    node8.style.display = "none";
    node8.width = 590;
    node8.height = 330;
    node.appendChild(node8);

const node9 = document.createElement("A");
    node9.id = "rawurl" + id;
    node9.innerHTML = "PNG raw";
    node9.style.display = "none";
    node.appendChild(node9);

const node10 = document.createElement("CANVAS");
    node10.id = "canvas" + id;
    node10.className = "jscanvas";
    node10.width = 590;
    node10.height = 330;
    node10.style.display = "none";
    node.appendChild(node10);

const node11 = document.createElement("A");
    node11.id = "canvasurl" + id;
    node11.innerHTML = "PNG canvas";
    node11.style.display = "none";
    node.appendChild(node11);

const node12 = document.createElement("SVG");
    node12.id = "svg" + id;
    node12.className = "jssvg";
    node12.setAttribute("width","590px");
    node12.setAttribute("height","330px");
    node12.style.display = "none";
    node.appendChild(node12);

const node13 = document.createElement("A");
    node13.id = "svgurl" + id;
    node13.innerHTML = "SVG";
    node13.style.display = "none";
    node.appendChild(node13);
 
    const node14 = document.createElement("A");
    node14.id = "pdfurl" + id;
    node14.innerHTML = "PDF";
    node14.style.display = "none";
    node.appendChild(node14);

node.appendChild(node7);

console.log(node.outerHTML);  // add the node to the parent element

Javascript Editor

And the justified text

Javascript Editor

postScriptEditor(`10 dict begin /pdfurl 1 def /raw 0 def currentdict setpagedevice end

% prologue
% support procedures
% n1 n2 max result
/max { 2 copy gt { pop } { exch pop } ifelse } def

% text countblanks n
/countblanks { 10 dict begin /c 0 def { ( ) search { /c c 1 add def pop pop } { pop c end exit } ifelse } loop } def

% text width justifyshow -
/justifyshow { exch dup stringwidth pop 3 -1 roll exch sub exch dup countblanks 1 max 3 -1 roll exch div 0 32 4 -1 roll widthshow } def

% text width wraptext post width pre 
/wraptext { 10 dict begin /maxwidth exch def /text exch def
/linewidth 0 def
/linelength 0 def
/spaces 0 def
/spacewidth 0 0 moveto ( ) stringwidth pop def
text
{
% search leaves the post on the stack, so we assign it to currenttext
dup /currenttext exch def
( ) search { 
% add word width and check if linewidth is still not bigger than max
dup stringwidth pop linewidth add dup spaces add maxwidth le 
% add space and count characters
{ /linewidth exch def 
length 1 add linelength add /linelength exch def
/spaces spaces spacewidth add def
pop } 
% check if half space ok
{ 
spaces 2 div add maxwidth le
{ length 1 add linelength add /linelength exch def exit}
{ pop pop pop exit }
ifelse

} ifelse
} 
% no space found we are at the last word
{ pop text length /linelength exch def /currenttext () def exit } ifelse } loop
% remove trailing space character if not empty text
currenttext length { /linelength linelength 1 sub def } if 
% set text if only one word
linelength 0 le { /linelength currenttext length def } if 
% output 
currenttext maxwidth text 0 linelength getinterval end
} def

% text x y w l showjustifyblock y
/showjustifyblock { 
/l exch def
/w exch def
/y exch def
/x exch def w
{ wraptext 3 -1 roll dup 4 1 roll length { newpath x y moveto w justifyshow y l sub /y exch def } { x y moveto show pop pop exit } ifelse } loop y } def

% parameter
/bodystart 300 def
/linespace 33 def
/LM 20 def 
/RM 570 def
(CMUSerif-Roman) run
/CMUSerif-Roman findfont 33 scalefont setfont

% main
/linewidth RM LM sub def
/nextline bodystart def
(Acme Widgets was founded in 1952 by Dippy and Daffy Acme to produce high technology widgets for the growing aerospace market. Acme was quickly recognized as being the best widget works in the country. Continued investment in new technology and manufacturing methods has kept Acme Widgets in the forefront of this industry.)
LM nextline linewidth linespace showjustifyblock pop

showpage

`);

It should be able to show the PDF directly with the object or embed tag, but I couldn't get it work to refresh the node.

We now can create valid SVG and PDF files but all text is converted in paths. It would be better to handle the text as text, so it can be selected and modified. The files would also be smaller if the path for each character is only defined once. To be able to do that, we will have to add the font to the SVG and to the PDF.

We have now 2631 lines of Javascript code (88 KB) with no dependencies capable to interpret PostScript and render gray level images to canvas, SVG and PDF. ps20240921.js and a reference installation
minimal5.html

PDF reference manual 1.7
https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf

Bonus: InDesign product manger David Evans on why Adobe created PDF:
https://www.asc.ohio-state.edu/schumacher.60/imageinfo/pdf_ps_eps.html

My Journey to PostScript