Best Of Two Worlds - Acrobat PDF Scripting Using VisualBasic (VBA) & JavaScript
The other day I was asked by a client of mine to create a convenient macro for adding watermarks or letterheads to Word documents. The first thought that came to my mind was putting a graphics object (the letterhead) into the header or footer of the document. This is exactly what Word does automatically, when the user clicks the Format | Background menu item. Sounds simple. There are only two subtleties. For one, the graphics object should be behind the text, such that it doesn't get in the way of other header or page content. The other subtlety is more intrusive: let's assume that you want the letterhead only on the first page. You can put it into the body text of the first page or you can change the section format to have different headers and footers on the first and subsequent pages. If you want the letterhead on all but the first page you can only use the second option. If you get the document from another source, like a customer or a supplier, the chances are that your document already has differing headers or footers for the pages in a section or that the document already has multiple sections and so on. Too many ifs for a reliable solution, don't you think?
Then I discovered that Adobe Acrobat 6.0 has a neat feature that can be used to put one PDF document on top of or behind the pages of another PDF document. The menu item is Document | Add Watermark&Background. I tried it on a few documents and it worked very well. Problem soved. The only thig left to do was instructing my users to convert their Word documents to PDF and then use Acrobat to add the letterhead PDF. Unfortunately, my users aren't very computer literate. They do not understand how things work; instead they memorize the way they are achieved. It's like with cars, hardly anyone knows these days how a car engine works, but everyone knows how to operate one. Anyway, for someone whose brain works that way, the instructions can't just be "Convert to PDF, merge letterhead PDF using Add Background&Watermark and print". As I didn't want to write another two pages of instructions for dummies, I thought I could stll write a Word macro that automates the task as much as possible.
A VisualBasic for Applications (VBA) program that scripts Adobe Acrobat? Impossible, I thought. But after a few hours of online research (Google has gotten better in the past months), I came up with a rough idea how this could be achieved. From now on, it's getting pretty darn technical.
First of all, there is Adobe's Interapplication Communication for Acrobat (IAC). The reference is available on Adobe's ASN pages (see Resources section at the end of this document) provided that you register for a free ASN web account (only a valid email address needed). On Windows, the Adobe lingo term 'interapplication' means OLE and DDE, on MacOS it stands for AppleScript. OLE objects can be scripted in VBA. There are three revisions of the IAC, each corresponding to one of Acrobat's major versions: 4, 5 and 6. Unfortunately, neither revision provides access to the watermark/background functionality. Well, no access other than executing a menu item, effectively simulating a mouse click:
Dim app As Object Set app = CreateObject("AcroExch.App") app.MenuItemExecute( "COMP:AddBack" )
The method MenuItemExecute is documented in the IAC reference; wheras the menu item id is documented in the Acrobat Core API Reference. The above code opens the dialog box that is normally shown after clicking the Document | Add Watermark&Background menu item. There must be a better way.
On his ByteRyte site, Matthew Fitzgerald explains how to use Acrobat's JavaScript API to overlay two PDF pages from different documents. The Acrobat JavaScript Scripting reference is also available freely. Matthew's technique employs template pages - a feature that is only accessible through JavaScript, not through IAC. As briefly mentioned on PlanetPDF, Adobe introduced a somewhat ominous Javascript/IAC bridge with version 6 of Acrobat. Unfortunately, the document that describes how to use this bridge is not publicly available (Programming Acrobat JavaScript Using Visual Basic). I didn't want to pay € 185,- for a 14 page document. After some more searching, I found the key piece of information in the Acrobat Knowledge Base (also on Adobe's ASN partner site). One KB article contained sample VB source code listing a method not documented in the IAC reference: PDDoc.GetJSObject. It didn't take a rocket scientist to figure out what this JSObject was. It's basically a collection of the top-level objects found in Acrobat's JavaScript API (excuse my poor terminology; I'm not a JavaScript person). At the other end of the world, some Python programmer mentions JSObject in a mailing list post, saying that it returns Variants and accepts almost every elementary argument type. From that I concluded that the VB/JS bridge should be invoked as follows:
Set app = CreateObject("AcroExch.App")
Set avDoc = app.GetActiveDoc ' get the logical doc Set pdDoc = avDoc.GetPDDoc ' get the physical doc Set jso = pdDoc.GetJSObject ' get the bridge docs = jso.app.activeDocs ' get array of active docs, ' app is the JS handle to Acrobat's Application top level object For Each doc In docs ' iterate docs ... Next
And it worked. At that point, I had everything I needed. The necessary steps are:
- Convert the Word document to PDF by printing it to the "Adobe PDF" printer.
- Wait until Acrobat opens the distilled PDF.
- Obtain the JSObject from the distilled PDF.
- Add the first page of another PDF - the background PDF - to the distilled PDF.
- Turn the added page into a template page.
- Instantiate ("spawn" in Adobe lingo) the template page on some or all of the distilled PDF's pages, effectively merging the template's content into the normal pages' contents.
- Remove the template page from the distilled PDF.
- Have a glas of milk - for the calcium and vitamins A and D.
Step 4 was tricky because there is no GetActiveDoc() method in the JavaScript API, at least I didn't find one. The functionally similar JS property App.activeDocs returns an array of all open PDF documents. In order to get a JS handle to the distilled PDF I had to iterate this array and compare the name of each document to the name of the document obtained through GetActiveDoc(). This is in fact not very reliable and if you know a better way, drop me a line or leave a comment.
Source Code
Available under GPL here.
Copyright (C) 2004 Hannes Schmidt This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
Online Resources
ByteRyte | Using JavaScript to Apply Templates To PDF File is Matthew Fitzgerald's article on using page templates for PDF backgrounds and effects.
Adobe | Adobe PDF - Acrobat 6.0 SDK Documentation lists available SDK documents. Some can be accessed by getting a free ASN web account. For others an ASN developer membership is required for which you have to pay. The documents mentioned in this article are:
- Adobe | Acrobat Interapplication Communication Reference describes the OLE (e.g. VisualBasic for Application) and DDE interfaces to Acrobat and Acrobat Viewer
- Adobe | Acrobat Core API Reference Low-level API documentation; the Lists section contains a list of language-independent menu item names for use with menuItemExecute().
- Adobe | Acrobat JavaScript Scripting Reference
Adobe | How To: Call Menuitem Using Javascript/Visual Basic Interface VB sample code with JSObject and a method call on it. Amusing: the workhorse method call is commented out.
PlanetPDF | Developing with Inter-Application Communication (IAC) mentions the VB/Javascript bridge.
Python-Win32 mailing list | COM, Acrobat and JavaScript by Joshua Reynolds says something about JSObject's return and argument types.