Jan 25 2006

Word To PDF

Why isn’t there are good and simple command line doc2pdf application? I just can’t find any good command line programs that can faithfully produce a PDF document given a Word document. There are a lot of commercial and some open source applications that can create a PDF document but I can’t find a simple command line tool that does this. For example, PDFCreator is an open source application that allows you to create a PDF document from Word by ‘printing’ the document to a virtual PDFCreator printer. Several commercially available Word to PDF solutions do this same thing; installing a ‘printer’ to print a document as a PDF. This solution is really a hack that exploits the fact that documents sent to the printer need to be transformed to what is essentially PostScript. Once you have a document in its PostScript format you can create a PDF using Adobe Acrocat Distiller or GhostScript’s ps2pdf.cmd batch file.

PDFCreator does not provide a nice command line interface but that is easy to get past that limitation with some simple Visual Basic. You can write some simple Visual Basic script code that opens a Word document, sets the default printer to PDFCreator, and ‘prints’ out document allowing PDFCreator to create a PDF for you. You might want to edit the PDFCreator’s auto-save options otherwise you will be prompted where to save the new PDF. Here is some sample Visual Basic code that does just what I described above.

Set word = CreateObject("Word.Application")
Set docs = wdo.Documents

' Remember current active printer
Set sPrevPrinter = wdo.ActivePrinter

' Select the PDFCreator as your printer
word.ActivePrinter = "PDFCreator"

' Open the Word document
Set document = docs.Open(sMyDocumentFile)

' Print the document file to the PDFCreator
word.ActiveDocument.PrintOut

document.Close WdDoNotSaveChanges
word.ActivePrinter = sPrevPrinter
word.Quit WdDoNotSaveChanges

For completeness sakes let me mention how to create a PDF document using the Apache POI project. You can of course convert a Word document to PDF using the Apache POI API. Using POI you can create a XSL-FO version of your document which can be transformed into a PDF using Apache FOP. It has been my experience that the results generated by POI are not perfect but here is some code for you go get started. The POI scratch pad jar contains a WordDocument class that will create a XSL-FO version of the Word document. The WordDocment might have been intended to be just a command line application because it throws a NullPointerException if you try to use it in your code so you will have to modify this class. Once you fix the exception you can code the following two lines to produce an XSL-FO for a given Word document:

WordDocument file = new WordDocument(wordDocumentPath);
file.closeDoc();

Of course once you have the XSL-FO version of your document you can transform it to a PDF using Apache FOP. One word of warning, the WordDocument class is in the scratch pad jar and might not be as stable as you might think.


Jan 13 2006

Decode Java

Sometimes when working on web applications you need to encode GET parameter values that are part of a URL. Recently while working on a custom URL handler in Java I ran into a situation where I thought I had to encode part of the URL. I didn’t have to go that route, to encode/decode the custom URL, but this reminded me how to do that in Java. If you need to encode string value to be included in a URL use the URLEncoder as such:

String encoded = URLEncoder.encode("value: *(&#%");
// encoded would equal to "value%3A+*%28%26%23%25"

If you notice spaces will be converted to plus signs and characters like the pound sign (#) will be converted to their hexadecimal value. To decode a sting value you can use the URLDecoder’s decode(String) method.


Jan 8 2006

ARRR

What is up with file extensions ending in ar. In a daily and weekly basis I have to deal with tar, jar, rar, war, and ear files. I think we still need a car extension for Nascar fans, and var for the JavaScript web devheads, mar for the Spanish speakers, par for all the golf fans, oh wait EJB 3.0 makes use of a new par extension for its persistence entity beans.. To put everything into perspective I think we still need a far file extension. If I would to come up with a new compression algorithm I would use an extension that denotes my favorite hang out; bar.


Dec 29 2005

Rails Hits 1.0

Ruby on Rails 1.0 has been available for some time now so I thought that I upgrade to the latest release. I don’t know if I did something wrong but I couldn’t download the latest Rails release using the gem package installer. If I remember correctly the first time I installed Rails I did not manually download anything, the gem package installer did everything for me. To install Rails 1.0 I manually download the rails-1.0.0.gem and ran the following command:

gem install c:\path\to\rails-1.0.0.gem

I hope this is helpful for those upgrading to Ruby on Rails 1.0.


Dec 27 2005

TechKnow Year In Review 2005

It is that time of year where we reflect on the accomplishments of the passing year and look forward to the one to come. Here is a window of the past year in technology through past posts.

TechknowZenze: First Post – How it all started.
Import Script/CSS/PHP
Page Redirect – PHP, HTML, and JavaScript code to redirect an HTML page to another.
MySQL Admin – Quick tutorial for MySQL administrative tasks.
Put JavaScript To Sleep – Tutorial describing how to set JavaScript functions to timeout.
Word Mail Merge – Visual Basic Script code to manipulate Word’s Mail Merge functionality
JavaScript FX – JavaScript code to hide/show HTML elements.
Style and Class – Working with style attributes on HTML tags using JavaScript.
Search Engine Optimization
The Word is POI – Java library for working with MS Office documents.

Seasons Greetings

Technorati Tags: , , , , , , , , , , , ,


Dec 14 2005

Eclipse Tool Tip #6: Quick Find

Most people know that in Windows base application you can hit the ctrl+f keys to search within a document. Of course supports Eclipse ctrl+f but if you want to do a quick search just highlight the word of interest (you can double click to highlight) and hit the ctrl+k to find the next instance of said word. This is so convenient because you do not interact with any find dialog box, again to do that you can use the ctrl+f hot keys.

I can’t say it enough time that if you are using Eclipse on a daily basis you should know the hot keys to build, run, and debug your code. I have written about Eclipse hot keys before in Eclipse Tool Tip #4: Key Assist and I will probably write about debug specific hot key in a future Eclipse Tool Tip.