|
This article is for those who want
to learn how to generate PDF documents from HTML files
Who is this for
This article is for those who want
to learn how to generate PDF documents from HTML files
What you need to know
Basic Perl scripting, HTML, Perl
module installation
Introduction
The Adobe © PDF files is one
of the popular file format for transferring documents. One of the
reasons for this is that it displays the document the way it will be
printed (WYSIWYG). Since it does not require a word processor to
browse the document, it is also a lot more convenient. You just need
their reader to be able to browse the document.
Perl has a module that will allow
you to convert HTML files to PDF documents. This is done by using the
HTML::HTMLDoc module.
Installation
You need to download the HTMLDoc
product from Easy
Software Products. Install it first then you can proceed to
download the HTML::HTMLDoc module from CPAN. Once you download it,
install it as you would typical Perl modules.
Generating your PDF Document
First thing to do is to use the
module and create your instance of the HTML::HTMLDoc package.
You can then pass a full HTML
document to the package or tell it to generate the PDF document from
an HTML file.
Sample code
#!/usr/bin/perl use HTML::HTMLDoc; use strict; #################################################### # This script is distributed according to the terms of # the Perl Artistic License. Use at your own risk # © 2004 Philip L. Yuson #################################################### my $str = ' <html> <body> <p><font size=14pt><b>HTML to PDF Document</b></font></p> <p>Let us see how this will work</p> <table border=1> <tr><td>This is a row in a table</td></tr> <tr><td>This is another row</td></tr> </table> <HR> copyright © 2004 Philip L. Yuson </body> </html>'; my $html = new HTML::HTMLDoc(); # Start instance $html->set_page_size('letter'); # set page size $html->set_bodyfont('Arial'); # set font $html->set_left_margin(1, 'in'); # set margin $html->set_html_content($str); # contents to convert my $pdf = $html->generate_pdf(); # generate document $pdf->to_file('article.pdf'); # save document
|
Save this as pdf.pl and
run it by starting a command line and typing this:
This should generate a PDF file
similar to this.
Generating PDF files for
download on the web
to generate a PDF which can be
downloaded, without saving it to a file, you can use the same script
as above except that you need to change the last line. However, to
make our example more interesting, we will put the date and time on
the document also. So the script will look like this:
#!/usr/bin/perl use HTML::HTMLDoc; use Date::Calc; use strict; #################################################### # This script is distributed according to the terms of # the Perl Artistic License. Use at your own risk # © 2004 Philip L. Yuson #################################################### my @c = Date::Calc::Today_and_Now(); my $str_temp; foreach (@c) { $str_temp .= sprintf("%02d:", $_); } my $str = " <html> <body> <p><font size=14pt><b>HTML to PDF Document</b></font></p> <p>Let us see how this will work</p> <table border=1> <tr><td>This is a row in a table</td></tr> <tr><td>This is another row</td></tr> </table> <HR> This document was generated: $str_temp copyright © 2004 Philip L. Yuson </body> </html>"; my $html = new HTML::HTMLDoc(); # Start instance $html->set_page_size('letter'); # set page size $html->set_bodyfont('Arial'); # set font $html->set_left_margin(1, 'in'); # set margin $html->set_html_content($str); # contents to convert my $pdf = $html->generate_pdf(); # generate document # Tell browser this is a a PDF document print "Content-Type: Application/pdf\n\n"; $pdf->to_string(); # Send the document
|
To see
how the output looks like, click here.
To
learn more about HTML::HTMLDoc, check out the documentation. On the
command line, type:
man HTML::HTMLDoc
and it will display all the functions
available in this module.
|