Encoding issues with ABCpdf8

I ran across a problem using ABCpdf8 for PDF generation: bullets and the Euro sign are displayed as garbled characters.

What’s going on here?

character name character unicode displayed as UTF-8
bullet U+2022 • 0xE2 0x80 0xA2
Euro sign U+20AC € 0xE2 0x82 0xAC

Since .Net is Unicode-based, the problem cannot be caused by .Net sting handling.

My conclusion is that the Doc.AddImageHtml method creates a temporary file in UTF-8 encoding, and the HTML rendering engine (Gecko in this case) needs to be told about the correct encoding:

  @"<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"">
  <html xmlns=""http://www.w3.org/1999/xhtml"">
      <meta http-equiv=""content-type"" content=""text/xhtml; charset=utf-8"" />
  ... and more HTML content ....

In preparing this blog, my assumption was confirmed in the Notes section

ABCpdf saves this HTML into a temporary file and renders the file using a ‘file://’ protocol specifier. So this is a convenience function – it doesn’t offer any performance enhancements.

One Response to Encoding issues with ABCpdf8

  1. Saurabh says:

    Hi, i am facing same issue could you tell how to set the correct encoding for the HTML rendering engine (mshtmlengine)..

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: