Using the jsPDF library and following this sample code taken from this answer , I am working on a build using HTML and javascript to generate a PDF file.
The content of the PDF file is the HTML code of this test page 1 that you can download to review its content.
Opening the file in the browser and selecting the download link; the PDF file is downloaded to the "Downloads" folder 2 , but the result contains some encoded characters:
This is a snippet of the HTML file:
<div class="TituloPagina">
CONTRATO INDIVIDUAL DE TRABAJO A TÉRMINO FIJO INFERIOR A UN AÑO CELEBRADO ENTRE
COMPAÑÍA (A). Y FUTURO EMPLEADO (B) –SALARIO ORDINARIO.
</div>
The character –
is shown encoded in the PDF like this:þÿ
I tried:
- Put the property
UTF-8
in the HTML file. - Save the HTML file as instructed in the answer , but in this case , I don't think the problem is in the sample HTML file, but in the result of the conversion to PDF .
- Change the label
meta
like this:<meta charset="iso-8859-1" />
; which causes the HTML file to lose the encoding of characters such asñ
accents, etc.
But there is no difference in the generated PDF files.
How to remove encoded characters from a generated PDF file using jsPDF?
1 I have created this test file as the actual development contains sensitive information.
2 The download path varies according to each user's configuration in their preferred browser.
Everything indicates that it is a bug in the "fromHTML" plugin. A solution may be to avoid using characters with code greater than 255 (restrict to characters in Windows-1252 ).
The other is to use your own jsPDF, changing this line , where it says:
by
A third solution: map the characters when generating the 8-bit vector.
Keeping the change in the call to
pdfEscape
, in the main module, after this line , putDo the same with the characters in this table when their Unicode value differs from their value in the charset (they are the ones in rows "8_" and "9_").
( Disclaimer: It looks like the right thing to do but I haven't tested it.)