Jdoctopdf

Home

FAQ

Known Issues/Limitations

Source/Binaries/Documentation

Use Examples

Sample Output

   
   

Known Issues and Limitations

There are a number of known issues and limitations with Jdoctopdf. Most of these limitations surround the reading of the Word format files so that is a different category of limitations below.

General

  1. No support for DOCX files. I am currently working on this, it may take some time yet. Extracting the raw-text from DOCX is easy, the formatting not so much.
  2. No support for file type X. See the FAQ for the formats I am working to support currently. If you need another format you may want to look into that yourself.

Word 97 Specific

  1. Unicode support. Well. It's not really tested against such things and to be honest looking at the comments in POI it's not looking good. I believe there could be a number of issues with this. Good luck.
  2. Table formats are not detected. Sadly the current POI implementation of HWPF (used for reading Word 97 files) does not support getting the styles, aka colours and borders etc of tables. If it is at all possible I would like to add support for detecting if a table at least has borders or not, currently it's a problem when a file uses some tables for formatting because those get borders drawn around them as well when they shouldn't. To fix this will require some patche(s) to Apache POI HWPF, I am working on it but it may be long way away.
  3. Colors are not correct. Again this is a limitation of POI which does not currently provide the correct colours for elements, especially "custom" colours.
  4. Numbered lists. Numbered lists will appear as bulleted lists instead. Probably forever. Word seems hopelessly complicated at dealing with lists.