the pre-processing tools (about which, i believe you know a bit) are at the bottom of page 2 of the original thread. from my limited knowledge, tesseract seems to use, among others, opencv for thresholding, eroding, etc. user joilts provided some valuable inline java code which is easily modified to suit and seems to fit in well with the original test example. the tools were suggested as help to another user who was having a problem recognizing automobile license plates, but it appears that every image will probably benefit from at least some pre-processing. i think it's your opencv library that's used in this case.
regarding the updated trained data that i linked to, i think it may be incorrect. that is to say, you will see available trained data, but it is for tesseract 4, not 3.0X. the update javacpp (v1.2) code may be based on tesseract 3.0X. i have to check that. in any case, the tesseract engine initializes successfully (with the new data), so it may not be an issue. plus, in my case, i turn off a number of tesseract options that use the dictionary. that may or may not be helpful to someone who expects to use a dictionary.