A friend and I were privately discussing the challenges of searching scanned paper books by the Greek and Hebrew words they contain. What follows is one of my replies, with personal references deleted, that may apply to other DC readers.
“Yep, I know just the garbled mess you’re talking about.
DevonThink searches the Greek and Hebrew fine for original documents. However, I think you’re talking about how it handles scanned images of paper books containing English, Greek, and/or Hebrew which is the hardest case “out there.” That boils down to OCR engines, none of which can handle Hebrew very well, yet.
The only good news is comparative in that DT uses the best OCR engine (ABBYY FineReader.) Even so, I don’t see the ability to handle niqqud on the Hebrew characters, but neither do any other alternatives.
If you’re starting with a scanned book, using an OCR engine to convert to text, and then exporting the resulting PDF to .docx in order to upload to Logos, I’ve found no workaround other than the publisher doing it for us (and charging more), or the work of someone who knows what they’re doing.
Ideally, the publisher has a digital copy, makes a deal with FaithLife, and FaithLife begins with the digital copy (side-stepping language issues) and starts tagging.
The more I learn, the more I’ve come to respect the amount of formatting work FaithLife has to do. That’s also why I focus on the 5 or 10% of the Divine Council Bibliography that is most urgent for scholarly work.
Having said all that, if you’ve found a few Divine Council resources that tend to be at the heart of your work let’s talk about what it would take to get them formatted, properly, for upload to Logos.
If you’re a MAC user doing research or writing, DevonThink is inevitable. There’s nothing out there that competes. I have DT office pro, and it’s one of those “always running” apps. Spotlight, HoudahSpot, Easyfind, DefailtFolderX, Acrobat, sure. But DevonThink is mandatory, IMO.”