Has anyone installed the following?
antiword - available from http://www.winfield.demon.nl/
pdftotext - part of the xpdf package http://www.foolabs.com/xpdf/home.html
html2text - available from http://www.mbayer.de/html2text/
rtf-converter - available from http://directory.fsf.org/rtf-converter.html
If so, how did you do it?
Jason, is this something only you can do?
Thanks
From the looks of it, those appear to be server-wide installations though I'll do some research and follow-up.
To confirm, these have to be installed server wide.
I solicited some feedback on them and have been advised against installing them on a shared server for multiple reasons. It's not that they're individually "unsafe" per se but they add more items to manage and maintain current versions (security), can increase resource usage depending on their utilization, aren't standard packages with a cpanel server and thus wouldn't be supported fully by my techs, etc.
Unless it was a request on a dedicated box, I'd rather not add any unnecessary uncertainty or instability to our setups.
Do you (or does anyone :) ) know of any php scripts that do similar functions?
Google Documents provide this kind of conversion. I access it via my gmail account.
To access Google Documents, go to http://gmail.com/ and set up a gmail (web-based email) account. When you login to your new account the upper left has the following menu...
Gmail - Calendar - Documents - Photos - Reader - Web - more ▼
:)
Thank you CountryLady. Unfortunately, this does not appear to be what I am looking for.
I am looking for a customizable script that will convert the formats (pdf, rtf, doc, etc.) into plain text in real time for database import and possible parsing.
I've asked around and will let you know if I hear of anything useful --
Quote from: Jason on May 05, 2008, 10:48:00 AM
I've asked around and will let you know if I hear of anything useful --
Any update on this? :)
Unfortunately I never heard anything.