Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Howto: Convert PDF to ODT/TXT

Share your HowTo, Documentation, Tips and Tricks. Not for support questions!.
Post Reply
Message
Author
User avatar
bester69
Posts: 2072
Joined: 2015-04-02 13:15
Has thanked: 24 times
Been thanked: 14 times

Howto: Convert PDF to ODT/TXT

#1 Post by bester69 »

The following script convert any pdf file to a readable ODT, TXT file.

requirements:
convert, soffice, unoconv, tesseract
clear
workfol=$(pwd)
ruta=$(readlink -f "$1")
ruta2=$(dirname "$ruta")
filename=$(basename "$1")

echo "Ruta completa es:"$ruta
echo "Nombre es:"$filename
echo "Carpeta es:"$ruta2

tmp="/tmp/tmpxxzy"
rm -rf $tmp
mkdir $tmp
cd $tmp

convert -units pixelsperinch -density 300x300 -resize 2480x3508 -page a4 "$ruta" sal.jpg
find . -name "*.jpg" -exec tesseract -l spa {} {} \;
cat *.txt > "$filename".txt
soffice --headless --convert-to odt "$filename".txt
#pdfunite *.pdf $1
unoconv -f pdf "$filename".odt
#mv $1.* ../
echo "Moviendo A :$filename.txt a $ruta2"
mv "$filename".txt "$ruta2"/
mv "$filename".odt "$ruta2"/

cd "$workfol"/
#rm -rf $tmp
pdf2odt fich.pdf --> create two files, fich.txt and fich.pdf.odt
bester69 wrote:STOP 2030 globalists demons, keep the fight for humanity freedom against NWO...

Post Reply