Searchables OCR's (Very good Solutions!!)

If it doesn't relate to Debian, but you still want to share it, please do it here

Searchables OCR's (Very good Solutions!!)

Postby bester69 » 2017-09-07 23:27

Im very impressed with Google OCR engine, as it's incredibly fast and accurate :shock: , and the best thing we can use it in linux to get great results from thoses
blurry and small pictures than some ocrs doesnt always get accurate results.

I found out two ways:
1- Using Google docs:
- We upload a picture or a pdf file to GoogleDrive
- Open with google Docs --> It open the file in a doc with the text below.

2- My favourite, using GoogleKeep:
- Drag a picture or serveral pictures in a note or several notes apart
- Select "Grab text from image"
---------------

https://opensource.com/life/15/9/open-s ... ext-images
Google's Optical Character Recognition (OCR) software now works for over 248 world languages (including all the major South Asian languages). It's quite simple and easy to use, and can detect most languages with over 90% accuracy.

The technology extracts text from images, scans of printed text, and even handwriting, which means text can be extracted from pretty much any old books, manuscripts, or images.
Last edited by bester69 on 2017-09-09 16:09, edited 2 times in total.
User avatar
bester69
 
Posts: 949
Joined: 2015-04-02 13:15

Re: Google OCR Solution (Very good!!)

Postby debiman » 2017-09-08 05:46

OCR with 90% accuracy? meaning, every 10th character is wrong?
that's thick, even for the almighty google.
also, enjoy providing the beast with even more information.
User avatar
debiman
 
Posts: 1528
Joined: 2013-03-12 07:18

Re: Google OCR Solution (Very good!!)

Postby bester69 » 2017-09-08 08:31

debiman wrote:OCR with 90% accuracy? meaning, every 10th character is wrong?
that's thick, even for the almighty google.
also, enjoy providing the beast with even more information.


We cant stop it man!!, quantum D-wave its already here, and between CERN and that D-wave, who knows what future is coming. :? :? google information oscure uses should be the less to worry about.

We might even already been inside a Matrix, as Ive already noticed some weird things going on around me. :?
https://www.youtube.com/watch?v=uwl3h8l4NPI
User avatar
bester69
 
Posts: 949
Joined: 2015-04-02 13:15

Re: Google OCR Solution (Very good!!)

Postby bdtc1 » 2017-09-08 16:43

aptitude search tesseract-ocr
bdtc1
 
Posts: 26
Joined: 2015-01-22 09:00

Re: Google OCR Solution (Very good!!)

Postby Dai_trying » 2017-09-08 18:14

I gnerally use ocrfeeder, it works without submitting your work to google and in my (not so extensive) use it has been pretty accurate.
Dai_trying
 
Posts: 332
Joined: 2016-01-07 12:25

Re: Google OCR Solution (Very good!!)

Postby bester69 » 2017-09-08 18:56

Dai_trying wrote:I gnerally use ocrfeeder, it works without submitting your work to google and in my (not so extensive) use it has been pretty accurate.


I use AbbyFinederX by using playonlinux (wine), It gets outstanding jobs. And i also Use tesseract with recollindex, for searches indexing within pictures and pdf's. It also do a great job.

Im loving now using GoogleKeep with Android and Chrome, you take a picture of a document and use it as a note by extracting the text and then removing the picture. Its reallt great! :D

-----------------------------
There are also some great Android Apps we can use for OCR's.
I found a good one out:
- Adobe SCan ---> It works Great.
https://video.tv.adobe.com/v/18742t1/?autoplay=true
Image

It does a real OCR pdf's job; we can upload a scanning document to the cloud and then using Adobe Scan to get the ocr's pdf, and then download the job to our computer.
User avatar
bester69
 
Posts: 949
Joined: 2015-04-02 13:15

Re: Google OCR Solution (Very good!!)

Postby bester69 » 2017-09-08 20:52

Here i leave a pdf's OCR document ive done with Adobe Scan, in just two minits,
You will see text its in background:
https://drive.google.com/open?id=0B-1Wr ... mY2N1g5azg


Steps to get done the OCR:
1- Create a folder and put inside all pictures (limitation 25 files per document)
pdftoppm -png -aa yes -r 300 document_forOCRscan.pdf outfile.png

2- Upload folder to GoogleDrive with all pictures

3- Goto Adobe Scan App,
- select GoogleDrive source-->Pick the Folder uploades--> Select All png files
- Save as PDF file

4- Merged All pdf files (AdobeScan limited to 25 pages per file in free version)
pdftk *.pdf cat output finaldoc.pdf

Done. :D
Last edited by bester69 on 2017-09-09 00:17, edited 2 times in total.
User avatar
bester69
 
Posts: 949
Joined: 2015-04-02 13:15

Re: Google OCR Solution (Very good!!)

Postby Dai_trying » 2017-09-08 21:09

It does sound very good, but I will stick to using my off-line version and retain some privacy on my data. :D
Dai_trying
 
Posts: 332
Joined: 2016-01-07 12:25

Re: Google OCR Solution (Very good!!)

Postby 4D696B65 » 2017-09-08 21:17

Dai_trying wrote:It does sound very good, but I will stick to using my off-line version and retain some privacy on my data. :D

+1
User avatar
4D696B65
 
Posts: 2022
Joined: 2009-06-28 06:09

Re: Google OCR Solution (Very good!!)

Postby bester69 » 2017-09-09 00:12

4D696B65 wrote:
Dai_trying wrote:It does sound very good, but I will stick to using my off-line version and retain some privacy on my data. :D

+1


I downloaded a pdf ebook with no text from amule, and i used AdobeScan for OCR:
- It has limitiation to 25 pager per document, so i created folders of 25 pages then used ptftk to merged the resulted file.

Here, Check the proffessional result I got with AdobeScan, I uploaded a 25 pages file (limitation of free version):
https://drive.google.com/open?id=0B-1Wr ... UZTdjI2Zkk

8)
User avatar
bester69
 
Posts: 949
Joined: 2015-04-02 13:15

Re: Google OCR Solution (Very good!!)

Postby bester69 » 2017-09-09 00:21

Dai_trying wrote:It does sound very good, but I will stick to using my off-line version and retain some privacy on my data. :D


Im afraid there is no linux/off-line version that gets text mapping OCR, and the app that i think try to maps text do an awfull job,in resumen linux still lacks an App with a mapping ocr's text.
User avatar
bester69
 
Posts: 949
Joined: 2015-04-02 13:15

Re: Google OCR Solution (Very good!!)

Postby debiman » 2017-09-09 06:32

bester69 wrote:
debiman wrote:OCR with 90% accuracy? meaning, every 10th character is wrong?
that's thick, even for the almighty google.
also, enjoy providing the beast with even more information.


We cant stop it man!!, quantum D-wave its already here, and between CERN and that D-wave, who knows what future is coming. :? :? google information oscure uses should be the less to worry about.

We might even already been inside a Matrix, as Ive already noticed some weird things going on around me. :?
https://www.youtube.com/watch?v=uwl3h8l4NPI

quoted for posterity, before OP changes his/her mind and edits it.
:lol:
User avatar
debiman
 
Posts: 1528
Joined: 2013-03-12 07:18

Re: Google OCR Solution (Very good!!)

Postby alan stone » 2017-09-09 12:53

bester69 wrote: who knows what future is coming. :? :?

Unless fundamental changes, Silly Valley and other Big Tech ending up as repugnant and despised as Wall Street.
Debian 8.9 32bit, WM: Openbox
Computers are like air conditioners. They work fine until you start opening windows. - Author Unknown
Programming is like sex. One mistake and you have to support it for the rest of your life. - Michael Sinz
User avatar
alan stone
 
Posts: 205
Joined: 2011-10-22 14:08
Location: In my body.

Re: Google OCR Solution (Very good!!)

Postby bester69 » 2017-09-09 15:45

I finally found out tthe best and accesibles Solutions in linux for Searchables OCR's purpose.:


- Master PDF Edidtion (Native App- Free Version)-> It brings a Searchable OCR included in free version; It get a very good results, but still text labels alieneation is not perfect; when you copy some paragraphs text and paste in a text editor, some lines still not shows properly. But for using as a serachable document, it does a great job.

- Adobe Scan (Mobile app -Android)
For being an Adobe App, it does a proffesional job, the problem is that you need a smartphone to be able to enjoy. Free version, limited to 25 pages/document. We can scan our docs. in linux then upload them to cloud and use AdobeSCan for getting the Searchables OCR's, next in linux using ptftk to merge final document result. Perfect text labels alineation, worthy for copy/paste

- OcrMyPDF (comand line)
https://ocrmypdf.readthedocs.io/en/late ... notes.html
It gets the job done, but my testing showed very bad alineation with text labels, much worse than "Master PDF Edition", so it's not worthy for copy/paste text.

- PDF-XChange Viewer
I definitly support this App, this small windows app includes a free OCR that do a proffesional job. It works "Gold" with any or most of wine's versions, so you wont get any problem by installing it. The alinention it gets, its finally perfect (much better than "Master PDF Edition"); you will be able to do copy/paste paragrpahs with the right alineation words.

So, Right Now, for me the best and accessible linux searchable OCR's solution would be using PDF-XChange Viewer(wine) ( and in its default MasterPDFEdition). And for a proffesional job i would use AdobeScan.
User avatar
bester69
 
Posts: 949
Joined: 2015-04-02 13:15

Re: Searchables OCR's (Very good Solutions!!)

Postby Dai_trying » 2017-09-09 16:18

Did you try ocrfeeder?
Dai_trying
 
Posts: 332
Joined: 2016-01-07 12:25

Next

Return to Offtopic

Who is online

Users browsing this forum: No registered users and 4 guests

fashionable