Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

[Bash] Check if file is binary or text?

Programming languages, Coding, Executables, Package Creation, and Scripting.
Post Reply
Message
Author
thamarok

[Bash] Check if file is binary or text?

#1 Post by thamarok »

Hello!

Is it possible to have a Bash script which would check if a file is either in a binary form (ELF Executable, PE Executable, Linked Library, etc..) or textual form (Any file which contains plain text.. like /var/log/dmesg)?

Thanks in advance!

Scotti
Moderator Team Member
Moderator Team Member
Posts: 305
Joined: 2005-11-08 01:13

#2 Post by Scotti »

Interesting. I'm curious myself.

I found this, not sure if it's what you're looking for: http://tldp.org/LDP/abs/html/fto.html
:?

thamarok

#3 Post by thamarok »

Scotti wrote:Interesting. I'm curious myself.

I found this, not sure if it's what you're looking for: http://tldp.org/LDP/abs/html/fto.html
:?
Nothing found on that site either :?
Also note that not every application has execution permissions, so watching the permissions of the file won't help much.

lacek
Posts: 764
Joined: 2004-03-11 18:49
Location: Budapest, Hungary
Contact:

#4 Post by lacek »

Use the 'file' command. It is contained in the package named 'file'.
It's output is like this:

Code: Select all

~# file *
atisysteminfo-report.txt: ASCII English text
c.txt:                    ASCII text
Mail:                     directory
MyTest.class:             compiled Java class data, version 46.0
MyTest.java:              ISO-8859 Java program text
portdef.props:            ASCII text
vpd.properties:           ASCII text, with very long lines
xiclotl-wake:             Bourne-Again shell script text executable
You can use this information to decide what kind of file you are looking at.

thamarok

#5 Post by thamarok »

Ever heard "programmatically"? I don't want to be rude, but I wouldn't be asking for a Bash script if I wouldn't be developing something.. Also note that there are millions of file formats that are binary.. so making an extremly slow interpreter with millions of checks wouldn't be very professional.
Last edited by thamarok on 2007-03-16 22:05, edited 2 times in total.

lacek
Posts: 764
Joined: 2004-03-11 18:49
Location: Budapest, Hungary
Contact:

#6 Post by lacek »

Ok, this depends on what is "programmatical". I thought that this:

Code: Select all

[ -n "`file $1|grep text`" ] && {
    echo "$1 is a text file"
} || {
    echo "$1 isn't a text file"
}
is a programmatical approach. After all, it is a program which makes the guess. It you are against using external programs in bash scripts, keep in mind that even 'cd' is an external program... :-)

thamarok

#7 Post by thamarok »

"file" doesn't understand every file format. In Windows there is a simple API that check if the file is binary or not, so I am sure it can be done on Linux too. Also, some file formats which contain only plain ASCII text don't have "text" in their "file" description.
Last edited by thamarok on 2007-03-16 22:05, edited 1 time in total.

lacek
Posts: 764
Joined: 2004-03-11 18:49
Location: Budapest, Hungary
Contact:

#8 Post by lacek »

Also, some file formats which contain only plain ASCII text don't have "text" in their "file" description
The -i switch can be of help. It makes 'file' slightly faster as well.

'file' indeed doesn't recognize every binary file format, as this sounds like an impossible mission... However, for binary formats aren't recognized, file outputs 'data'. So you can still guess that this file is a binary one.
I can agree that calling 'file' on every program isn't a fast thing to do, however, if you want to gain speed you should consider not having a bash script anyway. Using a non-interpreted language would be much faster.

Just out of curiosity: why do you need to decide if a file is binary in the first place? Scanning through the filesystem, recording the changes and removing the new files upon user request is all you want to do. Am I wrong?

thamarok

#9 Post by thamarok »

Yup that's what I want to do.

User avatar
hcgtv
Posts: 500
Joined: 2006-11-17 23:03
Location: Charlotte, NC

#10 Post by hcgtv »

thamarok wrote:The program works like this: You click on "scan system" and it will make a snapshot of the current state of the filesystem. Then the user tweaks and installs whatever (s)he likes and then clicks on "scan system" again, then the program will compare both snapshots and give the user an easy to understand summary of all the new, deleted and updated files.
The snapshot could be made with rsync, then for the comparison do:

-n, --dry-run show what would have been transferred
Bert Garcia - When all you have is a keyboard

thamarok

#11 Post by thamarok »

REMOVED
Last edited by thamarok on 2007-03-16 22:05, edited 2 times in total.

User avatar
Fluenza
Posts: 236
Joined: 2006-11-22 18:44
Location: Fog of War

#12 Post by Fluenza »

thamarok wrote:Thanks, but:
[magnify]
AND NOTE TO EVERYONE: I DON'T CARE IF THERE IS ALREADY SUCH AN APPLICATION SO DON'T RECOMMEND ME ANYTHING.
[/magnify]
Are you just wanting to write this app so that you can learn to code bash scripts? Hmmm, I should spend some time learning to write bash scripts myself. :idea:
Visualize, Describe, Direct (VDD)
Common Operational Picture (COP) --> Common Operational Response (COR) --> Common Operational Effect (COE)

thamarok

#13 Post by thamarok »

A client asked for this so I wanted to help him, but after looking up all the resources I know, I ended up with no solution, so I asked here.

Grifter
Posts: 1554
Joined: 2006-05-04 07:53
Location: Svea Rike

#14 Post by Grifter »

first of all file isn't slow, it's incredibly fast

second, WHAT!? Ok granted it's been a while since I used windows, but when you renamed a .exe file to .txt in my day, it would happily open the binary file in a text editor

using a script to determine if a file is binary would take up far more resources than using the command file, and while file doesn't have the specs on every conceivable data file ever created, it has the ones that matter (and it's a long list), second, if a file is binary but is unknown to file, it will name it "data"
Eagles may soar, but weasels don't get sucked into jet engines...

thamarok

#15 Post by thamarok »

I think I will start with HEX values.. thanks anyway..

Amber
Posts: 70
Joined: 2007-02-22 06:03

#16 Post by Amber »

The unzip command has an option to perform text conversion while extracting. You can study its source code to see if its method of determining ASCII vs. binary is useful to you.

thamarok

#17 Post by thamarok »

Amber wrote:The unzip command has an option to perform text conversion while extracting. You can study its source code to see if its method of determining ASCII vs. binary is useful to you.
Thanks for pointing that out!

Post Reply