Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

[SOLVED] Practicality of Bash with unconventional pathnames

Programming languages, Coding, Executables, Package Creation, and Scripting.
Post Reply
Message
Author
THBlack
Posts: 5
Joined: 2021-02-08 20:45

[SOLVED] Practicality of Bash with unconventional pathnames

#1 Post by THBlack »

SOLUTION

My Bash idiom was immature. Scripting in Bash apparently requires some different habits than entering commands interactively on the command line does. For example, it requires that one terminate command-line options using '--'. It requires that one prefer printf or <<< rather than echo in some circumstances. It requires a clear understanding of the eval builtin. It requires lots of things.

In general, Bash has ways of escaping various possible misinterpretations of text, for example via the aforementioned '--' and eval, not to mention quoting. Most of these ways are unfamiliar to Bash command-line users (or at any rate, they were unfamiliar to me), so they must be learned—and, once learned, must frequently be employed. Coming from C++, I had not been used to escaping everything, like this:

Code: Select all

run_by_batch wget_insist "'"'-nc -x -nH --cut-dirs=4 -P '"'"'"'"'"'"'"$TARGET/pool"'"'"'"'"'"'"' --'"'"
It actually makes some sense once one thinks about it in the right way, but it's a little hard to get used to.

Bash lacks programming support for long scripts, but support is for scripts longer than about 3000 lines. A decent programmer can work without support up to that length, so lack of support was not my problem. My problem was immature Bash idiom. Bash scales poorly when idiom is immature.

I suppose that there is no good remedy for immature scripting idiom other than writing a few thousand lines and failing at it, while occasionally seeking advice. The various advice others have given below has helped.

ORIGINAL QUESTION

In your opinion, how practical is it to write complex Bash shell scripts that handle offbeat filenames and pathnames like, say, 'My Files/-Preferences:(3)'?

I realize that it is possible to write such scripts, but every time one mixes various quoting and expansion mechanisms like (say) echo ${X}{1,2} | sed "s/${A}/${B%$C}/" or uses eval or the like, Bash—or various command-line tools invoked from Bash—seems to remind one how much it dislikes offbeat filenames. The script on which I am now working is a moderate 1500 lines or so in length, yet I have spent so many hours tracking down obscure bugs due to quoting and expansion issues that I am thinking of having the script simply reject all unconventional pathnames.

Some users won't like that.

Meanwhile, I feel that I am fighting Bash. I would instead like to work with Bash, and I would prefer not to give up on Bash just because some users have ported Microsoft-style file-naming habits to Debian.

Is there an easy way to do this? What am I missing, please?
Last edited by THBlack on 2021-02-10 22:21, edited 3 times in total.

reinob
Posts: 1189
Joined: 2014-06-30 11:42
Has thanked: 97 times
Been thanked: 47 times

Re: Practicality of Bash with unconventional pathnames

#2 Post by reinob »

It of course depends on what you're trying to do. What are you trying to do anyway? :)

Normally you don't need to manipulate file names, you just pass them to a program, etc. and as long as they're quoted ("$NAME" instead of $NAME), there's usually no problem.

But yes, the general idea is that everything is text, so a command line is parsed as text, commonly separating tokens by spaces, which may be a problem. I hear Powershell is more "object oriented", but everytime I see a Powershell script I think something is deeply wrong with the concept, and seems only OK for programming but not for an interactive shell.

So, again, what are you trying to do? maybe there's an easier/simpler way.

THBlack
Posts: 5
Joined: 2021-02-08 20:45

Re: Practicality of Bash with unconventional pathnames

#3 Post by THBlack »

What are you trying to do anyway?
  • To select and transfer a lot of files from one server to another,
  • To distribute the files to various directories on the target server.
  • To parse and obey control files fetched from the source server.
  • To verify checksums.
  • To let the user configure various aspects of the process when he or she invokes the script on the command line.
  • Etc.
And to do it all in a way that does not require other maintainers of the script to learn my favorite scripting language (Python or Perl, say) as a precondition of maintaining the script. The exact details are probably unimportant to the present discussion unless you want to see 1500 lines of shell scripting.

I can fix any particular problem when it comes up, but overall, when I am writing lots of lines that resemble,

Code: Select all

touch -d "$(sed -rne 's/^Date:\s*//p;T;q' "$2")" "$1"
or

Code: Select all

ARG="$(eval echo '$ARG1'/{'$ARG2',{MD5,SHA{1,256,512}}SUMS{,.sign}})"
or

Code: Select all

[[ "$FP1" = dists/$(printf '%q' "$DIST1")?(/*) ]] && PREFETCH_FPS="$PREFETCH_FPS $FP1"
... well, I seem to be fighting Bash rather than working with Bash. My script functions, but it's probably bug-prone, is probably insecure, and, generally, just does not feel right. Sometimes, spaces are separators between arguments. Sometimes, spaces are embedded within arguments. Sometimes, spaces are not normally embedded within arguments, but an intruder might embed a space to cause trouble. And that's just spaces: what about embedded parentheses that, if mishandled, launch subprocesses? Or what about the humble '..' that lets the user escape upward from the directory in which he or she is supposed to be working? Weird filenames like -myfile can be misinterpreted as command options, and so forth; not to mention the infinite variety of Unicode characters that might mean anything as far as I know.

I am not really asking any of those specific questions. I am asking instead if I am inadvertently trying to force Bash into the wrong paradigm, because the need to handle unconventional pathnames is causing me an inordinate amount of trouble.

Maybe the answer is, "Bash wasn't designed to handle unconventional pathnames."

But maybe there is a better answer?

User avatar
Bloom
df -h | grep > 90TiB
df -h | grep > 90TiB
Posts: 504
Joined: 2017-11-11 12:23
Been thanked: 26 times

Re: Practicality of Bash with unconventional pathnames

#4 Post by Bloom »

I don't know what you consider "unconventional". Windows has a set of rules determining the proper characters that can be used in a full path, as well as a list of characters that may not. Linux has a similar set of rules.
If you want to act on files that use Windows path and filenames from a Linux system, you can indeed run into trouble as most GNU utilities dealing with files (like mv) expect Linux path and filenames.
My suggestion is to not do that. If you want to act on Windows files on one server and move them or some of them to another Windows server, use Windows to do that. Powershell should allow you do your magic as well.

Or have you mounted Windows filesystems in Linux? That could work.

Dai_trying
Posts: 1100
Joined: 2016-01-07 12:25
Has thanked: 5 times
Been thanked: 16 times

Re: Practicality of Bash with unconventional pathnames

#5 Post by Dai_trying »

I recently had a problem with file and path names with a bash script and the way I managed to solve it was to change the IFS to newline only, this allowed spaces and special characters in filenames and paths, it has worked so far (script used daily for last couple of weeks) but I don't know if it will fail at some point for any reason.

What I had to do was change the IFS value at the beginning of the script and then change it back at the end, something like below:

Code: Select all

#!/bin/bash
OIFS="$IFS"
IFS=$'\n'

<do all script stuff here

IFS="$OIFS"
Like I said it worked for what I needed but YMMV.

P.s. Don't forget to quote your variables :wink:

THBlack
Posts: 5
Joined: 2021-02-08 20:45

Re: Practicality of Bash with unconventional pathnames

#6 Post by THBlack »

Bloom wrote:I don't know what you consider "unconventional".
Convention is not defined by me, of course. As far as I am aware, unconventional are spaces; control characters; punctuation other than the five characters [._~+-]; and, at the start of a word, punctuation other than the underscore [_]. Debian Policy does not define the convention, but all names it mentions adhere to the convention as far as I know. (Someone might mention POSIX at this point, but POSIX is not what I was talking about here.)

There are a few other ASCII punctuators that sometimes get used, like [@], although Make, RCS and (I believe) the runtime loader ld.so treat [@] as special. There is at least one that probably could be used but for some reason isn't: [%]. As I said, the convention is not defined by me.
Or have you mounted Windows filesystems in Linux?
I have not.
That could work.
It could? Interesting. What do you mean?
Dai_trying wrote:... change the IFS value ...
There's an idea. I had not thought of that.

User avatar
Head_on_a_Stick
Posts: 14114
Joined: 2014-06-01 17:46
Location: London, England
Has thanked: 81 times
Been thanked: 132 times

Re: Practicality of Bash with unconventional pathnames

#7 Post by Head_on_a_Stick »

The only forbidden filename characters in GNU/Linux (or POSIX for that matter) are the NULL byte and "/" (without the quotation marks). It gets interesting if you start the filename with a dash character but most utilities can deal with that if it is prepended with either "-- " or "./" (without the quotation marks). Any other characters can be escaped with a backslash or quoted.
deadbang

LE_746F6D617A7A69
Posts: 932
Joined: 2020-05-03 14:16
Has thanked: 7 times
Been thanked: 65 times

Re: Practicality of Bash with unconventional pathnames

#8 Post by LE_746F6D617A7A69 »

@THBlack: You are trying to fight with the *effects*, instead of the *source* of the problem - quite typical, unfortunately.
Investing your' division time in education of the users will produce much better effects, and the results will be visible not only in this particular case.

1. You should convince the users, that by wisely choosing directory names, they're helping themselves.
2. You should explain why using of forbidden characters in directory/file names is harmfull.

Fighting with the effects of a problem is a waste of time.
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed

THBlack
Posts: 5
Joined: 2021-02-08 20:45

Re: Practicality of Bash with unconventional pathnames

#9 Post by THBlack »

Head_on_a_Stick wrote:The only forbidden filename characters in GNU/Linux (or POSIX for that matter) are the NULL byte and "/" (without the quotation marks).
That's not the solution, though, is it? That's the problem, rather.
It gets interesting if you start the filename with a dash character but most utilities can deal with that if it is prepended with either "-- " or "./" (without the quotation marks).
I had not thought of the "./". That helps.
Any other characters can be escaped with a backslash or quoted.
Yes, in principle this is true; but when the name that contains the other characters has been supplied by a user, filtered through two or three command-line utilities, sorted to a temporary file somewhere, retrieved and resorted to another file, retrivied again and expanded by Bash's eval builtin, chopped up and used as an element in a pattern match, and so on, the script-writer (me) tends to lose track of how many backslashes and quotes are required. Is it "\\\\" or '\\\\\\\\' or "$(eval "'"${*##\\\\\\\\\\\\\\\\}"'")"? (Don't ask me what the last production means. I have no idea.)

Does such confusion not overwhelm you when you write 1500-line Bash scripts? I had never before used Bash for a script longer than 500 lines. If you say that a 1500-line Bash script does not overwhelm you, then maybe I just need to return to my keyboard to acquire more practical experience to develop a sound Bash idiom. I just shouldn't be having such trouble with a script this short.

If it were C++ then the compiler's type system would keep track and sort it out, but it isn't. It's Bash.

User avatar
Head_on_a_Stick
Posts: 14114
Joined: 2014-06-01 17:46
Location: London, England
Has thanked: 81 times
Been thanked: 132 times

Re: Practicality of Bash with unconventional pathnames

#10 Post by Head_on_a_Stick »

THBlack wrote:Does such confusion not overwhelm you when you write 1500-line Bash scripts?
I would *never* write a 1500 line bash script. The "language" is unbearably slow, massively bloated and egregiously buggy. POSIX sh is better but only just. Use another language.

EDIT: I should probably note that I'm useless at scripting, hence my lack of concrete advice.
deadbang

THBlack
Posts: 5
Joined: 2021-02-08 20:45

Re: Practicality of Bash with unconventional pathnames

#11 Post by THBlack »

LE_746F6D617A7A69 wrote:Fighting with the effects of a problem is a waste of time.
I see.

If there is a story or anecdote you would like to tell about this, I'd be interested to read it.

LE_746F6D617A7A69
Posts: 932
Joined: 2020-05-03 14:16
Has thanked: 7 times
Been thanked: 65 times

Re: Practicality of Bash with unconventional pathnames

#12 Post by LE_746F6D617A7A69 »

THBlack wrote:
LE_746F6D617A7A69 wrote:Fighting with the effects of a problem is a waste of time.
I see.

If there is a story or anecdote you would like to tell about this, I'd be interested to read it.
Anecdote:
A man who decided to work for a helpdesk is now trying to prove that he is worth much more than it is stated in the contract - in this way he can feel "a more valuable person" - of course this is nothing but a pure illusion...
Bill Gates: "(...) In my case, I went to the garbage cans at the Computer Science Center and I fished out listings of their operating system."
The_full_story and Nothing_have_changed

Post Reply