I'm looking for explanation of the behavior of sorting in Debian and probably many other GNU tools too (unfortunately I don't have access to other Unix systems with pl_PL.UTF-8 locale). Let's say we have the file tosort.txt containing:
Code: Select all
atx001b.jpg
atx001l.jpg
atx001k.jpg
atx001j.jpg
atx001m.jpg
atx001h.jpg
atx001.jpg
atx001i.jpg
atx001чернее.jpg
atx001z.jpg
Code: Select all
user@m4800:/tmp$ echo $LANG
pl_PL.UTF-8
user@m4800:/tmp$ sort tosort.txt
atx001b.jpg
atx001h.jpg
atx001i.jpg
atx001j.jpg
atx001.jpg
atx001k.jpg
atx001l.jpg
atx001m.jpg
atx001z.jpg
atx001чернее.jpg
user@m4800:/tmp$ LANG=C sort tosort.txt
atx001.jpg
atx001b.jpg
atx001h.jpg
atx001i.jpg
atx001j.jpg
atx001k.jpg
atx001l.jpg
atx001m.jpg
atx001z.jpg
atx001чернее.jpg
user@m4800:/tmp$ LANG=en_US.UTF-8 sort tosort.txt
atx001.jpg
atx001b.jpg
atx001h.jpg
atx001i.jpg
atx001j.jpg
atx001k.jpg
atx001l.jpg
atx001m.jpg
atx001z.jpg
atx001чернее.jpg
user@m4800:/tmp$
Is this a bug or a specified behavior?
P.S. This is not sort-specific, it "leaks" to many other programs, it can be seen e.g. in GTK file dialogs, especially the hints or some file managers, so it may not be a sort's thing but some library function's characteristic.