unique sorted search from text file

Message

MakeTopSite · #1 Post by **MakeTopSite** » 2024-09-22 09:33

$ cat file.txt
word4 word2
word4 word2
word3 word1
word1 word3
$

How can I please get this output from file.txt ?

Code: Select all

word1
word2
word3
word4

fabien · #2 Post by **fabien** » 2024-09-22 10:08

Code: Select all

$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file.txt
word4
word2
word3
word1

It is possible to sort with mawk, but this requires complicated code that is probably slower than piping the output to sort. I believe gawk has a sort function built in.

Code: Select all

$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file.txt | sort
word1
word2
word3
word4

Works for any number of words per line:

Code: Select all

$> cat /tmp/file.txt
word4 word2
word4 word2 word5
word3 word1
word1 word3 word6 word7
$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file.txt | sort
word1
word2
word3
word4
word5
word6
word7

Using sort -u:

Code: Select all

$> mawk '{for (i=1;i<=NF;++i){print $i}}' /tmp/file.txt | sort -u
word1
word2
word3
word4

arzgi · #3 Post by **arzgi** » 2024-09-22 10:23

A shorter one line:

Code: Select all

arto@dell:~$ awk -v RS=" " '{print}' file.txt | sort -u

word1
word2
word3
word4
arto@dell:~$

fabien · #4 Post by **fabien** » 2024-09-22 11:05

arzgi wrote: 2024-09-22 10:23 A shorter one line:
Code: Select all
arto@dell:~$ awk -v RS=" " '{print}' file.txt | sort -u

Yes, this is another solution, but the result may be unexpected if there are tabs:

Code: Select all

$> cat --show-tabs /tmp/file.txt 
word4 word2
word4 word2
word3 word1
word1 word3
$> cat --show-tabs /tmp/file2.txt 
word4^Iword2
word4^Iword2
word3^Iword1
word1^Iword3
$> mawk 'BEGIN {RS=" "} {print}' /tmp/file.txt | sort -u

word1
word2
word3
word4
$> mawk 'BEGIN {RS=" "} {print}' /tmp/file2.txt | sort -u

word1	word3
word3	word1
word4	word2
$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file2.txt | sort
word1
word2
word3
word4

This also adds a newline to the result (since the newline is no longer an input record separator):

Code: Select all

$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file.txt | wc -l
4
$> mawk 'BEGIN {RS=" "} {print}' /tmp/file.txt | sort -u | wc -l
5

lindi · #5 Post by **lindi** » 2024-09-22 12:21

Code: Select all

tr " " "\n" < file.txt | sort -u

fabien · #6 Post by **fabien** » 2024-09-22 13:22

lindi wrote: 2024-09-22 12:21
Code: Select all
tr " " "\n" < file.txt | sort -u

Nice. Can even handle tabs (edit: not perfect when there are multiple spaces):

Code: Select all

$> tr " \t" "\n" < /tmp/file2.txt | sort -u
word1
word2
word3
word4

I just thought of this one:

Code: Select all

$> grep -o "[^[:blank:]]\+" /tmp/file2.txt | sort -u
word1
word2
word3
word4

Also (edit: not perfect when there are spaces at the end):

Code: Select all

$> sed -E 's/([^[:blank:]]+)([[:blank:]]+)/\1\n/g' /tmp/file2.txt | sort -u
word1
word2
word3
word4

m4c-attack · #7 Post by **m4c-attack** » 2024-10-07 00:19

lindi wrote: 2024-09-22 12:21
Code: Select all
tr " " "\n" < file.txt | sort -u

Beat me to it

(I didn't know sort had a unique flag tho and unnecessarily piped into uniq in my solution

Debian User Forums

unique sorted search from text file

unique sorted search from text file

Re: unique sorted search from text file

Re: unique sorted search from text file

Re: unique sorted search from text file

Re: unique sorted search from text file

Re: unique sorted search from text file

Re: unique sorted search from text file