Code: Select all
$ cat file.txt
word4 word2
word4 word2
word3 word1
word1 word3
$
Code: Select all
word1
word2
word3
word4
Code: Select all
$ cat file.txt
word4 word2
word4 word2
word3 word1
word1 word3
$
Code: Select all
word1
word2
word3
word4
Code: Select all
$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file.txt
word4
word2
word3
word1
Code: Select all
$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file.txt | sort
word1
word2
word3
word4
Code: Select all
$> cat /tmp/file.txt
word4 word2
word4 word2 word5
word3 word1
word1 word3 word6 word7
$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file.txt | sort
word1
word2
word3
word4
word5
word6
word7
Code: Select all
$> mawk '{for (i=1;i<=NF;++i){print $i}}' /tmp/file.txt | sort -u
word1
word2
word3
word4
Code: Select all
arto@dell:~$ awk -v RS=" " '{print}' file.txt | sort -u
word1
word2
word3
word4
arto@dell:~$
Yes, this is another solution, but the result may be unexpected if there are tabs:arzgi wrote: 2024-09-22 10:23 A shorter one line:Code: Select all
arto@dell:~$ awk -v RS=" " '{print}' file.txt | sort -u
Code: Select all
$> cat --show-tabs /tmp/file.txt
word4 word2
word4 word2
word3 word1
word1 word3
$> cat --show-tabs /tmp/file2.txt
word4^Iword2
word4^Iword2
word3^Iword1
word1^Iword3
$> mawk 'BEGIN {RS=" "} {print}' /tmp/file.txt | sort -u
word1
word2
word3
word4
$> mawk 'BEGIN {RS=" "} {print}' /tmp/file2.txt | sort -u
word1 word3
word3 word1
word4 word2
$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file2.txt | sort
word1
word2
word3
word4
Code: Select all
$> mawk '{for (i=1;i<=NF;++i){if (!duplicate[$i]++){print $i}}}' /tmp/file.txt | wc -l
4
$> mawk 'BEGIN {RS=" "} {print}' /tmp/file.txt | sort -u | wc -l
5
Nice. Can even handle tabs (edit: not perfect when there are multiple spaces):
Code: Select all
$> tr " \t" "\n" < /tmp/file2.txt | sort -u
word1
word2
word3
word4
Code: Select all
$> grep -o "[^[:blank:]]\+" /tmp/file2.txt | sort -u
word1
word2
word3
word4
Code: Select all
$> sed -E 's/([^[:blank:]]+)([[:blank:]]+)/\1\n/g' /tmp/file2.txt | sort -u
word1
word2
word3
word4
Beat me to it