Note that comm takes two sorted files as input. Here are our sample input files:
$ cat A.txt apple orange gold silver steel iron $ cat B.txt orange gold cookies carrot $ sort A.txt -o A.txt ; sort B.txt -o B.txt
- First, execute comm without any options:
$ comm A.txt B.txt
apple
carrot
cookies
gold
iron
orange
silver
steel
The first column of the output contains lines that are only in A.txt. The second column contains lines that are only in B.txt. The third column contains the common lines from A.txt and B.txt. Each of the columns are delimited using the tab (\t) character.
- In order to print the intersection of two files, we need to remove the first and second columns and print the third column. The -1 option removes the first column, and the -2 option removes the second column, leaving the third column:
$ comm A.txt B.txt -1 -2
gold
orange
- Print only the lines that are uncommon between the two files by removing column 3:
$ comm A.txt B.txt -3
apple
carrot
cookies
iron
silver
steel
This output uses two columns with blanks to show the unique lines in file1 and file2. We can make this more readable as a list of unique lines by merging the two columns into one, like this:
apple
carrot
cookies
iron
silver
steel
- The lines can be merged by removing the tab characters with tr (discussed in Chapter 2, Have a Good Command)
$ comm A.txt B.txt -3 | tr -d '\t'
apple
carrot
cookies
iron
silver
steel
- By removing the unnecessary columns, we can produce the set difference for A.txt and B.txt, as follows:
- Set difference for A.txt:
$ comm A.txt B.txt -2 -3
-2 -3 removes the second and third columns
- Set difference for B.txt:
$ comm A.txt B.txt -1 -3
-2 -3 removes the second and third columns