25. Bash Shell - Text Processing: uniq, comm
Contents
We can use the uniq, comm command provided by linux to deduplicate, or compare text file contents.
Let’s begin the second part of the intermediate level of word processing.
Unique Result
Let’s prepare some repeated and unique contents for the uniq command:
|
|
data:image/s3,"s3://crabby-images/8a9bf/8a9bf39b0f8393b8560c7a6257728fc39d97e7e1" alt="img"
Prepared some repeated and unique contents
By default, the uniq command output does not include subsequently repeated contents:
|
|
data:image/s3,"s3://crabby-images/2c91c/2c91cdaa1174408b03c8ba9dcb6b86c7dc772a71" alt="img"
Output all but not the subsequently repeated contents
Repeated Contents
We can add the -d or –repeated parameter to output the file’s repeated contents:
|
|
data:image/s3,"s3://crabby-images/77b58/77b5809ce12a1609b449d185c9148177b9addddc" alt="img"
Output uniq_file.txt file's repeated contents
Unique Contents
We can add the -u or –unique parameter to output the file’s unique contents:
|
|
data:image/s3,"s3://crabby-images/62ba0/62ba05980c98fbe4cfb7652b7318f3e1f303e7d0" alt="img"
Output uniq_file.txt file's unique contents
Repeated Contents Without Case Sensitivity
We can add the -d or –repeated plus -i or –ignore-case parameters to output the file’s repeated contents without case sensitivity:
|
|
data:image/s3,"s3://crabby-images/7a44c/7a44c56ce1cbe6696c2ad5246e76b5cb7d658cb0" alt="img"
Output uniq_file.txt file's repeated contents without case sensitivity
Contents With Their Occurrence Numbers
We can add the -c or –count to output the file’s repeated contents without case sensitivity:
|
|
data:image/s3,"s3://crabby-images/862a4/862a4b5024aa4e006e5dd29ed7a2422acff96ca1" alt="img"
Output uniq_file.txt file's contents with their occurrence numbers
Compared Result
We need to prepare two files to demonstrate the comm command:
|
|
data:image/s3,"s3://crabby-images/74dcf/74dcffda67c79204d82f4d3b4b1da7a5397f2670" alt="img"
Prints contents to file1 and file2
By default, comm outputs three columns of data, the first column is unique to the first file, the second column is unique to the second file, and the third column is the co-existent contents of the two files:
|
|
data:image/s3,"s3://crabby-images/ce28e/ce28e611c84e7272b4bc195e9d145c00191cc6da" alt="img"
Compares file file1 to file file2
Hide First Column
We can hide the first column with the -1 parameter:
|
|
data:image/s3,"s3://crabby-images/53b29/53b29e90961602796c0faa617fdc692569135b0d" alt="img"
Hide first column
Hide Second Column
We can hide the first column with the -2 parameter:
|
|
data:image/s3,"s3://crabby-images/92114/9211497bffa658f75b67520634b7e9b8f5637a1d" alt="img"
Hide second column
Hide Third Column
We can hide the first column with the -3 parameter:
|
|
data:image/s3,"s3://crabby-images/30128/301280b3a67faf788e5f0adaeca046b62b8748bb" alt="img"
Hide third column
Show First Column
We can hide the first column with the -23 parameter:
|
|
data:image/s3,"s3://crabby-images/b8913/b89130db042191bb5f059cbe41731e4251f8f942" alt="img"
Show first column
Show Second Column
We can hide the first column with the -13 parameter:
|
|
data:image/s3,"s3://crabby-images/b2126/b212632b9d7fc96a70bf248ec6de69e561e0f97c" alt="img"
Show second column
Show Third Column
We can hide the first column with the -12 parameter:
|
|
data:image/s3,"s3://crabby-images/26d8f/26d8fff48745f1f94f127949fefc486e8a39eede" alt="img"
Show third column
References 7.3 uniq: Uniquify files, 7.4 comm: Compare two sorted files line by line
Author Dong Chen
LastMod Tue Feb 26 2019