Allgemein

uniq Command in Linux: Remove and Count Duplicate Lines

uniq Command in Linux: Remove and Count Duplicate Lines

uniq is a command-line utility that filters adjacent duplicate lines from sorted input and writes the result to standard output. It is most commonly used together with sort
to remove or count duplicates across an entire file.

This guide explains how to use the uniq command with practical examples.

uniq Command Syntax

The syntax for the uniq command is as follows:

txt
uniq [OPTIONS] [INPUT [OUTPUT]]

If no input file is specified, uniq reads from standard input. The optional OUTPUT argument redirects the result to a file instead of standard output.

Removing Duplicate Lines

By default, uniq removes adjacent duplicate lines, keeping one copy of each. Because it only compares consecutive lines, the input must be sorted first — otherwise only back-to-back duplicates are removed.

Given a file with repeated entries:

fruits.txttxt
apple
apple
banana
cherry
banana
cherry
cherry

Running uniq on the unsorted file removes only the back-to-back duplicates (apple apple and cherry cherry), but leaves the second banana and the second cherry intact:


Terminal
uniq fruits.txt

output
apple
banana
cherry
banana
cherry

To remove all duplicates regardless of position, sort the file first:


Terminal
sort fruits.txt | uniq

output
apple
banana
cherry

Counting Occurrences

To prefix each output line with the number of times it appears, use the -c (--count) option:


Terminal
sort fruits.txt | uniq -c

output
 2 apple
2 banana
3 cherry

The count and the line are separated by whitespace. This is especially useful for finding the most frequent lines in a file. To rank them from most to least common, pipe the output back to sort:


Terminal
sort fruits.txt | uniq -c | sort -rn

output
 3 cherry
2 banana
2 apple

Showing Only Duplicate Lines

To print only lines that appear more than once (one copy per group), use the -d (--repeated) option:


Terminal
sort fruits.txt | uniq -d

output
apple
banana
cherry

To print every instance of a repeated line rather than just one, use -D:


Terminal
sort fruits.txt | uniq -D

output
apple
apple
banana
banana
cherry
cherry
cherry

Showing Only Unique Lines

To print only lines that appear exactly once (no duplicates), use the -u (--unique) option:


Terminal
sort fruits.txt | uniq -u

If every line appears more than once, the output is empty. In the example above, all three fruits are duplicated, so nothing is printed.

Ignoring Case

By default, uniq comparisons are case-sensitive, so Apple and apple are treated as different lines. To compare lines case-insensitively, use the -i (--ignore-case) option:


Terminal
sort -f file.txt | uniq -i

Skipping Fields and Characters

uniq can be told to skip leading fields or characters before comparing lines.

To skip the first N whitespace-separated fields, use -f N (--skip-fields=N). This is useful when lines share a common prefix (such as a timestamp) that should not be part of the comparison:

log.txttxt
2026-01-01 ERROR disk full
2026-01-02 ERROR disk full
2026-01-03 WARNING low memory


Terminal
uniq -f 1 log.txt

output
2026-01-01 ERROR disk full
2026-01-03 WARNING low memory

The first field (the date) is skipped, so the two ERROR disk full lines are treated as duplicates.

To skip the first N characters instead of fields, use -s N (--skip-chars=N). To limit the comparison to the first N characters per line, use -w N (--check-chars=N).

Combining uniq with Other Commands

uniq works well in pipelines with grep
, cut
, sort
, and wc
.

To count the number of unique lines in a file:


Terminal
sort file.txt | uniq | wc -l

To find the top 10 most common words in a text file:


Terminal
grep -Eo '[[:alnum:]]+' file.txt | sort | uniq -c | sort -rn | head -10

To list unique IP addresses from an access log:


Terminal
awk '{print $1}' /var/log/nginx/access.log | sort | uniq

To find lines that appear in one file but not another (using -u against merged sorted files):


Terminal
sort file1.txt file2.txt | uniq -u

Quick Reference

Command Description
sort file.txt | uniq Remove all duplicate lines
sort file.txt | uniq -c Count occurrences of each line
sort file.txt | uniq -c | sort -rn Rank lines by frequency
sort file.txt | uniq -d Show only duplicate lines (one per group)
sort file.txt | uniq -D Show all instances of duplicate lines
sort file.txt | uniq -u Show only lines that appear exactly once
uniq -i file.txt Compare lines case-insensitively
uniq -f 2 file.txt Skip first 2 fields when comparing
uniq -s 5 file.txt Skip first 5 characters when comparing
uniq -w 10 file.txt Compare only first 10 characters

Troubleshooting

Duplicates are not removed
uniq only removes adjacent duplicate lines. If the file is not sorted, non-consecutive duplicates are not detected. Always sort the input first: sort file.txt | uniq.

-c output has inconsistent spacing
The count is right-aligned and padded with spaces. If you need to process the output further, use awk to normalize spacing while preserving the full line text: sort file.txt | uniq -c | awk '{count=$1; $1=""; sub(/^ +/, ""); print count, $0}'.

Case variants are treated as different lines
Use the -i option to compare case-insensitively. Also sort with -f so that case-insensitive duplicates are adjacent before uniq processes them: sort -f file.txt | uniq -i.

FAQ

What is the difference between sort -u and sort | uniq?
Both produce the same output for simple deduplication. sort -u is slightly more efficient. sort | uniq is more flexible because uniq supports options like -c (count occurrences) and -d (show only duplicates) that sort -u does not.

Does uniq modify the input file?
No. uniq reads from a file or standard input and writes to standard output. The original file is never modified. To save the result, use redirection: sort file.txt | uniq > deduped.txt.

How do I count unique lines in a file?
Pipe through sort, uniq, and wc
: sort file.txt | uniq | wc -l.

How do I find lines that only appear in one of two files?
Merge both files and use uniq -u: sort file1.txt file2.txt | uniq -u. Lines shared by both files become duplicates in the merged sorted stream and are suppressed. Lines that exist in only one file remain unique and are printed.

Can uniq work on columns instead of full lines?
Yes. Use -f N to skip the first N fields, -s N to skip the first N characters, or -w N to limit comparison to the first N characters. This lets you deduplicate based on a portion of each line.

Conclusion

The uniq command is a focused tool for filtering and counting duplicate lines. It is most effective when used after sort
in a pipeline and pairs naturally with wc
, grep
, and head
for text analysis tasks.

If you have any questions, feel free to leave a comment below.