Software & Apps Linux Linux uniq Command: What It Is and How to Use It Make short work of hunting through textual data by Aaron Peters Writer Aaron Peters is a writer with Lifewire who has 20+ years experience in technology. His work appears in Linux Journal, MakeUseOf, and others. our editorial process Twitter LinkedIn Aaron Peters Updated on January 09, 2020 Tweet Share Email Linux Switching from Windows Linux (and its predecessor, Unix) was built on plain text. As a result, it has all sorts of useful text processing tools you can use from the terminal. The Linux uniq utility was designed to help you sort through text files for unique values. What Is Linux uniq and When Would You Use It? The uniq command comes installed on most Linux distributions out of the box, and belongs to the coreutils package. It's used to identify and 'collapse' lines of adjacent, identical text. Let's unpack this definition a bit. The basic unit for comparison is a line of text, i.e. all the text from one line return until the next one. This can include multiple sentences, provided they're in the same paragraph. By default, uniq only compares adjacent lines only. This means if two lines are exactly the same, but there's a different one in between them, they'll be considered different unless you apply some different options to the command (more on this later). In this context, "collapsing" means then uniq displays its output, it will only include the first occurrence of the line. The uniq command helps you sift through lots of data, and identify which lines are the same, and remove them from the output. Basic Usage of Linux uniq Command At a basic level, using the Linux uniq command is as follows: uniq -o=value /path/to/inputfile Here, the "o" above represents the shorthand flag for one of its options. You can also enter this in its longer form, such as: uniq --option=value /path/to/inputfile The "inputfile" must be a plain text file containing your data. There are many options for the uniqu command in Linux, but it may not be obvious how you can use these options to provide you with useful output. We'll take a deep-dive into some of them in the below sections. Removing Adjacent Duplicates With the uniq Command In its most basic form, the uniq command will 'collapse' adjacent duplicates and display the results. For example, let's say you're starting a new blog and have a list of people who signed up for your email newsletter (newsletter.txt), but are not yet members. Jsmith@example.com Jsmith@example.com Tmiller@example.com Mjones@example.com Mjones@example.com Since you wouldn't want to bother these people more than once, you can de-duplicate this with the following: $ uniq newsletter.txt Jsmith@example.com Tmiller@example.com Mjones@example.com Admittedly, this on its own isn't very exciting. If a third occurrence of "Jsmith@example.com" existed at the end of the file, it would remain. So it's important to learn some of the options for this command. Counting the Number of Occurrences With uniq Let's suppose your blog takes off and not only are people registering, they're subscribing! For money! And why wouldn't they? The list of payments you're receiving will start to grow. Smith John Jsmith@example.com $3.00Smith John Jsmith@example.com $3.00Smith John Jsmith@example.com $3.00Smith John Jsmith@example.com $3.00Smith John Jsmith@example.com $3.00Smith John Jsmith@example.com $3.00Smith John Jsmith@example.com $3.00Smith John Jsmith@example.com $3.00Peters Aaron Apeters@example.com $10.00Peters Aaron Apeters@example.com $10.00Peters Aaron Apeters@example.com $10.00Miller Tim Tmiller@example.com $1.00Miller Tim Tmiller@example.com $1.00Miller Tim Tmiller@example.com $1.00Miller Tim Tmiller@example.com $1.00Miller Tim Tmiller@example.com $1.00Miller Tim Tmiller@example.com $1.00Jones Mary Mjones@example.com $5.00Jones Mary Mjones@example.com $5.00Jones Mary Mjones@example.com $5.00Jones Mary Mjones@example.com $5.00Jones Fred Fjones@example.com $4.00Jones Fred Fjones@example.com $4.00Jones Fred Fjones@example.com $4.00Jones Fred Fjones@example.com $4.00Jones Fred Fjones@example.com $4.00 At some point, you'll want to take stock of how long some of your subscribers have been with you. Given the above list of their payments to date, you can have uniq count the number of occurrences with the -c flag: $ uniq -c payments.txt 8 Smith John Jsmith@example.com $3.00 3 Peters Aaron Apeters@example.com $10.00 6 Miller Tim Tmiller@example.com $1.00 4 Jones Mary Mjones@example.com $5.00 5 Jones Fred Fjones@example.com $4.00 However, this again relies on the lines being adjacent... if there were any that weren't, there would be duplicates in the output of the program that's designed to de-duplicate! For this reason, uniq is most useful when used in conjunction with the sort command. Displaying Unique Lines with sort and uniq Commands The sort command helps us here as it will arrange duplicated lines so they are adjacent, thereby allowing uniq to filter them out. For example, imagine the above payment report didn't come nicely ordered: Smith John Jsmith@example.com $3.00 Jones Fred Fjones@example.com $4.00 Miller Tim Tmiller@example.com $1.00 Peters Aaron Apeters@example.com $10.00 Jones Mary Mjones@example.com $5.00 Peters Aaron Apeters@example.com $10.00 Miller Tim Tmiller@example.com $1.00 Jones Fred Fjones@example.com $4.00 Smith John Jsmith@example.com $3.00 Jones Fred Fjones@example.com $4.00 Peters Aaron Apeters@example.com $10.00 Jones Fred Fjones@example.com $4.00 Jones Fred Fjones@example.com $4.00 Miller Tim Tmiller@example.com $1.00 Jones Mary Mjones@example.com $5.00 Smith John Jsmith@example.com $3.00 Miller Tim Tmiller@example.com $1.00 Smith John Jsmith@example.com $3.00 Smith John Jsmith@example.com $3.00 Smith John Jsmith@example.com $3.00 Smith John Jsmith@example.com $3.00 Jones Mary Mjones@example.com $5.00 Jones Mary Mjones@example.com $5.00 Miller Tim Tmiller@example.com $1.00 Miller Tim Tmiller@example.com $1.00 Smith John Jsmith@example.com $3.00 In this case, you'd want to first run this list through sort to group all the like items together, then run uniq. This uses the pipe operator on the command line ("|"), where the results of the command before the pipe get fed directly into the second command. So when we run this on our mixed-up payments we get the unique results (with their count): $ sort payments-rand.txt | uniq -c 5 Jones Fred Fjones@example.com $4.00 4 Jones Mary Mjones@example.com $5.00 6 Miller Tim Tmiller@example.com $1.00 3 Peters Aaron Apeters@example.com $10.00 8 Smith John Jsmith@example.com $3.00 Use the uniq Command for Quick Data Analysis As you get more familiar with the Linux command line, you'll find tons of useful programs like uniq. Sure, you could open the above in Excel and sort that way, but then you wouldn't start earning any tech cred, now would you? Was this page helpful? Thanks for letting us know! Get the Latest Tech News Delivered Every Day Email Address Sign up There was an error. Please try again. You're in! Thanks for signing up. There was an error. Please try again. Thank you for signing up. Tell us why! Other Not enough details Hard to understand Submit