- Unsupervised Learning
- Posts
- A grep Tutorial and Primer
A grep Tutorial and Primer
grep is one of our most frequently used tools, yet many still grapple with its basics. This tutorial will make it feel like a friend rather than a stranger.
grep is used to search for content within input.
Here are three (3) key grep basics:
You give it your search term first, followed by what you’re searching within
By default it searches by line, and returns those lines that contain what you searched for
You can also send it input from STDIN
[ NOTE: You want to make sure you’re using the right version of grep, i.e. the GNU version and not the BSD version. Check the notes for how to so on OS X. ]
So if you have a file called names.txt:
cat names.txt
Sarah John Michael Stewart Christina
…you could search within it like so:
grep John names.txt
…and it would return a single line:
John
…you could also do it this way, from STDIN:
cat names.txt | grep John
…and get the same result:
John
That’s the basic functionality, so now let’s look at some of the main options:
-i: ignore the case of your search term
-v: show lines that don’t match, instead of those that do
-c: instead of returning matches, return the number of matches
-x: return only an exact match
-E: interpret search as an extended regular expression
-F: interpret search as a list of fixed strings, including newlines, dots, etc
-f: get the search patterns from this file
-H: print the filename with each match
-m: stop reading file after n number of matches
-n: print the line number of where matches were found
-q: don’t output anything, but exit with status 0 if any match is found (check that status with echo $?).
-A: print n number of lines after the match
-B: print n number of lines before the match
-o: print only the matching part of the line
-e: search literally, and protects patterns starting with a hyphen
-w: find matches surrounded by space
--color: add color to the matched output
--help: get some help
-V: get grep’s version
grep can also use regular expressions. Here are some of the most common ones to know about:
[ NOTE: In GNU grep there is no difference between basic and extended regular expressions, and the functionality of egrep and fgrep have been pulled into grep itself, so there’s no reason to use them anymore. ]
Regex structure
These are the basic building blocks of a regex.
\: disregard the system meaning of (escape) the next character. Useful when entering carats for code, dots for IP addresses, etc.
[ ]: a bracket is a list of characters, and matches any character in that list. If the first character is ^ then it matches what’s not in the list
-: a hyphen indicates a range, so [a-d] means [abcd]
[^ ]: shows what doesn’t include those characters
^: matches at the beginning of the line
$: matches at the end of the line
.: matches any single character, except end of line
+: matches one or more of the preceding thing (at least once)
*: matches zero or more of the preceding thing
{x,y}: matches x to y occurrences of the preceding thing
{x}: matches exactly x occurrences of the preceding thing
{x,}: matches x or more occurrences of the preceding thing
Predefined shortcuts
A number of common expressions have been defined as universal shortcuts to make your searches easier, and they use double brackets.
[[:alnum:]]: any alphanumeric character
[[:alpha:]]: any alphabetic character
[[:contrl:]]: any control character
[[:digit:]]: any number
[[:lower:]]: any lower case character
[[:print:]]: any printable character
[[:space:]]: any space character, including space, tab, newline, CR, FF, etc.
Show, don’t tell.
That was a lot of words. Now let’s see some of these in action, going from basic to more complex:
cat file.txt
Stewart Christina - 441 <a href="https://www.google.com">Google</a> Jill 54r4h Shazbot123 lll 221 Item 1, Item 2, Item 3 TABS TABS TABS
One of the most common mistakes is searching for something and not getting a hit because there was a case mismatch. You can simply ignore case with the -i option:
grep -i jill file.txt
Jill
Another extremely common situation is wanting to get every line that doesn’t have the search pattern in it. You can do that with the -v option:
grep -v Christina file.txt
Stewart - 441 <a href="https://www.google.com">Google</a> Jill 54r4h Shazbot123 lll 221 Item 1, Item 2, Item 3 TABS TABS TABS
It’s easy to search in multiple files. Simply include the capture of each file in your file target.
grep -i password file*.txt
[ NOTE: The file*.txt will match file1.txt, file2, text, etc., as expected. ]
So you could do something like this as well:
grep -i password *.txt
In this case, the *.txt would search within all text files in the current directory.
Here we’re going to switch to a shortcut:
grep '[[:digit:]]' names.txt
Unsupervised Learning — Security, Tech, and AI in 10 minutes…
Get a weekly breakdown of what's happening in security and tech—and why it matters.
- 441 54r4h Shazbot123 lll 221 Item 1, Item 2, Item 3
Here we’re going to add the ^ to look for all lines that start with a number.
grep '^[[:digit:]]' names.txt
54r4h 221
Sometimes you want to find a line that is exactly what you searched for, rather than lines that have it in it.
echo "Jason" | grep -x son
grep -w 1 file.txt
Item 1, Item 2, Item 3
[ NOTE: This matched because of the final number, the 3. ]
grep '[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}' net.txt
23.44.124.67 172.16.23.1 10.10.2.21
Here’s the -A option, which gets lines after the match.
grep -A 2 Three count.txt
Three Four Five
[ NOTE: -B shows the lines before, and -C shows the lines both before and after. ]
You may have heard that egrep or fgrep should be used instead of grep, for this reason or that reason.
This may have been true at one time, but those commands are actually implemented within grep now, as the -e (egrep), and -f (fgrep).
From the man page:
egrep is the same as grep -E. fgrep is the same as grep -F. Direct invocation as either egrep or fgrep is deprecated, but is provided to allow historical applications that rely on them to run unmodified.
So just use grep with the appropriate switches, and you’ll get everything those used to have.
Using these components you should be able to find anything you want using grep, but if you are having an issue, put your search the comments and I’ll answer it and add it in the example section.
Hope this has been helpful.
The GNU Grep manual can be found here.
To install and use the GNU version of grep on OS X, install Homebrew and then run brew tap homebrew/dupes and brew install grep --with-default-names to get it to work.
grep is different from find in that find is designed to find files and directories on a system, not search within them.
To use shell variables, you need to switch to double quotes "$PATH".