Overview
Teaching: 1 min Exercises: minQuestions
Useful tools for finding commands and files?
Useful tools for directory navigation?
Objectives
Get a decent overview of useful shell tools
As we saw in the shell lecture, the first-order approach is to call said command with the -h or –help flags. A more detailed approach is to use the man
command. Short for manual, man
provides a manual page (called manpage) for a command you specify. For example, man rm
will output the behavior of the rm
command along with the flags that it takes. In fact, what I have been linking so far for every command is the online version of the Linux manpages for the commands. Even non-native commands that you install will have manpage entries if the developer wrote them and included them as part of the installation process.
Sometimes manpages can provide overly detailed descriptions of the commands, making it hard to decipher what flags/syntax to use for common use cases. TLDR pages are a nifty complementary solution that focuses on giving example use cases of a command so you can quickly figure out which options to use. For instance, I find myself referring back to the tldr pages for tar
way more often than the manpages.
One of the most common repetitive tasks that scientist faces is finding files or directories within a project directory. All UNIX-like systems come packaged with find
, a great shell tool to find files. find
will recursively search for files matching some criteria. Some examples:
# Find all directories named src
find . -name src -type d
# Find all python files that have a folder named test in their path
find . -path '*/test/*.py' -type f
# Find all files modified in the last day
find . -mtime -1
# Find all zip files with size in range 500k to 10M
find . -size +500k -size -10M -name '*.tar.gz'
Beyond listing files, find can also perform actions over files that match your query. This property can be incredibly helpful to simplify what could be fairly monotonous tasks.
# Delete all files with .tmp extension
find . -name '*.tmp' -exec rm {} \;
# Find all PNG files and convert them to JPG
find . -name '*.png' -exec convert {} {}.jpg \;
Despite find
’s ubiquitousness, its syntax can sometimes be tricky to remember. For instance, to simply find files that match some pattern PATTERN
you have to execute find -name '*PATTERN*'
(or -iname
if you want the pattern matching to be case insensitive). You could start building aliases for those scenarios, but part of the shell philosophy is that it is good to explore alternatives. For instance, fd
is a simple, fast, and user-friendly alternative to find. It offers some nice defaults like colorized output, default regex matching, and Unicode support. It also has, in my opinion, a more intuitive syntax. For example, the syntax to find a pattern PATTERN
is fd PATTERN
.
Finding files by name is useful, but quite often you want to search based on file content. A common scenario is wanting to search for all files that contain some pattern, along with where in those files said pattern occurs. To achieve this, most UNIX-like systems provide grep
, a generic tool for matching patterns from the input text.
For now, know that grep
has many flags that make it a very versatile tool. Some I frequently use are -C
for getting Context around the matching line and -v
for inverting the match, i.e. print all lines that do not match the pattern. For example, grep -C 5
will print 5 lines before and after the match. When it comes to quickly searching through many files, you want to use -R
since it will Recursively go into directories and look for files for the matching string.
But grep -R
can be improved in many ways, such as ignoring .git
folders, using multi CPU support. Many grep
alternatives have been developed, including ack
, ag
and rg
. All of them are fantastic and pretty much provide the same functionality. For now I am sticking with ripgrep (rg
), given how fast and intuitive it is. Some examples:
# Find all python files where I used the numpy library
rg -t py 'import numpy'
# Find all files (including hidden files) without a shebang line
rg -u --files-without-match "^#!"
# Find all matches of foo and print the following 5 lines
rg foo -A 5
# Print statistics of matches (# of matched lines and files )
rg --stats PATTERN
Note that as with find/fd
, it is important that you know that these problems can be quickly solved using one of these tools, while the specific tools you use are not as important.
So far we have seen how to find files and code, but as you start spending more time in the shell, you may want to find specific commands you typed at some point. The first thing to know is that typing the up arrow will give you back your last command, and if you keep pressing it you will slowly go through your shell history.
The history
command will let you access your shell history programmatically. It will print your shell history to the standard output. If we want to search there we can pipe that output to grep and search for patterns. history | grep find
will print commands that contain the substring “find”.
In most shells, you can make use of Ctrl+R
to perform backwards search through your history. After pressing Ctrl+R
, you can type a substring you want to match for commands in your history. As you keep pressing it, you will cycle through the matches in your history. This can also be enabled with the UP/DOWN arrows in zsh
. A nice addition on top of Ctrl+R
comes with using fzf
bindings. fzf
is a general-purpose fuzzy finder that can be used with many commands. Here it is used to fuzzily match through your history and present results in a convenient and visually pleasing manner.
Another cool history-related trick I really enjoy is history-based autosuggestions. First introduced by the fish shell, this feature dynamically autocompletes your current shell command with the most recent command that you typed that shares a common prefix with it. It can be enabled in zsh and it is a great quality of life trick for your shell.
Key Points
TLDR pages can be more useful than
man
find, fd, rg are all useful tools for finding files or directories
History based suggestions and fuzzy finder are great QOL additions to your work space