The Unix Shell: Finding Stuff¶
Flexible ways to find files of interest.
Using locate
¶
Many *nix systems maintain a database of path names that can be searched with locate. This is not available on the Docker container you are using.
LOCATE(1) BSD General Commands Manual LOCATE(1)
NAME
locate -- find filenames quickly
SYNOPSIS
locate [-0Scims] [-l limit] [-d database] pattern ...
DESCRIPTION
The locate program searches a database for all pathnames which match the
specified pattern. The database is recomputed periodically (usually
weekly or daily), and contains the pathnames of all files which are pub-
licly accessible.
Shell globbing and quoting characters (``*'', ``?'', ``\'', ``['' and
``]'') may be used in pattern, although they will have to be escaped from
the shell. Preceding any character with a backslash (``\'') eliminates
any special meaning which it may have. The matching differs in that no
characters must be matched explicitly, including slashes (``/'').
Using grep
¶
grep
is used to find regular expression patterns within files. We
have covered regular expressions in a previous lecture, but here are the
basics as a reminder.
. represents one of any character
+ represents one or more of the preceding pattern
* represents zero or more of the preceding pattern
^ matches at start of line
$ matches at end of line
[a|b|c] matches a or b or c
(cat|dog) matches cat or dog
[A-Z] matches all upper case characters
[0-9] matches all digits
The -E
flag to grep
removes the need to escape special
characters.
In [1]:
cat hello.txt
1 Hello, bash
2 Hello, again
3 Hello
4 again
Recursive searching¶
In [3]:
grep -r "Hello" ./*txt
./hello.txt:1 Hello, bash
./hello.txt:2 Hello, again
./hello.txt:3 Hello
Get filenames only¶
We can use grep
to find files matching some regular expression.
In [8]:
grep -l "Hello" *.txt
hello.txt
Find only directories¶
In [4]:
ls -d */
data/ scripts/
Using grep¶
In [5]:
ls -l
total 320
-rw-r--r-- 1 cliburn staff 120 Jul 26 10:02 MD5_CHECKSUM
-rw-r--r-- 1 cliburn staff 46843 Jul 26 10:10 The_Unix_Shell_01___File_and_Directory_Management.ipynb
-rw-r--r-- 1 cliburn staff 6930 Jul 26 09:51 The_Unix_Shell_02___Working_with_Text.ipynb
-rw-r--r-- 1 cliburn staff 15644 Jul 26 09:22 The_Unix_Shell_03___Regular_Expresssions.ipynb
-rw-r--r-- 1 cliburn staff 13409 Jul 26 09:20 The_Unix_Shell_04___Finding_Stuff.ipynb
-rw-r--r-- 1 cliburn staff 22120 Jul 26 09:25 The_Unix_Shell_05___Shell_Scripts.ipynb
-rw-r--r-- 1 cliburn staff 1106 Jul 26 10:06 The_Unix_Shell___Exercises.ipynb
-rw-r--r-- 1 cliburn staff 6 Jul 26 10:01 a.txt
-rw-r--r-- 1 cliburn staff 6 Jul 26 10:02 b.txt
-rw-r--r-- 1 cliburn staff 6 Jul 26 10:01 c.txt
drwxr-xr-x 12 cliburn staff 408 Jul 26 09:11 data
-rw-r--r-- 1 cliburn staff 46 Jul 26 09:55 goodbye.md5
-rw-r--r-- 1 cliburn staff 45 Jul 26 09:55 goodbye.txt
-rw-r--r-- 1 cliburn staff 45 Jul 26 09:52 hell.txt
-rw-r--r-- 1 cliburn staff 44 Jul 26 09:49 hello.md5
-rw-r--r-- 1 cliburn staff 45 Jul 26 10:00 hello.txt
drwxr-xr-x 5 cliburn staff 170 Jul 26 09:25 scripts
-rw-r--r-- 1 cliburn staff 42 Jul 26 09:31 stderr.txt
-rw-r--r-- 1 cliburn staff 0 Jul 26 09:31 stdout.txt
-rw-r--r-- 1 cliburn staff 44 Jul 26 10:00 test.md5
In [6]:
ls -l | grep -E '^d'
drwxr-xr-x 12 cliburn staff 408 Jul 26 09:11 data
drwxr-xr-x 5 cliburn staff 170 Jul 26 09:25 scripts
In [25]:
ls -l | grep -E '^d'
drwxr-xr-x 12 cliburn staff 408 Jul 26 09:11 data
drwxr-xr-x 5 cliburn staff 170 Jul 26 09:25 scripts
Using the invert -v option to find only files¶
In [26]:
ls -l | grep -Ev '^d'
total 328
-rw-r--r-- 1 cliburn staff 120 Jul 26 10:02 MD5_CHECKSUM
-rw-r--r-- 1 cliburn staff 46843 Jul 26 10:10 The_Unix_Shell_01___File_and_Directory_Management.ipynb
-rw-r--r-- 1 cliburn staff 6930 Jul 26 09:51 The_Unix_Shell_02___Working_with_Text.ipynb
-rw-r--r-- 1 cliburn staff 15644 Jul 26 09:22 The_Unix_Shell_03___Regular_Expresssions.ipynb
-rw-r--r-- 1 cliburn staff 16745 Jul 26 10:16 The_Unix_Shell_04___Finding_Stuff.ipynb
-rw-r--r-- 1 cliburn staff 22120 Jul 26 09:25 The_Unix_Shell_05___Shell_Scripts.ipynb
-rw-r--r-- 1 cliburn staff 1106 Jul 26 10:06 The_Unix_Shell___Exercises.ipynb
-rw-r--r-- 1 cliburn staff 6 Jul 26 10:01 a.txt
-rw-r--r-- 1 cliburn staff 6 Jul 26 10:02 b.txt
-rw-r--r-- 1 cliburn staff 6 Jul 26 10:01 c.txt
-rw-r--r-- 1 cliburn staff 46 Jul 26 09:55 goodbye.md5
-rw-r--r-- 1 cliburn staff 45 Jul 26 09:55 goodbye.txt
-rw-r--r-- 1 cliburn staff 45 Jul 26 09:52 hell.txt
-rw-r--r-- 1 cliburn staff 44 Jul 26 09:49 hello.md5
-rw-r--r-- 1 cliburn staff 45 Jul 26 10:00 hello.txt
-rw-r--r-- 1 cliburn staff 42 Jul 26 09:31 stderr.txt
-rw-r--r-- 1 cliburn staff 0 Jul 26 09:31 stdout.txt
-rw-r--r-- 1 cliburn staff 44 Jul 26 10:00 test.md5
Using find
¶
While grep
can find files matching some regular expression, the
find
command is used to locate files of interest based on various
file properties. We will show a few examples.
FIND(1) BSD General Commands Manual FIND(1)
NAME
find -- walk a file hierarchy
SYNOPSIS
find [-H | -L | -P] [-EXdsx] [-f path] path ... [expression]
find [-H | -L | -P] [-EXdsx] -f path [path ...] [expression]
DESCRIPTION
The find utility recursively descends the directory tree for each path
listed, evaluating an expression (composed of the ``primaries'' and
``operands'' listed below) in terms of each file in the tree.
The options are as follows:
-E Interpret regular expressions followed by -regex and -iregex pri-
maries as extended (modern) regular expressions rather than basic
regular expressions (BRE's). The re_format(7) manual page fully
In [9]:
ls -R
The_Unix_Shell_01___File_and_Directory_Management.ipynb
The_Unix_Shell_03___Working_with_Text.ipynb
The_Unix_Shell_04___Regular_Expresssions.ipynb
The_Unix_Shell_05___Finding_Stuff.ipynb
The_Unix_Shell_06___Shell_Scripts.ipynb
data
hello.txt
scripts
./data:
X.txt example.fna iris24.csv
Y.txt food_and_groups.csv titanic.csv
Y1.txt forbes.csv
Y2.txt iris.csv
./scripts:
avg.sh extract_headers.sh
cat_if_exists.sh rename.py
Find is case sensitive by default¶
In [11]:
find . -name "*unix*ipynb"
Use `-iname
for case-insensitive search
In [12]:
find . -iname "*unix*ipynb"
./.ipynb_checkpoints/The_Unix_Shell_01___File_and_Directory_Management-checkpoint.ipynb
./.ipynb_checkpoints/The_Unix_Shell_03___Working_with_Text-checkpoint.ipynb
./.ipynb_checkpoints/The_Unix_Shell_04___Regular_Expresssions-checkpoint.ipynb
./.ipynb_checkpoints/The_Unix_Shell_05___Finding_Stuff-checkpoint.ipynb
./.ipynb_checkpoints/The_Unix_Shell_06___Shell_Scripts-checkpoint.ipynb
./The_Unix_Shell_01___File_and_Directory_Management.ipynb
./The_Unix_Shell_03___Working_with_Text.ipynb
./The_Unix_Shell_04___Regular_Expresssions.ipynb
./The_Unix_Shell_05___Finding_Stuff.ipynb
./The_Unix_Shell_06___Shell_Scripts.ipynb
Exclude unwanted directories from search¶
In [13]:
find . -not -path "*ipynb_checkpoints/*" -iname "*unix*ipynb"
./The_Unix_Shell_01___File_and_Directory_Management.ipynb
./The_Unix_Shell_03___Working_with_Text.ipynb
./The_Unix_Shell_04___Regular_Expresssions.ipynb
./The_Unix_Shell_05___Finding_Stuff.ipynb
./The_Unix_Shell_06___Shell_Scripts.ipynb
Limiting recursion depth¶
In [14]:
find . -name "*[csv|txt]"
./.ipynb_checkpoints
./data/food_and_groups.csv
./data/forbes.csv
./data/iris.csv
./data/iris24.csv
./data/titanic.csv
./data/X.txt
./data/Y.txt
./data/Y1.txt
./data/Y2.txt
./hello.txt
./scripts
./scripts/.ipynb_checkpoints
In [15]:
find . -name "*[csv|txt]" -maxdepth 1
./.ipynb_checkpoints
./hello.txt
./scripts
Find by time¶
Files notebooks created more than 1 day ago¶
In [16]:
find . -name "*ipynb" -ctime +1
Files notebooks modified within the last day¶
In [17]:
find . -name "*ipynb" -mtime -1
./.ipynb_checkpoints/The_Unix_Shell_01___File_and_Directory_Management-checkpoint.ipynb
./.ipynb_checkpoints/The_Unix_Shell_03___Working_with_Text-checkpoint.ipynb
./.ipynb_checkpoints/The_Unix_Shell_04___Regular_Expresssions-checkpoint.ipynb
./.ipynb_checkpoints/The_Unix_Shell_05___Finding_Stuff-checkpoint.ipynb
./.ipynb_checkpoints/The_Unix_Shell_06___Shell_Scripts-checkpoint.ipynb
./The_Unix_Shell_01___File_and_Directory_Management.ipynb
./The_Unix_Shell_03___Working_with_Text.ipynb
./The_Unix_Shell_04___Regular_Expresssions.ipynb
./The_Unix_Shell_05___Finding_Stuff.ipynb
./The_Unix_Shell_06___Shell_Scripts.ipynb
Files modified in the past 15 minutes¶
In [18]:
find . -name "*ipynb" -mmin -15
./The_Unix_Shell_06___Shell_Scripts.ipynb