Shell Coding Guidelines
Removing Suffix by Separator
Want to strip off everything after the first or last occurrence of a separator in a string? Suppose your string is $string
and the separator is $separator
:
Before the first occurrence:
1 |
|
Before the last occurrence:
1 |
|
Example:
1 |
|
Appending Content to a Command Pipeline
Suppose you have a file and want to process its contents, then append more data before passing it all to the next part of a pipeline. The most robust, portable (POSIX-compliant) solution is:
1 |
|
Why this works:
{ ...; }
groups commands together in the current shell.- The pipe (
|
) passes their combined output. - This does not create unnecessary subshells and works in all POSIX shells.
Syntax tip: A space is required after {
and the last command must end in a semicolon (;
) before the closing }
.
Safely Using find
+ xargs
with Filenames
Using find
+ xargs
with filenames can break when filenames have spaces or special characters. To handle this safely, use null (\0
) terminators:
1 |
|
-print0
makesfind
use a null character at the end of each filename.-0
tellsxargs
to expect null-delimited input.
This combination ensures even weird filenames (with spaces, newlines, quotes, etc.) are handled safely.
Batch Killing Processes
Sometimes we acidentally spawn a series of processes, and we want to kill them. We can look up their pid's through ps -aux | grep <process_name>
(as shown below) and manually run the kill
command to kill each process by providing its pid, but how can we automate this tedious task?
1 |
|
First, we can add grep -v grep
to the pipe to hide the grep processes from the output:
1 |
|
Then, we can add awk '{print $2}'
to the pipe to invoke awk
to trim the second space-delimited component (which in this case is the pid). Now we have a list of the pid's of the processes we want to kill:
1 |
|
Finally, we can iterate over the pid's in a for-loop to kill them.
1 |
|
Copying Files via cat
and dd
cat
and dd
are standard Unix utilities for handling file data.
cat
outputs the contents of a file tostdout
.dd
readsstdin
(if noif=
) and writes tostdout
or a file.
To copy a file, you can use a Unix pipe (|
) to send cat
's output to dd
, then write to a destination file:
1 |
|
Potential Advantages of cat
and dd
Over cp
Better progress/statistics
dd
with thestatus=progress
(GNU dd) option shows live copy statistics:1
cat bigfile | dd of=outfile status=progress
Working Around cp
Limitations
- Some device files, file descriptors, or pseudo-files (like
/proc
or/sys
) do not supportcp
, but streaming withcat
+dd
may work.
Parsing Command-line Options
getopts
is a built-in Unix shell command for parsing command-line options. It is a wrapper around getopt
, a POSIX C library function used to parse command-line options of the Unix/POSIX style. Specifically:
- Options are single-character alphanumerics preceded by a - (hyphen-minus) character, i.e.
-a
.-b
,-c
. - Options can take an argument or none.
- Multiple options can be chained together, as long as the non-last ones are not argument-taking. If
-a
and-b
take no arguments while-c
takes an argument,-abc foo
is the same as-a -c -e foo
, but-bca
is not the same as-b -c a
due to the preceding rule. - When an option takes an argument, this can be in the same token or in the next one. In other words, if
-c
takes an argument,-cfoo
is the same as-c foo
.
optstring
's
Both getopt and getopts specifies specify options using a optstring. Specifically:
- Begin an optstring with
:
. - To specify an option that does not take an argument, append its name to the optstring.
- To specify an option that takes an argument, append its name and
:
to the optstring.
For example, the optstring that specifies two options -a
, -b
that do not take arguments and two options -c
, -d
that take arguments is :abc:d:
.
Using getopts
in a Shell Script
In Shell scripts, getopts
invoked with an optstring
is used with a while
-loop to parse command-line options.
Say that our Shell script test_getopts.sh
accepts two options -a
, -b
that do not take arguments and two options -c
, -d
that take arguments. Our Shell script can look like this:
1 |
|
Here, getopts
is invoked with the optstring
for specifying our options, :abc:d:
. In each iteration of the while
-loop, the next option is parsed and the Shell variables name
and OPTARG
are set to different values based on different conditions we may encounter.
- If a valid option is detected and that option does not take an argument, the Shell variable
name
is set to the name of the option. - If a valid option is detected and that option takes an argument:
- If we have provided an argument, the Shell variable
name
is set to the name of the option, and the Shell variableOPTARG
is set to the value of the argument. - If we haven't provided an argument, the Shell variable
name
is set to:
, and the Shell variableOPTARG
is set to the name of the argument.
- If we have provided an argument, the Shell variable
- If an invalid option is detected, the Shell variable
name
is set to?
, and the Shell variableOPTARG
is set to the name of the argument.
We can see getopts
at work by providing different command-line options when invoking our Shell script.
Providing no command-line options:
1 |
|
Providing option -a
that do not take arguments:
1 |
|
Providing option -a
that do not take arguments twice:
1 |
|
Providing option -c
that takes an argument with an argument foo
:
1 |
|
Providing option -c
that takes an argument with an argument foo
twice:
1 |
|
Providing option -c
that takes an argument without an argument:
1 |
|
Providing an invalid argument -e
:
1 |
|
References
- https://www.baeldung.com/linux/grep-exclude-ps-results
- https://stackoverflow.com/questions/46008880/how-to-always-cut-the-pid-from-ps-aux-command
- https://en.wikipedia.org/wiki/Getopts
- https://pubs.opengroup.org/onlinepubs/9699919799/utilities/getopts.html
- https://en.wikipedia.org/wiki/Getopt
- https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html