Shell Coding Guidelines
Removing Suffix by Separator
Want to strip off everything after the first or last occurrence of a separator in a string? Suppose your string is $string and the separator is $separator:
Before the first occurrence:
1 | |
Before the last occurrence:
1 | |
Example:
1 | |
Appending Content to a Command Pipeline
Suppose you have a file and want to process its contents, then append more data before passing it all to the next part of a pipeline. The most robust, portable (POSIX-compliant) solution is:
1 | |
Why this works:
{ ...; }groups commands together in the current shell.- The pipe (
|) passes their combined output. - This does not create unnecessary subshells and works in all POSIX shells.
Syntax tip: A space is required after { and the last command must end in a semicolon (;) before the closing }.
Safely Using find + xargs with Filenames
Using find + xargs with filenames can break when filenames have spaces or special characters. To handle this safely, use null (\0) terminators:
1 | |
-print0makesfinduse a null character at the end of each filename.-0tellsxargsto expect null-delimited input.
This combination ensures even weird filenames (with spaces, newlines, quotes, etc.) are handled safely.
Batch Killing Processes
Sometimes we acidentally spawn a series of processes, and we want to kill them. We can look up their pid's through ps -aux | grep <process_name> (as shown below) and manually run the kill command to kill each process by providing its pid, but how can we automate this tedious task?
1 | |
First, we can add grep -v grep to the pipe to hide the grep processes from the output:
1 | |
Then, we can add awk '{print $2}' to the pipe to invoke awk to trim the second space-delimited component (which in this case is the pid). Now we have a list of the pid's of the processes we want to kill:
1 | |
Finally, we can iterate over the pid's in a for-loop to kill them.
1 | |
Copying Files via cat and dd
cat and dd are standard Unix utilities for handling file data.
catoutputs the contents of a file tostdout.ddreadsstdin(if noif=) and writes tostdoutor a file.
To copy a file, you can use a Unix pipe (|) to send cat's output to dd, then write to a destination file:
1 | |
Potential Advantages of cat and dd Over cp
Better progress/statistics
ddwith thestatus=progress(GNU dd) option shows live copy statistics:1
cat bigfile | dd of=outfile status=progress
Working Around cp Limitations
- Some device files, file descriptors, or pseudo-files (like
/procor/sys) do not supportcp, but streaming withcat+ddmay work.
Parsing Command-line Options
getopts is a built-in Unix shell command for parsing command-line options. It is a wrapper around getopt, a POSIX C library function used to parse command-line options of the Unix/POSIX style. Specifically:
- Options are single-character alphanumerics preceded by a - (hyphen-minus) character, i.e.
-a.-b,-c. - Options can take an argument or none.
- Multiple options can be chained together, as long as the non-last ones are not argument-taking. If
-aand-btake no arguments while-ctakes an argument,-abc foois the same as-a -c -e foo, but-bcais not the same as-b -c adue to the preceding rule. - When an option takes an argument, this can be in the same token or in the next one. In other words, if
-ctakes an argument,-cfoois the same as-c foo.
optstring's
Both getopt and getopts specifies specify options using a optstring. Specifically:
- Begin an optstring with
:. - To specify an option that does not take an argument, append its name to the optstring.
- To specify an option that takes an argument, append its name and
:to the optstring.
For example, the optstring that specifies two options -a, -b that do not take arguments and two options -c, -d that take arguments is :abc:d:.
Using getopts in a Shell Script
In Shell scripts, getopts invoked with an optstring is used with a while-loop to parse command-line options.
Say that our Shell script test_getopts.sh accepts two options -a, -b that do not take arguments and two options -c, -d that take arguments. Our Shell script can look like this:
1 | |
Here, getopts is invoked with the optstring for specifying our options, :abc:d:. In each iteration of the while-loop, the next option is parsed and the Shell variables name and OPTARG are set to different values based on different conditions we may encounter.
- If a valid option is detected and that option does not take an argument, the Shell variable
nameis set to the name of the option. - If a valid option is detected and that option takes an argument:
- If we have provided an argument, the Shell variable
nameis set to the name of the option, and the Shell variableOPTARGis set to the value of the argument. - If we haven't provided an argument, the Shell variable
nameis set to:, and the Shell variableOPTARGis set to the name of the argument.
- If we have provided an argument, the Shell variable
- If an invalid option is detected, the Shell variable
nameis set to?, and the Shell variableOPTARGis set to the name of the argument.
We can see getopts at work by providing different command-line options when invoking our Shell script.
Providing no command-line options:
1 | |
Providing option -a that do not take arguments:
1 | |
Providing option -a that do not take arguments twice:
1 | |
Providing option -c that takes an argument with an argument foo:
1 | |
Providing option -c that takes an argument with an argument foo twice:
1 | |
Providing option -c that takes an argument without an argument:
1 | |
Providing an invalid argument -e:
1 | |
References
- https://www.baeldung.com/linux/grep-exclude-ps-results
- https://stackoverflow.com/questions/46008880/how-to-always-cut-the-pid-from-ps-aux-command
- https://en.wikipedia.org/wiki/Getopts
- https://pubs.opengroup.org/onlinepubs/9699919799/utilities/getopts.html
- https://en.wikipedia.org/wiki/Getopt
- https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html