This appendix describes the features of the awk scripting language.
The syntax for invoking awk has two basic forms:
awk [-v var=value] [-Fre] [--] 'pattern { action }' var=value datafile(s)
awk [-v var=value] [-Fre] -f scriptfile [--] var=value datafile(s)
An awk command line consists of the command, the script and the input filename. Input is read from the file specified on the command line. If there is no input file or "-" is specified, then standard input is read. The -F option sets the field separator (FS) to re.
The -v option sets the variable var to value before the script is executed. This happens even before the BEGIN procedure is run. (See the discussion below on command-line parameters.)
Following POSIX argument parsing conventions, the "--" option marks the end of command-line options. Using this option, for instance, you could specify a datafile that begins with "-", which would otherwise be confused with a command-line option.
You can specify a script consisting of pattern and action on the command line, surrounded by single quotes. Alternatively, you can place the script in a separate file and specify the name of the scriptfile on the command line with the -f option.
Parameters can be passed into awk by specifying them on the command line after the script. This includes setting system variables such as FS, OFS, and RS. The value can be a literal, a shell variable ($var) or the result of a command (`cmd`); it must be quoted if it contains spaces or tabs. Any number of parameters can be specified.
Command-line parameters are not available until the first line of input is read, and thus cannot be accessed in the BEGIN procedure. (Older implementations of awk and nawk would process leading command-line assignments before running the BEGIN procedure. This was contrary to how things were documented in The AWK Programming Language, which says that they are processed when awk would go to open them as filenames, i.e., after the BEGIN procedure. The Bell Labs awk was changed to correct this, and the -v option was added at the same time, in early 1989. It is now part of POSIX awk.) Parameters are evaluated in the order in which they appear on the command line up until a filename is recognized. Parameters appearing after that filename will be available when the next filename is recognized.
Typing a script at the system prompt is only practical for simple, one-line scripts. Any script that you might invoke as a command and reuse can be put inside a shell script. Using a shell script to invoke awk makes the script easy for others to use.
You can put the command line that invokes awk in a file, giving it a name that identifies what the script does. Make that file executable (using the chmod command) and put it in a directory where local commands are kept. The name of the shell script can be typed on the command line to execute the awk script. This is preferred for easily used and reused scripts.
On modern UNIX systems, including Linux, you can use the #! syntax to create self-contained awk scripts:
#! /usr/bin/awk -f
script
Awk parameters and the input filename can be specified on the command line that invokes the shell script. Note that the pathname to use is system-dependent.