sed & awk

sed & awkSearch this book
Previous: B.2 Language Summary for awkAppendix B
Quick Reference for awk
Next: C. Supplement for Chapter 12
 

B.3 Command Summary for awk

The following alphabetical list of statements and functions includes all that are available in POSIX awk, nawk, or gawk. See Chapter 11, A Flock of awks, for extensions available in different implementations.

atan2()

atan2(y, x)

Returns the arctangent of y/x in radians.

break

Exit from a while, for, or do loop.

close()

close(filename-expr)

close(command-expr)

In most implementations of awk, you can only have a limited number of files and/or pipes open simultaneously. Therefore, awk provides a close() function that allows you to close a file or a pipe. It takes as an argument the same expression that opened the pipe or file. This expression must be identical, character by character, to the one that opened the file or pipe - even whitespace is significant.

continue

Begin next iteration of while, for, or do loop.

cos()

cos(x)

Return cosine of x in radians.

delete

delete array[element]

Delete element of an array.

do

do

body

while (expr)

Looping statement. Execute statements in body then evaluate expr and if true, execute body again.

exit

exit [expr]

Exit from script, reading no new input. The END rule, if it exists, will be executed. An optional expr becomes awk's return value.

exp()

exp(x)

Return exponential of x (e ^ x).

for

for (init-expr; test-expr; incr-expr) statement

C-style looping construct. init-expr assigns the initial value of the counter variable. test-expr is a relational expression that is evaluated each time before executing the statement. When test-expr is false, the loop is exited. incr-expr is used to increment the counter variable after each pass.

for (item in array) statement

Special loop designed for reading associative arrays. For each element of the array, the statement is executed; the element can be referenced by array[item].

getline

Read next line of input.

getline [var] [<file]

command | getline [var]

The first form reads input from file and the second form reads the output of command. Both forms read one line at a time, and each time the statement is executed it gets the next line of input. The line of input is assigned to $0 and it is parsed into fields, setting NF, NR, and FNR. If var is specified, the result is assigned to var and the $0 is not changed. Thus, if the result is assigned to a variable, the current line does not change. getline is actually a function and it returns 1 if it reads a record successfully, 0 if end-of-line is encountered, and -1 if for some reason it is otherwise unsuccessful.

gsub()

gsub(r, s, t)

Globally substitute s for each match of the regular expression r in the string t. Return the number of substitutions. If t is not supplied, defaults to $0.

if

if (expr) statement1

[ else statement2 ]

Conditional statement. Evaluate expr and, if true, execute statement1; if else clause is supplied, execute statement2 if expr is false.

index()

index(str, substr)

Return position (starting at 1) of substring in string.

int()

int(x)

Return integer value of x by truncating any digits following a decimal point.

length()

length(str)

Return length of string, or the length of $0 if no argument.

log()

log(x)

Return natural logarithm (base e) of x.

match()

match(s, r)

Function that matches the pattern, specified by the regular expression r, in the string s and returns either the position in s where the match begins, or 0 if no occurrences are found. Sets the values of RSTART and RLENGTH to the start and length of the match, respectively.

next

Read next input line and begin executing script at first rule.

print

print [ output-expr ] [ dest-expr ]

Evaluate the output-expr and direct it to standard output followed by the value of ORS. Each output-expr is separated by the value of OFS. dest-expr is an optional expression that directs the output to a file or pipe. "> file" directs the output to a file, overwriting its previous contents. ">> file" appends the output to a file, preserving its previous contents. In both of these cases, the file will be created if it does not already exist. "| command" directs the output as the input to a system command.

printf

printf (format-expr [, expr-list ]) [ dest-expr ]

An alternative output statement borrowed from the C language. It has the ability to produce formatted output. It can also be used to output data without automatically producing a newline. format-expr is a string of format specifications and constants; see next section for a list of format specifiers. expr-list is a list of arguments corresponding to format specifiers. See the print statement for a description of dest-expr.

rand()

rand()

Generate a random number between 0 and 1. This function returns the same series of numbers each time the script is executed, unless the random number generator is seeded using the srand() function.

return

return [expr]

Used at end of user-defined functions to exit function, returning value of expression.

sin()

sin(x)

Return sine of x in radians.

split()

split(str, array, sep)

Function that parses string into elements of array using field separator, returning number of elements in array. Value of FS is used if no field separator is specified. Array splitting works the same as field splitting.

sprintf()

sprintf (format-expr [, expr-list ] )

Function that returns string formatted according to printf format specification. It formats data but does not output it. format-expr is a string of format specifications and constants; see the next section for a list of format specifiers. expr-list is a list of arguments corresponding to format specifiers.

sqrt()

sqrt(x)

Return square root of x.

srand()

srand(expr)

Use expr to set a new seed for random number generator. Default is time of day. Return value is the old seed.

sub()

sub(r, s, t)

Substitute s for first match of the regular expression r in the string t. Return 1 if successful; 0 otherwise. If t is not supplied, defaults to $0.

substr()

substr(str, beg, len)

Return substring of string str at beginning position beg, and the characters that follow to maximum specified length len. If no length is given, use the rest of the string.

system()

system(command)

Function that executes the specified command and returns its status. The status of the executed command typically indicates success or failure. A value of 0 means that the command executed successfully. A non-zero value, whether positive or negative, indicates a failure of some sort. The documentation for the command you're running will give you the details. The output of the command is not available for processing within the awk script. Use "command | getline" to read the output of a command into the script.

tolower()

tolower(str)

Translate all uppercase characters in str to lowercase and return the new string.[3]

[3] Very early versions of nawk, such as that in SunOS 4.1.x, don't support tolower() and toupper(). However, they are now part of the POSIX specification for awk.

toupper()

toupper(str)

Translate all lowercase characters in str to uppercase and return the new string.

while

while (expr) statement

Looping construct. While expr is true, execute statement.

B.3.1 Format Expressions Used in printf and sprintf

A format expression can take three optional modifiers following "%" and preceding the format specifier:

%-width.precision format-specifier

The width of the output field is a numeric value. When you specify a field width, the contents of the field will be right-justified by default. You must specify "-" to get left-justification. Thus, "%-20s" outputs a string left-justified in a field 20 characters wide. If the string is less than 20 characters, the field will be padded with spaces to fill.

The precision modifier, used for decimal or floating-point values, controls the number of digits that appear to the right of the decimal point. For string formats, it controls the number of characters from the string to print.

You can specify both the width and precision dynamically, via values in the printf or sprintf argument list. You do this by specifying asterisks, instead of specifying literal values.

printf("%*.*g\n", 5, 3, myvar);

In this example, the width is 5, the precision is 3, and the value to print will come from myvar. Older versions of nawk may not support this.

Note that the default precision for the output of numeric values is "%.6g." The default can be changed by setting the system variable OFMT. This affects the precision used by the print statement when outputting numbers. For instance, if you are using awk to write reports that contain dollar values, you might prefer to change OFMT to "%.2f."

The format specifiers, shown in Table 13.7, are used with printf and sprintf statements.

Table B.6: Format Specifiers Used in printf
CharacterDescription
cASCII character.
dDecimal integer.
iDecimal integer. Added in POSIX.
e

Floating-point format ([-]d.precisione[+-]dd).

E

Floating-point format ([-]d.precisionE[+-]dd).

f

Floating-point format ([-]ddd.precision).

g

e or f conversion, whichever is shortest, with trailing zeros removed.

G

E or f conversion, whichever is shortest, with trailing zeros removed.

oUnsigned octal value.
sString.
x

Unsigned hexadecimal number. Uses a-f for 10 to 15.

X

Unsigned hexadecimal number. Uses A-F for 10 to 15.

%Literal %.

Often, whatever format specifiers are available in the system's sprintf(3) subroutine are available in awk.

The way printf and sprintf() do rounding will often depend upon the system's C sprintf(3) subroutine. On many machines, sprintf rounding is "unbiased," which means it doesn't always round a trailing ".5" up, contrary to naive expectations. In unbiased rounding, ".5" rounds to even, rather than always up, so 1.5 rounds to 2 but 4.5 rounds to 4. The result is that if you are using a format that does rounding (e.g., "%.0f") you should check what your system does. The following function does traditional rounding; it might be useful if your awk's printf does unbiased rounding.

# round --- do normal rounding
#	Arnold Robbins, [email protected]
#	Public Domain
function round(x,       ival, aval, fraction)
{
        ival = int(x)	# integer part, int() truncates
	# see if fractional part
	if (ival == x)	# no fraction
		return x
	if (x < 0) {
		aval = -x	# absolute value
		ival = int(aval)
		fraction = aval - ival
		if (fraction >= .5)
			return int(x) - 1		# -2.5 --> -3
		else
			return int(x)		# -2.3 --> -2
	} else {
		fraction = x - ival
		if (fraction >= .5)
			return ival + 1
		else
			return ival
	}
}


Previous: B.2 Language Summary for awksed & awkNext: C. Supplement for Chapter 12
B.2 Language Summary for awkBook IndexC. Supplement for Chapter 12

The UNIX CD Bookshelf NavigationThe UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System