AWK Lenguaje

E1000IO3 de Agosto de 2014

7.137 Palabras (29 Páginas)237 Visitas

Página 1 de 29

Awk -- A Pattern Scanning and Processing Language USD:19-1

Awk -- A Pattern Scanning and Processing Language

(Second Edition)

Alfred V. Aho

Brian W. Kernighan

Peter J. Weinberger

AT&T Bell Laboratories

Murray Hill, New Jersey 07974

ABSTRACT

Awk is a programming language whose basic opera-

tion is to search a set of files for patterns, and to

perform specified actions upon lines or fields of lines

which contain instances of those patterns. Awk makes

certain data selection and transformation operations

easy to express; for example, the awk program

length > 72

prints all input lines whose length exceeds 72 charac-

ters; the program

NF % 2 == 0

prints all lines with an even number of fields; and the

program

{ $1 = log($1); print }

replaces the first field of each line by its logarithm.

Awk patterns may include arbitrary boolean combi-

nations of regular expressions and of relational opera-

tors on strings, numbers, fields, variables, and array

elements. Actions may include the same pattern-

matching constructions as in patterns, as well as

arithmetic and string expressions and assignments, if-

else, while, for statements, and multiple output

USD:19-2 Awk -- A Pattern Scanning and Processing Language

streams.

This report contains a user's guide, a discussion

of the design and implementation of awk, and some tim-

ing statistics.

1. Introduction

Awk is a programming language designed to make many common

information retrieval and text manipulation tasks easy to state

and to perform.

The basic operation of awk is to scan a set of input lines

in order, searching for lines which match any of a set of pat-

terns which the user has specified. For each pattern, an action

can be specified; this action will be performed on each line that

matches the pattern.

Readers familiar with the UNIX program grep[1] will recog-

nize the approach, although in awk the patterns may be more gen-

eral than in grep, and the actions allowed are more involved than

merely printing the matching line. For example, the awk program

{print $3, $2}

prints the third and second columns of a table in that order.

The program

$2 ~ /A|B|C/

prints all input lines with an A, B, or C in the second field.

The program

--------------------------------------------------

UNIX is a trademark of AT&T Bell Laboratories.

Awk -- A Pattern Scanning and Processing Language USD:19-3

$1 != prev { print; prev = $1 }

prints all lines in which the first field is different from the

previous first field.

1.1. Usage

The command

awk program [files]

executes the awk commands in the string program on the set of

named files, or on the standard input if there are no files. The

statements can also be placed in a file pfile, and executed by

the command

awk -f pfile [files]

1.2. Program Structure

An awk program is a sequence of statements of the form:

pattern { action }

...

Each line of input is matched against each of the patterns in

turn. For each pattern that matches, the associated action is

executed. When all the patterns have been tested, the next line

is fetched and the matching starts over.

Either the pattern or the action may be left out, but not

USD:19-4 Awk -- A Pattern Scanning and Processing Language

both. If there is no action for a pattern, the matching line is

simply copied to the output. (Thus a line which matches several

patterns can be printed several times.) If there is no pattern

for an action, then the action is performed for every input line.

A line which matches no pattern is ignored.

Since patterns and actions are both optional, actions must

be enclosed in braces to distinguish them from patterns.

1.3. Records and Fields

Awk input is divided into ``records'' terminated by a record

separator. The default record separator is a newline, so by

default awk processes its input a line at a time. The number of

the current record is available in a variable named NR.

Each input record is considered to be divided into

``fields.'' Fields are normally separated by white space --

blanks or tabs -- but the input field separator may be changed,

as described below. Fields are referred to as $1, $2, and so

forth, where $1 is the first field, and $0 is the whole input

record itself. Fields may be assigned to. The number of fields

in the current record is available in a variable named NF.

The variables FS and RS refer to the input field and record

separators; they may be changed at any time to any single charac-

ter. The optional command-line argument -Fc may also be used to

set FS to the character c.

If the record separator is empty, an empty input line is

Awk -- A Pattern Scanning and Processing Language USD:19-5

taken as the record separator, and blanks, tabs and newlines are

treated as field separators.

The variable FILENAME contains the name of the current input

file.

1.4. Printing

An action may have no pattern, in which case the action is

executed for all lines. The simplest action is to print some or

all of a record; this is accomplished by the awk command print.

The awk program

{ print }

prints each record, thus copying the input to the output intact.

More useful is to print a field or fields from each record. For

instance,

print $2, $1

prints the first two fields in reverse order. Items separated by

a comma in the print statement will be separated by the current

output field separator when output. Items not separated by com-

mas will be concatenated, so

print $1 $2

runs the first and second fields together.

The predefined variables NF and NR can be used; for example

USD:19-6 Awk -- A Pattern Scanning and Processing Language

{ print NR, NF, $0 }

prints each record preceded by the record number and the number

of fields.

Output may be diverted to multiple files; the program

{ print $1 >"foo1"; print $2 >"foo2" }

writes the first field, $1, on the file foo1, and the second

field on file foo2. The >> notation can also be used:

print $1 >>"foo"

appends the output to the file foo. (In each case, the output

...

Descargar como (para miembros actualizados) txt (31 Kb)

Leer 28 páginas más »

Leer documento completo Guardar

Disponible sólo en Clubensayos.com