ClubEnsayos.com - Ensayos de Calidad, Tareas y Monografias
Buscar

AWK Lenguaje


Enviado por   •  3 de Agosto de 2014  •  7.137 Palabras (29 Páginas)  •  215 Visitas

Página 1 de 29

Awk -- A Pattern Scanning and Processing Language USD:19-1

Awk -- A Pattern Scanning and Processing Language

(Second Edition)

Alfred V. Aho

Brian W. Kernighan

Peter J. Weinberger

AT&T Bell Laboratories

Murray Hill, New Jersey 07974

ABSTRACT

Awk is a programming language whose basic opera-

tion is to search a set of files for patterns, and to

perform specified actions upon lines or fields of lines

which contain instances of those patterns. Awk makes

certain data selection and transformation operations

easy to express; for example, the awk program

length > 72

prints all input lines whose length exceeds 72 charac-

ters; the program

NF % 2 == 0

prints all lines with an even number of fields; and the

program

{ $1 = log($1); print }

replaces the first field of each line by its logarithm.

Awk patterns may include arbitrary boolean combi-

nations of regular expressions and of relational opera-

tors on strings, numbers, fields, variables, and array

elements. Actions may include the same pattern-

matching constructions as in patterns, as well as

arithmetic and string expressions and assignments, if-

else, while, for statements, and multiple output

USD:19-2 Awk -- A Pattern Scanning and Processing Language

streams.

This report contains a user's guide, a discussion

of the design and implementation of awk, and some tim-

ing statistics.

1. Introduction

Awk is a programming language designed to make many common

information retrieval and text manipulation tasks easy to state

and to perform.

The basic operation of awk is to scan a set of input lines

in order, searching for lines which match any of a set of pat-

terns which the user has specified. For each pattern, an action

can be specified; this action will be performed on each line that

matches the pattern.

Readers familiar with the UNIX program grep[1] will recog-

nize the approach, although in awk the patterns may be more gen-

eral than in grep, and the actions allowed are more involved than

merely printing the matching line. For example, the awk program

{print $3, $2}

prints the third and second columns of a table in that order.

The program

$2 ~ /A|B|C/

prints all input lines with an A, B, or C in the second field.

The program

--------------------------------------------------

UNIX is a trademark of AT&T Bell Laboratories.

Awk -- A Pattern Scanning and Processing Language USD:19-3

$1 != prev { print; prev = $1 }

prints all lines in which the first field is different from the

previous first field.

1.1. Usage

The command

awk program [files]

executes the awk commands in the string program on the set of

named files, or on the standard input if there are no files. The

statements can also be placed in a file pfile, and executed by

the command

awk -f pfile [files]

1.2. Program Structure

An awk program is a sequence of statements of the form:

pattern { action }

pattern { action }

...

Each line of input is matched against each of the patterns in

turn. For each pattern that matches, the associated action is

executed. When all the patterns have been tested, the next line

is fetched and the matching starts over.

Either the pattern or the action may be left out, but not

USD:19-4 Awk -- A Pattern Scanning and Processing Language

both. If there is no action for a pattern, the matching line is

simply copied to the output. (Thus a line which matches several

patterns can be printed several times.) If there is no pattern

for an action, then the action is performed for every input line.

A line which matches no pattern is ignored.

Since patterns and actions are both optional, actions must

be enclosed in braces to distinguish them from patterns.

1.3. Records and Fields

Awk input is divided into ``records'' terminated by a record

separator. The default record separator is a newline, so by

default awk processes its input a line at a time. The number of

the current record is available in a variable named NR.

Each input record is considered to be divided into

``fields.'' Fields are normally separated by white space --

blanks or tabs -- but the input field separator may be changed,

as described below. Fields are referred to as $1, $2, and so

forth, where $1 is the first field, and $0 is the whole input

...

Descargar como (para miembros actualizados) txt (31 Kb)
Leer 28 páginas más »
Disponible sólo en Clubensayos.com