<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8">
<meta name="generator" content="pdf2htmlEX">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<link rel="stylesheet" href="https://static.pudn.com/base/css/base.min.css">
<link rel="stylesheet" href="https://static.pudn.com/base/css/fancy.min.css">
<link rel="stylesheet" href="https://static.pudn.com/prod/directory_preview_static/62b2a4b4a11cf7345fc74d9a/raw.css">
<script src="https://static.pudn.com/base/js/compatibility.min.js"></script>
<script src="https://static.pudn.com/base/js/pdf2htmlEX.min.js"></script>
<script>
try{
pdf2htmlEX.defaultViewer = new pdf2htmlEX.Viewer({});
}catch(e){}
</script>
<title></title>
</head>
<body>
<div id="sidebar" style="display: none">
<div id="outline">
</div>
</div>
<div id="pf1" class="pf w0 h0" data-page-no="1"><div class="pc pc1 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/62b2a4b4a11cf7345fc74d9a/bg1.jpg"><div class="c x1 y1 w2 h2"><div class="t m0 x2 h3 y2 ff1 fs0 fc0 sc0 ls0 ws0">下载</div><div class="t m0 x3 h4 y3 ff2 fs0 fc1 sc0 ls0 ws0">第<span class="ff3">9</span>章<span class="_ _0"> </span><span class="ff3 ls1">AWK </span>介<span class="_ _0"> </span>绍</div><div class="t m0 x4 h5 y4 ff2 fs1 fc0 sc0 ls2 ws0">如果要格式化报文或从一个大的文本文件中抽取数据包,那么<span class="_ _1"> </span><span class="ff3 ls0 ws1">a w k<span class="_"> </span></span><span class="ls3">可以完成这些任务。它</span></div><div class="t m0 x5 h6 y5 ff2 fs1 fc0 sc0 ls4 ws0">在文本浏览和数据的熟练使用上性能优异。</div><div class="t m0 x4 h5 y6 ff2 fs1 fc0 sc0 ls0 ws0">整体来说,<span class="_ _2"></span><span class="ff3 ws2">a w k</span>是所有<span class="_ _3"></span><span class="ff3 ws2">s h e l l</span><span class="ls5">过滤工具中最难掌握的,不知道为什么,也许是其复杂的语法</span></div><div class="t m0 x5 h5 y7 ff2 fs1 fc0 sc0 ls6 ws0">或含义不明确的错误提示信息。在学习<span class="_ _4"> </span><span class="ff3 ls0 ws3">a w k<span class="_"> </span></span><span class="ls7">语言过程中,就会慢慢掌握诸如<span class="_ _5"> </span><span class="ff3 ls8">Bailing out <span class="_ _2"></span></span><span class="ls0">和</span></span></div><div class="t m0 x5 h5 y8 ff3 fs1 fc0 sc0 ls0 ws4">a w k : c m d . L i n e :<span class="ff2 ls9 ws0">等错误信息。可以说<span class="_ _2"> </span></span>a w k<span class="_"> </span><span class="ff2 lsa ws0">是一种自解释的编程语言,之所以要在<span class="_ _6"> </span></span>s h e l l<span class="ff2 ws0">中使用<span class="_ _3"></span></span>a w k</div><div class="t m0 x5 h5 y9 ff2 fs1 fc0 sc0 ls0 ws0">是因为<span class="_ _3"></span><span class="ff3 ws5">a w k<span class="_"> </span></span><span class="lsb">本身是学习的好例子,但结合<span class="_ _7"> </span></span><span class="ff3 ws5">a w k</span><span class="lsc">与其他工具诸如<span class="_ _2"></span></span><span class="ff3 ws5">g r e p</span>和<span class="_ _8"></span><span class="ff3 ws5">s e d</span>,将会使<span class="_ _2"></span><span class="ff3 ws5">s h e l l</span>编程更加</div><div class="t m0 x5 h6 ya ff2 fs1 fc0 sc0 ls0 ws0">容易。</div><div class="t m0 x4 h5 yb ff2 fs1 fc0 sc0 lsc ws0">本章没有讲述<span class="_ _3"></span><span class="ff3 ls0 ws6">a w k</span><span class="lsd">的全部特性,也不涉及<span class="_ _9"> </span><span class="ff3 ls0 ws6">a w k<span class="_"> </span></span><span class="ls3">的深层次编程,<span class="_ _a"></span><span class="lsd">(这些可以在专门讲述<span class="_ _2"> </span><span class="ff3 ls0 ws6">a w k<span class="_"> </span></span><span class="ls0">的</span></span></span></span></div><div class="t m0 x5 h5 yc ff2 fs1 fc0 sc0 lse ws0">书籍中找到)<span class="_ _a"></span><span class="lsf">。本章仅注重于讲述使用<span class="_ _2"></span><span class="ff3 ls0 ws7">a w k</span><span class="ls10">执行行操作及怎样从文本文件和字符串中抽取信息。</span></span></div><div class="t m0 x4 h6 yd ff2 fs1 fc0 sc0 ls11 ws0">本章内容有:</div><div class="t m0 x4 h5 ye ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2">抽取域。</span></div><div class="t m0 x4 h5 yf ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2 ls12">匹配正则表达式。</span></div><div class="t m0 x4 h5 y10 ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2">比较域。</span></div><div class="t m0 x4 h5 y11 ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2">向<span class="_ _8"></span></span><span class="ws7">a w k<span class="_"> </span></span><span class="ff2">传递参数。</span></div><div class="t m0 x4 h5 y12 ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2">基本的<span class="_ _3"></span></span><span class="ws7">a w k</span><span class="ff2 ls13">行操作和脚本。</span></div><div class="t m0 x4 h5 y13 ff2 fs1 fc0 sc0 ls14 ws0">本书几乎所有包含<span class="_ _7"> </span><span class="ff3 ls0 ws8">a w k<span class="_"> </span></span><span class="lse">命令的脚本都结合了<span class="_ _7"> </span><span class="ff3 ls0 ws8">s e d<span class="_"> </span></span><span class="ls0">和<span class="_ _8"></span><span class="ff3 ws8">g r e p<span class="_ _3"></span></span><span class="ls15">,以从文本文件和字符串中抽取信</span></span></span></div><div class="t m0 x5 h6 y14 ff2 fs1 fc0 sc0 ls16 ws0">息。为获得所需信息,文本必须格式化,意即用域分隔符划分抽取域,分隔符可能是任意字</div><div class="t m0 x5 h5 y15 ff2 fs1 fc0 sc0 ls17 ws0">符,在以后讲述<span class="_ _3"></span><span class="ff3 ls0 ws7">a w k<span class="_"> </span></span>时再详细讨论。</div><div class="t m0 x4 h5 y16 ff3 fs1 fc0 sc0 ls0 ws9">a w k<span class="ff2 ls18 ws0">以发展这种语言的人<span class="_ _7"> </span></span><span class="wsa">A h o . We n i n b e rg e r<span class="_ _b"></span><span class="ff2 ws0">和<span class="_ _8"></span></span></span>K e r n i g h a m<span class="_"> </span><span class="ff2 ls19 ws0">命名。还有<span class="_ _3"></span></span>n a w k<span class="_ _b"></span><span class="ff2 ws0">和<span class="_ _8"></span></span>g a w k<span class="_"> </span><span class="ff2 ls1a ws0">,它们扩</span></div><div class="t m0 x5 h6 y17 ff2 fs1 fc0 sc0 ls1b ws0">展了文本特性,但本章不予讨论。</div><div class="t m0 x4 h5 y18 ff3 fs1 fc0 sc0 ls0 wsb">a w k<span class="_"> </span><span class="ff2 ls1c ws0">语言的最基本功能是在文件或字符串中基于指定规则浏览和抽取信息。<span class="_ _c"> </span></span>a w k<span class="_"> </span><span class="ff2 ls1a ws0">抽取信息</span></div><div class="t m0 x5 h5 y19 ff2 fs1 fc0 sc0 ls1b ws0">后,才能进行其他文本操作。完整的<span class="_ _7"> </span><span class="ff3 ls0 ws7">a w k<span class="_"> </span></span><span class="ls1d">脚本通常用来格式化文本文件中的信息。</span></div><div class="t m0 x5 h7 y1a ff4 fs2 fc1 sc0 ls0 ws0">9.1 <span class="ff5">调用</span>awk</div><div class="t m0 x4 h5 y1b ff2 fs1 fc0 sc0 ls17 ws0">有三种方式调用<span class="_ _3"></span><span class="ff3 ls0 ws7">a w k</span><span class="ls1e">,第一种是命令行方式,如:</span></div><div class="t m0 x4 h5 y1c ff2 fs1 fc0 sc0 ls0 ws0">这里,<span class="_ _b"></span><span class="ff3 ws7">c o m m a n d s<span class="_"> </span></span>是真正的<span class="_ _3"></span><span class="ff3 ws7">a w k<span class="_"> </span></span><span class="ls1f">命令。本章将经常使用这种方法。</span></div><div class="t m0 x4 h5 y1d ff2 fs1 fc0 sc0 ls20 ws0">上面例子中,<span class="_ _2"></span><span class="ff3 ls0 wsc">[ - F</span><span class="ls21">域分隔符<span class="_ _3"></span><span class="ff3">]<span class="_ _8"></span></span><span class="ls22">是可选的,因为<span class="_ _2"> </span><span class="ff3 ls0 wsc">a w k<span class="_"> </span></span><span class="ls23">使用空格作为缺省的域分隔符,因此如果</span></span></span></div><div class="t m0 x5 h5 y1e ff2 fs1 fc0 sc0 ls24 ws0">要浏览域间有空格的文本,不必指定这个选项,但如果要浏览诸如<span class="_ _4"> </span><span class="ff3 ls0 wsd">p a s s w d<span class="_"> </span></span><span class="ls25">文件,此文件各域</span></div><div class="t m0 x5 h5 y1f ff2 fs1 fc0 sc0 ls26 ws0">以冒号作为分隔符,则必须指明<span class="_ _9"> </span><span class="ff3 ls0 ws7">- F<span class="_"> </span></span><span class="ls0">选项,如:</span></div><div class="t m0 x4 h5 y20 ff2 fs1 fc0 sc0 ls27 ws0">第二种方法是将所有<span class="_ _9"> </span><span class="ff3 ls0 wse">a w k<span class="_"> </span></span><span class="ls28">命令插入一个文件,并使<span class="_ _6"> </span><span class="ff3 ls0 wse">a w k<span class="_"> </span></span></span>程序可执行,然后用<span class="_ _9"> </span><span class="ff3 ls0 wse">a w k<span class="_ _b"></span></span><span class="ls1a">命令解释</span></div><div class="t m0 x5 h6 y21 ff2 fs1 fc0 sc0 ls29 ws0">器作为脚本的首行,以便通过键入脚本名称来调用它。</div><div class="t m0 x4 h5 y22 ff2 fs1 fc0 sc0 ls2a ws0">第三种方式是将所有的<span class="_ _2"></span><span class="ff3 ls0 ws7">a w k</span><span class="ls2b">命令插入一个单独文件,然后调用:</span></div></div></div><div class="pi" data-data='{"ctm":[1.860465,0.000000,0.000000,1.860465,0.000000,0.000000]}'></div></div>
</body>
</html>
<div id="pf2" class="pf w0 h0" data-page-no="2"><div class="pc pc2 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/62b2a4b4a11cf7345fc74d9a/bg2.jpg"><div class="t m0 x6 h5 y23 ff3 fs1 fc0 sc0 ls0 wsf">- f<span class="_"> </span><span class="ff2 ls2c ws0">选项指明在文件<span class="_ _9"> </span></span>a w k _ s c r i p t _ f i l e<span class="_"> </span><span class="ff2 ws0">中的<span class="_ _b"></span></span>a w k<span class="_"> </span><span class="ff2 ws0">脚本,<span class="_ _2"></span></span>i n p u t _ f i l e ( s )<span class="_"> </span><span class="ff2 ws0">是使用<span class="_ _3"></span></span>a w k<span class="_ _b"></span><span class="ff2 ls2d ws0">进行浏览的文件</span></div><div class="t m0 x7 h6 y24 ff2 fs1 fc0 sc0 ls0 ws0">名。</div><div class="t m0 x7 h7 y25 ff4 fs2 fc1 sc0 ls0 ws0">9.2 awk<span class="ff5">脚本</span></div><div class="t m0 x6 h5 y26 ff2 fs1 fc0 sc0 ls11 ws0">在命令中调用<span class="_ _b"></span><span class="ff3 ls0 ws7">a w k</span><span class="ls0">时,<span class="_ _b"></span><span class="ff3 ws7">a w k<span class="_ _8"></span></span><span class="ls1e">脚本由各种操作和模式组成。</span></span></div><div class="t m0 x6 h5 y27 ff2 fs1 fc0 sc0 ls0 ws0">如果设置了<span class="_ _2"></span><span class="ff3 ws6">- F<span class="_"> </span></span>选项,则<span class="_ _3"></span><span class="ff3 ws6">a w k<span class="_"> </span></span><span class="ls5">每次读一条记录或一行,并使用指定的分隔符分隔指定域,但</span></div><div class="t m0 x7 h5 y28 ff2 fs1 fc0 sc0 lsc ws0">如果未设置<span class="_ _b"></span><span class="ff3 ls0 ws10">- F<span class="_"> </span></span><span class="ls0">选项,<span class="_ _3"></span><span class="ff3 ws10">a w k</span><span class="ls2e">假定空格为域分隔符,并保持这个设置直到发现一新行。当新行出现</span></span></div><div class="t m0 x7 h5 y29 ff2 fs1 fc0 sc0 ls0 ws0">时,<span class="_ _8"></span><span class="ff3 ws11">a w k<span class="_"> </span></span><span class="ls2f">命令获悉已读完整条记录,然后在下一个记录启动读命令,这个读进程将持续到文件</span></div><div class="t m0 x7 h6 y2a ff2 fs1 fc0 sc0 ls30 ws0">尾或文件不再存在。</div><div class="t m0 x6 h5 y2b ff2 fs1 fc0 sc0 ls0 ws0">参照表<span class="_ _b"></span><span class="ff3 ws12">9 - 1<span class="_ _8"></span></span>,<span class="_ _8"></span><span class="ff3 ws12">a w k<span class="_ _8"></span></span><span class="ls31">每次在文件中读一行,找到域分隔符(这里是符号<span class="_ _d"> </span><span class="ff3">#</span></span>)<span class="_ _a"></span><span class="lsc">,设置其为域<span class="_ _3"></span><span class="ff3">n<span class="_ _8"></span></span><span class="ls0">,直</span></span></div><div class="t m0 x7 h5 y2c ff2 fs1 fc0 sc0 ls32 ws0">至一新行(这里是缺省记录分隔符)<span class="_ _a"></span><span class="ls18">,然后,划分这一行作为一条记录,接着<span class="_ _d"> </span><span class="ff3 ls0 ws13">a w k<span class="_"> </span></span><span class="ls33">再次启动下</span></span></div><div class="t m0 x7 h6 y2d ff2 fs1 fc0 sc0 lse ws0">一行读进程。</div><div class="t m0 x8 h8 y2e ff5 fs3 fc0 sc0 ls0 ws0">表<span class="_ _8"></span><span class="ff4 ls30">9-1 awk</span><span class="ls34">读文件记录的方式</span></div><div class="t m0 x9 h9 y2f ff2 fs4 fc0 sc0 ls0 ws0">域<span class="ff3">1<span class="_ _e"> </span></span>分<span class="_ _f"> </span>隔<span class="_ _7"> </span>符<span class="_ _10"> </span>域<span class="_ _8"></span><span class="ff3">2<span class="_ _11"> </span></span>分<span class="_ _f"> </span>隔<span class="_ _7"> </span>符<span class="_ _12"> </span>域<span class="_ _f"> </span><span class="ff3">3<span class="_ _13"> </span></span>分<span class="_ _f"> </span>隔<span class="_ _7"> </span>符<span class="_ _12"> </span>域<span class="_ _8"></span><span class="ff3">4</span>及换行</div><div class="t m0 x6 h9 y30 ff3 fs4 fc0 sc0 ls0 ws0">P<span class="_ _14"></span><span class="ws14">. B u n n y (<span class="ff2 ws0">记录<span class="_ _8"></span></span>1 )<span class="_ _15"> </span>#<span class="_ _16"> </span>0 2 / 9 9<span class="_ _17"> </span><span class="ws0"># <span class="_ _18"> </span></span>4 8<span class="_ _19"> </span><span class="ws0"># <span class="_ _1a"> </span>Yellow \n</span></span></div><div class="t m0 x6 h9 y31 ff3 fs4 fc0 sc0 ls0 ws15">J . Tr o l l (<span class="ff2 ws0">记录<span class="_ _b"></span></span><span class="ws14">2 )<span class="_ _18"> </span>#<span class="_ _16"> </span>0 7 / 9 9<span class="_ _17"> </span>#<span class="_ _1b"> </span>4 8 4 2<span class="_ _1c"> </span>#<span class="_ _1d"> </span><span class="ls35 ws0">Brown-3 \n</span></span></div><div class="t m0 x7 h6 y32 ff4 fs1 fc1 sc0 ls0 ws0">9.2.1 <span class="ff5">模式和动作</span></div><div class="t m0 x6 h5 y33 ff2 fs1 fc0 sc0 ls0 ws0">任何<span class="_ _b"></span><span class="ff3 wsb">a w k</span><span class="ls35">语句都由模式和动作组成。在一个<span class="_ _6"> </span></span><span class="ff3 wsb">a w k<span class="_ _b"></span></span><span class="ls36">脚本中可能有许多语句。模式部分决定动</span></div><div class="t m0 x7 h6 y34 ff2 fs1 fc0 sc0 ls16 ws0">作语句何时触发及触发事件。处理即对数据进行的操作。如果省略模式部分,动作将时刻保</div><div class="t m0 x7 h6 y35 ff2 fs1 fc0 sc0 lse ws0">持执行状态。</div><div class="t m0 x6 h5 y36 ff2 fs1 fc0 sc0 ls37 ws0">模式可以是任何条件语句或复合语句或正则表达式。模式包括两个特殊字段<span class="_ _0"> </span><span class="ff3 ls0 ws16">B E G I N<span class="_"> </span></span><span class="ls0">和</span></div><div class="t m0 x7 h5 y37 ff3 fs1 fc0 sc0 ls0 ws17">E N D<span class="_"> </span><span class="ff2 ws0">。使用<span class="_ _3"></span></span>B E G I N<span class="_ _b"></span><span class="ff2 ls38 ws0">语句设置计数和打印头。<span class="_ _9"> </span></span>B E G I N<span class="_ _8"></span><span class="ff2 ls36 ws0">语句使用在任何文本浏览动作之前,之后</span></div><div class="t m0 x7 h5 y38 ff2 fs1 fc0 sc0 ls39 ws0">文本浏览动作依据输入文件开始执行。<span class="_ _1e"> </span><span class="ff3 ls0 ws18">E N D<span class="_"> </span></span><span class="ls3a">语句用来在<span class="_ _2"></span><span class="ff3 ls0 ws18">a w k<span class="_"> </span></span><span class="ls3b">完成文本浏览动作后打印输出文</span></span></div><div class="t m0 x7 h5 y39 ff2 fs1 fc0 sc0 ls3c ws0">本总数和结尾状态标志。如果不特别指明模式,<span class="_ _6"> </span><span class="ff3 ls0 ws7">a w k<span class="_"> </span></span><span class="ls3d">总是匹配或打印行数。</span></div><div class="t m0 x6 h5 y3a ff2 fs1 fc0 sc0 lsc ws0">实际动作在大括号<span class="_ _1f"></span><span class="ff3 ls0 ws19">{ }<span class="_"> </span></span><span class="ls3e">内指明。动作大多数用来打印,但是还有些更长的代码诸如<span class="_ _1"> </span><span class="ff3 ls0 ws19">i f<span class="_ _8"></span></span><span class="ls0">和循环</span></span></div><div class="t m0 xa h5 y3b ff2 fs1 fc0 sc0 ls0 ws0">(<span class="_ _8"></span><span class="ff3 ws7">l o o p i n g</span><span class="ls3c">)语句及循环退出结构。如果不指明采取动作,<span class="_ _f"> </span></span><span class="ff3 ws7">a w k<span class="_ _b"></span></span><span class="ls3f">将打印出所有浏览出来的记录。</span></div><div class="t m0 x6 h6 y3c ff2 fs1 fc0 sc0 ls1b ws0">下面将深入讲解这些模式和动作。</div><div class="t m0 x7 h6 y3d ff4 fs1 fc1 sc0 ls0 ws0">9.2.2 <span class="ff5">域和记录</span></div><div class="t m0 x6 h5 y3e ff3 fs1 fc0 sc0 ls0 ws1a">a w k<span class="_"> </span><span class="ff2 ls40 ws0">执行时,其浏览域标记为<span class="_ _2"> </span></span>$ 1<span class="_ _b"></span><span class="ff2 ws0">,<span class="_ _8"></span></span>$ 2 . . . $ n<span class="_"> </span><span class="ff2 ls41 ws0">。这种方法称为域标识。使用这些域标识将更容</span></div><div class="t m0 x7 h6 y3f ff2 fs1 fc0 sc0 ls42 ws0">易对域进行进一步处理。</div><div class="t m0 x6 h5 y40 ff2 fs1 fc0 sc0 ls0 ws0">使用<span class="_ _3"></span><span class="ff3 ws1b">$ 1 , $ 3<span class="_"> </span></span><span class="ls3a">表示参照第<span class="_ _1f"></span><span class="ff3">1<span class="_ _8"></span></span></span>和第<span class="_ _3"></span><span class="ff3">3<span class="_ _8"></span></span><span class="ls2b">域,注意这里用逗号做域分隔。如果希望打印一个有<span class="_ _c"> </span><span class="ff3">5<span class="_ _8"></span></span></span>个域</div><div class="t m0 x7 h5 y41 ff2 fs1 fc0 sc0 ls43 ws0">的记录的所有域,不必指明<span class="_ _f"> </span><span class="ff3 ls0 ws1c">$ 1 , $ 2 , $ 3 , $ 4 , $ 5<span class="_"> </span></span><span class="ls0">,可使用<span class="_ _2"> </span><span class="ff3 ws1c">$ 0<span class="_"> </span></span><span class="ls22">,意即所有域。<span class="_ _2"> </span><span class="ff3">A<span class="_ _20"></span><span class="ls0 ws1c">w k<span class="_"> </span><span class="ff2 ls1a ws0">浏览时,到达一新</span></span></span></span></span></div><div class="t m0 x7 h6 y42 ff2 fs1 fc0 sc0 ls44 ws0">行,即假定到达包含域的记录末尾,然后执行新记录下一行的读动作,并重新设置域分隔。</div><div class="t m0 x7 h5 y43 ff2 fs1 fc0 sc0 lsf ws0">注意执行时不要混淆符号<span class="_ _1f"></span><span class="ff3">$<span class="_ _8"></span></span><span class="ls0">和<span class="_ _8"></span><span class="ff3 ws7">s h e l l</span>提示符<span class="_ _b"></span><span class="ff3">$<span class="_ _8"></span></span><span class="ls45">,它们是不同的。</span></span></div><div class="t m0 x6 h5 y44 ff2 fs1 fc0 sc0 ls46 ws0">为打印一个域或所有域,使用<span class="_ _9"> </span><span class="ff3 ls0 ws7">p r i n t</span><span class="ls17">命令。这是一个<span class="_ _3"></span><span class="ff3 ls0 ws7">a w k<span class="_"> </span></span><span class="ls1f">动作(动作语法用圆括号括起来)<span class="_ _a"></span><span class="ls0">。</span></span></span></div><div class="t m0 xb ha y45 ff6 fs1 fc0 sc0 ls0 ws0">第<span class="ff7">9</span>章<span class="_ _21"> </span><span class="ff7">AWK <span class="_ _3"></span></span>介<span class="_ _22"> </span>绍<span class="_ _23"> </span><span class="ff8 fs5 fc2">67</span></div><div class="c x1 y1 w2 h2"><div class="t m0 xc h3 y2 ff1 fs0 fc0 sc0 ls0 ws0">下载</div></div></div><div class="pi" data-data='{"ctm":[1.860465,0.000000,0.000000,1.860465,0.000000,0.000000]}'></div></div>
<div id="pf3" class="pf w0 h0" data-page-no="3"><div class="pc pc3 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/62b2a4b4a11cf7345fc74d9a/bg3.jpg"><div class="t m0 xd h5 y46 ff3 fs1 fc0 sc0 ls0 ws0">1. <span class="_ _b"></span><span class="ff1">抽取域</span></div><div class="t m0 xd h5 y47 ff2 fs1 fc0 sc0 ls3 ws0">真正执行前看几个例子,现有一文本文件<span class="_ _5"> </span><span class="ff3 ls0 ws1d">g r a d e . t x t<span class="_"> </span></span><span class="ls47">,记录了一个称为柔道数据库的行信</span></div><div class="t m0 xe h6 y48 ff2 fs1 fc0 sc0 ls0 ws0">息。</div><div class="t m0 xd h5 y49 ff2 fs1 fc0 sc0 lsc ws0">此文本文件有<span class="_ _3"></span><span class="ff3">7</span>个域,即(<span class="_ _3"></span><span class="ff3">1</span><span class="ls0">)名字、<span class="_ _24"></span>(<span class="_ _8"></span><span class="ff3">2</span><span class="ls3">)升段日期、<span class="_ _a"></span><span class="ls0">(<span class="ff3">3<span class="_ _b"></span></span><span class="ls3">)学生序号、<span class="_ _a"></span><span class="ls0">(<span class="ff3">4<span class="_ _8"></span></span><span class="ls3">)腰带级别、<span class="_ _a"></span><span class="ls0">(<span class="ff3">5<span class="_ _8"></span></span>)</span></span></span></span></span></span></span></div><div class="t m0 xe h5 y4a ff2 fs1 fc0 sc0 ls0 ws0">年龄、<span class="_ _25"></span>(<span class="_ _8"></span><span class="ff3">6</span><span class="ls45">)目前比赛积分、<span class="_ _a"></span><span class="ls0">(<span class="ff3">7<span class="_ _8"></span></span><span class="ls13">)比赛最高分。</span></span></span></div><div class="t m0 xd h5 y4b ff2 fs1 fc0 sc0 ls1d ws0">因为域间使用空格作为域分隔符,故不必用<span class="_ _f"> </span><span class="ff3 ls0 ws7">- F<span class="_"> </span></span><span class="ls4">选项划分域,现浏览文件并导出一些数据。</span></div><div class="t m0 xd h6 y4c ff2 fs1 fc0 sc0 ls48 ws0">在例子中为了利于显示,将空格加宽使各域看得更清晰。</div><div class="t m0 xd h5 y4d ff3 fs1 fc0 sc0 ls0 ws0">2. <span class="_ _b"></span><span class="ff1">保存<span class="_ _8"></span></span><span class="ws7">a w k</span><span class="ff1">输出</span></div><div class="t m0 xd h5 y4e ff2 fs1 fc0 sc0 ls22 ws0">有两种方式保存<span class="_ _2"> </span><span class="ff3 ls0 ws1e">s h e l l<span class="_"> </span></span><span class="ls21">提示符下<span class="_ _b"></span><span class="ff3 ls0 ws1e">a w k<span class="_ _b"></span></span><span class="ls34">脚本的输出。最简单的方式是使用输出重定向符号<span class="_ _1e"> </span><span class="ff3">></span><span class="ls0">文</span></span></span></div><div class="t m0 xe h5 y4f ff2 fs1 fc0 sc0 ls1b ws0">件名,下面的例子重定向输出到文件<span class="_ _7"> </span><span class="ff3 ls0 ws7">w o w</span><span class="ls0">。</span></div><div class="t m0 xd h6 y50 ff2 fs1 fc0 sc0 ls49 ws0">使用这种方法要注意,显示屏上不会显示输出结果。因为它直接输出到文件。只有在保</div><div class="t m0 xe h6 y51 ff2 fs1 fc0 sc0 ls4a ws0">证输出结果正确时才会使用这种方法。它也会重写硬盘上同名数据。</div><div class="t m0 xd h5 y52 ff2 fs1 fc0 sc0 ls4b ws0">第二种方法是使用<span class="_ _9"> </span><span class="ff3 ls0 ws1f">t e e<span class="_"> </span></span><span class="ls4c">命令,在输出到文件的同时输出到屏幕。在测试输出结果正确与否</span></div><div class="t m0 xe h5 y53 ff2 fs1 fc0 sc0 ls4d ws0">时多使用这种方法。例如输出重定向到文件<span class="_ _1e"> </span><span class="ff3 ls0 ws20">d e l e t e _ m e _ a n d _ d i e</span><span class="ls4e">,同时输出到屏幕。使用这种</span></div><div class="t m0 xe h5 y54 ff2 fs1 fc0 sc0 ls0 ws0">方法,在<span class="_ _3"></span><span class="ff3 ws7">a w k<span class="_"> </span></span><span class="ls11">命令结尾写入<span class="_ _d"> </span><span class="ff3 ls10">| tee delete_me_and_die<span class="_ _8"></span></span></span>。</div><div class="t m0 xd h5 y55 ff3 fs1 fc0 sc0 ls0 ws0">3. <span class="_ _b"></span><span class="ff1 lse">使用标准输入</span></div><div class="t m0 xd h5 y56 ff2 fs1 fc0 sc0 lsc ws0">在深入讲解这一章之前,先对<span class="_ _9"> </span><span class="ff3 ls0 ws1">a w k<span class="_"> </span></span><span class="ls4f">脚本的输入方法简要介绍一下。实际上任何脚本都是从</span></div><div class="t m0 xe h5 y57 ff2 fs1 fc0 sc0 ls3c ws0">标准输入中接受输入的。为运行本章脚本,使用<span class="_ _6"> </span><span class="ff3 ls0 ws7">a w k</span><span class="ls50">脚本输入文件格式,例如:</span></div><div class="t m0 xd h6 y58 ff2 fs1 fc0 sc0 ls42 ws0">也可替代使用下述格式:</div><div class="t m0 xd h6 y59 ff2 fs1 fc0 sc0 ls45 ws0">使用重定向方法:</div><div class="t m0 xd h6 y5a ff2 fs1 fc0 sc0 ls11 ws0">或管道方法:</div><div class="t m0 xd h5 y5b ff3 fs1 fc0 sc0 ls0 ws0">4. <span class="_ _b"></span><span class="ff1 ls11">打印所有记录</span></div><div class="t m0 xd h5 y5c ff3 fs1 fc0 sc0 ls0 ws21">a w k<span class="_"> </span><span class="ff2 lsc ws0">读每一条记录。因为没有模式部分,只有动作部分<span class="_ _6"> </span><span class="ff3 ls51">{print $0}(<span class="_ _b"></span></span><span class="ls11">打印所有记录<span class="_ _b"></span><span class="ff3">)<span class="_ _8"></span></span><span class="ls0">,这个动</span></span></span></div><div class="t m0 xe h6 y5d ff2 fs1 fc0 sc0 ls10 ws0">作必须用花括号括起来。上述命令打印整个文件。</div><div class="t m0 xf ha y45 ff8 fs5 fc2 sc0 ls0 ws0">68<span class="_ _26"> </span><span class="ff6 fs1 fc0">第二部分<span class="_ _21"> </span>文<span class="_ _6"> </span>本<span class="_ _27"> </span>过<span class="_ _6"> </span>滤</span></div><div class="c x1 y1 w2 h2"><div class="t m0 x2 h3 y2 ff1 fs0 fc0 sc0 ls0 ws0">下载</div></div></div><div class="pi" data-data='{"ctm":[1.860465,0.000000,0.000000,1.860465,0.000000,0.000000]}'></div></div>
<div id="pf4" class="pf w0 h0" data-page-no="4"><div class="pc pc4 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/62b2a4b4a11cf7345fc74d9a/bg4.jpg"><div class="t m0 x6 h5 y46 ff3 fs1 fc0 sc0 ls0 ws0">5. <span class="_ _b"></span><span class="ff1 lse">打印单独记录</span></div><div class="t m0 x6 h5 y47 ff2 fs1 fc0 sc0 ls52 ws0">假定只打印学生名字和腰带级别,通过查看域所在列,可知为<span class="_ _28"> </span><span class="ff3 ls0 ws22">f i e l d - 1<span class="_"> </span></span><span class="ls0">和<span class="_ _8"></span><span class="ff3 ws22">f i e l d - 4<span class="_"> </span></span><span class="ls3">,因此可以</span></span></div><div class="t m0 x7 h5 y48 ff2 fs1 fc0 sc0 ls0 ws0">使用<span class="_ _8"></span><span class="ff3 ws7">$ 1<span class="_"> </span></span>和<span class="_ _8"></span><span class="ff3 ws7">$ 4<span class="_"> </span></span><span class="ls26">,但不要忘了加逗号以分隔域。</span></div><div class="t m0 x6 h5 y5e ff3 fs1 fc0 sc0 ls0 ws0">6. <span class="_ _b"></span><span class="ff1">打印报告头</span></div><div class="t m0 x6 h5 y5f ff2 fs1 fc0 sc0 ls28 ws0">上述命令输出在名字和腰带级别之间用一些空格使之更容易划分,也可以在域间使用<span class="_ _21"> </span><span class="ff3 ls0 ws23">t a b</span></div><div class="t m0 x7 h5 y60 ff2 fs1 fc0 sc0 ls9 ws0">键加以划分。为加入<span class="_ _1f"></span><span class="ff3 ls0 ws24">t a b<span class="_ _b"></span></span><span class="ls0">键,使用<span class="_ _1f"></span><span class="ff3 ws24">t a b</span><span class="ls20">键速记引用符<span class="_ _5"> </span></span><span class="ff3 ws24">\ t<span class="_"> </span></span><span class="ls53">,后面将对速记引用加以详细讨论。也可</span></span></div><div class="t m0 x7 h5 y61 ff2 fs1 fc0 sc0 ls54 ws0">以为输出文本加入信息头。本例中加入<span class="_ _d"> </span><span class="ff3 ls0 ws25">n a m e<span class="_"> </span></span><span class="ls0">和<span class="_ _8"></span><span class="ff3 ws25">b e l t<span class="_"> </span></span><span class="ls35">及下划线。下划线使用<span class="_ _f"> </span></span><span class="ff3 ws25">\ n</span><span class="ls55">,强迫启动新行,</span></span></div><div class="t m0 x7 h5 y62 ff2 fs1 fc0 sc0 ls0 ws0">并在<span class="_ _3"></span><span class="ff3 ws26">\ n<span class="_"> </span></span><span class="ls56">下一行启动打印文本操作。打印信息头放置在<span class="_ _29"> </span></span><span class="ff3 ws26">B E G I N<span class="_ _b"></span></span><span class="ls57">模式部分,因为打印信息头被界</span></div><div class="t m0 x7 h5 y63 ff2 fs1 fc0 sc0 ls1d ws0">定为一个动作,必须用大括号括起来。在<span class="_ _7"> </span><span class="ff3 ls0 ws7">a w k<span class="_"> </span></span><span class="ls1b">查看第一条记录前,信息头被打印。</span></div><div class="t m0 x6 h5 y64 ff3 fs1 fc0 sc0 ls0 ws0">7. <span class="_ _b"></span><span class="ff1">打印信息尾</span></div><div class="t m0 x6 h5 y65 ff2 fs1 fc0 sc0 ls17 ws0">如果在末行加入<span class="_ _3"></span><span class="ff3 ls58">end of report</span><span class="lsc">信息,可使用<span class="_ _3"></span><span class="ff3 ls0 ws27">E N D<span class="_"> </span></span><span class="ls0">语句。<span class="_ _3"></span><span class="ff3 ws27">E N D<span class="_"> </span></span><span class="ls1e">语句在所有文本处理动作执行</span></span></span></div><div class="t m0 x7 h5 y66 ff2 fs1 fc0 sc0 ls34 ws0">完之后才被执行。<span class="_ _9"> </span><span class="ff3 ls0 ws28">E N D<span class="_"> </span></span><span class="ls59">语句在脚本中的位置放置在主要动作之后。下面简单打印头信息并告</span></div><div class="t m0 x7 h6 y67 ff2 fs1 fc0 sc0 ls12 ws0">之查询动作完成。</div><div class="t m0 x6 h5 y68 ff3 fs1 fc0 sc0 ls0 ws0">8. awk<span class="_ _1f"></span><span class="ff1 lse">错误信息提示</span></div><div class="t m0 x6 h5 y69 ff2 fs1 fc0 sc0 ls5a ws0">几乎可以肯定,在使用<span class="_ _9"> </span><span class="ff3 ls0 wsb">a w k<span class="_"> </span></span><span class="ls5b">时,将会在命令中碰到一些错误。<span class="_ _27"> </span><span class="ff3 ls0 wsb">a w k</span><span class="ls5c">将试图打印错误行,但</span></span></div><div class="t m0 x7 h6 y6a ff2 fs1 fc0 sc0 ls5d ws0">由于大部分命令都只在一行,因此帮助不大。</div><div class="t m0 x6 h5 y6b ff2 fs1 fc0 sc0 lsb ws0">系统给出的显示错误信息提示可读性不好。使用上述例子,如果丢了一个双引号,<span class="_ _5"> </span><span class="ff3 ls0 ws1">a w k<span class="_ _8"></span></span><span class="ls0">将</span></div><div class="t m0 x7 h6 y6c ff2 fs1 fc0 sc0 ls0 ws0">返回:</div><div class="t m0 xb ha y45 ff6 fs1 fc0 sc0 ls0 ws0">第<span class="ff7">9</span>章<span class="_ _21"> </span><span class="ff7">AWK <span class="_ _3"></span></span>介<span class="_ _2a"> </span>绍<span class="_ _23"> </span><span class="ff8 fs5 fc2">69</span></div><div class="c x1 y1 w2 h2"><div class="t m0 xc h3 y2 ff1 fs0 fc0 sc0 ls0 ws0">下载</div></div></div><div class="pi" data-data='{"ctm":[1.860465,0.000000,0.000000,1.860465,0.000000,0.000000]}'></div></div>
<div id="pf5" class="pf w0 h0" data-page-no="5"><div class="pc pc5 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/62b2a4b4a11cf7345fc74d9a/bg5.jpg"><div class="t m0 xd h5 y6d ff2 fs1 fc0 sc0 ls11 ws0">当第一次使用<span class="_ _3"></span><span class="ff3 ls0 ws1">a w k<span class="_"> </span></span><span class="ls5e">时,可能被错误信息搅得不知所措,但通过长时间和不断的学习,可总</span></div><div class="t m0 xe h5 y6e ff2 fs1 fc0 sc0 ls2a ws0">结出以下规则。在碰到<span class="_ _1f"></span><span class="ff3 ls0 ws7">a w k<span class="_"> </span></span>错误时,可相应查找:</div><div class="t m0 xd h5 y6f ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2">确保整个<span class="_ _3"></span></span><span class="ws7">a w k<span class="_"> </span></span><span class="ff2 ls2a">命令用单引号括起来。</span></div><div class="t m0 xd h5 y70 ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2 ls26">确保命令内所有引号成对出现。</span></div><div class="t m0 xd h5 y71 ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2 ls48">确保用花括号括起动作语句,用圆括号括起条件语句。</span></div><div class="t m0 xd h5 y72 ff3 fs1 fc0 sc0 ls0 ws0">• <span class="_ _8"></span><span class="ff2 ls3c">可能忘记使用花括号,也许你认为没有必要,但<span class="_ _6"> </span></span><span class="ws7">a w k</span><span class="ff2 ls3f">不这样认为,将按之解释语法。</span></div><div class="t m0 xd h6 y73 ff2 fs1 fc0 sc0 ls3c ws0">如果查询文件不存在,将得到下述错误信息:</div><div class="t m0 xd h5 y74 ff3 fs1 fc0 sc0 ls0 ws0">9. awk<span class="_ _1f"></span><span class="ff1">键盘输入</span></div><div class="t m0 xd h5 y75 ff2 fs1 fc0 sc0 ls46 ws0">如果在命令行并没有输入文件<span class="_ _9"> </span><span class="ff3 ls0 ws7">g r a d e . t x t</span><span class="ls11">,将会怎样?</span></div><div class="t m0 xd h5 y76 ff3 fs1 fc0 sc0 ls0 ws21">B E G I N<span class="_"> </span><span class="ff2 lsc ws0">部分打印了文件头,但<span class="_ _1f"></span></span>a w k<span class="_"> </span><span class="ff2 ls1b ws0">最终停止操作并等待,并没有返回<span class="_ _7"> </span></span>s h e l l<span class="_"> </span><span class="ff2 ls13 ws0">提示符。这是因</span></div><div class="t m0 xe h5 y77 ff2 fs1 fc0 sc0 ls0 ws0">为<span class="_ _8"></span><span class="ff3 ws29">a w k<span class="_"> </span></span><span class="ls5a">期望获得键盘输入。因为没有给出输入文件,<span class="_ _d"> </span></span><span class="ff3 ws29">a w k<span class="_"> </span></span><span class="ls1a">假定下面将会给出。如果愿意,顺序</span></div><div class="t m0 xe h5 y78 ff2 fs1 fc0 sc0 ls31 ws0">输入相关文本,并在输入完成后敲<span class="_ _7"> </span><span class="ff3 ls0"><C<span class="_ _b"></span><span class="ws2a">t r l - D ><span class="_"> </span></span></span>键。如果敲入了正确的域分隔符,<span class="_ _7"> </span><span class="ff3 ls0 ws2a">a w k<span class="_ _8"></span></span><span class="ls0">会像第一个</span></div><div class="t m0 xe h6 y79 ff2 fs1 fc0 sc0 ls5f ws0">例子一样正常处理文本。这种处理并不常用,因为它大多应用于大量的打印稿。</div><div class="t m0 xe h6 y7a ff4 fs1 fc1 sc0 ls0 ws0">9.2.3 awk<span class="ff5">中正则表达式及其操作</span></div><div class="t m0 xd h5 y7b ff2 fs1 fc0 sc0 ls0 ws0">在<span class="_ _8"></span><span class="ff3 ws2b">g r e p<span class="_"> </span></span><span class="ls60">一章中,有许多例子用到正则表达式,这里将不使用同样的例子,但可以使用条</span></div><div class="t m0 xe h5 y7c ff2 fs1 fc0 sc0 ls0 ws0">件操作讲述<span class="_ _1f"></span><span class="ff3 ws7">a w k<span class="_ _8"></span></span><span class="ls3d">中正则表达式的用法。</span></div><div class="t m0 xd h5 y7d ff2 fs1 fc0 sc0 ls61 ws0">这里正则表达式用斜线括起来。例如,在文本文件中查询字符串<span class="_ _28"> </span><span class="ff3 ls0 ws12">G r e e n</span><span class="ls0">,使用<span class="_ _3"></span><span class="ff3 ws12">/ G r e e n /<span class="_"> </span></span>可以</span></div><div class="t m0 xe h5 y7e ff2 fs1 fc0 sc0 ls0 ws0">查出单词<span class="_ _3"></span><span class="ff3 ws7">G r e e n</span><span class="ls11">的出现情况。</span></div><div class="t m0 xe h6 y7f ff4 fs1 fc1 sc0 ls0 ws0">9.2.4 <span class="ff5">元字符</span></div><div class="t m0 xd h5 y80 ff2 fs1 fc0 sc0 ls0 ws0">这里是<span class="_ _1f"></span><span class="ff3 ws1f">a w k<span class="_"> </span></span><span class="ls62">中正则表达式匹配操作中经常用到的字符,详细情况请参阅本书第<span class="_ _2b"> </span><span class="ff3">7<span class="_ _8"></span></span><span class="ls1a">章正则表</span></span></div><div class="t m0 xe h6 y81 ff2 fs1 fc0 sc0 ls0 ws0">达式概述。</div><div class="t m0 xd hb y82 ff9 fs6 fc0 sc0 ls63 ws0">\ ^ $ . [] | () * + ?</div><div class="t m0 xd h5 y83 ff2 fs1 fc0 sc0 ls12 ws0">这里有两个字符第<span class="_ _3"></span><span class="ff3">7<span class="_ _8"></span></span><span class="ls26">章没有讲到,因为它们只适用于<span class="_ _9"> </span><span class="ff3 ls0 ws7">a w k</span><span class="ls0">而不适用于<span class="_ _1f"></span><span class="ff3 ws7">g r e p<span class="_"> </span></span>或<span class="_ _8"></span><span class="ff3 ws7">s e d<span class="_"> </span></span>。它们是:</span></span></div><div class="t m0 xd h5 y84 ff3 fs1 fc0 sc0 ls0 ws0">+<span class="_ _2c"> </span><span class="ff2">使用<span class="_ _8"></span></span>+<span class="_ _8"></span><span class="ff2 ls3d">匹配一个或多个字符。</span></div><div class="t m0 xd h5 y85 ff2 fs1 fc0 sc0 ls0 ws0">?<span class="_ _c"> </span><span class="ls46">匹配模式出现频率。例如使用<span class="_ _9"> </span><span class="ff3">/<span class="ls0 ws7">X Y<span class="_"> </span><span class="ws0">?Z/<span class="_ _3"></span></span></span></span></span>匹配<span class="_ _b"></span><span class="ff3 ws7">X Y Z<span class="_"> </span></span>或<span class="_ _8"></span><span class="ff3 ws7">Y Z</span>。</div><div class="t m0 xe h6 y86 ff4 fs1 fc1 sc0 ls0 ws0">9.2.5 <span class="ff5">条件操作符</span></div><div class="t m0 xd h5 y87 ff2 fs1 fc0 sc0 ls0 ws0">表<span class="ff3 ws7">9 - 2<span class="_ _8"></span></span>给出<span class="_ _b"></span><span class="ff3 ws7">a w k</span><span class="ls1f">条件操作符,后面将给出其用法。</span></div><div class="t m0 xf ha y45 ff8 fs5 fc2 sc0 ls0 ws0">70<span class="_ _26"> </span><span class="ff6 fs1 fc0">第二部分<span class="_ _21"> </span>文<span class="_ _6"> </span>本<span class="_ _27"> </span>过<span class="_ _6"> </span>滤</span></div><div class="c x1 y1 w2 h2"><div class="t m0 x2 h3 y2 ff1 fs0 fc0 sc0 ls0 ws0">下载</div></div></div><div class="pi" data-data='{"ctm":[1.860465,0.000000,0.000000,1.860465,0.000000,0.000000]}'></div></div>