ITPub博客

首页 > Linux操作系统 > Linux操作系统 > [SHELL] ONE-LINERS FOR SED (Unix stream editor)

[SHELL] ONE-LINERS FOR SED (Unix stream editor)

原创 Linux操作系统 作者:fancyboy2k 时间:2010-03-05 11:21:26 0 删除 编辑
摘自 http://sed.sourceforge.net/sed1line.txt

示例:
$ sed -e 'd' /etc/services
如果输入该命令,将得不到任何输出。那么,发生了什么?在该例中,用一个编辑命令 'd' 调用 sed。sed 打开 /etc/services 文件,将一行读入其模式缓冲区,执行编辑命令(“删除行”),然后打印模式缓冲区(缓冲区已为空)。然后,它对后面的每一行重复这些步骤。这不会产生输出,因为 "d" 命令除去了模式缓冲区中的每一行!
在该例中,还有几件事要注意。
  • 首先,根本没有修改 /etc/services。这还是因为 sed 只读取在命令行指定的文件,将其用作输入 -- 它不试图修改该文件。
  • 第二件要注意的事是 sed 是面向行的。'd' 命令不是简单地告诉 sed 一下子删除所有输入数据。相反,sed 逐行将 /etc/services 的每一行读入其称为模式缓冲区的内部缓冲区。一旦将一行读入模式缓冲区,它就执行 'd' 命令,然后打印模式缓冲区的内容(在本例中没有内容)。
  • 第三件要注意的事是括起 'd' 命令的单引号的用法。养成使用单引号来括起 sed 命令的习惯是个好注意,这样可以禁用 shell 扩展。
文本间隔:

# double space a file
sed G

# double space a file which already has blank lines in it. Output file
# should contain no more than one blank line between lines of text.
sed '/^$/d;G'

# triple space a file
sed 'G;G'

# undo double-spacing (assumes even-numbered lines are always blank)
sed 'n;d'

# insert a blank line above every line which matches "regex"
sed '/regex/{x;p;x;}'

# insert a blank line below every line which matches "regex"
sed '/regex/G'

# insert a blank line above and below every line which matches "regex"
sed '/regex/{x;p;x;G;}'

编号:

# number each line of a file (simple left alignment). Using a tab (see
# note on '\t' at end of file) instead of space will preserve margins.
sed = filename | sed 'N;s/\n/\t/'

# number each line of a file (number on left, right-aligned)
sed = filename | sed 'N; s/^/ /; s/ *\(.\{6,\}\)\n/\1 /'

# number each line of file, but only print numbers if line is not blank
sed '/./=' filename | sed '/./N; s/\n/ /'

# count lines (emulates "wc -l")
sed -n '$='

文本转换和替代:

# IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
sed 's/.$//' # assumes that all lines end with CR/LF
sed 's/^M$//' # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//' # works on ssed, gsed 3.02.80 or higher

# IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
sed "s/$/`echo -e \\\r`/" # command line under ksh
sed 's/$'"/`echo \\\r`/" # command line under bash
sed "s/$/`echo \\\r`/" # command line under zsh
sed 's/$/\r/' # gsed 3.02.80 or higher

# IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format.
sed "s/$//" # method 1
sed -n p # method 2

# IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
# Can only be done with UnxUtils sed, version 4.0.7 or higher. The
# UnxUtils version can be identified by the custom "--text" switch
# which appears when you use the "--help" switch. Otherwise, changing
# DOS newlines to Unix newlines cannot be done with sed in a DOS
# environment. Use "tr" instead.
sed "s/\r//" infile >outfile # UnxUtils sed v4.0.7 or higher
tr -d \r outfile # GNU tr version 1.22 or higher

# delete leading whitespace (spaces, tabs) from front of each line
# aligns all text flush left
sed 's/^[ \t]*//' # see note on '\t' at end of file

# delete trailing whitespace (spaces, tabs) from end of each line
sed 's/[ \t]*$//' # see note on '\t' at end of file

# delete BOTH leading and trailing whitespace from each line
sed 's/^[ \t]*//;s/[ \t]*$//'

# insert 5 blank spaces at beginning of each line (make page offset)
sed 's/^/ /'

# align all text flush right on a 79-column width
sed -e :a -e 's/^.\{1,78\}$/ &/;ta' # set at 78 plus 1 space

# center all text in the middle of 79-column width. In method 1,
# spaces at the beginning of the line are significant, and trailing
# spaces are appended at the end of the line. In method 2, spaces at
# the beginning of the line are discarded in centering the line, and
# no trailing spaces appear at the end of lines.
sed -e :a -e 's/^.\{1,77\}$/ & /;ta' # method 1
sed -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/' # method 2

# substitute (find and replace) "foo" with "bar" on each line
sed 's/foo/bar/' # replaces only 1st instance in a line
sed 's/foo/bar/4' # replaces only 4th instance in a line
sed 's/foo/bar/g' # replaces ALL instances in a line
sed 's/\(.*\)foo\(.*foo\)/\1bar\2/' # replace the next-to-last case
sed 's/\(.*\)foo/\1bar/' # replace only the last case

# substitute "foo" with "bar" ONLY for lines which contain "baz"
sed '/baz/s/foo/bar/g'

# substitute "foo" with "bar" EXCEPT for lines which contain "baz"
sed '/baz/!s/foo/bar/g'

# change "scarlet" or "ruby" or "puce" to "red"
sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g' # most seds
gsed 's/scarlet\|ruby\|puce/red/g' # GNU sed only

# reverse order of lines (emulates "tac")
# bug/feature in HHsed v1.5 causes blank lines to be deleted
sed '1!G;h;$!d' # method 1
sed -n '1!G;h;$p' # method 2

# reverse each character on the line (emulates "rev")
sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//'

# join pairs of lines side-by-side (like "paste")
sed '$!N;s/\n/ /'

# if a line ends with a backslash, append the next line to it
sed -e :a -e '/\\$/N; s/\\\n//; ta'

# if a line begins with an equal sign, append it to the previous line
# and replace the "=" with a single space
sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'

# add commas to numeric strings, changing "1234567" to "1,234,567"
gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta' # GNU sed
sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta' # other seds

# add commas to numbers with decimal points and minus signs (GNU sed)
gsed -r ':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta'

# add a blank line every 5 lines (after lines 5, 10, 15, 20, etc.)
gsed '0~5G' # GNU sed only
sed 'n;n;n;n;G;' # other seds

选择显示特定的行:

# print first 10 lines of file (emulates behavior. of "head")
sed 10q

# print first line of file (emulates "head -1")
sed q

# print the last 10 lines of a file (emulates "tail")
sed -e :a -e '$q;N;11,$D;ba'

# print the last 2 lines of a file (emulates "tail -2")
sed '$!N;$!D'

# print the last line of a file (emulates "tail -1")
sed '$!d' # method 1
sed -n '$p' # method 2

# print the next-to-the-last line of a file
sed -e '$!{h;d;}' -e x # for 1-line files, print blank line
sed -e '1{$q;}' -e '$!{h;d;}' -e x # for 1-line files, print the line
sed -e '1{$d;}' -e '$!{h;d;}' -e x # for 1-line files, print nothing

# print only lines which match regular expression (emulates "grep")
sed -n '/regexp/p' # method 1
sed '/regexp/!d' # method 2

# print only lines which do NOT match regexp (emulates "grep -v")
sed -n '/regexp/!p' # method 1, corresponds to above
sed '/regexp/d' # method 2, simpler syntax

# print the line immediately before a regexp, but not the line
# containing the regexp
sed -n '/regexp/{g;1!p;};h'

# print the line immediately after a regexp, but not the line
# containing the regexp
sed -n '/regexp/{n;p;}'

# print 1 line of context before and after regexp, with line number
# indicating where the regexp occurred (similar to "grep -A1 -B1")
sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h

# grep for AAA and BBB and CCC (in any order)
sed '/AAA/!d; /BBB/!d; /CCC/!d'

# grep for AAA and BBB and CCC (in that order)
sed '/AAA.*BBB.*CCC/!d'

# grep for AAA or BBB or CCC (emulates "egrep")
sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d # most seds
gsed '/AAA\|BBB\|CCC/!d' # GNU sed only

# print paragraph if it contains AAA (blank lines separate paragraphs)
# HHsed v1.5 must insert a 'G;' after 'x;' in the next 3 scripts below
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;'

# print paragraph if it contains AAA and BBB and CCC (in any order)
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d'

# print paragraph if it contains AAA or BBB or CCC
sed -e '/./{H;$!d;}' -e 'x;/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d
gsed '/./{H;$!d;};x;/AAA\|BBB\|CCC/b;d' # GNU sed only

# print only lines of 65 characters or longer
sed -n '/^.\{65\}/p'

# print only lines of less than 65 characters
sed -n '/^.\{65\}/!p' # method 1, corresponds to above
sed '/^.\{65\}/d' # method 2, simpler syntax

# print section of file from regular expression to end of file
sed -n '/regexp/,$p'

# print section of file based on line numbers (lines 8-12, inclusive)
sed -n '8,12p' # method 1
sed '8,12!d' # method 2

# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files

# beginning at line 3, print every 7th line
gsed -n '3~7p' # GNU sed only
sed -n '3,${p;n;n;n;n;n;n;}' # other seds

# print section of file between two regular expressions (inclusive)
sed -n '/Iowa/,/Montana/p' # case sensitive

选择删除特定的行:

# print all of file EXCEPT section between 2 regular expressions
sed '/Iowa/,/Montana/d'

# delete duplicate, consecutive lines from a file (emulates "uniq").
# First line in a set of duplicate lines is kept, rest are deleted.
sed '$!N; /^\(.*\)\n\1$/!P; D'

# delete duplicate, nonconsecutive lines from a file. Beware not to
# overflow the buffer size of the hold space, or else use GNU sed.
sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P'

# delete all lines except duplicate lines (emulates "uniq -d").
sed '$!N; s/^\(.*\)\n\1$/\1/; t; D'

# delete the first 10 lines of a file
sed '1,10d'

# delete the last line of a file
sed '$d'

# delete the last 2 lines of a file
sed 'N;$!P;$!D;$d'

# delete the last 10 lines of a file
sed -e :a -e '$d;N;2,10ba' -e 'P;D' # method 1
sed -n -e :a -e '1,10!{P;N;D;};N;ba' # method 2

# delete every 8th line
gsed '0~8d' # GNU sed only
sed 'n;n;n;n;n;n;n;d;' # other seds

# delete lines matching pattern
sed '/pattern/d'

# delete ALL blank lines from a file (same as "grep '.' ")
sed '/^$/d' # method 1
sed '/./!d' # method 2

# delete all CONSECUTIVE blank lines from file except the first; also
# deletes all blank lines from top and end of file (emulates "cat -s")
sed '/./,/^$/!d' # method 1, allows 0 blanks at top, 1 at EOF
sed '/^$/N;/\n$/D' # method 2, allows 1 blank at top, 0 at EOF

# delete all CONSECUTIVE blank lines from file except the first 2:
sed '/^$/N;/\n$/N;//D'

# delete all leading blank lines at top of file
sed '/./,$!d'

# delete all trailing blank lines at end of file
sed -e :a -e '/^\n*$/{$d;N;ba' -e '}' # works on all seds
sed -e :a -e '/^\n*$/N;/\n$/ba' # ditto, except for gsed 3.02.*

# delete the last line of each paragraph
sed -n '/^$/{p;h;};/./{x;/./p;}'

特殊应用:

# remove nroff overstrikes (char, backspace) from man pages. The 'echo'
# command may need an -e switch if you use Unix System V or bash shell.
sed "s/.`echo \\\b`//g" # double quotes required for Unix environment
sed 's/.^H//g' # in bash/tcsh, press Ctrl-V and then Ctrl-H
sed 's/.\x08//g' # hex expression for sed 1.5, GNU sed, ssed

# get Usenet/e-mail message header
sed '/^$/q' # deletes everything after first blank line

# get Usenet/e-mail message body
sed '1,/^$/d' # deletes everything up to first blank line

# get Subject header, but remove initial "Subject: " portion
sed '/^Subject: */!d; s///;q'

# get return address header
sed '/^Reply-To:/q; /^From:/h; /./d;g;q'

# parse out the address proper. Pulls out the e-mail address by itself
# from the 1-line return address header (see preceding script)
sed 's/ *(.*)//; s/>.*//; s/.*[:<] *//'

# add a leading angle bracket and space to each line (quote a message)
sed 's/^/> /'

# delete leading angle bracket & space from each line (unquote a message)
sed 's/^> //'

# remove most HTML tags (accommodates multiple-line tags)
sed -e :a -e 's/<[^>]*>//g;/
# extract multi-part uuencoded binaries, removing extraneous header
# info, so that only the uuencoded portion remains. Files passed to
# sed must be passed in the proper order. Version 1 can be entered
# from the command line; version 2 can be made into an executable
# Unix shell script. (Modified from a script. by Rahul Dhesi.)
sed '/^end/,/^begin/d' file1 file2 ... fileX | uudecode # vers. 1
sed '/^end/,/^begin/d' "$@" | uudecode # vers. 2

# sort paragraphs of file alphabetically. Paragraphs are separated by blank
# lines. GNU sed uses \v for vertical tab, or any unique char will do.
sed '/./{H;d;};x;s/\n/={NL}=/g' file | sort | sed '1s/={NL}=//;s/={NL}=/\n/g'
gsed '/./{H;d};x;y/\n/\v/' file | sort | sed '1s/\v//;y/\v/\n/'

# zip up each .TXT file individually, deleting the source file and
# setting the name of each .ZIP file to the basename of the .TXT file
# (under DOS: the "dir /b" switch returns bare filenames in all caps).
echo @echo off >zipup.bat
dir /b *.txt | sed "s/^\(.*\)\.TXT/pkzip -mo \1 \1.TXT/" >>zipup.bat

使用SED:Sed接受一个或多个编辑命令,并且每读入一行后就依次应用这些命令。
当读入第一行输入后,sed对其应用所有的命令,然后将结果输出。接着再读入第二
行输入,对其应用所有的命令……并重复这个过程。上一个例子中sed由标准输入设
备(即命令解释器,通常是以管道输入的形式)获得输入。在命令行给出一个或多
个文件名作为参数时,这些文件取代标准输入设备成为sed的输入。sed的输出将被
送到标准输出(显示器)。因此:

cat filename | sed '10q' # 使用管道输入
sed '10q' filename # 同样效果,但不使用管道输入
sed '10q' filename > newfile # 将输出转移(重定向)到磁盘上

要了解sed命令的使用说明,包括如何通过脚本文件(而非从命令行)来使用这些命
令,请参阅《sed & awk》第二版,作者Dale Dougherty和Arnold Robbins
(O'Reilly,1997;http://www.ora.com),《UNIX Text Processing》,作者
Dale Dougherty和Tim O'Reilly(Hayden Books,1987)或者是Mike Arst写的教
程——压缩包的名称是“U-SEDIT2.ZIP”(在许多站点上都找得到)。要发掘sed
的潜力,则必须对“正则表达式”有足够的理解。正则表达式的资料可以看
《Mastering Regular Expressions》作者Jeffrey Friedl(O'reilly 1997)。
Unix系统所提供的手册页(“man”)也会有所帮助(试一下这些命令
“man sed”、“man regexp”,或者看“man ed”中关于正则表达式的部分),但
手册提供的信息比较“抽象”——这也是它一直为人所诟病的。不过,它本来就不
是用来教初学者如何使用sed或正则表达式的教材,而只是为那些熟悉这些工具的人
提供的一些文本参考。

括号语法:前面的例子对sed命令基本上都使用单引号('...')而非双引号
("...")这是因为sed通常是在Unix平台上使用。单引号下,Unix的shell(命令
解释器)不会对美元符($)和后引号(`...`)进行解释和执行。而在双引号下
美元符会被展开为变量或参数的值,后引号中的命令被执行并以输出的结果代替
后引号中的内容。而在“csh”及其衍生的shell中使用感叹号(!)时需要在其前
面加上转义用的反斜杠(就像这样:\!)以保证上面所使用的例子能正常运行
(包括使用单引号的情况下)。DOS版本的Sed则一律使用双引号("...")而不是
引号来圈起命令。

'\t'的用法:为了使本文保持行文简洁,我们在脚本中使用'\t'来表示一个制表
符。但是现在大部分版本的sed还不能识别'\t'的简写方式,因此当在命令行中为
脚本输入制表符时,你应该直接按TAB键来输入制表符而不是输入'\t'。下列的工
具软件都支持'\t'做为一个正则表达式的字元来表示制表符:awk、perl、HHsed、
sedmod以及GNU sed v3.02.80。

不同版本的SED:不同的版本间的sed会有些不同之处,可以想象它们之间在语法上
会有差异。具体而言,它们中大部分不支持在编辑命令中间使用标签(:name)或分
支命令(b,t),除非是放在那些的末尾。这篇文档中我们尽量选用了可移植性较高
的语法,以使大多数版本的sed的用户都能使用这些脚本。不过GNU版本的sed允许使
用更简洁的语法。想像一下当读者看到一个很长的命令时的心情:

sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d

好消息是GNU sed能让命令更紧凑:

sed '/AAA/b;/BBB/b;/CCC/b;d' # 甚至可以写成
sed '/AAA\|BBB\|CCC/b;d'

此外,请注意虽然许多版本的sed接受象“/one/ s/RE1/RE2/”这种在's'前带有空
格的命令,但这些版本中有些却不接受这样的命令:“/one/! s/RE1/RE2/”。这时
只需要把中间的空格去掉就行了。

速度优化:当由于某种原因(比如输入文件较大、处理器或硬盘较慢等)需要提高
命令执行速度时,可以考虑在替换命令(“s/.../.../”)前面加上地址表达式来
提高速度。举例来说:

sed 's/foo/bar/g' filename # 标准替换命令
sed '/foo/ s/foo/bar/g' filename # 速度更快
sed '/foo/ s//bar/g' filename # 简写形式

当只需要显示文件的前面的部分或需要删除后面的内容时,可以在脚本中使用“q”
命令(退出命令)。在处理大的文件时,这会节省大量时间。因此:

sed -n '45,50p' filename # 显示第45到50行
sed -n '51q;45,50p' filename # 一样,但快得多


Al Aab # founder of "seders" list
Edgar Allen # various
Yiorgos Adamopoulos # various
Dale Dougherty # author of "sed & awk"
Carlos Duarte # author of "do it with sed"
Eric Pement # author of this document
Ken Pizzini # author of GNU sed v3.02
S.G. Ravenhall # great de-html script
Greg Ubben # many contributions & much help

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/58054/viewspace-628676/,如需转载,请注明出处,否则将追究法律责任。

下一篇: AutoHotKey and AutoIt
请登录后发表评论 登录
全部评论

注册时间:2008-09-10

  • 博文量
    65
  • 访问量
    95473