1. awk
1.1 grammar
awk [options] 'BEGIN {cmd1; cmd2; ...} {cmd1; cmd2; ...} END {cmd1; cmd2; ...}' input_file
- Execute the script specified after BEGIN keyword before reading data; (optional)
- Execute the script of the middle part for each line of text in the data flow;
- After processing all data, execute the script specified after the END keyword. (optional)
You can also put the script in a file:
BEGIN { cmd1 cmd2 ... } { cmd1 cmd2 ... } END { cmd1 cmd2 ... }
$ awk -f test.awk input_file
1.2 fields
$0 Represents the entire text line; $1 Represents the first data field in the text line; $2 Represents the second data field in the text line; ...
The default field separator is any white space character. You can use - Fsep to change the field separator to sep; You can also set the FS variable to modify:
$ awk -F: '{printf "%s\t\t\t%d\n", $1, $3}' /etc/passwd root 0 daemon 1 ... $ awk 'BEGIN {FS=":"} {printf "%s\t\t\t%d\n", $1, $3}' /etc/passwd root 0 daemon 1 ...
1.3 variables
Common built-in variables
FS Enter field separator RS Enter record separator OFS Output field separator ORS Output record separator FIELDWIDTHS A column of numbers separated by spaces that defines the exact width of each data field NF Total number of fields in the data file ...
The print command automatically places the value of the OFS variable (blank by default) between each field in the output.
The FIELDWIDTHS variable allows you to read records without relying on field separators. Once the FIELDWIDTHS variable is set, awk ignores the FS variable.
$ cat data.txt 100200030000 400500060000 $ awk 'BEGIN {FIELDWIDTHS="3 4 5"} {print $1,$2,$3}' data.txt 100 2000 30000 400 5000 60000
1.4 mode
Match patterns can be used to limit which records the program script acts on.
regular expression
/pattern/{cmds}
The awk program will match all data fields in the record with regular expressions, including field separators.
$ cat data.txt hello world hello linux $ awk '/o w/{print $0}' data.txt hello world
Match operator
Allows you to restrict regular expression matches to specific data fields in the record.
Execute the script when the nth field matches the specified pattern:
$n ~ /pattern/{cmds}
Execute the script when the nth field does not match the specified pattern:
$n !~ /pattern/{cmds}
mathematical expression
You can use mathematical expressions in matching patterns.
x == y value x be equal to y x != y value x Not equal to y x <= y value x Less than or equal to y x < y value x less than y x >= y value x Greater than or equal to y x > y value x greater than y
$ awk -F: '$4 == 0{print $1}' /etc/passwd root
2. sed
2.1 line addressing
By default, the sed command works on all lines of text data. If you only want to apply commands to specific lines or lines, you must use line addressing.
address command address { command1 command2 ... commandn }
The following addressing modes are supported:
Digital mode
There are two ways:
n # Line n n1,n2 # Interval [n1, n2]
The first line is represented by 1 and the last line is represented by $.
$ sed '2s/dog/cat/' data1.txt $ sed '2,3s/dog/cat/' data1.txt $ sed '2,$s/dog/cat/' data1.txt
Text mode
There are two ways:
# All rows matching pattern /pattern/ # Start from the line matching pattern1 to the line matching pattern2 (including) /pattern1/,/pattern2/
$ sed '/MyPattern/s/bash/csh/' /etc/passwd
2.2 replacement
s/pattern/replacement/flags
By default, it replaces only the first occurrence in each row.
flags value:
- Number: replace the place where the pattern matches;
- g: Replace all matches;
- w file: write the replacement result to the file file.
$ sed 's/test/trial/' data4.txt $ sed 's/test/trial/2' data4.txt $ sed 's/test/trial/g' data4.txt $ sed 's/test/trial/w test.txt' data5.txt
2.3 delete line
d
$ sed 'd' data1.txt $ sed '3d' data6.txt
2.4 insert row
# Insert NewLine before the specified line i\NewLine # Insert NewLine after the specified line a\NewLine
If you want to insert multiple lines of text, you must use a backslash on each line of the new text you want to insert until the last line. If you want to cross a line, the backslash needs to be placed at the end of each line.
$ cat file hello1 hello2 $ sed '2i\line1\ > line2' file hello1 line1 line2 hello2
2.5 line replacement
c\NewLine
If you want to replace with multiple lines of text, you must use a backslash on each line in the new text until the last line. If you want to cross a line, the backslash needs to be placed at the end of each line.
$ cat file hello1 hello2 $ sed '1c\ > line1\ > line2' file line1 line2 hello2
2.6 character mapping
y/inchars/outchars/
The first character in inchars will be converted to the first character in outchars, the second character will be converted to the second character in outchars, and so on.
2.7 document processing
write file
w filename
Writes all lines in the matching address range to the file specified by filename.
$ sed '1,2w test.txt' data6.txt
read file
r filename
Read the contents of the file and insert it after all lines in the matching address range.
$ sed '1,2r nums.txt' lines.txt line 1 1 2 3 line 2 1 2 3 line 3
2.8 mode replacement
&Symbols can be used to represent the entire pattern that matches in the replace command.
$ echo "The cat sleeps in his hat." | sed 's/.at/"&"/g' The "cat" sleeps in his "hat".
\n refers to the content of the nth matching group.
$ echo "The System Administrator manual" | sed 's/\(System\) Administrator/\1 User/' The System User manual
2.9 using variables
Use the form of '$var' (double quotation marks in single quotation marks).
$ msg=hello $ echo "world" | sed '1i\'"$msg"' ' hello world
3. grep
3.1 options
-i ignore case -v Reverse search, that is, select the rows that do not match -c Output only the number of matching rows -n At the same time, the line number of the matching line is output
3.2 regularization
Note that some special characters need to be escaped.
Special characters
^ Start mark, or reverse $ End tag . Any character | or < Left boundary of word > Word right boundary
$ grep "^abc" data.txt $ grep "abc\$" data.txt $ grep "a.c" data.txt $ grep "adc\|456" data.txt $ grep "\<hijk" data.txt $ grep "efg\>" data.txt
Repetition, scope
? Match the previous character 0 or 1 times * Matches the previous character 0 or more times + Matches the previous character 1 or more times {m},{m,n},{m,},{,n} Match the previous character respectively m second,m reach n second,at least m second,at most n second [] Matches any one of the specified ranges
$ grep "a\?b" data.txt $ grep "a*b" data.txt $ grep "a\+b" data.txt $ grep "a\{2,\}b" data.txt $ grep "e[a-zA-Z0-9]" data.txt $ grep "e[^a-zA-Z0-9]" data.txt
Standard character class
[:alnum:] Letters and numbers, and[A-Za-z0-9]equivalence [:alpha:] Letters, and[A-Za-z]equivalence [:digit:] Numbers, and[0-9]equivalence [:xdigit:] Hexadecimal characters, and[0-9A-Fa-f equivalence] [:blank:] Spaces and tabs [:graph:] Visible characters, expanded by 33~126 [:lower:] Lowercase letters [:upper:] capital [:print:] Printable character [:space:] White space character, equivalent to[\t\r\n\v\f] [:punct:] punctuation [:cntrl:] ASCII Control code, including character 0~31 And 127
$ grep "e[[:alpha:]]" data.txt $ grep "e[[:alpha:][:digit:]]" data.txt