catalogue
Comparison between numeric value and string:
1, awk
1. General
Origin: it was born in Bell Labs in the 1970s. Now centos7 uses gawk. It is called AWK because it takes the first character of the Family Name of three founders, Alfred Aho, Peter weinberger and Brian Kernighan.
Overview: AWK is a language for processing text files and a powerful text analysis tool. The programming language specially designed for text processing is also line processing software. It is usually used for scanning, filtering and statistical summary. The data can come from standard input, pipeline or file
2. Working principle:
When the first row is read, match the conditions, then execute the specified action, and then read the second row for data processing, which will not be output by default; If no matching condition is defined, the default is to match all data rows. awk implies a loop, and the action will be executed as many times as the condition is matched
Read the text line by line. By default, it is separated by space or tab key. Save the separated fields to the built-in variable, and execute the editing command according to the mode or condition.
difference: sed Commands are often used to process a whole line; awk Prefer to divide a row into multiple""field"Then deal with it. awk The information is also read in line by line, and the execution result can be passed print The function of will print and display the field data.
in use awk During command,You can use logical operators "&":And "||": or "! ": wrong It can also carry out simple mathematical operations as+,I*,/,%,^Denotes addition, subtraction, multiplication, division, remainder and power respectively.
Command format:
awk option 'Mode or condition{operation}' File 1 file 2 ... awk -f Script file 1 file 2 ..
Format: awk keyword option command part '{XXX}' file name
3.awk built in variables
FS:Specifies the field separator for each line of text, which defaults to spaces or tab stops NF:Number of fields in the currently processed row [number of columns] NR:The line number of the currently processed line(Ordinal number)[Number of rows] $0:Entire line content of the currently processed line [entire line] $1:First column $2:Second column $n:The second row of the current processing line n Fields(The first n column) FILENAME:File name to be processed RS:Line separator. awk When reading data from a file,Will be based on Rs The definition cuts data into many records, and awk Read only one record at a time,For processing. The default is"\n' Jane said:Data record separation, default to\n,That is, one record for each behavior [root@localhost ~] # Awk '{print}' a (file name) root:x:0:0:root:/root:/bin/bash bin:x:1:l:bin:/bin:/sbin/nologin44 [root@localhost ~] # awk ' {print $1} ' A root:x:0:0:root:/root:/bin/bash #awk treats this row as a column by default, because it is not separated by spaces. awk separates by spaces or tab keys by default
4. Other built-in variables
Usage of other built-in variables FS,OFS,NR,FNR,RS,ORS OFS:Defines how the output is separated RS:Specify what line breaks to use;The specified character must be the character existing in the original text ORS:Merge multiple lines into one line output awk 'BEGIN{FS=":"}{print $1}' pass.txt #Define the field separator as a colon before printing awk 'BEGIN{FS=":";OFS="---"}{print $1,$2}' pass.txt #OFS defines how to separate the output. The $1 $2 should be separated by commas, Because commas are mapped to by default DFS Variable, which is a space by default root-—-x bin--—x awk 'BEGIN{RS=":" }{(print $0}' /etc/passwd #RS: specifies what to use as the newline character. Here, the colon is specified The specified character must be the character existing in the original text awk 'BEGIN{ORS=" "}{print $0}' /etc/passwd #Combine multiple lines into one line for output. When outputting, customize to separate each line with spaces, The default is enter
2, Examples
awk -F: '{print $5}' /etc/passwd/ #The custom colon displays the fifth column after the separator awk -Fx '{print $1} ' /etc/passwd/ use x As separator awk '{print $1 $2}' A awk -F: '{print $1""$2}' zz #A space is displayed. The space needs to be enclosed in double quotation marks. If quotation marks are not used, it is treated as a variable by default. If it is a constant, it needs to be enclosed in double quotation marks awk '{print $1,$2}' zz│/Comma has space effect awk -F: '{print $1"\t"$2}'/etc/passwd #Output with tab as separator awk -F[:/] '{print $9}' zz Define multiple separators. As long as you see one of them, it will be regarded as a separator awk -F: '/root/{print $0}' pass.txt #Print the entire line containing root awk -F: '/root/{print $1}' pass.txt awk -F: '/root/{print $1,$6}' pass.txt awk '/root/' /etc/passwd awk -F[:/] '{print NF} ' zz #Print the number of columns per row awk -F[:/] '{print NR}' zz #set number awk -F: '{print NR} ' pass.txt #Print lines awk -F: '{print NR,$0 } ' pass.txt #Print the number of lines and the corresponding content of each line awk 'NR==2'/etc/passwd #Print the second line without print. The default is print awk 'NR==2{print}' /etc/passwd #Ditto effect awk -F:'NR==2{print $1}' /etc/passwd #Print the first column of the second row awk -F: '{print $NF}' /etc/passwd #Print last column awk 'END{print NR}' /etc/passwd #Print total rows awk 'END{print $0}' /etc/passwd #Print the last line of the file awk -F: '{print "Current line has"NF"column" }' zz The current row has seven columns awk -F: '{print "The first"NR"Yes"NF"column"}' /etc/passwd #Which row has several columns Row 1 has seven columns Row 2 has seven columns Row 3 has seven columns Expand production:Network card ip,flow ifconfig ens33 | awk '/netmask/{print "Native ip the address is"$2}' #The ip address of this machine is 192.168.245.211 ifconfig ens33 | awk '/RX p/{print $5"byte"} 8341053 byte (string lookup) Available quantity of root partition df -h | awk 'NR==2{print $4}' 45G
Execute line by line what task to execute before the start and what task to execute after the END. BEGIN and END are generally used for initialization operation. BEGIN is only executed once before reading the data record. END is generally used for summary operation and only once after reading the data record
awk operation:
awk 'BEGIN{x=10;print x}' #If you don't use quotation marks, awk will be output as a variable, so you don't need to add $ awk 'BEGIN{x=10;print x+1}' BEGIN Before processing the file, it will not be affected if it is not followed by the file name awk 'BEGIN{x=10; x++;print x}' awk 'BEGIN{print x+1)' #If the initial value is not specified, the initial value is 0. If it is a string, it is empty by default awk 'BEGIN{print 2.5+3.5}' #Decimals can also be calculated awk 'BEGIN{print 2-1}' awk 'BEGIN(print 3*4}' awk 'BEGIN(print 3**2 }' awk 'BEGIN{print 2^3}' #^And * * are power operations 1788 awk 'BEGIN{print 1/2 }'
Fuzzy matching:
Fuzzy matching, using~Means contains,!~Indicates that it does not contain awk -F: '$1~/root/' /etc/passwd awk -F: '$1~/ro/'/etc/passwd #Fuzzy matching, as long as there is ro, it will match awk -F: '$7!~/nologin$/{print $1,$7}'/etc/passwd
Comparison between numeric value and string:
Comparison symbol:== != <= >= < > awk 'NR==5{print}' /etc/passwd #Print the fifth line awk 'NR==5' /etc/passwd awk 'NR<5' /etc/passwd awk -F: '$3==o' /etc/passwd #Print the row with the third column 0 awk -F: '$1==root' /etc/passwd awk -F: '$1=="root"' /etc/passwd
Logical operation & & and |:
awk -F: '$3<10 || $3>=1000' /etc/passwd awk -F: '$3>10 && $3<1000' /etc/passwd awk -F: 'NR>4 && NR<10' /etc/passwd #Print lines 4-9
#Print all integer numbers between 1 and 200 that can be divided by 7 and contain the number 7 seq 200 | awk '$1%7==0 && $1~/7/' 7 70 147 175
3, Advanced usage of awk
1. Define reference variables
a=100 awk -v b="$a" 'BEGIN(print b)' #The system variable a is assigned to variable b in awk, and then the variable b is called. awk 'BEGIN(print "'$a'")' #If you call directly, you need to use double quotation marks first and then single quotation marks awk -vc=1 'BEGIN{print c}' #awk directly defines variables and references them Call function getline,When reading a row of data, you do not get the current row, but the next row of the current row df -h | awk 'BEGIN{getline}/root/{print $0}' /dev/mapper/ centos-root 50G 5.2G 45G 11%/ seq 10 | awk '{getline;print $0}' #Show even rows seq 10 | awk '{print $0;getline}' #Show odd rows
2. Conditional statements
if sentence: awk of if Statements are also divided into single branch, double branch and multi branch Single branch is if(){} Double branch is if(){}else{} Multiple branches are if(){}else if(){}else{} awk -F: '{if($3<10){print $0}}' /etc/passwd #Print the entire row if the third column is less than 10 awk -F: '{if($3<10){print $3}else{print $1)}'/etc/passwd #If the third column is less than 10, print the third column, otherwise print the first column awk Also support for Circulation while Loops, functions, arrays, etc. 331
4, Other examples
awk 'BEGIN{x=0};/\/bin\/bash$/{x++;print x,$0};END{print x} ' /etc/passwd #Count the number of lines ending in / bin/bash Equivalent to grep -c " /bin/bash$" /etc/passwd BEGIN The pattern indicates that it needs to be executed before processing the specified text BEGIN Action specified in mode; awk Reprocess the specified text before execution END The action specified in the mode, END{}Statements such as print results are often placed in the statement block awk -F ":" '! ($3<200){print}' /etc/passwd #Output the row whose value of the third field is not less than 200 awk 'BEGIN{FS=":"} ;{if($3>=1000)(print}} ' /etc/passwd #First process the content of BEGIN, and then print the content in the text awk -F ":" '{max=($3>=$4) ?$3:$4;{print max}}' /etc/passwd #($3>$4)?$ 3: $4 ternary operator, if the value of the third field is greater than or equal to the value of the fourth field, Then assign the value of the third field to max,Otherwise, the value of the fourth field is assigned to max awk -F ":" '{print NR, $0}'/etc/passwd #Output the content and number of each line. After each record is processed, the NR value is increased by 1 sed -n '=;p' /etc/passwd awk -F":" '$7~"bash"{print $1}' /etc/passwd #The output is colon delimited and the seventh field contains the first field of the row of / bash awk -F: '/bash/{print $1}' /etc/passwd awk -F":"'($1~"root") && (NF==7) {print $1,$2,$NF}' /etc/passwo #The first field contains the first and second fields of the row with root and 7 fields awk -F ":"'($7!="/bin/bash") && ($7!="/sbin/nologin") {print}' /etc/passwd #Output the seventh field, which is neither / bin/bash nor / sbin/nologin awk -F:'($NF !="/bin/bash")&&($NF !="/sbin/nologin") {print NR,$0}' passwd Called through pipe and double quotation marks shell command: echo $PATH | awk 'BEGIN{RS=":"};END{print NR)' #Count the number of text paragraphs separated by colons. In the ENDA {} statement block, statements such as print results are often placed echo $PATH | awk 'BEGIN{RS=":"}; {print NR,$0};END{print NR}' awk -F: '/bash$/{print | "wc -l"}' /etc/paswd #Call the wc -l command to count the number of users using bash, which is equivalent to grep -c "bash$" /etc/passwd awk -F: '/bash$/{print}'passwd | wc -l free -m |awk '/Mem:/ {print int($3/($3+$4)*100)"%"}' #View current memory usage percentage free -m | awk '/Mem:/ {print $3/$2 }' 0.327869 free -m | awk /Mem:/ {print $3/$2*100}' 32.786938 free -m | awk '/Mem :/ {print int$3/$2*100} free -m | awk '/Mem:/ {print int($3/$2*100)}' 32 free -m | awk '/Mem:/ {print int($3/$2*100)"%"}' 32% free -m | awk '/Mem:/ {print $3/$2*100}' | awk -F. '{print $1 "%" }' top -b -n1 | grep Cpu | awk -F ',' '{print $4}' | awk '{print $1}' #View the current CPU idle rate (- b -n1 indicates that only one output is required) date -d "$(awk -F "." '{print $1}' /proc/uptime) second ago"+"%F %H:%M:%S" #Display the last system restart time, which is equivalent to uptime; second ago To show how many seconds ago,+$"Election day:3N:3s Equivalent to+"3Y-n-%d 3 day:8N:8s"Time format for date -d "s (awk -F "." '{iprint $1}' /proc/uptime) second ago"+"number F%H:%M:%S" #Display the last system restart time, which is equivalent to uptime;second ago is the time before the display, and + "E-day: n:: s" is equivalent to the time format of + "&1-tm-d day: 38:8s" awk 'BEGIN {n=0 ; while ("w" | getline) n++ ;{print n-2}}' #Call the w command and use it to count the number of online users awk 'BEGIN { "hostname" | getline ; {print $0}}' #Call hostname and output the current hostname When getline No redirection left or right"<"or"I"When, awk First read the first line, which is 1, and then getline,You get the second line below 1, which is 2, because getline After that, awk Will change the corresponding NF,NR,FNR and $0 And other internal variables, so at this time $0 The value of is no longer 1, but 2, and then print it out. When getline There are redirection characters on the left and right"<"or"T"When, getline It works on the directional input file. Because the file has just been opened, it has not been deleted awk Read in a line, just getline Read in, then getline The first line of the file is returned, not interlaced. seq 10 | awk ' {getline;print $0}' seq 10 | awk ' {print $0;getline}' CPU Utilization rate: cpu_us='top -b -n1 | grep Cpu | awk '{print $2}' cpu_sy='top -b -n1 | grep Cpu | awk -F ',' '{print $2}' | awk '{print $1}' cpu_sum=$(($cpu_us+$cpu_sy)) echo $cpu_sum echo "A B C D" | awk '{OFS="|";print $0;$1=$1;print $0}' A B C D A/B1C|D #$1 = $1 is used to activate the reassignment of $0, that is: field $1 And field number NF will cause awk to recalculate the value of $o, usually when it needs to output $0 after changing OFS echo "A B C D" | awk 'BEGIN{OFS="| "};{print $0;$1=$1;print $0}' echo "A B C D" | awk 'BEGIN{OFS="|"};{print $0;$1=$1;print $1,$2}' echo "A B C D" | awk 'BEGIN{OFS="|"};{$2=$2;print $1,$2}' awk 'BEGIN{a[0]=10;a[1]=20;print a[1]}' awk 'BEGIN{a[0]=10;a[1]=20;print a[0]}' awk 'BEGIN{a["abc"]=10;a["xyz"]=20;print a["abc"}' awk 'BEGIN{a["abc"]=10;a["xyz"]=20;print a["xyz"]}' awk 'BEGIN{a["abc"]="aabbcc";a["xyz"]="xxyyzz"; print a[ "xyz"]}' awk 'BEGIN{a[0]=10;a[1]=20;a[2]=30;for(i in a){print i,a[i]}}' Ps1:BEGIN The command in is executed only once Ps2:awk In addition to numbers, the subscript of the array can also use strings. The strings need to use double quotation marks Statistics tcp Number of connections: netstat -an | awk '/^tcp/ {print $NF}' |sort |uniq -c or netstat -nat | awk '/^tcp/{arr[$6]++}END{for(state in arr){print arr[state] ": " state}}' use awk Statistics httpd Access each client in the log IP Number of occurrences of? awk '{ip[$1]++}END{for(i in ip){print ip[i],i}}' /var/log/httpd/accesslog | sort -r remarks:Define an array with the name ip,The subscript of the number is column 1 of the log file(That is, the of the client Ie address),++The purpose of is to count the clients IP The counter is incremented by 1 once it appears. END The instructions in are executed after reading the file, and all statistical information is output through a loop, for Loop through the array name ip Subscript of. awk '/Failed password/{print $0} ' /var/log/secure awk '/Failed password/{print $11} ' /var/log/secure awk '/Failed/{ip[$11]++}END{for(i in ip) {print i", "ip[i]}}' /var/log/secure awk '/Failed/{ip[$11]++}END{for(i in ip){print i","ip[i]}}' /var/log/ secure awk '/Failed password/{ip[$11]++}END{for(i in ip) {print i", "ip[i]})' /var/log/secure Scripting: #!/bin/bash x= `awk '/Failed password/{ip[$11]++}END{for(i in ip){print i","ip[i]}}' /var/log/secure ` #190.168.80.13 for j in $x do ip=`echo $j | awk -F "," '{print $1}' num=`echo $j | awk -F "," '{print $2}'` if [ $num -ge 3 ];then echo "warning! $ip Failed to access this machine $num Once, please deal with it quickly!" fi done