LINUX AWK Tutorial Series | Chapter 4



Hello Guys, Hope you are enjoying the series. Remember practice will only help you grow and learn. So keep practicing!!

In this post, I am going to discuss the Variables and Operators in AWK command.


Variables and Operators

1) User Define Variables

So let's discuss the User-Defined Variables, As mentioned earlier AWK is a combination of a filtering tool and a programming language. So same as other programming languages it will also support Variables, Constants, operation, loops, etc..

Variable is a representation of a value referring to a variable name.
We will see both built-in and user-defined variables. 
I am going to focus on user-defined variables here.

Important Note:
  • No variable declaration is required in AWK same as shell scripting.
  • Variables will be initialized to null string or zero automatically.
  • Variable should begin with letter and can be followed by letters, numbers, underscores.
  • Variable are case sensitive. So be careful. Example: variable "Him" and variable "him" both will be treated separately.

Example 1:
Defining and variable and printing them




 So what we did here, Any Guesses...

I have defined a variable "a" which is storing a number, so no need for any quotes. (even if you give quotes no impact)
Variable "b" and "c" are storing string so they are enclosed within double-quotes.
All variables are separated by a semicolon(;)
Then I am printing the variables a b and c, but remember I used comma(,) between them so that they can use default delimiter to print the data.
Else the output will be like below. Why--It will work as a concatenation on the variables. Numbers and string will concatenate and will become a string.




Now please note you need to pass a file name as well, If you don't pass the system will keep waiting.
But why we have 7 lines in the output??
Reason: AWK processes each line by line of the file and every time, but does an action for printing only variable values.

Try at home: How to print only one line in this output. Feel free to write your answer in the comments section.

What happens when I only give print rather than print a,b,c. The system will not print variable values and just print normal file contents.


Example 2:

See the below screenshot, Awwww What happened here?




If I am doing an arithmetic operation between a number and string "a+b", then it will treat the variable having a string like "0".So in the output 20+0=20 is displayed.

If both are numbers then no issues, It will perform arithmetic operations. Please see the below screenshot.




2) Operators

In the above example, I used a "+" operator. There are multiple operators that can be used for various different purposes.


  • Arithmetic Operators: +,-,*,/,%,++,--
  • Assignment Operators:=,+=,-=,*=,/=
  • Relational Operators: <,>,>=,<=,!=
  • Logical operators: &&,||,!
  • String Comparison: ~,!~
  • String Concatenate: Blank Space

Example 3:

If we want to find all the employees who are Male and having a salary of less than 25000.


See the above screenshot, what I did.

-F--> used because the delimiter for my fields are comma
$2--> My second field contains the M/F column so matching it with "M"
&&--> Using logical operator for joining 2 conditions.
$4--> This is my salary column in the file which has to be less than  25000
print $1--> Printing the name of the employee as it is stored in field one.


Example 4:

Print the records from the employee file along with the sequence number using user-defined variables.



Explanation:

-F--> used because the delimiter for my fields are comma
++x--> Variable x along with arithmetic operator. As mentioned earlier by default variable initializes with 0 and I am adding 1 each time before printing variable x.
print $0--> print all fields


Example 5:

Compare name which starts with Rob and is Male.



I will not explain all syntax, I hope by now you would have understood all other parts. What new I have used here !!

I used $1~"Rob*" or $1~/Rob*/  --> If you remember // was used for pattern search same can be used or you can use double quotes ""
~ --> Is the string comparison operator.


If I want to print all employee name other than starting with Rob



Example 6:

Let's say if I want to find empty files in my employee data file.
Note: I am adding a few empty lines in my file using vi editor.



Explanation:

^$ --> This will search all patterns where the line is empty.^--> start of the line and $--> end of the line. so together it means no data in between.
x=x+1 --> Defining variable which will increase its value every time. By default, the variable value will be 0.
print x --> print value of x.

The same can be done via the below as well.





But the above output is not neat, I only want to display the total count once.

So that introduces our next concept of Begin and End

Begin and End

Begin Meaning: 

It sets an action on pre-processing. This will be executed first before the main execution takes place of the file processing.

End Meaning:

It sets an action on post-processing. This will be executed after the main execution takes place of the file processing.

These are optional procedures. Not required every time.  BEGIN and END keywords has to be written in Capital letters only.


Example 7: 


The total number of empty lines are displayed.

Begin --> WIll just print the string
/^$/{x=x+1} --> will keep adding each line
END --> Once end is encountered then print x will display the last value of x.


Example 8: 

Based on the same let's find the total salary of all employees.




I suppose this is self-explanatory, Please try to understand, if any doubts, feel free to mention in the comment section.

The next session will continue further


If you like please follow and comment