Monday, September 8, 2014

Frequently used Linux commands:

(1) To remove blank lines from a file:

     sed -e '/^ *$/d' inputfile > outputfile

(2) If loop in awk:

     awk '{
         if ($1 > 4)
         print $1,$2
         else
         print $1-10,$2
         }' filename

(3) assigning a value to a variable in the command line:
   
      Do not use : set I 20
      use   :  I=20

(4) If a directory is over-loaded with a large number of files, 'ls', 'mv' and other similar commands may fail to do their respective jobs. For example, if you want to move all the *.pdb files from one directory to another, we normally use the following command: mv *.pdb destination_directory

When the source directory is loaded with too many *.pdb files, the above command would give the following error message: "mv: Argument list too long".

In such circumstances, you could use the following command to complete the task:

 echo !(*.pdb) | xargs mv -t destination_directory

This command will move all other files (not *.pdb) to the destination_directory. The source directory would contain only the *.pdb files.

(5) To find the minimum and maximum values in a column:
      -  minimum value:
         awk '{print $1}' filename | sort -n | head -1

     -  maximum value:
         awk '{print $1}' filename | sort -n | tail -1

(6) Averages and standard deviation:
     Consider a tab delimited file with n columns and m rows.

     - To take an average of all the columns:
      awk '{for(i=1; i<=NF; i++){sum[i]+=$i}} END {for(i=1; i<=NF; i++){printf sum[i]/NR "\t"}}' file

   - To calculate the average and the standard deviation of the second column:
  awk '{sum += $2; sumsq += ($2)^2} END {printf "%f %f \n", sum/NR, sqrt((sumsq-sum^2/NR)/NR)}' file

No comments: