What grep/awk/sed command to use to the output that i want

grep sed awk cheat sheet
awk command in unix
sed command in unix
grep awk '(print examples)
sed and awk in unix
awk tutorial
grep command in linux
grep and sed

I have a input file like this:

COL1: VALUE1 , XYZ: 2, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, proc=0xyyy23, NAME=AUDIT
COL1: VALUE2 , XYZ: 2, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, proc=0xyy23, NAME=generic
XYZ:2, COL1: 289 , TREK:MRP, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X,  NAME=Oil, trial=TREE

I want to get the output like this:

  COL1: VALUE1 , NAME=AUDIT
  COL1: VALUE2 , NAME=generic
  COL1: 289    , NAME=Oil

How can I achieve this using awk/grep/sed on command line without using any advanced versions of awk like gawk, nawk etc?

Basically I want to get the value of COL1 (i.e. text after : and =) & NAME irrespective of where they are in the line. See that location of NAME column got slightly altered.

This is what I could come up with:

awk -F"," '{print $1, $6}' file.txt
COL1: VALUE1   NAME=AUDIT
COL1: VALUE2   NAME=generic
XYZ:2   NAME=Oil

Could you please try following(tested and written in GNU awk).

awk '
BEGIN{
  OFS=" , "
}
match($0,/COL[0-9]+: [^,]*/){
  val=substr($0,RSTART,RLENGTH)
  match($0,/NAME[^,]*/)
  print val OFS substr($0,RSTART,RLENGTH)
  val=""
}
'   Input_file

I have clubbed the match(es) of string COL and NAME in each line so in case any line do not have string COL in it, it may not print anything in it.



In case string COL is not found in a line and you still want to print NAME string match then try following.

awk '
BEGIN{
  OFS=" , "
}
match($0,/COL[0-9]+: [^,]*/){
  val=substr($0,RSTART,RLENGTH)
}
match($0,/NAME[^,]*/){
  if(val){
    printf "%s%s",val,OFS
  }
  print substr($0,RSTART,RLENGTH)
}
'    Input_file


Explanation: Adding explanation for above code now.

awk '                                          ##Starting awk program heer.
BEGIN{                                         ##Starting BEGIN section for awk code here.
  OFS=" , "                                    ##Setting OFS output field separator as space comma space here.
}                                              ##Closing BEGIN section here.
match($0,/COL[0-9]+: [^,]*/){                  ##Using match of awk OOTB function to match a REGEX string COL till comma here.
  val=substr($0,RSTART,RLENGTH)                ##If a match is foundthen creating variable val whose value is sub string of matched regex starting to till end value of it.
  match($0,/NAME[^,]*/)                        ##Again using match to match string from NAME to till next comma comes.
  print val OFS substr($0,RSTART,RLENGTH)      ##Printing value of variable val OFS and substring of current line whose sarting point is RSTART and end point is RLENGTH.
  val=""                                       ##Nullifying variable val here.
}
'  Input_file                                  ##Mentioning Input_file name here.

Adding reference from man awk page:

   RSTART      The index of the first character matched by match(); 0 if no match.  (This implies that character indices start at one.)

   RLENGTH     The length of the string matched by match(); -1 if no match.

UNIX Shell Programming : grep,sed and awk, sed:-Scripts, Operation, Addresses, commands, Applications, grep and sed. Functions, Using System commands in awk, Applications of awk, grep and sed All three search one or more files and output lines that contain text that matches Consider the case where we want to extract all lines that start with a capital letter� If you want a wider range of regular expression commands then you must use 'grep -E' (also known as the egrep command). For instance, the regexp command ? will match 1 or 0 occurences of the previous character: grep -E "boots?" a_file This query will return boot boots

You can try Perl one-liner

 perl -lne ' /(COL1:\s*\S+).+(NAME=\w+)/ and print "$1,\t$2" ' input_file

with your inputs:

$ cat sach.txt
COL1: VALUE1 , XYZ: 2, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, proc=0xyyy23, NAME=AUDIT
COL1: VALUE2 , XYZ: 2, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X, proc=0xyy23, NAME=generic
XYZ:2, COL1: 289 , TREK:MRP, OWNER: (DSF) , FLG: DIT /-/-/ OX if 0X,  NAME=Oil, trial=TREE
$ perl -lne ' /(COL1:\s*\S+).+(NAME=\w+)/ and print "$1,\t$2" ' sach.txt
COL1: VALUE1,   NAME=AUDIT
COL1: VALUE2,   NAME=generic
COL1: 289,      NAME=Oil
$

Explanation:

perl -lne  # use -n for suppressing print default at the end of each line

' /(COL1:\s*\S+).+(NAME=\w+)/  # Match pattern and capture them in capture groups first () will be $1 and second () will be in $2
                               # First ()  matches COL1:\s*\S+ => COL1: followed by zero or more spaces using \s* and \S+ for non-space characters
                               # .+ => match all strings between first () and second ()
                               # Seecond ()  matches NAME followed by a word \w+


and                            # bind on the success of previous condition /..../
print "$1,\t$2"                # print the $1 and $2 captured variables 

' input_file

A brief introduction to grep, awk & sed, grep, awk and sed are three of the most useful command-line tools1 in *nix world. And this article will But it's useful when we just want to search and filter out matches. OFS, The output field separator, default value is “ “. Grep is useful if you want to quickly search for lines that match in a file. It can also return some other simple information like matching line numbers, match count, and file name lists. Awk is an entire programming language built around reading CSV-style files, processing the records, and optionally printing out a result data set.

With grep you can maybe try something like that :

while read line; do COL=$(echo $line | grep -o "COL1:.*,"); NAME=$(echo $line | grep -o "NAME=[a-zA-Z]*"); echo $COL $NAME >> new_file.txt; done < your_file.txt 

The regexp in this example assume that the value after COL1 are always followed by a "," (then it take every characters between the : and ,) so you might have to adapt it to fit your file (same for the regexp used for NAME).

Sculpting text with regex, grep, sed and awk - Matt Might, Sculpting text with regex, grep, sed, awk, emacs and vim The equally useful command grep -v pattern file prints each line of the file file A common use case for grep is command | grep word , which will dump out the lines from the output of If you need to find a specific IP address, say 1.10.3.20, in a log file, you can do � Awk is a programming language that is strictly focused on pattern matching and reporting from text files. GNU awk has taken things to a whole new level (I recall a minimalist wiki written in one awk file).

Try this:

$ sed 'H;s/.*NAME=/NAME=/;s/ *,.*//;x;s/^.*COL1/COL1/;s/ *,.*//;G;s/\n/\t, /;' file
COL1: VALUE1    , NAME=AUDIT
COL1: VALUE2    , NAME=generic
COL1: 289       , NAME=Oil

Used hold space, and used \t for alignment.

Is there a basic tutorial for grep, awk and sed?, This yields a much more useful result, which explains which lines matched the If you want a wider range of regular expression commands then you must use� Bumping your posts is against the rules, and expecting a reply within half an hour is not realistic, especially if your problem is not described in much detail. awk by nature does a superset of what grep does.

by gnu sed

$ sed -E 's/^([^,]+,\s*)?(col1:[^,]+).+(,\s*name=\w+).*/\2\3/i' file.txt

What grep/awk/sed command to use to the output that i want, It's like many unix things (e.g., vi ), there is a learning curve, but it's worth it. I disagree with the suggestion to use awk in place of grep . Does not make sense in� nextfile is the awk command to quit the current file and start working on a new file. Without this command, if the pattern is present twice in a file, the file name will also get printed twice. 5. To print the line number along with the pattern matching line: $ grep -n Unix file 1:Unix 7:Unix

[PDF] Extended Unix: sed, awk, grep, and bash scripting basics What is , Could you please try following(tested and written in GNU awk ). awk ' BEGIN{ OFS=" , " } match($0,/COL[0-9]+: [^,]*/){ val=substr($0,RSTART� Can use multiple times. -f file : Takes patterns from file, one per line. -E : Treats pattern as an extended regular expression (ERE) -w : Match whole word -o : Print only the matched parts of a matching line, with each such part on a separate output line.

[PDF] UNIX II:grep, awk, sed, 5. Objectives. • Unix commands for searching. – REGEX. – grep. – sed. – awk. • Bash scripting also seen like [[:name:]] or [[.az.]] syntax: – using stdin: cat file | sed 'command' ORS – output record separator (default newline). 16 awk� 20 awk examples. Many utility tools exist in the Linux operating system to search and generate a report from text data or file. The user can easily perform many types of searching, replacing and report generating tasks by using awk, grep and sed commands. awk is not just a command.

[PDF] Shell Scripting, command line. • Regular expressions are accepted input for grep, sed, awk, perl, vim and other unix commands. any string or numeric text can be explicitly output using “”. Assume a starting file like so: 1 1 1918 9 22 9 54�

Comments
  • sed -r 's/.*(COL 1:[^,]+,).*( NAME=[^,]+).*/\1\2/'
  • works perfect, can you please explain it to me, will appreciate if you do
  • @Sach fyi, this awk solution is using gawk, which stands for GNU awk. awk and gawk can point to same executable, and in your case, I think you are already using GNU awk since you said it's working. Better to check your version (awk --version) and to know the difference, and if you plan to migrate the code to other platform, bear in mind that it does not guarantee that it will work.
  • awk --version GNU Awk 3.1.7
  • @Sach, IMHO man awk is BEST reference, after that you could go through SO posts too there is lot to learn from this GREAT forums. It is simple, if a match is found then RSTART tells its starting indexing point and RLENGTH means that full length of matched index. Also [^,]* means match till first comma occures in current line.
  • on moba this is the version :awk --version awk: unknown option -- version BusyBox v1.22.1 (2015-11-10 11:07:12 ) multi-call binary.
  • Ahh... perl, always a good choice, and so powerful as well as elegant PRE is :)
  • @Tiw.. thank you for the appreciation..yep, Perl is the right tool for regex problems like this..
  • @stack0114106 can you please explain it a bit to me. this is very nice