Use awk to match & store, append the pattern and also split the line having a delimiter

awk match pattern in column
awk match function
awk pattern matching if condition
awk pattern match variable
awk match string
awk print column matching pattern
awk print line if column matches
awk print word after match

I am having the below output from a text file which I need to format to be more readable.

julian text:case2345
maria  text:case4567
clover text,text,text,text,text,text:case3456
neil   text,text:case09876

I need to reformat the output as follows:

julian text:case2345
maria  text:case4567
clover text:case3456 
clover text:case3456
clover text:case3456
clover text:case3456
clover text:case3456
clover text:case3456
neil   text:case09876
neil   text:case09876

Using awk I was trying to match the pattern case[0-9], store it in a variable and then split the line using delimiter "," and finally print. I was trying below earlier but couldn't get the desired output

awk '/match($0,/case[0-9]/){val=substr($0,RSTART,RLENGTH);next}{split($2,k,","); for (i in k) {printf ("%s %s %s\n\n",$1,k[i],val)}}'

Following awk may help here. (Considering that your actual Input_file is the shown sample).

awk -F' +|,|:' '$NF~/[cC][aA][sS][eE]/ && NF>2{for(i=2;i<=(NF-1);i++){print $1 OFS $i":"$NF};next} 1' Input_file

Adding a non-one liner form of solution too now.

awk -F' +|,|:' '
$NF~/[cC][aA][sS][eE]/ && NF>2{
  for(i=2;i<=(NF-1);i++){
    print $1 OFS $i":"$NF};
  next
}
1
'  Input_file

Explanation: Adding explanation for code now too.

awk -F' +|,|:' '           ##Setting field separator as space(s) OR comma OR colon here for each line.
$NF~/[cCaAsSeE]/ && NF>2{  ##Checking condition here if last field is having case OR CASE string in it and number of fields are more than 2.
  for(i=2;i<=(NF-1);i++){  ##Starting a for loop which starts from 2nd value to second last value of total fields value here.
    print $1 OFS $i":"$NF};##first field OFS(whose default value is space) value of current field and colon with last field of line.
  next                     ##next is awk default keyword which will skip all further lines now.
}
1                          ##Only those lines will come here which was NOT true for above conditions, simple printing of line will happen here.
' Input_file               ##Mentioning Input_file name here.

How to use regular expressions in awk, The syntax for using regular expressions to match lines in awk is: word ~ /match/. The inverse of that is not matching a pattern: word !~ /match/. This chapter describes the awk command, a tool with the ability to match lines of text in a file and a set of commands that you can use to manipulate the matched lines. In addition to matching text with the full set of extended regular expressions described in Chapter 1 , awk treats each line, or record , as a set of elements, or fields , that can be manipulated individually or in combination.

Just tweak the answer to your previous question:

$ awk -F'[ ,:]+' '{for (i=2;i<NF;i++) print $1, $i ":" $NF}' file
julian text:case2345
maria text:case4567
clover text:case3456
clover text:case3456
clover text:case3456
clover text:case3456
clover text:case3456
clover text:case3456
neil text:case09876
neil text:case09876

How to print matched regex pattern using awk?, Using awk , I need to find a word in a file that matches a regex pattern. I only want to print the word matched with the pattern. So if in the line, I  The gensub function allows you to use the & character to recall the matched text. For example, if you have a file with the word Awk and you want to change it to GNU Awk, you could use this rule: {print gensub (/ (Awk) /, "GNU &", 1)} This searches for the group of characters Awk and stores it in memory, represented by the special character &.

# set field separator
awk -F '[: ]+' '/,/{                                # if line/row/record contains comma 
                    split($2,arr,/,/);              # split 2nd field by comma, 
                                                    # store elements in array arr
                    for(i=1; i in arr;i++)          # iterate through array arr
                         print $1, arr[i] ":" $NF;  # print 1st field, array element and last field from record
                    next                            # stop processing go to next line
                                                    # 1 at the end does default operation that is print $0
                }1' infile

Test Results:

$ cat infile
julian text:case2345
maria  text:case 4567
clover text,text,text,text,text,text:case3456
neil   text,text:case09876

$ awk -F '[: ]+' '/,/{split($2,arr,/,/);for(i=1; i in arr;i++)print $1,arr[i]":"$NF;next}1' infile
julian text:case2345
maria  text:case 4567
clover text:case3456
clover text:case3456
clover text:case3456
clover text:case3456
clover text:case3456
clover text:case3456
neil text:case09876
neil text:case09876

How to Use Awk and Regular Expressions to Filter Text or String in , matches the end of line in a file. \ it is an escape character. In order to filter text, one has to use a text filtering tool such as awk. You can think of  I am trying to look for $2 of file1 (skipping the header) in $2 of file2 and if they match and the value in $10 is &gt; 30 and $11 is &gt; 49, then print the line to a output file. The below awk has

AWK Cheat Sheet, awk '!/regex/' file↵. Print only lines that do not match regex in file. awk '$2 == "foo"' file↵. Print any line where field 2 is equal to "foo" in file. awk '$2 != "foo"' file↵. Use the OFS output field separator to tell awk to use colons (:) to separate fields in the output. Set a counter to 0 (zero). Set the second field of each line of text to a blank value (it’s always an “x,” so we don’t need to see it).

awk - Match a pattern in a file in Linux, This is actually the simulation of grep command using awk. 2. awk, while doing pattern matching, by default does on the entire line, and hence  shell script - awk to match and cut out fields with alternating delimiter - Unix & Linux Stack Exchange I would like to use awk or similar to match patterns of a chrome bookmarks file and depending on match, cut out a specific field based on different field delimiters. I have attached a sample pict

awk match function, In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5  In this awk tutorial, let us review awk conditional if statements with practical examples. Awk supports lot of conditional statements to control the flow of the program. Most of the Awk conditional statement syntax are looks like ‘C’ programming language. Normally conditional statement checks the condition, before performing any action.

Comments
  • its a typo , just reviewed the actual input file and there is no space in between 'case' and the numbers ..thanks for noticing
  • @thanks Ravinder for the explanation..really helps , am working on the same will confirm , is --> awk -F ' +| setting field separator for spaces ? what is the significance of '+'
  • @shirop, + means take more than 1 continuous occurrence of spaces together(as a single field)
  • $NF~/[cCaAsSeE]/ in your final script is not the same as $NF~/[cC][aA][sS][eE]/ in your first 2 versions but in any case consider just calling tolower instead of creating upper and lower case character lists within bracket expression for every char in a regexp, e.g. use tolower($NF) ~ /case/ instead of $NF~/[cC][aA][sS][eE]/. Also, i<=(NF-1) is the same as i<NF and you don't need the semi-colons at the end of lines. Finally - the next statement and 1 are doing nothing given the OPs posted sample input.
  • how is the split happening here in the for loop, given we are not explicitly mentioning which delimiter to use in split ..say like ..split($2,arr,/,/) @thanks
  • You don't need a separate, explicit call to split() since awk already splits each record into fields based on the value of FS which I'm setting with -F'[ ,:]+' so awk splits the input every time it sees spaces, commas, or colons.
  • thanks @Akshay ..can you please explain what is this --> '[: ]+' and also is $NF storing the last value of the field ? and how the value is getting assigned in $NF ?
  • found the gnu-manual which explain $NF will always represent the last field link