Extracting information from a line with specific pattern using awk/sed

sed extract substring from line
sed extract part of line
awk '(print column matching pattern)
awk match pattern in column
grep extract string between two delimiters
sed extract pattern
awk pattern matching
awk extract string after pattern

I have a file like this, i.e.

A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF

Using the command line below, I extract information as a separate column for conf.

sed -Ei 's/(.*conf=)([^;]*)(;.*)/\1\2\3\t\2/g' my_file

However, if at the end of conf there is this symbol ; it works. Otherwise no. How to modify the script in order to extract the pattern in both cases,like this,and also in case it is empty to put tab?

A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1  XF
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF  XF

I used this link as a reference: https://unix.stackexchange.com/questions/414082/extract-part-of-lines-with-specific-pattern-and-store-in-a-new-field-using-awk-o?noredirect=1&lq=1

You may actually remove ;:

sed -iE 's/(.*conf=)([^;]*)(.*)/\1\2\3\t\2/g'  my_file

The [^;]* is a negated bracket expression, it will only match 0 or more (due to *) chars other than ;, and thus the ; is not necessary to be present in the pattern itself, the preceding pattern is already "restricted".

See the online sed demo:

s="A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF"
sed -E 's/(.*conf=)([^;]*)(.*)/\1\2\3\t\2/g' <<< "$s"

Output:

A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1 XF
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF XF

Extracting information from a line with specific pattern using awk/sed , You may actually remove ; : sed -iE 's/(.*conf=)([^;]*)(.*)/\1\2\3\t\2/g' my_file. The [^; ]* is a negated bracket expression, it will only match 0 or more� From the following article, you’ll learn how to print lines between two patterns in bash. I’ll show how to to extract and print strings between two patterns using sed and awk commands. I’ve created a file with the following text. It’ll be used in the examples below, to print text between strings with patterns.

Could you please try following in awk.

awk 'match($0,/conf=[^;]*/){print $0,substr($0,RSTART+5,RLENGTH-5);next} 1' Input_file

Explanation: Adding explanation for above code now.

awk '                                        ##Starting awk program here.
match($0,/conf=[^;]*/){                      ##Using match function of awk to match regex from string conf= till semi colon comes.
   print $0,substr($0,RSTART+5,RLENGTH-5)    ##Printing current line and then sub-string whose starting point of RSTART+5 and ending point is RLENGTH-5
   next                                      ##next will skip all further statements from here.
}                                            ##Closing BLOCK for match function here.
1                                            ##Mentioning 1 will print lines, those ones which are not having conf string match so it will simply print them.
'  Input_file                                ##Mentioning Input_file name here.

Output will be as follows.

A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1 XF
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF XF

Extracting pattern only with AWK | SED, How can we use AWK or SED to extract only the number from the string? trying to pull certain pieces of data out of a line of a file that matches a certain pattern:� I am trying to pull certain pieces of data out of a line of a file that matches a certain pattern: The three pieces that I want to pull out of this line are the only occurrences of that pattern within the line, but the rest of the line is not consistent in each file. Basically the line is (3 Replies)

Whenever you have name=value input data I find it easiest, most robust, most flexible, etc. to create an array representing that relationship (f[name]=value below) so you can then just access the values by their names. Depending on what in case it is empty to put tab means:

$ awk -F'[[:space:];=]+' -v OFS='\t' '
    {delete f; for (i=5; i<NF; i+=2) f[$i]=$(i+1); print $0, f["conf"]}
' file
A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1     XF
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF     XF

or:

$ awk -F'[[:space:];=]+' '
    {delete f; f["conf"]="\t"; for (i=5; i<NF; i+=2) f[$i]=$(i+1); print $0, f["conf"]}
' file
A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1 XF
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF XF

Using SED and AWK to Print Lines Between Two Patterns , I'll show how to to extract and print strings between two patterns using sed and awk commands. I've created a file with the following text. It'll be� Extract line beginning with a specific pattern in sed. I am using sed to extract all the IDno and the type only when type is student. Nth occurrence of a

You can try Perl one-liner

$ perl -lne ' /conf=(\w+)/ and $_.=" $1"; print ' conf.txt
A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1 XF
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF XF
$

or even shorter

$ perl -lne ' /conf=(\w+)/ and print "$_ $1" ' conf.txt
A   10  20  bob.1   ID=bob.1;Parent=bob;conf=XF;Note=bob_v1 XF
A   20  30  bob.2   ID=bob.2;Parent=bob;Note=bob_v1;conf=XF XF

[PDF] grep, awk and sed – three VERY useful command-line utilities Matt , Beginning at the first line in the file, grep copies a line into a buffer Please cut & paste the following data and save to a file called 'a_file': boot way you could track down a particular string more easily, if you needed to open the file in an editor A regular expression is a compact way of describing complex patterns in text. I have a question regarding the awk/sed operators. I have a big file which has following set of lines repeated. Expression loweWallrhoPhi : sum=-6.97168e-09 Expression leftWallrhoPhi : sum=6.97168e-09 Expression lowerWallPhi : sum=-5.12623e-12 Expression leftWallPhi : sum=5.12623e-12 Expression loweWallrhoUSf : sum=-6.936e-09 Expression leftWallrhoUSf : sum=6.97169e-09 Expression lowerWallUSf

We should not require the ; in \3 -- as it is already handled in the exclusion character list in \2:

sed -Ei 's/(.*conf=)([^;]*)(.*)/\1\2\3\t\2/' my_file

If we need to contend with some other character than ; as a delimiter, we include it in the character list in \2. Such a character could be a \t or a space?

sed -Ei 's/(.*conf=)([^;\t ]*)(.*)/\1\2\3\t\2/' my_file

text processing, egrep can get multiple lines from a file. Using a pipe | as a separator you can pull as many different criteria as you want. egrep is the equivalent� The following example shows the use of awk command with sed command. Here, sed command will search all employee names starts with ‘ J ’ and passes to awk command as input. awk will print employee name and ID after formatting.

Using commands and pipes to mine and extract data, Because of the power of Unix pipes and the rich set of command-line tools available, Unix/Linux forms; but, a common form is to process a stream of text and extract certain fields from certain lines. Even sed lets us pick off the fifth field separated by colons using a “regular expression” pattern (though this is very messy!): And it goes without saying that the most popular command line tools for this in Linux are sed and awk – the two best text processing programs. In the following article, you’ll find an information about how to add some text, character or comma to the beginning or to the end of every line in a file using sed and awk .

Using sed to extract lines in a text file, Use sed or perl to extract every nth line in a text file. Use awk to extract lines. Extract columns how can I find a certain string in an output and then print the following lines? Usage could be Lastly, you give too little information, Eri, to take your question seriously. But I wanted to Mask Patterns Without Rerunning ATPG bash,perl,command-line,awk,sed In awk awk -F, 'NF==1{a=$0;next}{print a","$0}' file Checks if the number of fields is 1, if it is it sets a variable to that and skips the next block. For each line that doesn't have 1 field, it prints the saved variable and the line And in sed sed -n Remove part of a column, if its in a specific Column number.

extracting string from a line in Unix, Extracting part of lines with specific pattern using awk,sed, With grep Newbie: 1 : 02-22-2011 03:27 AM [SOLVED] Extract multiple lines of data from a text file. Replace specific field on specific line sed or awk I'm trying to update a text file via sed/awk, after a lot of searching I still can't find a code snippet that I can get to work. Brief overview: I have user input a line to a variable, I then find a specific value in this line 10th field in this case.

Comments
  • When you say in case it is empty to put tab - do you mean have a tab instead of XF in your above output or do you mean that the XFs above should be preceded by a tab and in the empty case it'd just be tab then null, or do you mean something else? Include that case in you sample input/output.