Trying to print previous line in awk, but instead appears to print current line twice

awk printf
awk nr
awk print line if column matches
awk print without newline
awk print whole line
awk if else
awk print column
awk '(gsub)

I'm trying to use awk to imitate uniq -d on specific fields to print the line currently being read as well as the previous line using the first solution from here, but it appears to print the same line twice.

Here's a sample of the stuff in the file.

130 chr1    7237    7238    0k9imgkt
135 chr1    7637    7637    b9gko
138 chr1    7908    7908    kob9g
139 chr1    8045    8045    34e5rg  4r
151 chr1    8329    8329    b
151 chr1    8346    8346    345y46htyh
151 chr1    8346    8346    76jtuj
152 chr1    8358    8358    asfge

Here's the line I used. I'm trying to compare rows based on the second, third, and fourth fields; if two or more rows are identical in those fields, print the entirety of those rows. Also, it's safe to assume that the rows are sorted based on fields 1, 2, and 3.

awk '{prev = $0; ++array[$2$3$4]; if(array[$2$3$4] == 2) {print; curr = $0; $0 = prev; print; $0 = curr}}' file

Here's what I want the output to be.

151 chr1    8346    8346    345y46htyh
151 chr1    8346    8346    76jtuj

And here's what the output is.

151 chr1    8346    8346    76jtuj
151 chr1    8346    8346    76jtuj

If I understood your question correctly, could you please try following.

awk 'FNR==NR{a[$2$3$4]++;next} a[($2$3$4)]>1' Input_file Input_file

OR

awk '{k=$2 FS $3 FS $4} FNR==NR{a[k]++;next} a[k]>1'  Input_file Input_file

Output will be as follows.

151 chr1    8346    8346    345y46htyh
151 chr1    8346    8346    76jtuj

4. Printing Output, Use the print statement for simple output, and the printf statement for fancier formatting The items to print can be constant strings or numbers, fields of the current awk 'BEGIN { print "line one\nline two\nline three" }' line one line two line three print uses with sprintf when it wants to convert a number to a string for printing� Awk to print data from current and previous line However, right now I am stuck on a seemingly "simple" thing in AWK. I have two columns of data, the first column in Age (in million years) and the second column is Convergence Rate (in mm/yr).

You are printing the same line twice. It's not entirely clear what you want the logic to be, but surely one of the print statements should be print curr or perhaps print prev. Also the lone prev doesn't do anything, and looks like it was left over from an editing mistake.

Perhaps you are looking for something like

awk '++array[$2$3$4] >= 2 {
        if(prev)print prev;
        print;
        prev = ""; next }
    { prev = $0 }' file

If that doesn't do what you want, maybe edit your question to describe in more detail what you hope your current script should do; code which doesn't do what you want isn't really a good way to communicate what you do want.

Using awk to write only duplicates, iterate over the file twice: In the second iteration, print only lines where the count is more than 1 each $2 appears, and remember which lines have occurred for each $2. in file, hence saving it based on $2 instead of just temporary variable if(c[$2]--){print fl[$2]} first print the previous line, counter is� [shellscript-l] using awk to print previous line Hi, I have a file(for eg. ABC) that has the following data: query 12345 jhkhk query 87987 kjlj query 800 abcdef key 777 mnkl query 999 xyx key 877 nmnnln Now I want to print only those lines which are above line where $1=="key". for eg.the below lines only get printed: query 800 abcdef query 999 xyz

Here is another awk solution that doesn't read input file twice and works even if your input is not sorted.

awk '(k = $2 FS $3 FS $4) in a {
  print a[k] $0; a[k] = ""; next
} { a[k] = $0 ORS }' file

AWK one-liner collection, Split up the lines of the file file.txt with ":" (colon) separated fields and print the Same as above but print only output if the second field ($2) exists and is not Change "prompt" to whatever string appears in your terminal prompt, e.g the hostname. If you want to include the last prompt where to stop printing then try this: awk� awk - print current line and next line if certain conditions exist Hi, I need to print both the current line (i.e. row one) and the next line (i.e. row two) where column 6 matches and if column 10 in row two is greater than column 10 in row one by a value of 3 or greater.

Matching Patterns and Processing Information with awk, This chapter describes the awk command, a tool with the ability to match lines of For each matching record, it will print two lines, first the number of the record on double quotation marks as required to control shell file name expansion and are printed separated by the current output field separator, and arguments not� 5.2 print Statement Examples. Each print statement makes at least one line of output. However, it isn’t limited to only one line. If an item value is a string containing a newline, the newline is output along with the rest of the string. A single print statement can make any number of lines this way.

Handy one-liners for SED, Using a tab (see # note on '\t' at end of file) instead of space will preserve 6,\}\)\ n/\1 /' # number each line of file, but only print numbers if line is not blank sed '/. the line, and # no trailing spaces appear at the end of lines. sed -e :a -e 's/^. append it to the previous line # and replace the "=" with a single space sed -e :a - e� I'm trying to print only the <N>th line before a search pattern. grep -B<N> prints all the <N> lines before the search pattern. I saw the awk code here that can print only the <N>th line after the search pattern. awk 'c&&!--c;/pattern/{c=N}' file How to modify this to print only the <N>th line before each line that matches pattern? For example

AWK Language Programming, The statement looks like this: fields of the current record (such as $1 ), variables, or any awk expressions. Numeric values are converted to strings, and then printed. If you forget to use the double-quote characters, your text will be taken as an awk expression, and Each print statement makes at least one line of output. Awk processes a line at a time. To print the previous line, remember the previous line in a variable. To print the next line, remember that you want to, and print and reset this variable on the next iteration. – tripleee Feb 7 '19 at 18:17

Comments
  • I made a mistake while entering that line; I amended it.
  • What should the output be if a line like 153 chr1 8045 8045 foo appeared at the end of your posted sample input? Should the earlier 139 chr1 8045 8045 34e5rg 4r be printed and then that new line since both have common $2/$3/$4 values? If so where should it appear - before the 151 lines or after them?
  • @DangIt, could you please try this one once and let me know if this helps you?
  • I'm not sure why, but for me the output is an empty file.
  • @DangIt, could you please do let me know if you have mentioned Input_file 2 times? I am reading file 2 times.
  • You're right; I forgot to put it in a second time. Why is it that I need to enter the input file twice? As in, would there be any advantages in being able to call two different input files?
  • It checks which keys exist multiple times in the file, and then on the second pass prints those. The teo rounds are an inefficiency but lets you handle files where the duplicates may not be adjacent.
  • What is the purpose of prev = ""? To me, it looks like {prev = $0} will overwrite that anyway.
  • It's to avoid printing prev more than once if there is a third or fourth repeat. The next bypasses the block which contains the prev = $0 assignment.
  • a bc -> abc. ab c -> abc. When creating unique keys by concatenating fields you need to include separators: {k=$2 FS $3 FS $4}.
  • That's a good fix if the fields might actually be ambiguous. The sample data looks like the second field would not have a lot of variation.
  • I'm not convinced that chrN will always ends in 1 digit vs 2, nor that the numeric columns are always 4 digits vs 3 or 5 or something else but maybe I'm just being paranoid, idk.
  • Using the full data set, it works correctly for the first duplicate pair, but it only outputs the second duplicate of each duplicate pair after that.
  • @DangIt that's hard to believe unless $2/3/4 keys can recur in later lines. You should include that use case in your posted sample input/output so people can test potential solutions against it.