UNIX group by two values

I have a file with the following lines (values are separated by ";"):

dev_name;dev_type;soft
name1;ASR1;11.1
name2;ASR1;12.2
name3;ASR1;11.1
name4;ASR3;15.1

I know how to group them by one value, like count of all ASRx, but how can I group it by two values, as for example:

ASR1
    *11.1 - 2
    *12.2 - 1
ASR3 
    *15.1 - 1
$ cat tst.awk
BEGIN { FS=";"; OFS=" - " }
NR==1 { next }
$2 != prev { prt(); prev=$2 }
{ cnt[$3]++ }
END { prt() }

function prt(   soft) {
    if ( prev != "" ) {
        print prev
        for (soft in cnt) {
            print "    *" soft, cnt[soft]
        }
        delete cnt
    }
}

$ awk -f tst.awk file
ASR1
    *11.1 - 2
    *12.2 - 1
ASR3
    *15.1 - 1

Or if you like pipes....

$ tail +2 file | cut -d';' -f2- | sort | uniq -c |
    awk -F'[ ;]+' '{print ($3!=prev ? $3 ORS : "") "    *" $4 " - " $2; prev=$3}'
ASR1
    *11.1 - 2
    *12.2 - 1
ASR3
    *15.1 - 1

I have data in txt file where in first column I have value as 99, After that we have few records again 99 will come than we have few records. so I need to insert all records between two 99 in oracle database. 99 1 abc 1234 2 xyz 4567 99 3 hgj 345 4 jhj 987 99 Here how Can I group first two records and another two group in next. Reply Delete

another awk

$ awk -F';' 'NR>1 {a[$2]; b[$3]; c[$2,$3]++} 
             END  {for(k in a) {print k; 
                                for(p in b) 
                                   if(c[k,p]) print "\t*"p,"-",c[k,p]}}' file
ASR1
        *11.1 - 2
        *12.2 - 1
ASR3
        *15.1 - 1

Input file: $ cat /tmp/file.txt 286 255564800 609 146 671290 Required: Add (Sum) all the numbers present in the above file. Way#1: This is supposed to be the most popular way of doing an addition of numbers present in a particular field of a file.

try something like

awk -F ';' '
   NR==1{next}
   {aRaw[$2"-"$3]++}
   END {
      asorti( aRaw, aVal)
      for( Val in aVal) {
         split( aVal [Val], aTmp, /-/ )
         if ( aTmp[1] != Last ) { Last = aTmp[1]; print Last }
         print "   " aTmp[2] " " aRaw[ aVal[ Val] ]
         }
      }
   ' YourFile

key here is to use 2 field in a array. The END part is more difficult to present the value than the content itself

GROUP BY department_id, job_id. This clause will group all employees with the same values in both department_id and job_id columns in one group. The following statement groups rows with the same values in both department_id and job_id columns in the same group then returns the rows for each of these groups. 1.

Using Perl

$ cat bykub.txt
dev_name;dev_type;soft
name1;ASR1;11.1
name2;ASR1;12.2
name3;ASR1;11.1
name4;ASR3;15.1
$ perl -F";" -lane ' $kv{$F[1]}{$F[2]}++ if $.>1;END { while(($x,$y) = each(%kv)) { print $x;while(($p,$q) = each(%$y)){ print "\t\*$p - $q" }}}' bykub.txt
ASR1
        *11.1 - 2
        *12.2 - 1
ASR3
        *15.1 - 1
$

Yet Another Solution, this one using the always useful GNU datamash to count the groups:

$ datamash -t ';' --header-in -sg 2,3 count 3 < input.txt |
   awk -F';' '$1 != curr { curr = $1; print $1 } { print "\t*" $2 " - " $3 }' 
ASR1
    *11.1 - 2
    *12.2 - 1
ASR3
    *15.1 - 1

Comments
  • Insert the csv as a table in a DB, the use some SQL with a "group by" clause :D
  • @funkyjelly There's also the q command.
  • @Socowi Agreed, but might not be available/installed (not that a DB would be more easily available though!)
  • stackoverflow.com/a/2613073/2908724
  • Welcome to SO. Stack Overflow is a question and answer page for professional and enthusiastic programmers. Add your own code to your question. You are expected to show at least the amount of research you have put into solving this question yourself.
  • You should mention it requires GNU awk for asorti().
  • $2"-"$3 assumed safe chars are risky. You may perhaps assume FS but not any other.
  • @karakfa did'nt catch the point, your assume it in your solution and not in mine ? could you give a sample where it doesn't work on such a input file ?
  • @EdMorton as often, good remark about limitation of the solution.