SQL query to count rows based on previous values of different column

count number of rows in sql query result
sql count where
how to use values from previous or next rows in a sql server query
sql server selecting rows where column value changed from previous row
sql select distinct values and count of each
sql select unique values from multiple columns
sql calculate difference between current and previous rows
sql count(distinct multiple columns)

I'm working in SAS and I have a table that looks like this

ID | Time | Main | lag_1 | lag_2
----------------------------------------------------------------------------
A  |  01  |   0  |   0   |  1  
A  |  03  |   0  |   0   |  1  
A  |  04  |   0  |   0   |  0  
A  |  10  |   1  |   0   |  0  
A  |  11  |   1  |   0   |  0  
A  |  12  |   1  |   0   |  0  
B  |  02  |   1  |   1   |  1  
B  |  04  |   0  |   1   |  1  
B  |  07  |   0  |   0   |  1  
B  |  10  |   1  |   0   |  0  
B  |  11  |   1  |   0   |  0  
B  |  12  |   1  |   0   |  0  

except with multiple IDs. The table is sorted by ID and Time. After calculating the total count of ones in the Main column (call it tot), I am trying to calculate 2 things:

  1. The total count of ones in the Main column only if lag_1 has been equal to 1 at some time before Main became 1, say tot_1; and
  2. The same as 1. but in this case for lag_2, call the variable tot_2

The table of expected calculations would give me that

tot | tot_1 | tot_2
--------------------
 7  |   3   |   6

since tot_1 should be 3 (0 from ID = A + 3 from ID = B), and tot_2 should be 6 (3 from ID = A + 3 from ID = B).

I am a complete beginner in these types of segmentations so any help is greatly appreciated.

Edit: I would expect that tot_2 >= tot_1 because lag_2 is built on events from Main which go longer back in time than lag_1 does.


Much easier to do in a data step. That way you can check for start of new id and reset the flag for whether the lag_x variables were ever true.

data want ;
  set have end=eof;
  by id time ;
  tot + main ;
  if first.id then call missing(any_lag_1,any_lag_2);
  if any_lag_1 then tot_1 + main ;
  if any_lag_2 then tot_2 + main ;
  if eof then output;
  any_lag_1+lag_1;
  any_lag_2+lag_2;
  keep tot: ;
run;

SQL Server LAG() Function By Practical Examples, How do I count the same values in a column in SQL? The total count of ones in the Main column only if lag_1 has been equal to 1 at some time before Main became 1, say tot_1; and The same as 1. but in this case for lag_2, call the variable tot_2 The table of expected calculations would give me that


If I understand correctly, you want these sums per id. The key is comparing the minimum value of the id under different circumstances, and then doing the sums. This is all conditional aggregation:

select sum(tot) as tot,
       sum(case when id_lag_1 < id_main then tot else 0 end) as tot_1,
       sum(case when id_lag_2 < id_main then tot else 0 end) as tot_2
from (select id, sum(main) as tot,
             min(case when main = 1 then id end) as id_main,
             min(case when lag_1 = 1 then id end) as id_lag_1,
             min(case when lag_2 = 1 then id end) as id_lag_2
      from t 
      group by id
     ) t;

Does COUNT() include duplicate values of a column?, SQL SELECT DISTINCT Statement. SELECT DISTINCT returns only distinct (​different) values. SELECT DISTINCT eliminates duplicate records from the results​.   The problem is that SQL queries perform operations on a row-by-row basis; accessing data on different rows at the same time requires the query to do some extra work. In SQL Server versions prior to 2012, you need to perform a join using a row enumerator to match up rows with previous or next rows.


Consider the computation for tot_1 and tot_2

My first step is to look for a pattern where lag_1 > main (This fulfills the case you mentioned that,ie find records where lag_1=1 sometime before main=1) and i name all such values as 'grp_lag_1' and 'grp_lag_2'

Once i have grouped the records, i "copy" down the values using max() over(order by id,time1).

select *
      ,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1 
      ,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2 
  from t

So i get a result as follows

+----+-------+------+-------+-------+-----------+-----------+
| id | time1 | main | lag_1 | lag_2 |   grp_1   |   grp_2   |
+----+-------+------+-------+-------+-----------+-----------+
| A  |    01 |    0 |     0 |     1 |           | grp_lag_2 |
| A  |    03 |    0 |     0 |     1 |           | grp_lag_2 |
| A  |    04 |    0 |     0 |     0 |           | grp_lag_2 |
| A  |    10 |    1 |     0 |     0 |           | grp_lag_2 |
| A  |    11 |    1 |     0 |     0 |           | grp_lag_2 |
| A  |    12 |    1 |     0 |     0 |           | grp_lag_2 |
| B  |    02 |    1 |     1 |     1 |           |           |
| B  |    04 |    0 |     1 |     1 | grp_lag_1 | grp_lag_2 |
| B  |    07 |    0 |     0 |     1 | grp_lag_1 | grp_lag_2 |
| B  |    10 |    1 |     0 |     0 | grp_lag_1 | grp_lag_2 |
| B  |    11 |    1 |     0 |     0 | grp_lag_1 | grp_lag_2 |
| B  |    12 |    1 |     0 |     0 | grp_lag_1 | grp_lag_2 |
+----+-------+------+-------+-------+-----------+-----------+

After this if i were to sumup the main values for grp_lag_1 i would get tot_1 and likewise summing up grp+lag_2 i would get tot_2

 select sum(main) as tot_cnt
       ,sum(case when grp_1='grp_lag_1' then main end) as tot_1
       ,sum(case when grp_2='grp_lag_2' then main end) as tot_2
 from(      
select *
      ,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1 
      ,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2 
  from t
  )x


+---------+-------+-------+
| tot_cnt | tot_1 | tot_2 |
+---------+-------+-------+
|       7 |     3 |     6 |
+---------+-------+-------+

Demo https://dbfiddle.uk/?rdbms=sqlserver_2012&fiddle=c17be111dbc3c516afa2bc3dcd3c9e1c

SQL to find the number of distinct values in a column, These will be used together to build a simple single SQL query that indicates the number of rows in a table that match different values for a  The SQL COUNT(), AVG() and SUM() Functions The COUNT() function returns the number of rows that matches a specified criterion. The AVG() function returns the average value of a numeric column. The SUM() function returns the total sum of a numeric column.


SQL SELECT DISTINCT | COUNT | ROWS, The problem is that SQL queries perform operations on a row-by-row basis; accessing data on different rows at the same time requires the query to do some extra work. From this table we need to calculate the idle time per user per Here's the data produced by the CTE with its RowNumber column: Hi All, I have one table with 6 columns, let say ID1, ID2,status, EnteredDate,NewValue, Old Value. Where ID1 is the primary key field. ID1 ID2 status EnteredDate NewValue Old Value 1 XYZ New 07/12/2012 ABC null 2 XYZ Renewal 08/19/2012 DEF null 3 XYZ Cancel 10/21/2012 GHI null 4 ZYX New 09/15/2012 BDF null 5 ZYX Cancel 10/21/2012 MNS null 6 MBS New 05/29/2012 EXP null 7 SBX New 05/29/2012 SKS


SQL: Counting Groups of Rows Sharing Common Column Values , SQL COUNT DISTINCT example. To exclude both NULL values and duplicates, you use the COUNT(DISTINCT column) as the following query:  There are times that we get requirements such as populating and duplicate SQL rows, based on a value, on another column. E.g.: In an inventory system when items are received those details will be saved in the following format (ItemDetails) : And we are asked to create a GUI for end user to enter ‘Serial Numbers’ for each item.


How to Use Values from Previous or Next Rows in a SQL Server Query, SQL COUNT() with GROUP by: The use of COUNT() function in conjunction BY is useful for characterizing our data under various groupings. A combination of same values (on a column) will be treated as an SELECT working_area, COUNT(*) FROM agents GROUP BY Previous: COUNT with Distinct SQL COUNT(*) with ORDER BY clause example. You can use the COUNT(*) function in the ORDER BY clause to sort the number of rows per group. For example, the following statement gets the number of employees for each department and sorts the result set based on the number of employees in descending order.