SUM of N non-zero preceeding rows in SQL table

sql: selecting rows where column value changed from previous row
coalesce
how to use values from previous or next rows in a sql server query
sql lag
sql query to select a particular row from a table
sql calculate difference between current and previous rows
sql lag with condition
sql get next row value

I have the following table:

DATE, EMPLOYEE_ID, ILL
1.1.2016, 101, 0
1.1.2016, 102, 1
2.1.2016, 101, 1
2.1.2016, 102, 1
3.1.2016, 101, 0
3.1.2016, 102, 0

And I need to write an SQL code to create a new column which would calculate a number of preceding (considering DATE) non-zero integers in the ILL column to a new column.

And that must be for each employee separately.

The reason is that I need to find out how many days has an employee been ill (1 for absence in the ILL column) before said date.

Is this even possible to do in SQL?

I am currently trying to alter the query from https://dba.stackexchange.com/questions/181773/sum-of-previous-n-number-of-columns-based-on-some-category but I am not having success yet.

OUTPUT I WANT:

DATE, EMPLOYEE_ID, PREVIOUS
1.1.2016, 101, 0
1.1.2016, 102, 0
2.1.2016, 101, 0
2.1.2016, 102, 1
3.1.2016, 101, 1
3.1.2016, 102, 2
4.1.2016, 101, 0
4.1.2016, 102, 0

This is data prep for my master's thesis. I am using SAP HANA STUDIO.

You would do this by assigning a group number to each group separated by 0s. Then, you would use row_number() within the group.

You can calculate the group number using a cumulative sum. So, the query looks like:

select t.*,
       (case when ill = 1
             then row_number() over (partition by employee_id, grp, ill order by date)
        end) as ill_day_counter
from (select t.*,
             sum(case when ill = 0 then 1 else 0 end) over (partition by employee_id order by date) as grp
      from t
     ) t;

The Definitive Guide to SQLite, Table A-2. Built-in SQL Functions Function Description round(X), round(X,Y) Round off the In such cases, duplicate elements are filtered before being passed into the X instead of the total number of non-NULL values in column X. Table A-3. String and BLOB values that do not look like numbers are interpreted as 0. However, when we use the DISTINCT modifier, the SUM () function returns the sum of only unique values in the val column: SUM(DISTINCT val) total. Here is the output: Warning: Null value is eliminated by an aggregate or other SET operation. ( 1 row affected) SQL Server SUM () function examples. Let’s take some practical examples of using the

You could use a cumulative (windowing) count:

SELECT date,
       employee_id,
       ill, 
       COUNT(CASE ill WHEN 1 THEN 1 END) OVER 
            (PARTITION BY employee_id
             ORDER BY date ASC
             ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT_ROW)
FROM   mytable

10 SQL tricks that you didn't think were possible, This article will bring you 10 SQL tricks that many of you might not have But once your database and your application matures, you will have put all the prior to summing; The rows clause will consider only preceding rows or wife's birthday, you HAVE TO LOG IN, or the counter starts from zero again. In this example as shown in the output below, you can see the in row 1, the prev3 column shows 10030, or the just the same thing as the current row. In the second row it shows the sum of Revenue from row 1 and row 2. For row 3 Prev3 shows the total of 70030 which is the sum of revenue from rows 1 to 3.

A simple self join will help to solve this problem without only using SUM() and GROUP BY as follows

select 
    t1.date, 
    t1.EMPLOYEE_ID, 
    t1.ill, 
    coalesce(sum(t2.ill),0) as previous
from testdata as t1
left join testdata as t2
    on t1.EMPLOYEE_ID = t2.EMPLOYEE_ID and
       t1.date > t2.date
group by t1.date, t1.EMPLOYEE_ID, t1.ill
order by t1.date, t1.EMPLOYEE_ID;

An alternative query can be as follows which gives the same result as above

SELECT date,
       employee_id,
       ill, 
       coalesce(
       SUM(ill) OVER 
            (PARTITION BY employee_id
             ORDER BY date ASC
             ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
       ), 0) as Previous
FROM   testdata

Just a note, as I understand from your question; you seem to ask for number of consecutive days that the employee was ill before that day. Am I correct?

Data Analysis Using SQL and Excel, ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING )), 1 ) as S The population is the cumulative sum of the populations for all tenures larger The use of buckets makes the results of this query more useful as a lookup table. Calculating the Product of Column Values Unfortunately, SQL does not have a​  The SQL COUNT () function returns the number of rows in a table satisfying the criteria specified in the WHERE clause. It sets the number of rows or non NULL column values. COUNT () returns 0 if there were no matching rows. The above syntax is the general SQL 2003 ANSI standard syntax. This helps to understand the way SQL COUNT () Function is used.

If you want to find the number of consecutive ill days just before the current date for an employee, followin SQL query could be used

with newdata as (
select 
    *,
    case 
        when ill = 1 and lag(ill,1,0) over (partition by EMPLOYEE_ID order by date) = 0 then 1 end as illdays
from testdata
), empdata as (
select
    date, EMPLOYEE_ID, ill,
    case when ill = 1 and lag(ill,1,0) over (partition by EMPLOYEE_ID order by date) = 1 then coalesce(lag(illdays,1,0) over (partition by EMPLOYEE_ID order by date),0)+1 else illdays end as illdays
from newdata
)
select
date, EMPLOYEE_ID, ill,
coalesce( lag(illdays,1,0) over (partition by EMPLOYEE_ID order by date), 0) as previous
from empdata
order by EMPLOYEE_ID, date;

Before going into details and explaining the SQL query, here is the output of the execution of above SqlScript

First of all, instead of sub-select statements I used Common Table Expression CTE expression in SQLScript using WITH clauses

I frequently used SQL Lag() function in my code to read previous records in given order using Partition By and Order By clauses following Lag() function. Since Lag() function is a standard SQL function, you can use it on different database platforms

The query actually finds a starting point in newdata CTE and store illdays column.

Then I update this value in following empdata CTE. This part identifies repeating illness days following one after each other

Last CTE is used to display illness days and prepare final output

I hope it helps

Fetching a Row Plus N Rows Either Side in a Single SQL Statement , I want to find a row matching a string and the N (non-matching) rows either Therefore instead of getting the employee_IDs next to Johnson's, these functions return zero: Oracle has to scan all the rows in the employees table first, before In addition to all this goodness, I found to my delight that I could  The SQL COUNT(), AVG() and SUM() Functions. The COUNT() function returns the number of rows that matches a specified criteria. The AVG() function returns the average value of a numeric column. The SUM() function returns the total sum of a numeric column. SELECT COUNT(column_name) FROM table_name.

Advanced SQL:1999: Understanding Object-Relational and Other , Each transaction records its date and the amount, either positive (for a credit to BY translate ROWS UNBOUNDED PRECEDING), MAX (amount) OVER (ORDER the first row of the source data in Table 7.1, but the first — and, with no more data -12-07' 100.00 50.00 25.00 Action 1990 250 0 0 Action 1991 374 0 0 LXIII. SQL> SQL> COLUMN ma FORMAT 999999.999 SQL> COLUMN sum LIKE ma SQL> COLUMN "sum/3" LIKE ma SQL> SQL> SQL> SELECT x, y, 2 AVG(y) OVER(ORDER BY x 3 ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) ma, 4 SUM(y) OVER(ORDER BY x 5 ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) sum, 6 (SUM(y) OVER(ORDER BY y 7 ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING))/3 "Sum/3

SQL Cookbook: Query Solutions and Techniques for Database Developers, Query Solutions and Techniques for Database Developers Anthony Molinaro If you are using Oracle9i Database or later you can use the preceding solution. this recipe joins table EMP and table EMP_BONUS and returns only rows for TYPE is NULL, the CASE expression returns zero, which has no effect on the sum. As you can see the above sum is sum of marks of each student. This is horizontal sum in our table. Now we will try to get sum of marks of all students in all subjects.

Using the SQL Coalesce function in SQL Server, Before we delve into how to navigate the potential minefield of datasets a value is not the same thing as a value of zero in the same way that a lack The SQL Coalesce function evaluates the arguments in order and The following example returns the concatenated non-null values from the table 'state'. SQL Server 2012 adds many new features to Transact SQL (T-SQL). One of my favorites is the Rows/Range enhancements to the over clause. These enhancements are often times referred to as the windowing functions. Overview: ROWS PRECEDING, FOLLOWING, UNBOUNDED, refers … Rows and Range, Preceding and Following Read More »

Comments
  • Please tag your question with the database you are using. Also, edit your question and show the results you want.
  • What database are you using ? This is important because the leading databases (Oracle, SQL-Server, Postgresql) have analytic functions that make queries like that very easy.
  • I did not find out why, but this querry prints out an output of 2 for the first day of absence in the new column. However simply substracting from the new column solves this issue. Thank you a lot!
  • @SgtMarmite . . . That is one fix. The reason is because the group starts on the last 0 record. I added grp into the partition by. That should fix the problem in a saner way.
  • Thank you again. Solved :)
  • This code almost works. It adds up to total sum of absent days (ILL=1), however the counter does not go to 0 when an employee goes to work again (ILL=0). Could there be a fix?