Pandas - Rolling slope calculation

Related searches

How to calculate slope of each columns' rolling(window=60) value, stepped by 5?

I'd like to calculate every 5 minutes' value, and I don't need every record's results.

Here's sample dataframe and results:

df
Time                A    ...      N
2016-01-01 00:00  1.2    ...    4.2
2016-01-01 00:01  1.2    ...    4.0
2016-01-01 00:02  1.2    ...    4.5
2016-01-01 00:03  1.5    ...    4.2
2016-01-01 00:04  1.1    ...    4.6
2016-01-01 00:05  1.6    ...    4.1
2016-01-01 00:06  1.7    ...    4.3
2016-01-01 00:07  1.8    ...    4.5
2016-01-01 00:08  1.1    ...    4.1
2016-01-01 00:09  1.5    ...    4.1
2016-01-01 00:10  1.6    ...    4.1
....

result
Time                A    ...      N
2016-01-01 00:04  xxx    ...    xxx
2016-01-01 00:09  xxx    ...    xxx
2016-01-01 00:14  xxx    ...    xxx
...

Can df.rolling function be applied to this problem?

It's fine if NaN is in the window, meaning subset could be less than 60.


try this

windows = df.groupby("Time")["A"].rolling(60)
df[out] = windows.apply(lambda x: np.polyfit(range(60), x, 1)[0], raw=True).values

pandas.DataFrame.rolling — pandas 1.1.1 documentation, Provide rolling window calculations. Parameters. windowint, offset, or BaseIndexer subclass. Size of the moving window. This is the number of observations� Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.rolling () function provides the feature of rolling window calculations. The concept of rolling window calculation is most primarily used in signal processing and time series data.


It seems that what you want is rolling with a specific step size. However, according to the documentation of pandas, step size is currently not supported in rolling.

If the data size is not too large, just perform rolling on all data and select the results using indexing.

Here's a sample dataset. For simplicity, the time column is represented using integers.

data = pd.DataFrame(np.random.rand(500, 1) * 10, columns=['a'])
            a
0    8.714074
1    0.985467
2    9.101299
3    4.598044
4    4.193559
..        ...
495  9.736984
496  2.447377
497  5.209420
498  2.698441
499  3.438271

Then, roll and calculate slopes,

def calc_slope(x):
    slope = np.polyfit(range(len(x)), x, 1)[0]
    return slope

# set min_periods=2 to allow subsets less than 60.
# use [4::5] to select the results you need.
result = data.rolling(60, min_periods=2).apply(calc_slope)[4::5]

The result will be,

            a
4   -0.542845
9    0.084953
14   0.155297
19  -0.048813
24  -0.011947
..        ...
479 -0.004792
484 -0.003714
489  0.022448
494  0.037301
499  0.027189

Or, you can refer to this post. The first answer provides a numpy way to achieve this: step size in pandas.DataFrame.rolling

Python, rolling() function provides the feature of rolling window calculations. The concept of rolling window calculation is most primarily used in signal� pandas.DataFrame.rolling¶ DataFrame.rolling (window, min_periods = None, center = False, win_type = None, on = None, axis = 0, closed = None) [source] ¶ Provide rolling window calculations. Parameters window int, offset, or BaseIndexer subclass. Size of the moving window. This is the number of observations used for calculating the statistic.


You could use pandas Resample. Note that to use this , you need an index with time value

df.index = pd.to_datetime(df.Time)
print df
result = df.resample('5Min').bfill()
print result
                                 Time    A    N
Time                                           
2016-01-01 00:00:00  2016-01-01 00:00  1.2  4.2
2016-01-01 00:01:00  2016-01-01 00:01  1.2  4.0
2016-01-01 00:02:00  2016-01-01 00:02  1.2  4.5
2016-01-01 00:03:00  2016-01-01 00:03  1.5  4.2
2016-01-01 00:04:00  2016-01-01 00:04  1.1  4.6
2016-01-01 00:05:00  2016-01-01 00:05  1.6  4.1
2016-01-01 00:06:00  2016-01-01 00:06  1.7  4.3
2016-01-01 00:07:00  2016-01-01 00:07  1.8  4.5
2016-01-01 00:08:00  2016-01-01 00:08  1.1  4.1
2016-01-01 00:09:00  2016-01-01 00:09  1.5  4.1
2016-01-01 00:10:00  2016-01-01 00:10  1.6  4.1
2016-01-01 00:15:00  2016-01-01 00:15  1.6  4.1
                                 Time    A    N

Output

Time                                           
2016-01-01 00:00:00  2016-01-01 00:00  1.2  4.2
2016-01-01 00:05:00  2016-01-01 00:05  1.6  4.1
2016-01-01 00:10:00  2016-01-01 00:10  1.6  4.1
2016-01-01 00:15:00  2016-01-01 00:15  1.6  4.1

Calculating rolling regression coefficients of a DataFrame, I'm not sure if Quantopian supports pandas rolling regression? I have no idea what this error means. Any insight would be appreciated! pandas.core.window.rolling.Rolling.std¶ Rolling.std (ddof = 1, * args, ** kwargs) [source] ¶ Calculate rolling standard deviation. Normalized by N-1 by default. This can be changed using the ddof argument.


hi sorry to pull this old question up. but I cannot follow the results :S

def calc_slope(x):
    slope = np.polyfit(range(len(x)), x, 1)[0]
    return slope

# set min_periods=2 to allow subsets less than 60.
# use [4::5] to select the results you need.
data['slope']  = data.rolling(3, min_periods=3).apply(calc_slope)

print(data.to_string())

with a result of:

           a     slope
0   6.902663       NaN
1   2.257267       NaN
2   0.172393 -3.365135
3   9.642700  3.692717
4   1.221879  0.524743
5   1.634674 -4.004013
6   8.274599  3.526360
7   9.800035  4.082681
8   4.577713 -1.848443
9   1.368656 -4.215690
10  9.377983  2.400135
11  9.795934  4.213639
12  3.045406 -3.166288
13  6.063934 -1.866000
14  8.202430  2.578512

any ideas?

thx

How to get slope from timeseries data in pandas?, Pandas rolling slope, I'm trying to rolling apply a custom function to a pandas dataframe. Windows identify sub periods of your time series ○ Calculate metrics for� For a sanity check, let's also use the pandas in-built rolling function and see if it matches with our custom python based simple moving average. df['pandas_SMA_3'] = df.iloc[:,1].rolling(window=3).mean() df.head()


I would like to calculate the slope using scipy.stats.linregress for each entity a and b in the above example. I tried using groupby on the first column, following the split-apply-combine advice , but it seems problematic since it's expecting one Series of values ( a and b ), whereas I need to operate on the two columns on the right.


calculating slope for a series trendline in Pandas. Ask Question Asked 4 years, 1 month ago. Pandas conditional creation of a series/dataframe column.


pandas.DataFrame.pct_change¶ DataFrame.pct_change (periods = 1, fill_method = 'pad', limit = None, freq = None, ** kwargs) [source] ¶ Percentage change between the current and a prior element.