https://www.zhihu.com/search?type=content&q=Pandas聚合时间序列数据
注意以下的聚合是从当前时间点往后一段时间计算的。
tmp_group = ori_data.groupby(['cols',
pd.Grouper(key='开始', freq=f'{day_delta}d',closed='right')]).agg({
'col1':{'sum','std'},
'col2':'min'
})
展开多级索引
level0 = tmp_group.columns.get_level_values(0)
level1 = tmp_group.columns.get_level_values(1)
tmp_group.columns = level0 + '_' + level1
rolling的说明
https://www.cnpython.com/qa/505930
import pandas as pd
A地有两个仓库,都运往B。
df = pd.DataFrame({'1': ['A1', 'A2', 'A1', 'A2', 'A2', 'A1', 'A2'],
'num': [1,2,1,3,4,2,1],
'time' : [pd.Timestamp('20130101 09:00:00'),
pd.Timestamp('20130101 09:00:00'),
pd.Timestamp('20130301 09:00:00'),
pd.Timestamp('20130401 09:00:03'),
pd.Timestamp('20130501 09:00:04'),
pd.Timestamp('20130701 09:00:05'),
pd.Timestamp('20130801 09:00:06')]})
df.set_index('time',inplace = True)
print(df[df['num']==1])
print(df.groupby(['1','time'])['num'].rolling(2,closed='both',min_periods=1).sum())
print(df.groupby(['1']).rolling('31d',closed='both',min_periods=1).agg({'num':'sum'}))
以下是每个时间点往前滚动聚合,注意首先要把时间设置为索引
ori_data.set_index('开始',inplace = True)
tmp_group = ori_data[ori_data['周期']==per].groupby(['cols',
]).rolling(f'{days}d',closed='right',min_periods=1).agg({
'col1':{'sum','std'},
'col2':{'min'}
})
level0 = tmp_group.columns.get_level_values(0)
level1 = tmp_group.columns.get_level_values(1)
tmp_group.columns = level0 + '_' + level1