python - pandas calculate rolling_std of top N dataframe rows -
i have dataframe this:
date 2015.1.1 10 2015.1.2 20 2015.1.3 30 2015.1.4 40 2015.1.5 50 2015.1.6 60
i need caculate std of top n rows, such as:
date std 2015.1.1 10 std(10) 2015.1.2 20 std(10,20) 2015.1.3 30 std(10,20,30) 2015.1.4 40 std(10,20,30,40) 2015.1.5 50 std(10,20,30,40,50) 2015.1.6 60 std(10,20,30,40,50,60)
pd.rolling_std used this, however, how change n dynamically?
df[['a']].apply(lambda x:pd.rolling_std(x,n))
<class 'pandas.core.frame.dataframe'> index: 75 entries, 2015-04-16 2015-07-31 data columns (total 4 columns): 75 non-null float64 dtypes: float64(4) memory usage: 2.9+ kb
it done calling apply
on df so:
in [29]: def func(x): return df.iloc[:x.name + 1][x.index].std() df['std'] = df[['a']].apply(func, axis=1) df out[29]: date std 0 2015.1.1 10 nan 1 2015.1.2 20 7.071068 2 2015.1.3 30 10.000000 3 2015.1.4 40 12.909944 4 2015.1.5 50 15.811388 5 2015.1.6 60 18.708287
this uses double subscripts [[]]
call apply
on df single column, allows pass param axis=1
can call function row-wise, have access index attribute, name
, column name attribute, index
, allows slice df calculate rolling std
.
you can add window arg func
modify window desired
edit
it looks index str, following should work:
in [39]: def func(x): return df.ix[:x.name ][x.index].std() df['std'] = df[['a']].apply(lambda x: func(x), axis=1) df out[39]: std date 2015.1.1 10 nan 2015.1.2 20 7.071068 2015.1.3 30 10.000000 2015.1.4 40 12.909944 2015.1.5 50 15.811388 2015.1.6 60 18.708287
Comments
Post a Comment