python - pandas calculate rolling_std of top N dataframe rows -


i have dataframe this:

date      2015.1.1  10 2015.1.2  20 2015.1.3  30 2015.1.4  40 2015.1.5  50 2015.1.6  60 

i need caculate std of top n rows, such as:

date       std 2015.1.1  10  std(10) 2015.1.2  20  std(10,20) 2015.1.3  30  std(10,20,30) 2015.1.4  40  std(10,20,30,40) 2015.1.5  50  std(10,20,30,40,50) 2015.1.6  60  std(10,20,30,40,50,60) 

pd.rolling_std used this, however, how change n dynamically?

df[['a']].apply(lambda x:pd.rolling_std(x,n)) 

<class 'pandas.core.frame.dataframe'> index: 75 entries, 2015-04-16 2015-07-31 data columns (total 4 columns):    75 non-null float64 dtypes: float64(4) memory usage: 2.9+ kb 

it done calling apply on df so:

in [29]: def func(x):     return df.iloc[:x.name + 1][x.index].std() ​ df['std'] = df[['a']].apply(func, axis=1) df out[29]:        date          std 0  2015.1.1  10        nan 1  2015.1.2  20   7.071068 2  2015.1.3  30  10.000000 3  2015.1.4  40  12.909944 4  2015.1.5  50  15.811388 5  2015.1.6  60  18.708287 

this uses double subscripts [[]] call apply on df single column, allows pass param axis=1 can call function row-wise, have access index attribute, name , column name attribute, index, allows slice df calculate rolling std.

you can add window arg func modify window desired

edit

it looks index str, following should work:

in [39]: def func(x):     return df.ix[:x.name ][x.index].std() ​ df['std'] = df[['a']].apply(lambda x: func(x), axis=1) df  out[39]:                   std date                    2015.1.1  10        nan 2015.1.2  20   7.071068 2015.1.3  30  10.000000 2015.1.4  40  12.909944 2015.1.5  50  15.811388 2015.1.6  60  18.708287 

Comments

Popular posts from this blog

Upgrade php version of xampp not success -

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -