python - pandas calculate rolling_std of top N dataframe rows -


i have dataframe this:

date      2015.1.1  10 2015.1.2  20 2015.1.3  30 2015.1.4  40 2015.1.5  50 2015.1.6  60 

i need caculate std of top n rows, such as:

date       std 2015.1.1  10  std(10) 2015.1.2  20  std(10,20) 2015.1.3  30  std(10,20,30) 2015.1.4  40  std(10,20,30,40) 2015.1.5  50  std(10,20,30,40,50) 2015.1.6  60  std(10,20,30,40,50,60) 

pd.rolling_std used this, however, how change n dynamically?

df[['a']].apply(lambda x:pd.rolling_std(x,n)) 

<class 'pandas.core.frame.dataframe'> index: 75 entries, 2015-04-16 2015-07-31 data columns (total 4 columns):    75 non-null float64 dtypes: float64(4) memory usage: 2.9+ kb 

it done calling apply on df so:

in [29]: def func(x):     return df.iloc[:x.name + 1][x.index].std() ​ df['std'] = df[['a']].apply(func, axis=1) df out[29]:        date          std 0  2015.1.1  10        nan 1  2015.1.2  20   7.071068 2  2015.1.3  30  10.000000 3  2015.1.4  40  12.909944 4  2015.1.5  50  15.811388 5  2015.1.6  60  18.708287 

this uses double subscripts [[]] call apply on df single column, allows pass param axis=1 can call function row-wise, have access index attribute, name , column name attribute, index, allows slice df calculate rolling std.

you can add window arg func modify window desired

edit

it looks index str, following should work:

in [39]: def func(x):     return df.ix[:x.name ][x.index].std() ​ df['std'] = df[['a']].apply(lambda x: func(x), axis=1) df  out[39]:                   std date                    2015.1.1  10        nan 2015.1.2  20   7.071068 2015.1.3  30  10.000000 2015.1.4  40  12.909944 2015.1.5  50  15.811388 2015.1.6  60  18.708287 

Comments

Popular posts from this blog

python - Healpy: From Data to Healpix map -

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -