python - DataFrame read from excel (all sheets) not recognizing index and columns -
i wrote following code ( see 2) below) extract , manipulate data excel file looks following way:
1) excel file content, in each sheet:
time f1 f2 f3 41030.00069444 -83.769997 29.430000 29.400000 41030.00138889 -84.209999 28.940001 28.870001 41030.00208333 -84.339996 28.280001 28.320000
2) code:
raw_data = pd.read_excel(r'/users/linnkloster/desktop/results/01_05_2012 raw results.xls', skiprows=1, header=0, nrows=1440, dayfirst=true, infer_datetime_format='%d/%m/%y %h') raw_data[u'time']= pd.to_datetime(raw_data['time'], unit='d') raw_data.set_index(pd.datetimeindex(raw_data[u'time']), inplace=true) print raw_data ave_data = raw_data.resample('h', how='mean')
i'm running 2 issues:
i) need read in data all sheets in excel file (all of have same format shown above, different column names). when add sheetnames=none
input pd.read_excel
in first line of code achieve this, stops recognizing column titles , index excel file, making impossible me take averages/manipulate raw_data dataframe way need (as seen in last line of code new ave_data dataframe in created). can me develop code extract data all sheets in excel file, while still recognizing column headers , index column can manipulate it?
ii) raw_data outputs following:
raw_data:
time f1 f2 f3 2082-05-03 00:00:59.961599999 -83.769997 29.430000 29.400000 2082-05-03 00:02:00.009600000 -84.209999 28.940001 28.870001 2082-05-03 00:02:59.971200000 -84.339996 28.280001 28.320000
the date displayed here wrong - should 2012-05-01 - hour correct. know how can change code correct this?
thank in advance
Comments
Post a Comment