Python - Perform file check based on format of 3 values then perform tasks -


all,

i trying write python script go through crime file , separate file based on following items: updates, incidents, , arrests. reports receive either show these sections have listed or **updates**, **incidents**, or **arrests**. have started write following script separate files based on following format **. however, wondering if there better way check files both formats @ same time? also, there not updates or arrests section causes code break. wondering if there check can instance, , if case, how can still incidents section without other two?

with open('crimereport20150518.txt', 'r') f:   content = f.read()   print content.index('**updates**')   print content.index('**incidents**')   print content.index('**arrests**')   updatesline = content.index('**updates**')   incidentsline = content.index('**incidents**')   arrestsline = content.index('**arrests**')   #print content[updatesline:incidentsline]   updates = content[updatesline:incidentsline]   #print updates   incidents = content[incidentsline:arrestsline]   #print incidents   arrests = content[arrestsline:]   print arrests 

you using .index() locate headings in text. documentation states:

like find(), raise valueerror when substring not found.

that means need catch exception in order handle it. example:

try:     updatesline = content.index('**updates**')     print "found updates heading at", updatesline except valueerror:     print "note: no updates"     updatesline = -1 

from here can determine correct indexes slicing string based on sections present.


alternatively, use .find() method referenced in documentation .index().

return -1 if sub not found.

using find can test value returned.

updatesline = content.find('**updates**') # following straightforward, unwieldy if updatesline != -1:     if incidentsline != -1:         updates = content[updatesline:incidentsline]     elif arrestsline != -1:         updates = content[updatesline:arrestsline]     else:         updates = content[updatesline:] 


either way, you'll have deal combinations of sections , not present determine correct slice boundaries.

i prefer approach using state machine. read file line line , add line appropriate list. when header found update state. here untested demonstration of principle:

data = {     'updates': [],     'incidents':  [],     'arrests': [],     }  state = none open('crimereport20150518.txt', 'r') f:     line in f:         if line == '**updates**':             state = 'updates'         elif line == '**incidents**':             state = 'incidents'         elif line == '**arrests**':             state = 'arrests'         else:             if state none:                 print "warn: no header seen; skipping line"             else                 data[state].append(line)  print data['arrests'].join('') 

Comments

Popular posts from this blog

python - Healpy: From Data to Healpix map -

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -