Python - Perform file check based on format of 3 values then perform tasks -
all,
i trying write python script go through crime file , separate file based on following items: updates, incidents, , arrests. reports receive either show these sections have listed or **updates**, **incidents**, or **arrests**. have started write following script separate files based on following format **. however, wondering if there better way check files both formats @ same time? also, there not updates or arrests section causes code break. wondering if there check can instance, , if case, how can still incidents section without other two?
with open('crimereport20150518.txt', 'r') f: content = f.read() print content.index('**updates**') print content.index('**incidents**') print content.index('**arrests**') updatesline = content.index('**updates**') incidentsline = content.index('**incidents**') arrestsline = content.index('**arrests**') #print content[updatesline:incidentsline] updates = content[updatesline:incidentsline] #print updates incidents = content[incidentsline:arrestsline] #print incidents arrests = content[arrestsline:] print arrests
you using .index()
locate headings in text. documentation states:
like find(), raise valueerror when substring not found.
that means need catch exception in order handle it. example:
try: updatesline = content.index('**updates**') print "found updates heading at", updatesline except valueerror: print "note: no updates" updatesline = -1
from here can determine correct indexes slicing string based on sections present.
alternatively, use .find()
method referenced in documentation .index()
.
return -1 if sub not found.
using find can test value returned.
updatesline = content.find('**updates**') # following straightforward, unwieldy if updatesline != -1: if incidentsline != -1: updates = content[updatesline:incidentsline] elif arrestsline != -1: updates = content[updatesline:arrestsline] else: updates = content[updatesline:]
either way, you'll have deal combinations of sections , not present determine correct slice boundaries.
i prefer approach using state machine. read file line line , add line appropriate list. when header found update state. here untested demonstration of principle:
data = { 'updates': [], 'incidents': [], 'arrests': [], } state = none open('crimereport20150518.txt', 'r') f: line in f: if line == '**updates**': state = 'updates' elif line == '**incidents**': state = 'incidents' elif line == '**arrests**': state = 'arrests' else: if state none: print "warn: no header seen; skipping line" else data[state].append(line) print data['arrests'].join('')
Comments
Post a Comment