python - analyze text file in parallel with mpi4py -


i have input tab separated text file:

0   .4 1   .9 2   .2 3   .12 4   .55 5   .98 

i analyze in plain python as:

lines = open("songs.tsv").readlines()  def extract_hotness(line):         return float(line.split()[1])  songs_hotness =map(extract_hotness, lines) max_hotness = max(songs_hotness) 

how perform same operation in parallel using mpi4py? started implementing scatter, won't work straight away because scatter needs list elements same length number of nodes.

processing text file in parallel difficult. split file? reading parallel file system? might consider mpi-io if have large enough input file. if go route, these answers, provided in c context, describe challenges still hold in mpi4py: https://stackoverflow.com/a/31726730/1024740 , https://stackoverflow.com/a/12942718/1024740

another approach not scatter data read in on rank 0 , broadcast else. approach requires enough memory stage input data @ once, or master-worker scheme data read in 1 shot.


Comments

Popular posts from this blog

python - Healpy: From Data to Healpix map -

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -