mapreduce - Spark : How to design streaming with spark? -


current architecture

mysql database rest api abstraction.

the problem

mysql not scaling various reasons including data model design hard fix.

proposed architecture

using cassandra nosql backend , using spark in memory computation engine along spark streaming.

questions

  • how cassandra's consistency ? decision directly use kafka streams carries real time information cassandra , use spark sql query data.
  • if consistency above good, how rdd's designed around since immutable.do create new rdd's ?
  • an alternative design migrate data mysql cassandra , use kafka directly send messages spark handles in real time , use downstream systems hand on data cassandra on time.

in points 1&2 consistency dependant upon cassandra , in point 3 tied spark.

which design better ? can throw light on this.


Comments

Popular posts from this blog

python - Healpy: From Data to Healpix map -

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -