mapreduce - Spark : How to design streaming with spark? -
current architecture
mysql database rest api abstraction.
the problem
mysql not scaling various reasons including data model design hard fix.
proposed architecture
using cassandra nosql backend , using spark in memory computation engine along spark streaming.
questions
- how cassandra's consistency ? decision directly use kafka streams carries real time information cassandra , use spark sql query data.
- if consistency above good, how rdd's designed around since immutable.do create new rdd's ?
- an alternative design migrate data mysql cassandra , use kafka directly send messages spark handles in real time , use downstream systems hand on data cassandra on time.
in points 1&2 consistency dependant upon cassandra , in point 3 tied spark.
which design better ? can throw light on this.
Comments
Post a Comment