Apache Spark Basics – Accumulators and Broadcast Variables
In my previous post I talked about of RDDs as an abstraction of parallel data processing. Today, I’d like to briefly discuss and set an example for accumulators and broadcast variables. Accumulators counters or sums that can be reliably used in parallel processing native support for numeric types, extensions possible via API workers can modify…