Data Hero 4 U: Deep Dive Into Spark Cluster Management

Thursday, October 05, 2017

Deep Dive Into Spark Cluster Management

DZone Database Zone

Deep Dive Into Spark Cluster Management

This blog aims to dig into the different cluster management modes in which you can run your Spark application.

Spark applications run as independent sets of processes on a cluster, coordinated by the SparkContext object in your main program, which is called the Driver Program. Specifically, to run on a cluster, SparkContext can connect to several types of Cluster Managers, which allocate resources across applications. Once the connection is established, Spark acquires executors on the nodes in the cluster to run its processes, does some computations, and stores data for your application. Next, it sends your application code (defined by JAR or Python files passed to SparkContext) to the executors. Finally, SparkContext sends tasks to the executors to run.

more info...

Data Hero 4 U

Thursday, October 05, 2017

Deep Dive Into Spark Cluster Management

No comments:

Fun With SQL: Functions in Postgres

Report Abuse