Defining Amazon EMR connection parameters with Spark Universal
When you run your Spark Jobs on YARN cluster using Amazon EMR distribution, you need to manually distribute the libraries as Amazon EMR does not have the same classpath on main and subordinate nodes.