Defining the HDInsight connection parameters
Complete the HDInsight connection configuration in the Spark configuration tab of the Run view of your Job. This configuration is effective on a per-Job basis.
Only the Yarn cluster mode is available for this type of cluster.
The information in this section is only for users who have subscribed to Talend Data Fabric or to any Talend product with Big Data.
Procedure
Results
-
After the connection is configured, you can tune the Spark performance, although not required, by following the process explained in:
-
Tuning Spark for Apache Spark Batch Jobs for Spark Batch Jobs.
-
Tuning Spark for Apache Spark Streaming Jobs for Spark Streaming Jobs.
-
-
It is recommended to activate the Spark logging and checkpointing system in the Spark configuration tab of the Run view of your Spark Job, in order to help debug and resume your Spark Job when issues arise: