Defining the Cloudera Altus connection parameters
Complete the Altus connection configuration in the Spark configuration tab of the Run view of your Job. This configuration is effective on a per-Job basis.
Only the Yarn cluster mode is available for this type of cluster.
The information in this section is only for users who have subscribed to Talend Data Fabric or to any Talend product with Big Data.
Before you begin
Prerequisites:
-
To install the Cloudera Altus CLI on Linux, see Cloudera Altus Client Setup for Linux from the Cloudera documentation.
-
To install the Cloudera Altus CLI on Windows, see Cloudera Altus Client Setup for Windows from the Cloudera documentation.
Procedure
Results
-
After the connection is configured, you can tune the Spark performance, although not required, by following the process explained in:
-
Tuning Spark for Apache Spark Batch Jobs for Spark Batch Jobs.
-
Tuning Spark for Apache Spark Streaming Jobs for Spark Streaming Jobs.
-
-
It is recommended to activate the Spark logging and checkpointing system in the Spark configuration tab of the Run view of your Spark Job, in order to help debug and resume your Spark Job when issues arise:
-
If you need to consult the Altus related logs, check them in your Cloudera Manager service or on your Altus cluster instances.