Defining Spark Universal connection details in the Spark configuration view
Complete the Spark Universal connection configuration in the Spark configuration tab of the Run view of your Job. This configuration is effective on a per-Job basis.
Mode or environment | Description |
---|---|
Cloudera Data Engineering |
Talend Studio submit Jobs and collects the execution information of your Job
from Cloudera Data Engineering service. For more information, see Defining Cloudera Data Engineering connection parameters with Spark Universal. |
Databricks |
Talend Studio submits Jobs and collects the execution information of your Job
from Databricks. The Spark driver runs either on a job Databricks
cluster or on an all-purpose Databricks cluster on GCP, AWS, or
Azure. For more information, see Defining Databricks connection parameters with Spark Universal. |
Dataproc |
Talend Studio submits Jobs and collects the execution information of your Job
from Dataproc. For more information, see Defining Dataproc connection parameters with Spark Universal. |
Kubernetes |
Talend Studio submits Jobs and collects the execution information of your Job
from Kubernetes. The Spark driver runs on the cluster managed by
Kubernetes and can run independently from Talend Studio. For more information, see Defining Kubernetes connection parameters with Spark Universal. |
Local |
Talend Studio builds the Spark environment in itself at runtime to run the Job
locally in Talend Studio. With this mode, each processor of the local machine is used as a
Spark worker to perform the computations. For more information, see Defining Local connection parameters with Spark Universal. |
Spark-submit scripts |
Talend Studio submits Jobs and collects the execution information of your Job
from Yarn and ApplicationMaster of your cluster, typically an HPE
Data fabric cluster. The Spark driver runs on the cluster and can
run independently from Talend Studio. For more information, see Defining Spark-submit scripts connection parameters with Spark Universal |
Standalone |
Talend Studio connects to a Spark-enabled cluster to run the Job from this
cluster. For more information, see Defining Standalone connection parameters with Spark Universal. |
Synapse |
Talend Studio submits Jobs and collects the execution information of your Job
from Azure Synapse Analytics. For more information, see Defining Azure Synapse Analytics connection parameters with Spark Universal. |
Yarn cluster |
Talend Studio submits Jobs and collects the execution information of your Job
from Yarn and ApplicationMaster. The Spark driver runs on the
cluster and can run independently from Talend Studio. For more information, see Defining Yarn cluster connection parameters with Spark Universal. |