tFixedFlowInput properties for Apache Spark Batch
These properties are used to configure tFixedFlowInput running in the Spark Batch Job framework.
The Spark Batch tFixedFlowInput component belongs to the Misc family.
The component in this framework is available in all subscription-based Talend products with Big Data and Talend Data Fabric.
Basic settings
Schema and Edit Schema |
A schema is a row description, it defines the number of fields that will be processed and passed on to the next component. The schema is either built-in or remote in the Repository. Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:
|
|
Built-in: The schema will be created and stored locally for this component only. For more information about a component schema in its Basic settings tab, see Basic settings tab. |
|
Repository: You have already created the schema and stored it in the Repository, hence can be reused in various projects and Job designs. For more information about a component schema in its Basic settings tab, see Basic settings tab. |
Mode |
From the three options, select the mode that you want to use. Use Single Table : Enter the data that you want to generate in the relevant value field. Use Inline Table : Add the row(s) that you want to generate. Use Inline Content : Enter the data that you want to generate, separated by the separators that you have already defined in the Row and Field Separator fields. |
Number of rows |
Enter the number of lines to be generated. |
Values |
Between inverted commas, enter the values corresponding to the columns you defined in the schema dialog box via the Edit schema button. |
Advanced settings
Set the number of partitions |
Select this check box and then enter the number of partitions into which you want to dispatch the input rows. If you leave this check box clear, each input row forms a partition. For example, with 5 in the Number of rows field, each row is handled as one partition and thus they make 5 partitions in total. |
Usage
Usage rule |
This component is used as a start component and requires an output link. This component, along with the Spark Batch component Palette it belongs to, appears only when you are creating a Spark Batch Job. Note that in this documentation, unless otherwise explicitly stated, a scenario presents only Standard Jobs, that is to say traditional Talend data integration Jobs. |
Spark Connection |
In the Spark
Configuration tab in the Run
view, define the connection to a given Spark cluster for the whole Job. In
addition, since the Job expects its dependent jar files for execution, you must
specify the directory in the file system to which these jar files are
transferred so that Spark can access these files:
This connection is effective on a per-Job basis. |