Skip to main content Skip to complementary content

tMap MapReduce properties (deprecated)

Availability-noteDeprecated

These properties are used to configure tMap running in the MapReduce Job framework.

The MapReduce tMap component belongs to the Processing family.

The component in this framework is available in all Talend products with Big Data and Talend Data Fabric.

Availability-noteDeprecated
The MapReduce framework is deprecated from Talend 7.3 onwards. Use Talend Jobs for Apache Spark to accomplish your integration tasks.

Basic settings

Map editor

It allows you to define the tMap routing and transformation properties.

Information noteNote: If you do not want to handle execution errors, you can click the Property Settings button at the top of the input area and select the Die on error check box (selected by default) in the Property Settings dialog box. It will kill the Job if there is an error.
Information noteNote: To maximize the data transformation performance in a Job that handles multiple lookup input flows with large amounts of data, you can select the Lookup in parallel check box in the Property Settings dialog box.

However, in a Map/Reduce Job, only one expression key is allowed per mapping component. If you need to use multiple expression keys to join different input tables, use multiple tMap components one after another.

Mapping links display as

Auto: the default setting is curves links

Curves: the mapping display as curves

Lines: the mapping displays as straight lines. This last option allows to slightly enhance performance.

Temp data directory path Enter the path where you want to store the temporary data generated for lookup loading. For more information on this folder, see Talend Studio User Guide.

Preview

The preview is an instant shot of the Mapper data. It becomes available when Mapper properties have been filled in with data. The preview synchronization takes effect only after saving changes.

Use replicated join

Select this check box to perform a replicated join between the input flows. By replicating each lookup table into memory, this type of join doesn't require an additional shuffle-and-sort step, thus speeding up the whole process.

You need to ensure that the entire lookup tables fit in memory.

Advanced settings

Max buffer size (nb of rows) Type in the size of physical memory, in number of rows, you want to allocate to processed data.
Ignore trailing zeros for BigDecimal Select this check box to ignore trailing zeros for BigDecimal data.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Usage rule

In a Talend Map/Reduce Job, this component is used as an intermediate step and other components used along with it must be Map/Reduce components, too. They generate native Map/Reduce code that can be executed directly in Hadoop.

As explained earlier, If you need to use multiple expression keys to join different input tables, use mutiple tMap components one after another.

For further information about a Talend Map/Reduce Job, see the sections describing how to create, convert and configure a Talend Map/Reduce Job of the Talend Big Data Getting Started Guide .

Note that in this documentation, unless otherwise explicitly stated, a scenario presents only Standard Jobs, that is to say traditional Talend data integration Jobs, and non Map/Reduce Jobs.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!