Skip to main content Skip to complementary content
  • New archived content: Talend MDM, Talend Data Catalog 8.0, and Talend 7.3 products reached their end of life in 2024. Their documentation was moved to the Talend Archive page and will no longer receive content updates.
Close announcements banner

Setting up the Job

Procedure

  1. Drop the following components from the Palette onto the design workspace: tFileInputDelimited, tMatchPredict and tFileOutputDelimited.
  2. Connect tFileInputDelimited to tMatchPredict using the Main link.
  3. Connect tMatchPredict to tFileOutputDelimited using the Suspect duplicates link.
  4. Check that you have defined the connection to the Spark cluster and activated checkpointing in the Run > Spark Configuration view as described in Computing suspect pairs and suspect sample from source data.

Results

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!