Skip to main content Skip to complementary content
  • New archived content: Talend MDM, Talend Data Catalog 8.0, and Talend 7.3 products reached their end of life in 2024. Their documentation was moved to the Talend Archive page and will no longer receive content updates.
Close announcements banner

Creating a Job to divide the input text into tokens in CoNLL format

This Job uses tNLPPreprocessing to divide a text sample in XML format into tokens. Then, tokens are converted to the CoNLL format using tNormalize.

Procedure

  1. Drop the following components from the Palette onto the design workspace: tXMLFileInput, tNLPPreprocessing, tFilterColumns, tNormalize and tFileOutputDelimited.
  2. Connect the components using Row > Main connections.

Results

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!