Skip to main content Skip to complementary content
  • New archived content: Talend MDM, Talend Data Catalog 8.0, and Talend 7.3 products reached their end of life in 2024. Their documentation was moved to the Talend Archive page and will no longer receive content updates.
Close announcements banner

Configuring the input component

Before you begin

  • You annotated the named entities in the CoNLL files to be used for training the model.

Procedure

  1. Double-click the tFileInputDelimited component to open its Basic settings view and define its properties.
    1. Set the Schema as Built-in and click Edit schema to define the desired schema.

      The first column in the output schema must be tokens and the last one must be labels. In between, you can have columns for features you added manually.

    2. In the Folder/file field, specify the path to the training data.
    3. Leave the Die on error check box selected.
  2. In the Advanced settings view of the component, select the Custom encoding check box if you encounter issues when processing the data.
  3. From the Encoding list, select the encoding to be used, UTF-8 in this example.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!