The following example shows you how you can improve the Talend Trust Score™ using
Talend Cloud Data Inventory
and Talend Cloud Data Preparation.
In this example, you are working in a e-commerce company. Some orders have not been shipped
yet. While you are looking into the order progress, you noticed some country names and tax
identification numbers are wrong.
Here is a sample of the dataset:
Checking the actual Talend Trust Score™
Procedure
Go to the Datasets tab.
To find the dataset of which you want to improve the Talend Trust Score™, filter the datasets.
In this example, use the tags to filter the datasets.
The dataset list is filtered. The Talend Trust Score™
is 3.38/5.
In Talend Cloud Data Inventory, go to the Datasets tab.
Your dataset list is filtered according to the filter applied in the previous
section.
Hover over the dataset and click the Preparations
icon.
The Preparations wizard opens.
Click Add.
You are redirected to Talend Cloud Data Preparation
and the preparation is created.
What to do next
You can now configure the preparation.
Configuring the preparation
About this task
This example makes you use functions from Talend Cloud Data Preparation.
Procedure
To correct the country names, use the fuzzy matching function.
Select the column: delivery_country.
In the right panel, select Column and start typing
fuzzy matching.
Select the function Standardize value (fuzzy
matching).
Set the Match threshold to Default (>
80%).
Click Submit. The step is added to the preparation
steps in the left panel and the country names are corrected. For example,
United Staates is replaced by United
States.
To convert the country codes, use a conversion function. The
delivery_country column is still selected.
In the right panel, select Column and start typing
convert.
Select the function Convert country names and
codes.
Set From to ISO country code
and To to English country
name.
Click Submit. The country names are converted. For
example, CA is replaced by
Canada.