The example below uses a database table which holds customer information.
Procedure
In the DQ Repository tree view, expand Metadata and browse to the table you want to
analyze.
Right-click the table and select Semantic-aware
Analysis, or right-click a set of columns in the table and select
Semantic-aware Analysis.
Configure the Sampling Options in the related
section:
Select or click
To...
- First N Rows
- Reservoir Sampling
list in the data preview N first data records from the selected
columns. You set the number of records in the Number of
rows field.
list in the data preview N random records
from the selected columns. You set the number of records in the
Number of rows field.
Threshold for category discovery
decide the minimum threshold for the matches to show in the
Category lists of the analyzed
columns.
This threshold filters the less probable categories of
the analyzed columns.
Refresh
refresh the data preview after any change in the
configuration.
From the Category field of each of the matched columns,
either:
Select a category of data from the Category list that
best suites the column, or
Enter a meaningful name for the column that best represent the
content.
To edit the name of a column, click in the field twice, type the name and press
Enter on your keyboard to save the changes.
The names entered by you will display in a different color. This step stores
locally the categories and the semantic names of the columns. If no semantic
names are found, categories are stored anyway.
This is not mandatory but will help you better match table metadata with the
concepts stored in the ontology repository on the log server.
The percentages of the proposed categories are calculated by analyzing the
data in the columns against the following methods:
regex, data dictionary and
keyword dictionary. The dictionary indexes and
regex categories are embedded in the Studio and are used to decide what
category does the data fall in.
Click Next to open a page in the wizard where you can
see the results of matching column metadata and semantic concepts with the
concepts in the ontology repository.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!