Fields

An entity is composed of records and fields that hold data; a record is composed of fields that populate with data. Field metadata is critical to ingest, validation and profiling of the data. Each field is described by specific metadata that can be viewed and/or edited.

Field Information: General Information

Select icon view and edit details (view details) icon to display column attributes, metadata, and statistical information about data loaded.

Select icon sample data (sample data) icon to display sample records for that field. Note that multiple fields can be selected and then select top-level Sample Data button to display sample data for multiple fields.

Display and modify field properties

View details of a field

Select editable fields or select options from the dropdowns.

Field Information – General Information properties
Name	Created when data is ingested
Business Name	user-defined
Business Description	user-defined
Technical Description	user-defined. Freeform field to describe technical characteristics of the data.
Internal Data Type	The data type as stored in Receiving Directory. Supported Data Types: INTEGER DOUBLE* STRING BOOLEAN DECIMAL* Note that Qlik Catalogwill convert DOUBLE/DECIMAL internal fields to scientific notation when the field is very large or very small (more than seven decimal places).
Last Updated at	auto-generated. Provides data and the last time metadata was last updated (ISO standard)
Index	column sequence number, column position of field in table (ex. 1, 2, 3, 4…)

Field Information: Properties

Source ingest Information (key/value) properties can be added from the second modal tab, Properties.

Key/Value pairs, also known as attribute-value pairs, are specific to the object level at which they are applied.

Select the icon add property (plus) icon (Add Property) to open a drop-down with optional field properties.

Field Information: Properties

Add property to a field in properties tab

Field Information: Lineage

Parent Lineage shows the root source of the field data. Child Lineage shows the source of the field data and identifies any other Qlik Catalog objects using this field. Select the icon carat expand collapse lineage details arrow icon to display lineage information.

Field Information: Lineage

Lineage tab for fields

Field Information: Assigning tags to a field

Tags assist in locating and organizing data. Tags can be assigned in the Field Information box under the Tags tab, by filling in the Add a tag field and selecting tab or enter.

Field Information: Tags

Tags tab for fields

Field Information: Comments

Field Information Comments allows authorized users to view and edit details and properties of the selected field. The authorized user can create a Comment Topic, and then type in Comment Details in the boxes indicated. Additional comments can be entered by selecting + Add Comments, which will create another comment field. Save each comment. A Success message will appear above the box tabs. Comments are subject to collaborative review and can be saved as Draft or Approved.

Field Information: Comments

Comments tab for fields

Field Information: Data Distribution

Field level profiling statistics and data distributions are calculated for each field and recalculated against each successive data load.

Field Information: Data Distribution

Data distribution tab for fields

Profile values

Profiling metrics of fields data provide the following top-level information:

Cardinality

The number of unique values for that field. Cardinality can be examined by Percent, Count, and Value.

Survey Count

The number of records in the field.

Survey Type

An index describing distribution method. The following Survey Types are represented:

Census: Every value in the field is counted for an exact distribution.

Sample: There were too many unique values in the field to be efficiently counted; a sample of values was used to estimate the cardinality and distribution.

Log10Survey: A counting method used for distributions with high cardinality—

number of different values in the specified range.

Reading (FIELD) data distribution

Cardinality

Estimated cardinality is denoted by the "approximately" equal ≈ symbol. Note that in the case of INTEGER Log10 and STRING samples, exact cardinality cannot be computed but estimated cardinality is computed and displayed with an "approximately" equal symbol.

Sample survey type

Estimated cardinality when exact cardinality is not possible

Intervals

Reading intervals: square bracket [] and parenthesis () notation with half-open or half-closed brackets and parentheses (ex. '[10.0, 100.0)') is used to indicate an interval from '10.0' to '100.0' that is inclusive of '10 .0' but exclusive of '100.0'. In other words, [10.0, 100.0) would be the set of all real numbers between 10.0 and 100.0, including 10.0 but not 100.0. Numbers within that interval may come very close to 100.0 (for example, 99.9999999) but 100.0 is not included and would be included in the next represented interval (ex. '[100.0, 10000.0)')

Note that intervals are listed in descending order of occurring frequency rather than value.

Log10 intervals

Scientific notation

Data distribution intervals are notated scientifically to help represent very large and small numbers in a way that is easy to read and understand.

Log10 scientific notation

Qlik Catalog covers the following ranges:

INTEGER (-1E+18, 1E+18)
18 digits, negative to positive
DOUBLE or DECIMAL (-1.0E+38, 1.0E+38)
38 digits, negative to positive

Census vs. sample for string fields: Qlik Catalog samples data to effectively build a histogram of unique data value distribution. Columns with cardinality < 4001 conduct a census that includes every unique observation, columns that number beyond that range conduct a sample.

Census survey string field

Census distribution displays unique values for strings

Sample survey data distribution for a string field

Sample distribution displays sample values for strings

For Integer, Double, and Decimal numeric fields, Qlik Catalog profiling of numeric fields conducts a LOG10_SURVEY, which effectively builds a histogram distribution of the log10 (numeric_observation). LOG10_SURVEY results present with Survey Count, Survey Type description, and survey profile stats: Percent, Count, Value (between) [low value, high value].

Qlik Catalog does not display Estimated Cardinality for fields of data type Double (such as Decimal). Data type Double fields are continuous rather than discrete and so cardinality is not applicable or meaningful for profiling Double data sets.

Log10 survey data distribution for integer field

Distribution on field type integer calculates estimated cardinality

Log10 survey data distribution for double field

Distribution on field type double does not calculate cardinality

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here