Navigating source module
Source module provides external source properties and ingest specifications for building metadata environments and onboard of data.
Available object grids in descending hierarchical order: source hierarchy | sources | entities | fields provide information and configurable properties appropriate to that level.
Select the desired object grid and navigate to or hover over object rows of interest to display tooltips. To access the level below the current grid (for example if in source drill-down to entity level) select the (view) icon. For details about the various object levels and search and filter options refer to Discover: navigation and properties.
Source module: Source hierarchy
Source hierarchy is the highest parent object level. Multiple sources can be added to source hierarchies.
Source hierarchies can also be nested within other source hierarchies by selecting (edit) icon on the source hierarchy to be nested and selecting a parent hierarchy to nest it under.
To create a new Source Hierarchy:
- Select Source Hierarchy button, source hierarchies display
- Select + Create Hierarchy from bottom of panel
- Enter Name for new source hierarchy in popup.
- Save
Source module: Source information
In the source grid, drill in to see entities within the source by selecting (view) icon on each row.
To access Details by object-level select the (view and edit) icon. Note that disabled greyed-out fields cannot be edited. Editable Source General Information fields include: Name, Business Name, Business Description, Tags, and Source Hierarchy (that can be selected from dropdown).
Non-editable Source Information fields include: Last updated at, Communication Protocol, File Type, Source Type, and Base Directory. Source Connection can be expanded by selecting (caret) to display non-editable fields: Source Type, Resource URI, Username, and Password.
To view actions available for the current object, select the More dropdown to access information and configurable attributes and properties: Load Logs, Delete (source), Discover (internal source), View/Edit General Info which brings up the same modal as selection of View Details) and View/Edit Properties.
Source module: Source properties
In the Source: Propertiestab, administrators can add, edit or delete properties.
To Add a property to the property panel, select the (add property) button to display available properties. Select one at a time, and populate its value. To edit properties on those where editing is enabled, either manually enter the value or select a value from the dropdown for enum type properties with set predefined values. To delete a property, select the (delete) icon to the left of each property.
Remember to Save changes before exiting the property panel. Objects must be reloaded to ingest with property changes in place.
Source module: Entities
From the entity grid, drill in to see fields within the entity by selecting (view) icon on each row.
To access Details by object-level select the (view and edit) icon. Note that disabled greyed-out fields cannot be edited. Editable Entity General Information fields include: Name, Business Name, Business Description,Short Name, and Source Connection ( default source-level connection can be overwritten for each entity, selected from dropdown. This configures the specified connection for ingest), Entity Base Type, and Tags.
Non-editable Entity General Information fields include:Entity Type, Stored Format Type, andLast updated at.
To view actions available for an entity in the source module, select the More dropdown to access information and configurable attributes and properties:
Load (data), Load Logs (details every data load), Delete (entity), Discover (switch to internal source view), Source Connection (to change connection), View/Edit General Info (same modal as details accessed by selecting (view and edit)), and View/Edit Properties.
Source (External) Entity: View/Edit Properties In the Source: Propertiestab, administrators can add, edit or delete properties.
To Add a property to the property panel, select the (add property) button to display available properties. Select one at a time, and populate its value. To edit properties on those where editing is enabled, either manually enter the value or select a value from the dropdown for enum type properties with set predefined values. To delete a property, select the (delete) icon to the left of each property.
Remember to Save changes before exiting the property panel. Entities must be reloaded to ingest with property changes in place.
Source module: Fields
Fields grid in the source module provides a (switch to discover) hotlink to the field object in the discover module where users can view metadata and sample data for the field.
To View Details by object-level select the (view and edit) icon. Note that disabled greyed-out fields cannot be edited.
Editable Field General Information fields include: Business Name, Business Description,Technical Description, Validation Pattern, Null Proxy Regex, Data Type, Index, and Field Information (Required, Encrypted at Source, Encrypt, Primary Key, Foreign Key, Sensitive).
Non-editable Field General Information fields include: Name and Last updated at.
Field information includes five tabs: General Information, Properties, Lineage, Tags, Comment.
Note that while some properties are shared between source and discover modules, the source module contains dataload-specific attributes and structural properties that are not included in discover.
Field: General information tab
Source module field attributes display on this tab. This information (metadata) describes the external source of data which is parsed into the application data model.
Field | Definition |
---|---|
Name |
Name of column, editable in tab. This is a required attribute. |
Business Name |
user-defined |
Business Description |
user-defined |
Technical Description |
user-defined. Freeform field to describe technical characteristics of the data. |
Validation Pattern |
This expression specifies a control value (configurable via radio buttons as exact STRING or regular expression) to be applied against input data for field-level validation. Default value is empty STRING. |
Null Proxy Regex |
This expression specifies a control value (configurable via radio buttons as exact STRING or regular expression) to evaluate qualifying input data to null. |
Last Updated at |
Auto-generated. Provides data and time metadata was last updated (ISO standard). |
Data Type |
The data type as stored in receiving. Primary data type options:
|
Field level information | Definition |
---|---|
Required (NOT NULL) |
Constraint setting: Whether the field is required or not. Fields with null values will be bucketed as 'Ugly" when checked. Default is false (not checked). This attribute requires a refresh and reload of data to take effect. |
Encrypted at Source |
[Informational only, does not affect validation of field] Indicates whether value is encrypted at the source. |
Encrypt |
Specifies if the field data is to be encrypted by the system upon ingest. This attribute requires a refresh and reload of data to take effect. |
Primary Key |
[Informational only, does not affect validation of field] |
Foreign Key |
[Informational only, does not affect validation of field] |
Sensitive |
Specifies whether field data is sensitive with applicable obfuscation method. This attribute requires a refresh/reload of data to take effect. |
Enable data type change before ingest or upon reload (JDBC and flat files only)
Change the data type in the source module on field general information screen either before the Hive object has been created or after deleting the Hive object before entity is reloaded. This external field data type change will be propagated upon load or reload of the data to internal field data type.
- Schema of the Hive tables will not change by default if data type change is made after the first ingest:
-
If data type change is being made after first ingest, add (from external entity property dropdown) and apply new property:
'entity.hive.alterTableToCorrectType=true'
-
Update or remove the following internal field properties:
- 'field.hive.ddl.data.type=<datatype>'
- 'numeric.precision.scale.rounding.mode=[x,y]'
- To ensure a matching data type is found, it is best to use standard names like STRING, DOUBLE, DECIMAL, BOOLEAN, INTEGER [a few non-standard names will work such as INT4, FLOAT, TEXT] Regardless of whether a mapping internal data type is found, a warning will display in the UI when the data type is changed
- Prepare dataflows with pre-existing field data types will fail validation and need to be fixed manually
- Predefined publish jobs will not be impacted by these changes if publishing to a Hadoop target
- Hive will not read sample data if the datatype format is not supported by Hive. Sample data displays with HDFS method for that data that has been processed with supported data types.
External field: Properties
Source (External) Field: View/Edit Properties to add, edit or delete properties.
To Add a property to the property panel, select the (add property) button to display optional available properties. Select one at a time, and populate its value. To edit properties on those where editing is enabled, either manually enter the value or select a value from the dropdown for enum type properties with set predefined values. To delete a property, select the (delete) icon to the left of each property.
Remember to Save changes before exiting the property panel. Entities must be reloaded to ingest with property changes in place.
External field: Lineage
Parent lineage shows root source of field data. (In discover, the internal objects list the external parent source)
In source, this tab will display the child (internal) objects created from this source (external).
External field: Tags
Tag the objects with metatags to assist in locating and organizing data.
External field: Comments
Description of Field content, formulas, derivation, analyst notes. Comments are subject to collaborative review and can be saved as Draft or Approved.
External and internal field metadata are different and not interchangeable. Comments created from the source grid do not display in discover and vice versa.
External fields: More actions
To view actions available for a field in the source module, select the More dropdown to access the same information available in the field information as tabs: View/Edit General Information, View/Edit Properties, View Lineage, View/Edit Tags, View/Edit Comments.