Talend Studio features
Information noteDeprecation: This document was
last updated in March 2024 and its content might not be up to date. For recent
information about feature availability, see Data Integration and Quality Pricing and
Product Terms.
Cloud Talend Studio features
Design and productivity features for Cloud licenses
Feature | Available in... |
---|---|
Transform data, including filter, flatten/normalize, aggregate, replicate, look up, join, and time windowing |
|
Live preview of sample data | |
Ability to design batch and streaming pipelines in the same interface, using the same connectors and graphical components | |
Ability to standardize,cleanse, and enrich data in the pipeline | |
Schema on-read support | |
Easily embed Python code | |
Support data formats including: AVRO, JSON, Parquet, Excel and CSV | |
Quickly evaluate the quality of your data sets with the Talend Trust Score™ | |
Store data in shared, common dataset repository across all Talend products | |
Write into Talend Data Stewardship campaigns | |
File management: open, move, compress, decompress without scripting. See the documentation for related components. |
|
Graphical design environment. See Designing Jobs and Routes. | |
Control and orchestrate data flows and data integration with master Jobs. See Orchestration (Integration) components. | |
Map, aggregate, sort, enrich and merge data. See Mapping data flows and Processing (Integration) components. | |
Team collaboration with shared repository. See Working collaboratively on Git projects in Talend Cloud. | |
Continuous integration | |
Audit, Job compare, impact analysis, testing, and debugging and tuning. See Talend Project Audit, Comparing Jobs, Impact analysis, Testing Jobs and Services using test cases, and Running a Job in Java Debug mode. | |
Metadata bridge for metadata import/export and centralized metadata management. See Getting Started with the Talend Metadata Bridge. | |
Distant run and parallelization. See Running a Job remotely and Using parallelization to optimize Job performance. | |
Dynamic schema, re-usable Joblets, and reference projects. See Dynamic schema, Designing a Joblet, and Defining project references. | |
Wizards and interactive data viewer. See Managing metadata in Talend Studio. | |
Versioning. See Managing Job and Route versions. | |
Export and execute standalone jobs in runtime environments | |
Change Data Capture (CDC). See Change Data Capture (CDC). | |
Automatic documentation. See Documenting a Job or a Route. | |
Publish to Talend Cloud. See Publish to Talend Cloud . | |
Controlled patch management. See Updating Talend Studio. | |
Support for Hadoop technologies. See HBase, HCatalog, HDFS and Hive. |
|
Support for Apache Spark Batch and Apache Spark Streaming. See How a Talend Job for Apache Spark works. | |
Support for Spark Universal. See Spark Universal support for Hadoop distributions in Talend Studio . | |
Support for Spark on YARN platforms. See AWS EMR, Azure HD Insight, Cloudera, and Google Dataproc. | |
Support for server-less platforms. See Databricks, Delta Lake, and Azure Synapse Analytics with Spark pools. | |
Support for dynamic distributions. See Dynamic support for Hadoop distributions in Talend Studio . | |
Support for Apache Spark Streaming. See How a Talend Job for Apache Spark works. |
|
WS policy-based web services security |
|
Reliable messaging backbone based on ActiveMQ | |
Service Locator and Service Registry | |
XML key management specification (XKMS) | |
Command line and scripting tools | |
Build and deploy as an OSGi feature | |
Build, deploy, and manage a microservice | |
Deliver and route messages and events based on Enterprise Integration Patterns (EIPs) | |
Drag-and-drop routes and SOAP/REST service creation and simulation | |
Continuous delivery |
|
Visual mapping for hierarchical formats. See the What is Talend Data Mapper?. | |
Repository manager |
Data quality features for Cloud licenses
Feature | Available in... |
---|---|
Cleanse data, mask data, and data matching on Spark and Hadoop. See the Data quality components for Apache Spark. |
|
Profile Big Data. See Profiling Big Data. | |
Analyze databases/delimited files: Redundancy, table, column, correlation, and semantic-aware analyses. See Creating an analysis. | |
Generate reports on analyses. See What are reports?. | |
Store in the data quality data mart the analyses and reports executed in Talend Studio. See Data quality data mart. | |
Data profiling and analytics with graphical charts and drill-down data. See Creating an analysis. | |
Semantic discovery with automatic detection of patterns. See Exploring semantic categories of data columns. | |
Machine learning for data matching and deduplication. See the Machine Learning components. | |
Data privacy with masking and encryption. See the Data privacy components. | |
Data matching. See Data matching with Talend tools. | |
Standardize data. See the Data quality components. | |
Comprehensive survivorship. See the tRuleSurvivorship component. | |
Pattern library. See Patterns. | |
Fraud pattern detection using Benford's Law. See Fraud detection. | |
Advanced statistics with indicator thresholds. See Indicators. |
On-premises Talend Studio features
Design and productivity features for on-premises licenses
Feature | Available in... |
---|---|
File management: open, move, compress, decompress without scripting. See the documentation for related components. |
|
Map, aggregate, sort, enrich and merge data. See Mapping data flows and Processing (Integration) components. |
|
Metadata bridge for metadata import/export and centralized metadata management. See Getting Started with the Talend Metadata Bridge. | |
Graphical design environment. See Designing Jobs and Routes. |
|
Team collaboration with shared repository. See Working collaboratively on project items. | |
Control and orchestrate data flows and data integration with master Jobs. See Orchestration (Integration) components. | |
Continuous integration | |
Audit, Job compare, impact analysis, testing, and debugging and tuning. See Overview of Talend Project Audit, Comparing Jobs, Impact analysis, Testing Jobs and Services using test cases, and Running a Job in Java Debug mode. | |
Distant run and parallelization. See Running a Job remotely and Using parallelization to optimize Job performance. | |
Dynamic schema, re-usable Joblets, and reference projects. See Dynamic schema, Designing a Joblet, and Defining project references. | |
Wizards and interactive data viewer. See Managing metadata in Talend Studio. | |
Versioning. See Managing Job and Route versions. | |
Export and execute standalone Jobs in runtime environments | |
Change Data Capture (CDC). See Change Data Capture (CDC). | |
Automatic documentation. See Documenting a Job or a Route. | |
Controlled patch management. See Updating Talend Studio. | |
Support for Hadoop technologies. See HBase, HCatalog, HDFS and Hive. |
|
Support for Apache Spark Streaming. See How a Talend Job for Apache Spark works. |
|
Support for Apache Spark Batch. See How a Talend Job for Apache Spark works. |
|
Support for Spark Universal. See Spark Universal support for Hadoop distributions in Talend Studio . | |
Support for Spark on YARN platforms. See AWS EMR, Azure HD Insight, Cloudera, and Google Dataproc. | |
Support for server-less platforms. See Databricks, Delta Lake, and Azure Synapse Analytics with Spark pools. | |
Support for dynamic distributions. See Dynamic support for Hadoop distributions in Talend Studio . | |
Drag-and-drop routes and SOAP/REST service creation and simulation |
|
WS policy-based web services security | |
Deliver and route messages and events based on Enterprise Integration Patterns (EIPs) | |
Reliable messaging backbone based on ActiveMQ | |
Service Locator and Service Registry | |
Command line and scripting tools | |
XML key management specification (XKMS) | |
Build and deploy as an OSGi feature | |
Continuous delivery |
|
Visual mapping for hierarchical formats. See the What is Talend Data Mapper?. | |
Repository manager | |
Build a microservice |
|
Data quality features for on-premises licenses
Feature | Available in... |
---|---|
Analyze databases/delimited files: Redundancy, table, column, and correlation analyses. See Creating an analysis. |
|
Data profiling and analytics with graphical charts and drill-down data. See Creating an analysis. | |
Pattern library. See Patterns. | |
Fraud pattern detection using Benford's Law. See Fraud detection. | |
Advanced statistics with indicator thresholds. See Indicators. | |
Analyze databases/delimited files: Semantic-aware analyses. See Steps to use semantic-aware analysis. |
|
Semantic discovery with automatic detection of patterns. See Exploring semantic categories of data columns. | |
Cleanse data, mask data, and data matching on Spark and Hadoop. See the Data quality components for Apache Spark. | |
Profile Big Data. See Profiling Big Data. | |
Generate reports on analyses. See What are reports?. | |
Store in the data quality data mart the analyses and reports executed in Talend Studio. See Data quality data mart. | |
Machine learning for data matching and deduplication. See the Machine Learning components. | |
Data privacy with masking and encryption. See the Data privacy components. | |
Data matching. See Data matching with Talend tools. | |
Standardize data. See the Data quality components. | |
Comprehensive survivorship. See the tRuleSurvivorship component. |