Talend Data Catalog in active-passive cluster mode
With a Talend Data Catalog Advanced or Advanced Plus license edition, you can install a two-server, active-passive configuration relying on a distributed database to benefit from a high availability with your product.
Clustering is the process of grouping together a set of similar physical systems in order to ensure a level of operational continuity and minimize the risk of unplanned downtime, in particular by taking advantage of failover features.
Failover allows you to automatically switch to a secondary server if the primary server is down or temporarily unreachable.
Architecture of Talend Data Catalog in active-passive cluster mode
The following diagram illustrates the architecture behind Talend Data Catalog when set up in cluster mode.
This architecture is composed of several functional blocks:
- Two Talend Data Catalog application
servers are installed on different machines. Each server instance hosts an
identical Apache Tomcat server installation and resides on a shared file
server. Only one server is running at a time, known as the active server.
The other server is passive and does not access the shared file
server.
You can get a license that works for both servers by providing two HostInfo.xml files, one for each server, in your license request.
- All instances of the application server are connected to the distributed
database.
For more information, refer to your corresponding database vendor documentation.
- A third-party high availability software is installed on each instance. The high
availability management software detects when the primary server is down and
starts the secondary server. Before starting it, the high availability
system must unlock all the files in the data
directory.
This feature is not provided by Talend and needs to be implemented separately.
- A shared file server is implemented to store and share all application data,
including the data directory, and log files between the
instances. You can define the data directory with the
M_DATA_DIRECTORY parameter in the
<TDC_HOME>/conf/conf.properties file or with
the Data Directory field from the Setup utility.
As the Talend Data Catalog server locks files in the data directory when it accesses them and unlocks them when it is done. If the primary server still locks some files when it is down, the secondary server will fail to start as it must access these files. You can implement a script to unlock the files in the data directory before starting the secondary server.
This feature is not provided by Talend and needs to be implemented separately.