Data for the tMap Job example
Input file and file structure
The input file, California_Clients.csv, lists clients from all over the State of California. It contains the data that will be loaded into the database table.
The file structure, usually called Schema in Talend Studio, includes the following columns:
-
First name
-
Last name
-
Address
-
City
First name;Last name;Address;City
Lyndon;Lincoln;644 East 1st Street;GRANADA HILLS
Iyndon;Fillmore;1109 Tanger Blvd;MISSION HILLS
Ronald;Truman;417 Santa Rosa North;SANTA CLARITA
Harry;Carter;1094 El Camino Real;CANYON COUNTRY
Calvin;Johnson;1705 Cabrillo Highway;SUN VALLEY
Benjamin;McKinley;1399 Santa Rosa North;ALISO VIEJO
William;Jefferson;573 Jones Road;TARZANA
Ronald;Washington;1250 San Marcos;WESTLAKE VILLAGE
Theodore;Johnson;957 Cerrillos Road;ANAHEIM
Chester;Monroe;1392 Harbor Dr;STEVENSON RANCH
Ulysses;Truman;367 Carpinteris Avenue;VAN NUYS
Output data structure
This scenario loads the data of California clients living in Orange and Los Angeles counties to the database table.
The following lists the database table structure, which is slightly different from that of the input file. As a result, the data to be loaded into the database table needs to be transformed.
-
Key (key, Type: Integer)
-
Name (Type: String, max. length: 40)
-
Address (Type: String, max.length: 40)
-
County (Type: String, max. length:40)
In order to load this table, the following mapping process is needed.
- The Key column is fed with an auto-incremented integer.
- The Name column is filled out with a concatenation of first and last names.
- The Address column data comes from the equivalent Address column of the input file, but supports an upper-case transformation before the loading.
- The County column is fed with the name of the County where the city is located using a reference file which will help filtering Orange and Los Angeles counties' cities.
Reference data
As the data of the clients that are only of Orange and Los Angeles counties should be loaded into the database, a reference file is needed in order to filter only Orange and Los Angeles clients. The file contains the following city-to-county mappings.
City;County
GRANADA HILLS;Los Angeles
MISSION HILLS;Los Angeles
ALISO VIEJO;Orange
ANAHEIM;Orange
ARCADIA;Los Angeles
The reference file in this Job is named LosAngelesandOrangeCounties.txt.