Top DataStage Interview Questions And Answers
Here are top 50 objective type sample DataStage Interview questions and their answers are given just below to them. These sample questions are framed by experts from Intellipaat who train for DataStage Training
to give you an idea of type of questions which may be asked in interview. We have taken full care to give correct answers for all the questions. Do comment your thoughts Happy Job Hunting!
Wish to Learn DataStage? Click Here
Top Answers to DataStage Interview Questions
1.Explain Data Stage?
A data stage is simply a tool which is used to design, develop and execute many applications to fill various tables in data warehouse or data marts.
Learn more about DataStage in this insightful blog post now.
2.Tell how a source file is populated?
We can generate a source file in various ways such as by making a SQL query in Oracle, or by using row generator extract tool etc.
3.Write the command line functions to import and export the DS jobs?
To signify the DS jobs, dsimport.exe is used and to export the DS jobs, dsexport.exe is used.
4.Differentiate between Datastage 7.5 and 7.0?
In Datastage 7.5 various new stages are added for more sturdiness and smooth performance, such as Procedure Stage, Command Stage,etc.
5.In Datastage, how we can fix the shortened data error?
The shortened data error can be fixed by using ENVIRONMENT VARIABLE ‘IMPORT_REJECT_STRING_FIELD_OVERRUN’.
Wish to Learn DataStage? Click Here
Merge means to merge two or more tables. The two tables are merged on the origin of Primary key columns in both the tables.
Interested in learning DataStage? Well, we have the in-depth DataStage Courses to give you a head start in your career.
7.Differentiate between data file and descriptor file?
As the name says, data files contains the data and the descriptor file contains the information about the data in the data files.
8. Differentiate between datastage and informatica?
In datastage, there is a perception of separation, parallelism for node configuration. While, there is no perception of separation and parallelism in informatica for node configuration. Also, Informatica is more scalable than Datastage. Datastage is more easy to use as compared to Informatica.
9.Explain Routines and their types?
Routines are basically group of functions that is described by DS manager. It can be called through transformer stage. Routines are of three types such as, parallel routines, server routines and main frame routines.
10.How can we write parallel routines in datastage PX?
We can mention parallel routines in C or C++ compiler. Such routines are also developed in DS manager and can be called from transformer stage.
11.What is the procedure of removing duplicates, without the remove duplicate stage?
Duplicates can be detached by using Sort stage. We can use the opportunity, as allow duplicate = false.
12.What steps should be taken to recover Datastage jobs?
In order to recover presentation of Datastage jobs
, we have to first create the baselines. Secondly, we should not use only one flow for presentation testing. Thirdly, we should work in growth. Then, we should appraise data skews. Then we should separate and solve the problems, one by one. After that, we should allocate the file systems to take away bottlenecks, if any. Also, we should not embrace RDBMS in start of testing phase. Last but not the least, we should understand and evaluate the available tuning knobs.
13.Compare and contrast between Join, Merge and Lookup stage?
All the three are dissimilar from each other in the way they use the memory storage, compare input necessities and how they treat various data . Join and Merge needs minimum memory as compared to the Lookup stage.
14.Describe Quality stage?
Quality stage is also called as Integrity stage. It assists in integrating various types of data from different sources.
15.Describe Job control?
Job control can be best performed by using Job Control Language (JCL). This tool is used to execute various jobs concurrently, without using any kind of loop.
16.Contrast between Symmetric Multiprocessing and Massive Parallel Processing?
In Symmetric Multiprocessing, the hardware resources are communal by processor. The processor has one operating system and it communicates through shared memory. While in Massive Parallel processing, the CPU contact the hardware resources completely. This type of processing is also called as Shared Nothing, as nothing is common in this. It is quicker than the Symmetric Multiprocessing.
17.Write the steps required to kill the job in Datastage?
To destroy the job in Datasatge, we have to kill the individual processing ID.
18.Contrast between validated and Compiled in the Datastage?
In Datastage, validating a job means, executing a job. While validating, the Datastage engine checks whether all the necessary properties are given or not. In other case, while compiling a job, the Datastage engine checks that whether all the given property are suitable or not.
19.How we can run date conversion in Datastage?
We can use date conversion function for this reason i.e. Oconv (Iconv(Filedname,”Existing Date Format”),”Another Date Format”).
20.What is the need of exception activity in Datastage?
All the stages after the exception activity in Datastage are run in case of any unfamiliar error occurs while executing the job sequencer.
Learn how the DataStage Training Videos can take your career to the next level!
21.Explain APT_CONFIG in Datastage?
It is the environment variable which is used to recognize the *.apt file in Datastage. It is also used to keep the node information, scratch information and disk storage information.
22.Write the different types of Lookups in Datastage?
There are two types of Lookups in Datastage i.e. Normal lookup and Sparse lookup.
23.How we can covert server job to a parallel job?
We can convert a server job in to a parallel job by using Link Collector and IPC Collector.
24.Explain Repository tables in Datastage?
In Datastage, the Repository is second name for a data warehouse. It can be federalized as well as circulated.
25.Describe OConv () and IConv () functions in Datastage?
In Datastage, OConv () and IConv() functions are used to convert formats from one format to another i.e. conversions of time, roman numbers, radix, date, numeral ASCII etc. IConv () is mostly used to change formats for system to understand. While, OConv () is used to change formats for users to understand.
26.Define Usage Analysis in Datastage?
In Datastage, Usage Analysis is done within few clicks. Launch Datastage Manager and right click on job. Then, select Usage Analysis.
27.How we can find the number of rows in a sequential file?
To find rows in chronical file, we can use the System variable @INROWNUM.
28.Contast between Hash file and Sequential file?
The only dissimilarity between the Hash file and Sequential file is that the Hash file stores data on hash algorithm and on a hash key value, while sequential file doesn’t have any key value to save the data. Hence we can say that hash key feature, searching in Hash file is faster than in sequential file.
29.How we can clean the Datastage repository?
We can clean the Datastage repository via the Clean Up Resources functionality in the Datastage Manager.
30.How we can called routine in Datastage job?
We can call a routine from the transformer stage in Datastage job.
31.Differentiate between Operational Datastage (ODS) and Data warehouse?
We can say, ODS is a small data warehouse
. An ODS doesn’t have information for more than 1 year while a data warehouse have detailed information about the entire business.
32.For what NLS stand for in Datastage?
NLS stand for National Language Support. It can be used to integrate various languages such as French, German, and Spanish etc. in the data, requisite for processing by data warehouse.
33.Can you explain how could anyone crash the index before loading the data in target in Datastage?
In Datastage, we can crash the index before loading the data in target by using the Direct Load functionality of SQL Loaded Utility
34.Does Datastage support gradually changing dimensions ?
Yes,Version 8.5 + supports this feature in datastage.
35.How we can find bugs in job sequence?
We can locate bugs in job sequence by using DataStage Director.
36.How complicated jobs are implemented in Datstage to recover performance?
In order to recover performance in Datastage, it is suggested, not to use more than 20 stages in every job. If you need to use more than 20 stages then it is advisable to use next job for those stages.
37.Name the third party tools that can be used in Datastage?
The third party tools that can be used in Datastage, are Autosys, TNG and Event Co-ordinator.
38.Describe Project in Datastage?
Whenever we begin the Datastage client, we are asked to join to a Datastage project. A Datastage project have Datastage jobs, built-in apparatus and Datastage Designer or User-Defined components.
39.what types of hash files are there?
There are two types of hash files in which are Static Hash File and Dynamic Hash File.
40.Describe Meta Stage?
In Datastage, MetaStage is used to store metadata that is beneficial for data lineage and data analysis.
41.Why UNIX environment is useful in Datastage?
It is useful in Datastage because sometimes one has to write UNIX programs such as batch programs to raise batch processing etc.
42.Contrast between Datastage and Datastage TX?
Datastage is a tool from ETL i.e. Extract, Transform and Load and Datastage TX is a tool from EAI i.e. Enterprise Application Integration.
Learn more about the ETL process in this insightful blog now.
43.What is size of a transaction and an array means in a Datastage?
Transaction size means the number of row written before committing the account in a table. An array size means the number of rows written/read to or from the table respectively.
44.Name the various types views in a Datastage Director?
There are three types of views in a Datastage Director i.e. Log View, Job View and Status View.
45.What is the use of surrogate key?
Surrogate key is mostly used for getting data faster. It uses catalog to perform the retrieval operation.
46.How discarded rows are processed in Datastage?
In the Datastage, the discarded rows are managed by constraints in transformer. We can either place the discarded rows in the properties of a transformer or we can create a brief storage for discarded rows with the help of REJECTED command.
47.Contrast between ODBC and DRS stage?
DRS stage is faster than the ODBC stage because it uses local databases for connectivity.
48.Describe Orabulk and BCP stages?
Orabulk stage is used to store big amount of data in one target table of Oracle database. The BCP stage is used to store big amount of data in one target table of Microsoft SQL Server.
49.Describe DS Designer?
The DS Designer is used to make work area and add many links to it.
50.What is the need of Link Partitioner and Link Collector in Datastage?
In Datastage, Link Partitioner is used to split data into various parts by certain partitioning methods. Link Collector is used to collect data from many partitions to a single data and save it in the target table.
Get the Data Stage Certification Course at an unbelievable price now!