bing
Flat 10% & upto 40% off + 10% Cashback + Free additional Courses. Hurry up
×
UPTO
50%
OFF!

Top DataStage Interview Questions And Answers

Datastage Interview Questions
Here are top 51 objective type sample DataStage Interview questions and their answers are given just below to them. These sample questions are framed by experts from Intellipaat who train for DataStage Training to give you an idea of type of questions which may be asked in interview. We have taken full care to give correct answers for all the questions. Do comment your thoughts Happy Job Hunting!

Interested in Learning DataStage? Click Here

Top Answers to DataStage Interview Questions

1. DataStage Characteristics
Criteria Result
Support for Big Data Hadoop Access Big Data on a distributed file system, JSON support & JDBC integrator
Ease of use Improve speed, flexibility, & efficacy for data integration
Deployment On-premise or cloud as the need dictates
2. Explain What is IBM DataStage?

Datastage is an extract, transform and load tool that is part of the IBM Infosphere suite. It is a tool that is used for working with large data warehouses and data marts for creating and maintaining such a data repository.

CTA

Learn more about DataStage in this insightful blog post.

3. How is a DataStage source file filled?

We can develop an SQL query or we can use a row generator extract tool through which we can fill the source file in Data Stage.

4. How is merging done in DataStage?

Merging is done when two or more tables are expected to be combined based on their primary key column. This is the basis for merging in Data Stage.

CTA

Interested in learning DataStage? We have the in-depth DataStage Training Courses to give you a head start in your career!

5. What is a data file and a descriptor file?

Both these files are as the name indicates are serving different purpose in Data Stage. The descriptor files contain all the information or description while the data file is the one that just contains the data.

6. How is DataStage different from Informatica?

Data stage and Informatica are both powerful ETL tools but there are a few difference between the two tools. Data stage has the parallelism and partition concept for node configuration whereas the Informatica tool there is not support for parallelism in node configuration. Data stage is simpler to use as compared to Informatica.

7. What is a Routine in Data Stage?

The DataStage manager defines a collection of functions within this tool which is called as a Routine. There are basically there types of Routines in DataStage namely Job Control Routine, Before/After Sub-routine, Transform Function.

8. What is the process for removing duplicates in DataStage?

The duplicates within the data stage can be removed using the sort function. While running the sort function you need specify for the option which allows for duplicates by setting it to false

9. What is the difference between Join, Merge & Lookup stage?

The fundamental difference between these three stages is the amount of memory they take. Other than that how they treat the input requirement and the various records is also a differentiating factor. So based on memory usage, the Lookup stage uses a very less amount of memory. Both Lookup and Merge use a huge amounts of memory.

10. What is the quality state in DataStage?

The quality state is used for cleansing the data with the DataStage tool. It is a client server software tool that is provided as part of the IBM Information server.

CTA
Download DataStage Interview Questions asked by top MNCs in 2018
CLICK HERE

close

11. What is job control in DataStage?

This tool is used for control the job or executing multiple jobs in a parallel manner. It is deployed using the Job Control Language within the IBM data stage tool.

12. How to do DataStage jobs performance tuning?

First you have to select the right configuration files. Then you need to select the right partition and buffer memory. You have to handle the sorting of data and handling null time values. Try to use the modify, copy or filter instead of the transformer. Reduce the propagation of unnecessary metadata between the various stages.

13. What is a repository table in DataStage?

The repository is another name for a data warehouse. It can be centralized or a distributed one. The repository table is used for answering the queries like ad hoc, historical, analytical or complex queries.

14. Compare the massive parallel processing and symmetric multiprocessing?

In the process of massive parallel processing  many of the computers are present in the same chasis. While in the symmetric multiprocessing there are many processors that a share the same hardware resources. The massive parallel processing is called as shared nothing as there is no aspect between the various computers. On the other hand the massive parallel processing is faster than the symmetric multiprocessing.

15. How can you kill the DataStage job?

To kill a DataStage job you need to first kill the individual processing ID so that this ensures that the DataStage is killed.

CTA
16. How do you compare the validated OK and Compiled processes in DataStage?

The Compiled step ensures tha the important stage parameters are mapped and these are correct so this creates an executable job. Whereas in the Validated OK we make sure that the connections are valid.

17. Explain the feature of data type conversion in DataStage?

If you want to do data conversion in DataStage then you can use the data conversion function. For this to be successfully executed you need to ensure that the input or the output to and from the operator is the same and the record schema needs to be compatible with the operator.

18. What is the significance of the exception activity in DataStage?

Whenever there is an unfamiliar error that is happening when we are executing the job sequencer, during this time all the stages after the exception activity are run. So this makes the exception activity so important in the DataStage.

CTA

Learn how the DataStage Training Videos can take your career to the next level!

19. What are the various types of Lookups in DataStage?

There are different types of Lookups in DataStage. These include the Normal, Sparse, Range and Caseless Lookup in DataStage.

20. When do you use a parallel job and a server job?

Using the parallel job or a server job depends on the processing need, functionality, time to implement and the cost. The server job usually runs on a single node, it executes on a DataStage Server Engine and handles small volumes of data. The Parallel job runs on multiple nodes, it executes on a DataStage Parallel Engine and handles large volumes of data.

21. What is usage analysis in DataStage?

If you want to whether a certain job is part of the sequence then you right click in the Manager on the job and then choose the Usage Analysis.

22. How to find in a sequential file the number of rows?

For counting the number of rows, we should use the @INROWNUM variable.

23. What is the difference between a Sequential file and a Hash file?

The Hash file is based on a Hash algorithm and it can used with a key value. The sequential file on the other hand does not have any key column value. The Hash file can be used as a reference for a Look Up while a Sequential file cannot be used for Look Up. Due to the presence fo the Hash key, the Hash file is easier to search than a Sequential file.

24. How do you clean a DataStage Repository?

For cleaning a DataStage Repository you need to go to the DataStage Manager and go to the Job in the Menu bar and go to the Clean Up Resources. If you want to further remove the logs  then you need to go to the respective job and clean the log files.

25. How do you call a Routine in DataStage?

The Routines are stored in the Routine branch of the DataStage Repository. This is where you can create, view or edit all the Routines. The Routines in DataStage could be among the following: Job Control Routine, Before-after Sub-Routine, Transform Function.

26. What is the difference between an Operational DataStage and a Data Warehouse?

An Operational DataStage can be considered as a staging area, for real-time analysis, for user processing. Thus it is a temporary repository. Whereas the data warehouse is used for long-term data storage needs and has the complete data of the entire business.

27. What does NLS mean in DataStage?

NLS means National Language Support in DataStage. This means you can use this IBM DataStage tool in various languages like multi-byte character languages like Chinese or Japanese too. You can read and write in any language and process it as per the requirement.

"0 Responses on Top DataStage Interview Questions And Answers"

Leave a Message

100% Secure Payments. All major credit & debit cards accepted Or Pay by Paypal.
top

Sales Offer

  • To avail this offer, enroll before 26th April 2018.
  • This offer cannot be combined with any other offer.
  • This offer is valid on selected courses only.
  • Please use coupon codes mentioned below to avail the offer
OFFER DISCOUNT

Sign Up or Login to view the Free Top DataStage Interview Questions And Answers.