CTA
Most Frequently Asked Sas Interview Questions
1. Compare SAP BO with SAS BI.
2. Explain the SUBSTR function.
3. Define the TRANSLATE function.
4. Explain PROC SORT.
5. Describe PROC UNIVARIATE.
6. Elucidate the APPEND procedure.
7. Explain the BMDP procedure.
8. Define RUN-group Processing.
9. Describe BY-group Processing.
10. What do a CALENDAR procedure do?
SAS is among the most popular tool for Data Analytics today. Not only is this tool easy to learn, but it also provides an easy choice (PROC SQL) for those who have prior knowledge of SQL. In this SAS Interview Questions blog, we have segregated the frequently asked SAS questions based on three difficulty levels. This SAS Interview Questions and Answers blog is the best guide for you to use while preparing for your job interview. Prepare the following SAS interview questions and crack your job interview:
SAS Interview Questions and Answers for Freshers
1. Compare SAP BO with SAS BI.
Criteria |
SAP BO |
SAS BI |
Why deploy? |
High-level visualization, customer-friendly |
Quick data integration with diverse sources |
Presentation |
Excellent |
Average |
Ad-hoc analysis |
Excellent |
Average |
Mobile BI |
Good |
Excellent |
Analytics |
Predictive analytics |
Easy analytics |
Application |
Frontend suite to sort, view, and analyze BI data |
Combines BI and Analytics to deliver enterprise-grade data |
Find out more about the SAS Analytics tool in this SAS Tutorial!
2. Define the TRANSLATE function.
TRANSLATE Function: With this function, the characters specified in a string are replaced by the characters specified by users.
3. Explain the SUBSTR function.
The SUBSTR function is used for extracting a string or replacing the contents of a character value.
4. Explain PROC SORT.
PROC SORT sorts SAS data sets by variables so that a new data set can be prepared for further use.
5. Elucidate the APPEND procedure.
The term ‘append’ means adding at the end.
In SAS, we can say that the APPEND procedure is a procedure adding one SAS data set to another SAS data set.
6. Describe PROC UNIVARIATE.
PROC UNIVARIATE is used for the elementary numeric analysis, and it examines how data is distributed.
7. Explain the BMDP procedure.
For analyzing data, the BMPD procedure is used.
8. Define RUN-group Processing.
RUN-group processing is used to submit a PROC step using the RUN statement without ending the procedure.
9. Describe BY-group Processing.
The BY statement is used by the BY-group processing so that it can process data that are indexed, grouped, or ordered based on variables.
10. What do a CALENDAR procedure do?
The CALENDAR procedure shows data in a monthly calendar format from a SAS data set.
11. What are the functions used for character handling in SAS?
UPCASE and LOWCASE, known as the character functions, are used for character handling in SAS.
12. Explain the BOR function.
The BOR function is a bitwise logical operation used to return bitwise logical OR between two statements.
13. What is the use of the DIVIDE function?
The DIVIDE function is used to return the division result.
14. What do you mean by CALL PRXFREE Routine?
CALL PRXFREE routine is used for character string matching and for the allocation of free memory for Perl regular expression.
15. Explain CALL PRXCHANGE Routine.
CALL PRXCHANGE routine is used for performing the replacement of pattern matching.
16. Define the ANYDIGIT function.
The ANYDIGIT function is used to search for the first occurrence of a digit (numeral) in a string. It returns the position of the digit. If no digit is found, it returns a ‘0’. By using an optional parameter, the ANYDIGIT function can begin the search at any given position in the string.
Syntax:
ANYDIGIT(character-value <,start>)
The character-value is any SAS character expression, and the term start is an optional parameter that specifies the position within the string to begin the search.
Watch this video on SAS Programming Training:
17. What do you understand by CALL MISSING Routine?
The character or numeric variables that are specified can be assigned missing values through the CALL MISSING routine.
18. Explain the COMPRESS= Data set option.
It is used for compressing the data into new output.
19. What do you mean by the ALTER= Data Set option?
It is used for assigning an ALTER password, which will stop users from changing the file.
20. Define Formats.
Instructions used by SAS for writing data values are known as Formats.
Intermediate SAS Interview Questions and Answers
21. How are Variable Formats handled by PROC COMPARE?
Variable formats are handled by PROC COMPARE as it is used for comparing unformatted values.
22. What is the use of $BASE64X?
By using $BASE64X encoding, the character data is converted into ASCII text.
23. What are the features of the SAS system?
It provides IPv6 support, new TrueType fonts, extended time notations, the restart mode, universal printing, the checkpoint mode, and ISO 8601 support.
24. Describe the VFORMATX function.
The VFORMATX function is used to return the format that is assigned with the value of a given statement.
25. Define the STD function.
With the help of the STD function, the standard deviation will be returned for the nonmissing statements.
26. How can a SAS program be validated?
With the help of the STD function, the standard deviation will be returned for the nonmissing statements.
27. Elucidate the FILECLOSE data set option.
When a data set is closed, its tape positioning is defined by FILECLOSE.
28. What is Debugging?
Debugging is a technique for testing the program logic, and this can be done with the help of Debugger.
29. What does ODS stand for?
ODS stands for the Output Delivery System.
30. What does CDISC stand for?
CDISC stands for Clinical Data Interchange Standards Consortium.
31. Which method is used to copy blocks of data?
The method used for copying blocks of data is defined as the block I/O method.
32. Define the max() function.
The max() function is used to return the largest value.
33. What is the procedure for copying an entire library?
The copy statement should be followed by an input data library and an output data library.
34. What is the use of the SYSRC function?
It is a function that provides a system error number.
35. Explain SAS. What are the functions it performs?
SAS, i.e., Statistical Analysis System, is a combined set of software solutions that helps users analyze data.
- It can change, manipulate, analyze, and retrieve data.
- With SAS, numerical analysis can be done.
- We have several SAS tools to succeed at writing programs that analyze data and create reports.
- We get quality data analytics with SAS.
Learn What is SAS Analytics?
36. Describe the basic structure of a SAS program.
A SAS program consists of:
- A DATA step, which recovers and manipulates data
- A PROC step, which interprets the data
37. What is DATA Step?
The main function of a DATA step is to create SAS data sets by manipulating data.
38. What is PDV?
Program Data Vector (PDV) is the area of memory where data sets are created through the SAS system, one at a time. When a program is executed, an Input Buffer is created that reads data values and makes them assigned to their respective variables.
39. In SAS, which statement does not perform automatic conversions in comparisons?
With WHERE statements, automatic conversions cannot be performed because WHERE statement variables exist in the data set.
40. What is the difference between the NODUPKEY and NODUP options?
Identical observations are checked and removed through the NODUP option. On the other hand, the NODUPKEY option checks for all BY variable values and if found, it will eliminate those.
Advanced SAS Interview Questions and Answers For Experienced
41. What is the use of the function PROC SUMMARY?
PROC SUMMARY is the same as PROC MEANS, i.e., it will give descriptive statistics but will not give output as default. We have to give an option ‘print’, and then it will give the output.
42. What does PROC GLM do?
The functions of PROC GLM are covariance analysis, variance analysis, multivariate, and repeated analysis of variance.
43. What are PROC PRINT and PROC CONTENTS used for?
PROC PRINT outputs a list of the values of some or all variables in a SAS data set. PROC CONTENTS tells the structure of the data set rather than the data values.
44. What is SAS Informat?
Informat is an instruction that SAS uses to read data values. It is used to read or input data from the external files.
45. What does the function CATX syntax do?
CATX syntax inserts delimiters, removes trailing and leading blanks, and returns a concatenated character string.
46. Explain the use of PROC GPLOT.
PROC GPLOT identifies the data set that contains the plot variables. It has more options and, therefore, can create more colorful and fancier graphics.
47. What do put and the input functions do?
- Input function: Character values are converted into numeric values.
- Put function: Numeric values are converted into character values.
48. How to sort in descending order?
By using the DESCENDING keyword in the PROC SORT code, we can sort in descending order.
49. What is the difference between VAR B1 – B3 and VAR B1 -- B3?
A single dash specifies the consecutively numbered variables. A double dash specifies the variables available within the data set.
Example:
Data Set: ID NAME B1 B2 C1 B3
- B1 – B3 would return B1 B2 B3
- B1– B3 would return B1 B2 C1 B3
50. What is the basic syntax style in SAS?
Important points for running a SAS program are:
- A DATA statement, which names our data set
- Names of the variables in our data set are described by the INPUT statement
- The statement should end with a semicolon (;)
- Space should be given between the word and the statement
51. What are the special Input Delimiters?
Input delimiters are DLM and DSD.
52. What is the difference between a format and an informat?
- Format: A format is to write data, i.e., WORDIATE18 and WEEKDATEW
- Informat: An informat is to read data, i.e., comma, dollar, and date (MMDDYYw, DATEw, TIMEw, and PERCENTw)
53. Describe any one SAS function.
TRIM: TRIM removes the trailing blanks from a character expression.
Example:
Str1 = ‘my’;
Str2 = ‘dog’;
Result = TRIM (Str1)(Str2);
Result = ‘mydog’
54. What is PDV) and what are its functions?
Program Data Vector (PDV) is a logical area in memory.
- SAS creates a data set, one observation at a time.
- An Input Buffer is created at the time of compilation, for holding a record from external file.
- PDV is created followed by the creation of the Input Buffer.
- SAS builds the data set in the PDV area of memory
55. Distinguish between SAS, Stata, and SPSS.
Each package offers its own unique strengths and weaknesses. As a whole, SAS, Stata, and SPSS form a set of tools that can be used for a wide variety of statistical analyses. With Stat/Transfer, it is very easy to convert data files from one package to another in just a matter of seconds or minutes.
Therefore, there can be quite an advantage switching from one analysis package to another depending on the nature of our problem. For example, if we are performing analysis using mixed models, we might choose SAS, but if we are doing logistic regression we might choose Stata. Moreover, if we are doing an analysis of variance, then we might choose SPSS.
If we are frequently performing statistical analysis, it is strongly recommended to consider making each one of these packages part of our toolkit for data analysis.
56. What are the uses of SAS?
SAS/ETS software provides tools for a wide variety of applications in business, government, and academia. Major uses of SAS/ETS procedures are economic analysis, forecasting, economic and financial modeling, time series analysis, financial modelling, and manipulation of time-series data.
The common theme relating to many applications of the software is time-series data. SAS/ETS software is useful whenever it is necessary to analyze or predict processes that take place over time or to analyze models that involve simultaneous relationships.
Although SAS/ETS software is most closely associated with business, finance, and economics, time-series data also arise in many other fields. SAS/ETS software is useful whenever time dependencies, simultaneous relationships, or dynamic processes complicate data analysis. For example, an environmental quality study might use SAS/ETS software’s time-series analysis tools to analyze pollution emissions data. A pharmacokinetic study might use SAS/ETS software’s features for nonlinear systems to model the dynamics of drug metabolism in different tissues.
57. How do we create a SAS data set with Compressed Observations?
To create a compressed SAS data set, we use the COMPRESS=YES option as an output DATA set option or in an OPTIONS statement. Compressing a data set reduces its size by reducing repeated consecutive characters or numbers to 2-byte or 3-byte representations.
To uncompress observations, we must use a DATA step to copy the data set and use the option COMPRESS=NO for the new data set.
The advantages of using a SAS compressed data set are that there would be reduced storage requirements for the data set and only fewer input/output operations would be necessary to read from and write to the data set during processing.
The disadvantages include not being able to use the SAS observation number to access an observation. The CPU time required to prepare compressed observations for input/output observations is increased because of the overhead of compressing and expanding the observations. We have to remember that if there are a few repeated characters, a data set can occupy more space in the compressed form than in the uncompressed form, due to the higher overhead per observation. For more details on SAS compression see SAS Language: Reference, Version 6, First Edition, Cary, NC: SAS Institute Inc., 1990.
58. How can we minimize the space requirement of a huge data set in SAS for window?
When we are working with large data sets, we can do the following steps to reduce space requirements:
- Split the huge data set into smaller data sets
- Clean up our working space as much as possible at each step
- Use data set options (keep= or drop=) or statements (keep or drop) to limit to only the variables needed
- Use IF statement or OBS= to limit the number of observations
- Use WHERE= or WHERE or index to optimize the WHERE expression to limit the number of observations in a PROC Step and a DATA Step
- Use length to limit the bytes of variables
- Use a _null_ data set name when we don’t need to create a data set
- Compress the data set using system options or data set options (COMPRESS=yes or COMPRESS=binary)
- Use SQL to do merge, summary, sort, etc. rather than a combination of PROC Step and DATA Step with temporary data sets.