About SAS dataset
SAS data set is a SAS file, which holds data. Data should be in the form of a SAS data set to get processed. A SAS data set contains data value organized as a table of observations and variables to process by SAS.
Rows are defined as Observations and Columns are defines as Variables in SAS data set.
Rules for SAS Data sets:
Rules for Valid SAS files /data set Names:
- 1-32 characters long
- Must begin with letter or underscore
- Can continue with any combination of numbers, letters or underscore.
Learn more about SAS in this insightful blog now!
SAS data sets consists of two parts:
- Descriptor portion
- Data portion
Descriptor portion:
Descriptor portion of SAS data set contains attributes of the SAS data set and variables. It includes:
- Number of observations
- Observation length
- Data and time that data set was created / last modified
- Name of the datasets
- Other factors
Descriptor portion for variables contain attributes such as name, type, length, format, label and other.
Data Portion:
It’s a collection of data value. Arranged in the form of table.
Observations:
Rows are called observations in SAS data set. It is a collection of data values that usually relate to a single object in SAS Datasets
Variables:
Columns are called variables in SAS. It is a collection of values that describe a particular characteristic.
Missing Value:
If a data is unknown for a particular observation, a missing value is recorded
- “.” (Dot/Period) indicate missing value of a numeric value
- “ “ (Blank) indicate missing value of a character value
Null Data Sets
If you want to execute a DATA step but do not want to create a SAS data set, specify keyword _NULL_ as data set name;
data _NULL_ ;
Automatic Naming:
If you do not specify a SAS data set name or using _NULL_ in DATA statement, SAS automatically creates a data set with the names DATA1, DATA2 and so on to WORK or USER library. This is called DATAn naming convention; data;