Public Health Disparities Geocoding Project Monograph
Types of SAS Merging
there are many ways in which data can be combined using SAS commands,
most frequently one of the following five methods are used: (1) concatenation,
(2) interleaving, (3) one-to-one merge, (4) matched merge, (5) updating.1
is when SAS data sets are stacked on top of each other. This is most
commonly used when additional observations are added to an existing
dataset. Typically the concatenated data sets have the same variables,
although they may have some or none of the variables in common. Concatenation
is done using the SET statement.
is similar to concatenation, but is used for combining datasets when
it is optimal to have observations with similar values for key variables
appear consecutively in the output dataset. Interleaving uses the same
commands as concatenation, except a BY statement is used to indicate
the key variable(s) that observations with similar values should appear
next to each other. Prior to merging, both data sets should be sorted
by the key variable(s) using the SORT statement.
merge is done to place datasets with an equal number
of observations side by side when there is no variable that can be used
to match observations. This method should be avoided, and is appropriate
only if observations are in the same order in both data sets and there
are no common variables that can be used to merge the datasets using
a matched merge.
is similar to the one-to-one merge, but data from observations in each
dataset are combined based on a common identifying variable. This is
most commonly used when new variables for the same observations are
to be added to an existing dataset.
If for some observations, the identifying variable is present in only
one dataset, missing values will be assigned to the variables of the
dataset where there is no identifier. Before merging, variables must
be sorted by the common identifying variable.
is similar to a matched-merge, but non-missing data from the second
dataset overwrite the values for variables that are common to both datasets.
merging the numerators and denominators by AGECAT and AREAKEY in step
3 of the case example, it is the matched-merge type of combing data that
is used. The merging of this data with ABSM data in step 4, as well as
the merge of this data with the year 2000 standard million 5 category
age distribution in step 5 is also an example of a matched-merge.
DeIorio, Frank. (1991). SAS applications programming: a gentle introduction.
Duxbury Press, Pacific Grove, CA