# How to read hlm output?

**Answer # 1 #**

In most cases, data tends to be clustered. Hierarchical Linear Modeling (HLM) enables you to explore and understand your data and decreases Type I error rates. This tutorial uses R to demonstrate the basic steps of HLM in social science research.

Before beginning the analysis, let’s briefly talk about what is HLM first.

HLM (AKA multilevel modeling) analyzes data that is clustered in an organized pattern(s), such as universities in states, non-white males in tech companies, and clinics in hospitals. HLM is an ordinary least square (OLS) that requires all assumptions met (check out my tutorial for OLS assumption and data screening) except the independence of errors assumption. The assumption is likely violated as HLM allows data across clusters to be correlated.

Predictors in HLM can be categorized into random and fixed effects. Random effects refer to variables that are not the main focus of a study but may impact the dependent variable and therefore needed to be included in the model. Fixed effects, on the other hand, are key predictors of the study. For example, a psychologist wants to predict the impact of adverse childhood trauma on one’s tendency to develop borderline personality disorder (BPD) in adulthood. Participants are from collectivist and individualistic cultures, and both cultures likely define parents’ behaviors differently. People in individualistic cultures, such as those in America or the U.K., probably consider parents’ spanking abusive, whereas collectivist individuals, such as Asians and Africans, may consider spanking as a way to enhance a child’s discipline. Thus, participants from different cultures may be variously impacted by the same behavior from their parents during childhood and may develop BPD symptoms at a different level. According to the example, childhood trauma is treated as the fixed effects based on personality literature and the researcher’s interest as a psychologist. Cultures could be treated as random effects as the variable potentially impact borderline personality development but not the main focus of the study. It is noteworthy that random effects should be categorical, whereas fixed effects could be dummy variables (a categorical variable with two levels) or continuous variables.

Some of you may think, why don’t we use a single level regression model and control for potential random effects (e.g., cultures according to the mentioned example)? Doing so may introduce wrong standard error estimates as residuals (i.e., observations in the same group) tend to be correlated. For instance, people from the same culture may view a behavior in the same way. A single-level model’s error term represents clustered data errors across levels, limiting us from knowing how much effects that the key predictor (e.g., childhood trauma) has on one’s tendency to develop BPD after controlling for cultures in which participants are nested.

Still confused? Let’s look at the equations below:

yi = β0 + β1xi +ei A single regression model — (1)

yij = β0 + uj + eij A variance components model — (2)

yij = β0 + β1xij + uj + eij A mixed model (with random intercepts) — (3)

i is the number of observation (e.g., participant #1, #2, #3..).

j is the category of culture that each observation belongs to (i.e., j = 1 is collectivism, and j = 0 is individualism).

β1 is adverse childhood trauma

y is BPD tendency.

u is variance in y that is not explained by cultures, controlling for other predictors.

e is variance in y that is not explained by childhood trauma controlling for other predictors.

According to equation 1, the error term (ei) indicates an unexplained variance of the outcome that is not accounted for by the key independent variable (e.g., childhood trauma). Equation 2 shows two error terms, including the error term of the random effects (uj) (i.e., cultures) and the error term of the fixed effects nested in the random effects (eij) (childhood trauma scores in different cultures). Equation 3 represents a mixed model that integrates equations 1 and 2, accounting for more accurate error estimates relative to the single-level regression model in equation 1.

Now that you have some foundation of HLM let’s see what you need before the analysis.

A fictional data set is used for this tutorial. We will look at whether one’s narcissism predicts their intimate relationship satisfaction, assuming that narcissistic symptoms (e.g., self absorb, lying, a lack of empathy) vary across times in which different life events occur. Thus, fixed effects are narcissistic personality disorder symptoms (NPD). The outcome variable is one’s intimate relationship satisfaction (Satisfaction). The random effects are Time with three levels coded as 1 (before marriage), 2 (1 year after marriage), and 3 (5 years after marriage).

Step 1: Import data

Step 2: Data cleaning

This tutorial assumes that your data has been cleaned. Check out my data preparation tutorial if you would like to learn more about cleaning your data. For my current data set, all of the assumptions, except the independence of errors, are met, consistent with the HLM requirement.

Step 1:An intercept only model.

An intercept only model is the simplest form of HLM and recommended as the first step before adding any other predictive terms. This type of model testing allows us to understand whether the outcome variable scores (i.e., relationship satisfaction in this tutorial) are significantly different from zero (i.e., participants have indicated certain relationship satisfaction levels) without considering other predictors. For an OLS model, an intercept is also known as the constant, which in an intercept only model is the mean of the outcome variable, as shown in the below equation:

We will use the gls function (i.e., generalized least squares) to fit a linear model. The gls function enables errors to be correlated and to have heterogeneous variances, which are likely the case for clustered data. I will identify my intercept only model as ‘model1.’

Here are the results:

The p-value is significant, indicating that participants’ relationship satisfaction is significantly different from zero.

Step 2: A random intercept model.

This step added my random effects (i.e., Time) to see whether the predictor increases a significant variance explained in my dependent variable relative to the previous intercept only model (Model 1).

Statistically speaking, if you still remember the earlier equations, the intercept for the overall regression of an intercept only model is still β0. However, for each group of random effects(i.e., each point of Time after marriage), the intercept is β0+uj (when uj represents errors of the dependent variable that are not explained by Time).

To test the random intercept model, I will use the lme function as an alternative approach in addition to the mentioned gls function. Like gls, the lme function is used to test a linear mixed-effects model, allowing nested random effects and the correlations among within-group errors. Both lme and gls enable the maximum likelihood application.

Before including Time as random effects, make sure that the variable is categorical:

The output says ‘false,’ so I need to convert Time into a categorical variable.

Modeling the random intercept:

The results:

Now, you may wonder how I could know whether my random effects (i.e., Time) are significant. There are a couple of ways to look at this.

From my model 1’s and 2’s outputs, you will see that model 1’s AIC = 6543.89, and Model 2’s AIC = 6533.549. Generally, the two AIC values that differ more than 2 indicate a significant difference in model fitting. The lower the AIC value is, the better fit a model. You can see that including Time as random effects in Model 2 improves my Model 1 (6543.89 -6533.549 > 2).

2. In addition to AIC, we can compare the intercept only model and the random intercept using the ANOVA function.

Here are the results:

The p-value, 4e-04, is equal to 4 x 10^-4, indicating that the results are highly significant. Adding the random intercept thus significantly improves the intercept only model.

In addition to the gls and lme functions from the package nlme, we can use lmer from package lme4. In general, both lme and lmer are effective functions for mixed data analysis with some differences to be considered:

To put it simply, I would say for a simple HLM analysis, both lme4 and nlme should provide close parameter values. You may check out this page for comparisons of the packages.

If you want to try lme4, you need to install merTools first:

Let’s run our random intercept model using lmer from lme4

Results:

You can see that the parameters of model 2 (lme4) and model 2.1 (nlme) are quite close.

We can also run an ICC (AKA Intraclass Correlation Coefficient) to see the correlation of observations within groups (i.e., relationship satisfaction within each Time point in my case). The ICC index can range from 0 to 1, with more values indicate higher homogeneity within groups (Gelman & Hill, 2007).

You can see that my ICC value is approximately .01, indicating that the relationship satisfaction of participants nested within a point of Time is quite different from each other.

Before moving to the next HLM analysis step, I want to make sure that my fixed effects regression coefficient is accurate. To do so, I will request a 95% confidence interval (CI) using confint.

If you are not familiar with a CI, the term refers to a range of values that may include the true population parameter with a certain range of percent confidence (mostly 95%). The formula is

x bar is the sample mean

z is confidence level value

n is sample size

s is sample SD

A CI, let’s say at 95%, contains two endpoints. We may set a lower 1% limit, meaning that the probability that the true population parameter is below the 1% limit of our data scores is only 1%. We may also set an upper 96% limit, meaning that the probability that the true population parameter is beyond the 96% limit of our data scores is only 4%. The upper and lower limits together indicate that an interval or the probability that we will find the true population parameter out of the range that we set (1% — 96%) is 5% (1% + 4%). So we have a 95% confidence interval that the true parameter will be in the upper and lower limit range of our sample. If you want to learn more about CI and its relation to t-distribution, check out this link.

Now, confidence levels are different from a confidence interval. If we re-run a study several times and estimate a parameter of interest with a 95% CI, we will get different 95% CI values each Time due to errors in our data that could be caused by several factors, such as participants’ factors, measurement errors, our moods during each analysis. However, 95% of those different CI values will cover the true parameter value, and this concept is confidence levels. If we set a lower limit of our confidence levels at 1%, it means that out of many experiments that we conduct repeatedly, the true parameter value will be lower than this 1% limit in only 1% of those many experiments. If we set an upper 96% limit, the probability that we will find the true parameter value higher than the upper limit is 4% of several experiments that we repeatedly conduct.

As humans like symmetrical things, people often set a 95% CI as a lower 2.5% limit and an upper 97.5% limit. The true population parameter value will be below the interval in 2.5% of repeated studies and above it in another 2.5% of those studies. Thus, the confidence levels will cover the true parameter in 95% of all conducted studies.

Let’s get back to our example. If I want to know the confidence levels of model 2.1, I will use the following code.

Results:

The results indicate that if I re-rerun my study several times, 95% of the times, the intercept coefficient (i.e., the true mean of relationship satisfaction in population considering the random effects of Time) would be somewhere between 4.98–5.21 approximately.

Step 3: Fixed effects in the random intercept model

As I am mainly interested in the NPD’s fixed effects, I will include the predictor in my random intercept model (model 2 or model 2.1). I still let the intercept vary, meaning that each point of Time may have different intercepts of relationship satisfaction scores. To generate fixed effects in the random intercept model, I will use lme() from the nlme package.

Results:

The fixed effects are significant. Let’s compare whether the random intercept model with fixed effects (Model 3) is better than the random intercept model (Model 2).

Results:

The results show a significant difference across the two models, indicating that adding fixed effects significantly improved the random intercept model.

An alternative for model fitting in Step 3 is to use the lmer function:

Results:

You see that the parameter estimates are quite close across the lme and lmer functions.

Step 4: Adding a random slope term.

In HLM, adding random slopes allow regression lines across groups of random effects to vary in terms of slope coefficients. In my case, the slopes between one’s NPD and the outcome (relationship satisfaction) across different levels of Time could vary as people’s NPD symptoms may be weakened or strengthened across Time points, depending on their life events. To test the assumption, I will nest NPD traits in Time and allow the slopes of NPD and relationship satisfaction to vary across different Time levels.

Results:

The output suggests that the variation in the intercept of Time is fitted with a larger SD of 0.0724. The variation in NPD slopes in predicting relationship satisfaction is fitted with a smaller SD of 0.0024. The results indicate that participants’ relationship satisfaction likely differs across levels of Time more than the severity of NPD symptoms within each point of Time.

A weak positive correlation (Corr; r=0.131) between the intercept of Time and the NPD slope means that a more positive value of the intercept is slightly related to a more positive value of the slope. If participants’ intercepts increase by one unit of SD, the slopes will only increase by 0.131 SDs. In other words, the intercept of relationship satisfaction obviously differs across Time, whereas a variation in the slope of the correlation between NPD and relationship satisfaction is subtler. Thus, it is highly likely that Model 4 (adding the random slope term) does not significantly improve Model 3(the random intercept model). Let’s test the assumption.

Results:

As expected, adding the random slope term does not significantly improve the random intercept model and increased the AIC value (i.e., worse fit). To exclude the random slope term or not depends on several factors, such as theories that inform your data, whether excluding or including the random slope makes the models converge, and whether you would like to get a parsimonious or maximal model. It all depends on your decision and field of study. This article provides additional detail about random effects that are worth reading.

Additional steps:

If you have an interaction term, you may test whether adding the term improves your model. I will test whether adding borderline personality disorder traits (BPD), which are highly comorbid with NPD, as a moderator will improve my random intercept model (model 3). I choose to ignore the random slope model (model4) as the term does not improve the model, and studies argue that NPD traits may not change across Time points.

Results:

The interaction term is significant. We will see whether adding the interaction improves Model 3:

As expected, adding the interaction term significantly improves my random intercept only model:

I hope by now, you have got a sense of how to conduct simple HLM. Please stay tuned for more complex HLM analysis in the future.

**Answer # 2 #**

In this section you will learn how to: 1. Load a Stata file into R 2. Add a new variable to a dataset 3. Merge data files 4. Write data to a csv file 5. Change data from long format to wide format 6. Change data from wide format to long format

Download HSB datasets in .dat format CSV file also available at: http://www.ats.ucla.edu/stat/paperexamples/singer/

Save the files into a folder on your computer. Set your working directory to the file that the datasets are saved in For Mac Users: If you are unsure what the appropriate file path is, you can look it up by right clicking on the file and selecting “Get Info” A dialogue box will pop up, the file path is under “Where:”

Check that your working directory is set to the correct file. Running this command should return the file path that you set in the code above.

Sometimes people may share data with you that is not saved in a csv or other delimited file. If you have your data already in a csv file you can skip this step.

Install the ‘forgeign’ library and load it. This package allows you to upload data files from SPSS, SAS, and Stata to R.

For complete documentation see: http://cran.r-project.org/web/packages/foreign/foreign.pdf

In this example, we read in a Stata file (.dta).function = read.dta() Description: reads in a Stata version 5-12 binary format into a data frame. Note: this function does not support Stata formats after version 12 Usage: read.dta(file, convert.dates = T, convert.factors = T, missing.type = F, convert.underscore = F, warn.missing.labels = T) Arguments: file = filename or URL as a character string convert.dates = Convert Stata dates to Date class, and date-times to POSIXct class? Default is TRUE (i.e., yes). convert.factors = Use Stata value labels to create factors? (Stata Version 6.0 or later) Default is TRUE. missing.type = For Stata version 8 or later, store information about different types of missing data? Default is FALSE (i.e., no). convert.underscore = Convert “_" in Stata variable names to “.” in R names? Default is FALSE. warn.missing.labels = Warn if a variable is specified with value labels and those value labels are not present in the file? Default is TRUE.

Because we have no missing data or date information, we simply write the file path and leave all of the set default arguments.

We have now loaded the student data and the school data. The High School and Beyond (HSB) data was collected in 1980. This data is part of a longitudinal study of American youth conducted by the National Opinion Research Center on behalf of the National Center for Educaiton Statistics (NCES).

The student data (hsb1) includes the following variables: id: School id number, which school the student attends minority: White student = 0, Minority student (e.g., Black) = 1 female: Male = 0, Female = 1 ses: Composite SES, sum of 5 variables (standardized within grade prior to summing): Father’s occupation, Father’s education, Mother’s education Family income, and additive sum of following scale assessing whether the family had each of the following in their possession: (1) daily newspaper, (2) encyclopedia, (3) typewriter (4) electric dishwasher (5) 2 or more cars or trucks (6) more than 50 books (7) a room of your own (8) pocket calculator mathach: Mathematics test score in senior year of high school Mean = 12.7, SD = 6.8

We have a school id in the student level data, but we also need to add an identifier for each individual student. Because each row in hsb1 is the data for a single student, we simply create a variable ‘p’ that is numbered 1 through 7185, the total number of students in the dataset.

The school level data (hsb2) includes the following variables: id: School id number size: School size sector: Type of school, Catholic = 1, Public = 0 pracad: Proportion of students in the academic track disclim: Continuous measure of the disciplinary climate in the school himinty: School enrollment > 40% minority = 1 (n=44), school enrollment <= 40% minority = 0 (n=116) meanses: Average SES for the school, grand mean centered so that 0 = school with average SES level

The next step is to combine these two datasets into one, using school id as the matching key. To do this we will use the merge() function. Description: Merge two data frames by common columns or row names, or do other versions of database “join” operations. Usage: merge(x, y, by = intersect(names(x), names(y)), by.x = by, by.y = by, all = F, all.x = all, all.y = all, sort = T, suffixes = c(“.x”,“.y”), incomparables = NULL, …) Arguments: x, y: data frames to be merged into one by, by.x, by.y: specifications of columns to be used for merging, these are your id/matching key variables all.x: logical, should extra rows be added to the output for each row in x that has no matching row in y? Default is FALSE. all.y: logical, same as all.x but for the y file. all: logical, should extra rows be added to the output for each row in x and y that does not have a corresponding row in the other file? Default is FALSE. sort: logical, should the result be sorted on the ‘by’ columns? Default is TRUE. suffixes: a character vector of length 2 specifying the suffixes to be used for making unique names for variables that appear in both files but are not specified in the ‘by’ statement incomparables: values which cannot be matched. See ?match. This is intended to be used for merging on one column, so these are incomparable values of that column.

This data set is already in long format, so each school has a row for each student. This is the format that we would want our final dataset to be in to use it to conduct Multilevel Models (MLMs). You might want to export this dataset as a csv to share it with others. To export the dataset we use the function write.table() Description: prints the argument x, after converting it to a data frame, to a file Usage: write.table(x, file = “”, append = F, quote = T, sep = " “, eol =”“, na =”NA“, dec =”.“, row.names = T, col.names = T, qmethod = c(”escape“,”double“), fileEncoding =”“) Arguments: x: object to be written, preferably a matrix or data frame. If not, the function will attempt to coerce it to a data frame. file: character string with desired file name append: logical, Append output to the file? Only relevant if file is a character string. If TRUE, the output is appended to the file, if FALSE, any existing file of the same name is destroyed (i.e., writes over the existing file). Default is FALSE. quote: logical or numeric. If TRUE, any character or factor columns will be surrounded by double quotes. If a numeric vector, the numbers index the columns to be put in quotes. In both cases, the row and coumn names are quoted if written. If FALSE, nothing is quoted. Default is TRUE. sep: the field separator string. Values within each row of x are separated by this string. For csv sep = ‘,’. Default is ‘’ eol: the character(s) to print at the end of each line. For example, eol = ‘’ will produce Windows’ line endings on a Unix-alike OS, and eol = ‘’ will produce files as expected by Mac Excel (2004). Default is ‘’ na: the string to use for missing values in the data. For example na = ‘NA’, na = ‘.’, or na = ‘-999’. Default is ‘NA’. dec: the string to use for decimal points in numeric or complex columns: must be a single character. Default is ‘.’ row.names = logical or character vector of row names to be written. Default = TRUE. col.names = logical indicating whether the column names of x are to be written along with x, or a character vector of the column names to be written. This is useful if you want to rename all of the variable names in the dataset. Default is TRUE. qmethod = a character string specifying how to deal with embedded double quote characters when quoting strings. Must be ‘escape’ in which case the quote character is escaped in C style by a backslash, or ‘double’ in which case it is doubled. You can just specify the initial letter ‘e’ or ‘d’. Default is ‘escape’ fileEncoding: character string, if non-empty declares the encoding to be used on a file so the character data can be re-encoded as they are written.

There is also a function made specifically for writing csv files write.csv() Descritpion: This is another way of writing a csv file. The arguments are the same as the arguments for write.table, with a few exceptions: Usage: write.csv(…) …: arguments to write.table: append, col.names sep, dec, and qmethod cannot be altered. qmethod: default is ‘double’ If col.names = NA and row.names = T, a blank column is added, which is the convention used for CSV files to be read by spreadsheets.

For the next two sections you need to install and load the ‘reshape2’ package. Complete reference manual here: http://cran.r-project.org/web/packages/reshape/reshape.pdf

Sometimes you may need to convert a file back from wide format into long format. While R analyzes repeated measures and nested data (e.g., students nested within schools) in long format, SPSS uses wide format to analyze repeated measures and nested data.

We want data with one row per school, with a column for each students’ math achievement, SES, gender, and minority status. In order to do this, first we need to create a student level id variable that numbers students within each school, from the first student to the ith student in that school.

In the next step we use the function to change the data from long format to wide format. We want one row for each school, we keep the school level variables in the same format, but create a new column for each students’ math achievement score, gender, minority status, and ses.

We use the dcast function Description: reshapes existing object into data frame Usage: dcast(data, formula, fun.aggregate = NULL, …, margins = NULL, subset = NULL, fill = NULL, drop = T, value.var = guess_value(data)) Arguments: data: your data frame formula: casting formula; The cast formula has the following format: variable you don’t want to split by + other variable you don’t want to split by (in our example all of the school level vars) ~ variable you want to split by (in our example, student) fun.aggregate: aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified margins: vector of variable names (can include “grand_col” and “grand_row) to compute margins for, or TRUE to compute all margins. Any variables that can not be margined over will be silently dropped subset: quoted expression used to subset data prior to reshaping, e.g. subset = .(variable==”length“). fill: value with which to fill in structural missings, defaults to value from applying fun.aggregate to 0 length vector drop: should missing combinations dropped or kept? Default is TRUE. value.var: value to be placed under the new split variable columns

The next step is to change our data from wide format back to the original long format. To do this we use the melt function from the reshape2 package The melt function has several uses, below are details for it’s use with data frames. Usage: melt(data, id.vars, measure.vars, variable.name = “variable”, …, na.rm = F, value.name = “value”, factorsAsStrings = T) Arguments: data: the data frame you wish to alter (hsb.wide.final in our example). id.vars: the variables you wish to stay the same (in our example these are the school level variables). Can be input as a numeric vector indexing the columns or as a character vector of column names. measure.vars: the variables that you wish to collapse into one to long format column. Can be a numeric vector indexing columns or a character vector of column names. variable.name: column name for measured variables (in our example this is student.id) Default label is measure. …: further arguments passed to or from other methods na.rm: Should NA values be removed from the data set? This will convert explicit missings to implicit missings. Default is FALSE. value.name: the name of the variable of values (in our example this is mathach, minority, female, and ses). Default label is value. factorsAsStrings: Control whether factors are converted to character when melted as measure variables. When FALSE, coercion is forced if levels are not identical across the measure.vars

This example is pretty unusual in that we want to have a row for each student with a column for their math achievement score, their gender, minority status, and ses, plus their school’s variables. Most examples online only show one variable being transformed. Here we transform each of the four student level variables separately. I tried to write a function to automate this process since it is repetative, but I couldn’t get one to work. You can get the desired result by repeating use of the melt function for each new variable you want to create and then merge each of the resulting data frames together to get the data frame in the desired format.

**Answer # 3 #**

The data file for this example is made like this

You can download the MDM file here: hsb12hlm

Model 1. SES Group mean centered at level 1, MeanSES Grand mean centered at level 2.

Consider this multilevel model where we are predicting math achievement (MATHACH) based on the student level SES (USES) that has been group mean centered at level 1 and school level average SES (MEANSES) that has been grand mean centered at level 2. This model is shown below.

MATHACHij = β0j + β1j (Group Mean Centered SES) + rij β0j = γ00 + γ01 (Grand Mean Centered Mean SES) + u0j β1j = γ10 + u1j

We can specify this in HLM like this.

When we run this in HLM using Maximum Likelihood Estimation, we get these results.

In terms of the model, the results look like this.

MATHACHij = β0j + β1j (USES) + rij β0j = 12.64 + 5.86 (Grand Mean Centered Mean SES) + u0j β1j = 2.19

Interpretation. This decomposes the relationship between SES and MATCHACH into two components, the level 1 (within school) component and the level 2 (between school) component. At level 1, the regression coefficient is 2.19, meaning that for a one unit increase in a student’s SES, their math achievement would be predicted to increase by 2.19 units. The level 2 coefficient of 5.86 means that when the SES for a school increases by 1 unit, the school level math achievement would be expected to increase by 5.86 units.

Model 2. SES Grand mean centered at level 1, MeanSES Grand mean centered at level 2.

Let’s now try this model where as compared to Model 1 SES was group mean centered it is now Grand Mean Centered. We continue to Grand mean center MEANSES at level 2.

MATHACHij = β0j + β1j (Grand Mean Centered SES) + rij β0j = γ00 + γ01 (Grand Mean Centered Mean SES) + u0j β1j = γ10

We can specify this in HLM like this.

When we run this in HLM using Maximum Likelihood Estimation, we get these results.

In terms of the model, the results look like this.

MATHACHij = β0j + β1j (USES) + rij β0j = 12.66 + 3.67 (Grand Mean Centered Mean SES) + u0j β1j = 2.19

Interpretation. Like model 2, this decomposes the relationship between SES and MATCHACH into two components, the level 1 (within school) component and the level 2 (between school) component. Also, like model 2 the level 1 regression coefficient remains 2.19 with the same meaning. But, note how the coefficient associated with the grand mean centered SES has changed to 3.67. This value reflects the difference of the Between school regression coefficient for meanses minus the Within school regression coefficient for SES. Referring to the estimates from Model 2, 5.865601 – 2.191172 = 3.674429.

Model 3. Group Centered SES at level 1.

Suppose that we imagine that we are only concerned with the level 1 effect of SES (at the student level) and really do not care about the level 2 effect. We might construct a model like this to examine just the student level effect of SES.

MATHACHij = β0j + β1j (Group Centered SES) + rij β0j = γ00 + u0j β1j = γ10

We can specify this in HLM like this.

When we run this in HLM, we get these results.

In terms of the model, the results look like this.

MATHACHij = β0j + β1j (USES) + rij β0j = 12.63 + u0j β1j = 2.19 + u1j

Model 4. Grand Mean Centered SES at level 1.

Suppose that we imagine that we are only concerned with the level 1 effect of SES (at the student level) and really do not care about the level 2 effect. We might construct a model like this to examine just the student level effect of SES.

MATHACHij = β0j + β1j (Uncentered SES) + rij β0j = γ00 + u0j β1j = γ10

We can specify this in HLM like this.

When we run this in HLM, we get these results.

In terms of the model, the results look like this.

MATHACHij = β0j + β1j (USES) + rij β0j = 12.65 + u0j β1j = 2.39 + u1j

Model 5. Uncentered SES at level 1.

Suppose that we imagine that we are only concerned with the level 1 effect of SES (at the student level) and really do not care about the level 2 effect. We might construct a model like this to examine just the student level effect of SES.

MATHACHij = β0j + β1j (Uncentered SES) + rij β0j = γ00 + u0j β1j = γ10

We can specify this in HLM like this.

When we run this in HLM, we get these results.

In terms of the model, the results look like this.

MATHACHij = β0j + β1j (USES) + rij β0j = -17.23 + u0j β1j = 2.39 + u1j

**Answer # 4 #**

Hierarchical linear modeling (HLM) is an ordinary least square (OLS) regression-based analysis that takes the hierarchical structure of the data into account. Hierarchically structured data is nested data where groups of units are clustered together in an organized fashion, such as students within classrooms within schools. The nested structure of the data violates the independence assumption of OLS regression, because the clusters of observations are not independent of each other. HLM can also be called multi-level modeling. Hierarchical linear modeling can be used for the purpose of prediction. It can also be used for the purpose of data reduction, and can be helpful for drawing out the causal inference.

The flexibility of HLM models is reflected in the variety of applications for which it is used. There is substantial application of HLM models for the study of longitudinal data where observations are nested within individuals. Longitudinal HLM models, sometimes described as growth curve models, treat time in a flexible manner that allows the modeling of non-linear and discontinuous change across time and accommodates uneven spacing of time points and unequal numbers of observations across individuals.

HLM models provide a framework that incorporates variables on each level of the model. For example, student characteristics, such as age and school characteristics, such as graduation rate, can be modeled. HLM models can be extended beyond two levels. For example, students nested within schools are nested within school districts. In addition to purely hierarchical structures, there is a class of models called cross-classified models that allow units to be nested within more than one cluster where the clusters are not structurally related. For example, students could be nested within schools and churches, where there is no relationship between schools and churches.

Do school level scores, class level scores, and student level scores effectively impact school district scores?

Do demographic factors and geographic location influence voter scores?

HLM can be explained in two steps. Using the school example, the first step: separate analyses are conducted for every school taken under consideration with the help of the student level data. For example, the test scores of the students in some particular subjects could be regressed on the basis of the student’s socio-economic status and the gender of the student.

In the second step, the regression parameters obtained from the first step of the analyses become the outcome variables of interest.

Data does not need to meet the homogeneity-of-regression slopes requirement.

Data must be linear and normal.

The assumption of homoscedasticity must be met.

The assumption of independence is not required.

Linearity: relationship between variables is straight line or rectangular

Normality: the error terms at every level of the model are normally distributed

Homoscedasticity: equal variances on groups