STAT resources for stat geeks in social sciences

MPLUS workshop: introduction

MPLUS was designed to be easy to use.

The syntax, is fairly more easy to use than the EQS and LISREL framework. In EQS is not easy to write shortcuts for writing the equations from factor to item, in MPLUS you can use one line and that's it. Also, you loose the name of every variable, and each time you have to remember who was V1 and V2 and so on; in MPLUS you can name the variables as you want to.

In LISREL, although you can write models by describing the matrix in
the 0 1 form, or link by link, the selection command of variables changes the way you have to call variables every time. None of these is a trouble in MPLUS.

Its a very versatile program

Can Handle from regression, path analysis, EFA, CFA, IRT, HLM, LGM and so on. In general, is a software that permits you to handle different kinds of simultaneous regressions and latent variable estimation, from continuous observed variables, or categorical.

It can change your view on research

Just knowing about the possibilities makes you a better researcher. Learning to do more advanced analyses changes the research questions you can ask, and the way you think about your research topic

This introduction will cover up:

PATH
SEM
HLM
LGM

How to declare F29, for PPM bills | Cómo realizar la declaración del Formulario 29, para boletas PPM [FONDECYT]

Here you can find a screen shot for how to declare F29 from SII in Chile.

Aqui pueden encontrar una ppt con un screenshot de en que lineas colocar los montos para pagar el Formulario 29, de la declaracion de Impuestos Mensual, al SII, por boletas PPM [emitidas a personas naturales].

MERGE databases: complex scenarios, abstract example

This is a very practical syntax for creating database, specially in the case for complex merge scenarios.

The most simple merge scenario, is just to add cases, with two symmetrical sheets with the same amount of variables, which implies same structure and same quantity of columns. Nonetheless, the merge of different datasets can get more complicated when there is changes in a few items between studies, and items missing. For scenarios with the above described characteristic, I call the name of complex merge scenarios.

For this example, I’m going to use two fictional databases. Lets imagine a study with two measurement occasions, with cases that could be in time 1, and time 2, and also could be in more than one moment in time 1 or time 2. To add more complexity to the scenario, the first measurement occasion, differs from the second, with different variables, but a few of them are share.

In a complex scenario with more than one measurement occasion, there are two things to do: compare the items between the database provided, and evaluate the in how many the appearance of the unit of analysis per occasion.

ITEM COMPARISON

The item comparison step (see first 7 minutes of the video) is just to accomplish the task to identify the shared items between two databases. In this example, same name variable, imply same item data registry, which could not always be the case. In this abstract example, this is a prerequisite. Once the shared items are identify, we can use the following syntax, with shared variable list:

SAVE OUTFILE='C:\Users\dacarras\Desktop\T1 to merge.sav'
/KEEP=UNIQUE
Var1
Var2
Var3
Var4
Var5
Var6
Var7
/COMPRESSED.

The first line of the syntax, is the command for saving the new database. The important line, is the second, the KEEP command. This command, permits to call the variables you want to save from the source database, and in which order. For example, If the syntax the unique variable is declare at the end, in the data base would appear at the end. For any case, KEEP command has at least two functionalities: select the variables you want to keep, and declare the order in which you want them. It permits the reorder of the variables in SPSS.

As we have the variables in order for the the both database to merge (t1 and t2 to merge), in symmetrical form now, is not such a big deal to make a merge with the add cases (video) option in SPSS. Now the second issue, is to resolve how many measures are per unit of analysis.

APPEARANCE OF THE UNIT OF ANALYSIS PER MEASUREMENT OCCASION

If we already have a person period database (Singer & Willett, 2003), we can use a few options from SPSS to resolve this issue. UNIQUE is going to be index to identify each case, each unit of analysis. By using the option of ‘identify duplicate cases’ in SPSS [DATA] and the match sequence sub option we can identify how many appearances a case have.

This creates two variables, ‘PrimaryFirst’ is a dummy variable who target the first appearance of the index in the database; and leaves the rest of it just as a 0, creating a point of reference. The second variable, ‘Matchsequence’, using the previous point of reference, counts how many times the index appears in the database.

This two variables, leaves any case that only appear one time, with the following pattern:

PrimaryFirst = 1 & Matchsequence = 0

And for the cases that appear more than one time, would have at least one registry with the following pattern:

PrimaryFirst = 1 & Matchsequence = 1

This main differences can permit us create new variables to transpose the database in the form we want it to, selecting the first case appearance and the last one, has time 1 and time 2, to build a person level (Singer & Willett, 2003) database.

The downside of this example, as is fictional, there is no meaning on who is first or who’s last. In other aspect, is an incomplete example, ‘cause every measurement occasion is not provide with a proper time variable to distinguish when the registry of the responses occur. Although, it permits to show 4 different utilities of big functionality for complex merging:

item comparison
reorder variables
add cases
identify duplicate cases
match sequence measures

In the near future, I hope to document and comment a real merge scenario with several measurement occasion.

References

Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. Oxford University Press, USA.

How to create References in ZOTERO

Zotero is a plugin for Mozilla Firefox that permits to manage a whole library of references. Zotero webpage itself has many video tutorials, and also you can find other videos in youtube.

Nonetheless, here you can find a straightforward video, that shows how, while writing a simple paragraph, Zotero is use to add the cites and generate a reference in APA format. I have to warn you though, these video notes are in spanish, but the main for this tutorial are the actions.

How to conduct a EFA on MPLUS with ordinal data

In the previous Example we use a simple syntax from MPLUS to produce basic descriptives. In the following example we introduce new commands in mplus: EFA and PLOT.

This example, is only descriptive for showing the commands, no special remarks are made on how to interpret the loadings, the scree test, nor the ouput. The main thing is just for showing the command lines [in the future, we or I, should produce an special topic on how to deal with the question of ‘how many factors’ and ‘how to report a factor analisys’].

The syntax to use is:

title: EFA on ordinal data;
data: file = C:\EXAMPLE01.txt;
! if we still have the same previous data from the example,
! everything should work
variable: names =
year
NUNICO
con1 con2 con3
con4 con5 con6
con7 con8 con9
con10 con11 var1;
CATEGORICAL are
con1 con2 con3
con4 con5 con6
con7 con8 con9
con10 con11;

USEVARIABLES =

con1 con2 con3
con4 con5 con6
con7 con8 con9
con10 con11 ;

     missing = all (-99);
ANALYSIS: TYPE = EFA 1 4;
PLOT:
   TYPE IS
PLOT3;
OUTPUT: MODINDICES;

This example, require the use of the previous data. Here is the video tutorial, the notes, the syntax, and the mplus output.

How to export a data set from SPSS to MPLUS

In these post you will be able to find: syntax example, a short data base to repeat the example, and a video tutorial which shows how to export a data base from spss to mplus, the written notes made in the video, the MPLUS syntax produced, the dataset for mplus, and the mplus ouput results.

MPLUS, is not able yet to read database in every format, like *.sav files from SPSS, *.dta files from STATA or *.sd7 files from SAS. It handles csv files and tab files [see the links examples], which are mainly text database formats, which are called in general ASCII (text) files. The first one express the different columns between the values using a "," for each column, and a different line for each row on the text file. The second, instead of a character use a 'tab' separation in the text. These are the most standard and simple database format.

Usually, data in SPSS from cross sectional or panel studies can have a lot of variables, specially from long survey data collections. The original database from DIPUC study that we're going to use has 2122 data fields. Although, for the example we're going to use just a short form of it, with only one hundred data fields [aprox.]. To cope with the huge amount of variables one option is to use a syntax in SPSS to select the variables we need for the analysis and create a short database more easy to handle. Another reason to do this, is the limit of MPLUS to handle variables, which is 500 datafields. Although these are plenty data fields, could be not enough to try to handle a whole database from a survey directly, like the one mention above [see the image in the paragraph, is a screenshot of the original database in 10% view, in excel format].

The recommend syntax is the following:

SAVE TRANSLATE OUTFILE='C:\data\[put the name the database here].txt'

/TYPE=TAB

/MAP

/REPLACE

/CELLS=VALUES

/TEXTOPTIONS DECIMAL= DOT

/KEEP=

variable01

variable02

variable03

variable04

variable05

variable06 .

You can specify the name of the database you want to save and the location, in the first line, after the command 'SAVE TRANSLATE OUTFILE=' . After the KEEP= line, you can also specify the in which order you want the variables to be presented by calling them.

Another recommended option (Geiser, 2009) is to recode the missing values from SPSS to an unmistakeable value, like 999, -99, or other. This could be done, using the following syntax:

RECODE variable01 variable02 variable03 variable04 (SYSMIS=-99) .

In general, the syntax needed is to RECODE first, then SAVE TRANSLATE to an ASCII file.

Once data is recoded, and exported, could be read in MPLUS, to make a simple descriptives of the data, using this general syntax:

title: Checking if data is well exported and readable by MPLUS;

data: file = C:\EXAMPLE01.txt;

! This is a comment line

! This is yet another ...

variable: names =

var01

var02

var03

var04

var05

var06

var07

var08;

missing = all (-99);

analysis: type = basic;

References

Geiser, C. (2009). Datenanalyse mit Mplus: Eine anwendungsorientierte Einführung. VS Verlag für Sozialw.