What they call workflow

[Password protected version of this post, under resources, with downloadable example files]

Many researchers talk about workflow. This refers to the series of steps involved in completing an investigation: going from intention to paper, to paraphrase the title of Levelt’s (1989) book.

Here, I am going to comment on the middle of the workflow, assuming a design has been agreed in the past and the drafting of the report is in the future. I will focus on the example lexical decision study completed by ML.

We will usually proceed through a series of steps in completing an investigation:

1. Select items and prepare a script

See the materials in the Dropbox R resources folder:-

Typically, in conducting a psycholinguistic study, e.g. a study of reading, we will need to decide what items are presented for our participants to read: stimulus selection. How do we do this? It all depends on the design. In the materials provided, you will see that one can proceed by selecting a fairly loosely constrained set of items. I decided to select words for which psycholinguistic normative data were available, words that were monosyllabic and monomorphemic, and so on. I usually progress through a series of steps that I am careful to annotate: this is one part of the methodological information, the recipe for building the stimulus set, that must be recorded for others to reproduce my work.


— item selection, collation of norms, coding, script building in:

indiv diffs lexical decision 210212 270312.xlsx

See these notes on getting normative data

See the script and bitmap used to collect data:

lexical decision instructions.bmp
lexical decision_320p10_2s_220212.rtf

Note that through these materials you can trace back the creation of the .rtf script to the construction steps in the notes in the .xlsx.

You will need to read materials on DMDX and data collection to see how we use the .rtf script to get the raw data files.

2. We test people (here using DMDX) and collect the data

The DMDX script will output an .azk file.

In lexical decision, or any other task requiring a keypress response, this is all you need but see word naming script output differences.

lexical decision_320p10_2s_220212.azk

If you do a paper-and-pencil test, you will collect the results in an excel spreadsheet:

Participants final.xls

We might typically use the ART (see papers by Stanovich, Masterson etc.), the TOWRE (Torgesen et al.), the Phonological Awareness Battery etc.

You will need to read materials on DMDX and data collection to see how we use the .rtf script to get the raw data files.

3. We collate the data, cleaning up files (ie removing redundant info) and converting to R-readable formats (csv or txt) as necessary

We frequently work between the data collection and the analysis steps to deal with problems like misspelling of subjectID, incorrect entry of response data, see notes in:

ML data collation 060612.xlsx

I think it is very important that we document a clear chain of steps from the raw data file to the final data file used in analysis:

ML data 080612.csv

ML scores 080612.csv

Note the different formats, .xlsx and .csv in use here.

We also often need to pull data about the items into data files – normative data or data about stimulus types and item numbers as used in the DMDX data collection scripts:

item norms 270312.csv
item coding info 270312.csv

We will always need to be careful to check for errors even as we proceed through data analysis steps.

4. We then run a series of analyses, writing code to do the analysis, with commenting to explain it

— possibly superceded code

example code R 270312 020412 030412 110412 130612.R

We will go on to analyse the data files after they have been input to the workspace for R, as follows (see script):
lexical.RTs <- read.csv(file=”ML data 080612.csv”,head=TRUE,sep=”,”)

subjects <- read.csv(file=”ML scores 080612.csv”,head=TRUE,sep=”,”)

item.coding <- read.table(file=”item coding info 270312.txt”,head=TRUE)

item.norms <- read.table(file=”item norms 270312.txt”,head=TRUE)


Forster, K. I., & Forster, J. C. (2003). DMDX: A Windows display program with millisecond accuracy. Behavior Research Methods, Instruments, & Computers, 35, 116-124.
Levelt, W. J. M. (1989). Speaking: From intention to articulation. Cambridge,MA: MIT Press.
Masterson, J., & Hayes, M. (2007). Development and data for UK versions of an author and title recognition test for adults. Journal of Research in Reading, 30, 212-219.
Stanovich, K. E., & West, R. F. (1989). Exposure to print and orthographic processing. Reading Research Quarterly, 24, 402-433.
Torgesen, J. K., Wagner, R. K., & Rashotte, C. A. (1999). TOWRE Test of word reading efficiency. Austin,TX: Pro-ed.

This entry was posted in .Interim directions, DMDX, rstats, workflow and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s