tristanz / OpenIRT (http://people.fas.harvard.edu/~tzajonc/openirt.html)

Bayesian and Maximum Likelihood Estimation of Item Response Theory (IRT) Models

Clone this repository (size: 589.9 KB): HTTPS / SSH
$ hg clone http://bitbucket.org/tristanz/openirt
commit 25: 09fe34d62567
parent 24: 12cc7127cb73
branch: default
Added slides, TIMSS examples, fixed missing responses.
Tristan Zajonc / tristanz
8 months ago

Changed (Δ61.8 KB):

raw changeset »

Stata/openirt.ado (1 lines added, 1 lines removed)

Stata/openirt.pkg (5 lines added, 1 lines removed)

Stata/openirt.sthlp (11 lines added, 13 lines removed)

Stata/openirt_examples.do (187 lines added, 0 lines removed)

Stata/timss_children.dta (binary file changed)

Stata/timss_items.dta (binary file changed)

docs/Open IRT Slides.lyx (1531 lines added, 0 lines removed)

src/openirt.cpp (1 lines added, 1 lines removed)

Up to file-list Stata/openirt.ado:

@@ -49,7 +49,7 @@ program openirt
49
49
			display as error "Warning: `v' is 100 percent 1, consider dropping item."
50
50
		}
51
51
	}
52
	
52
	qui recode _all (.=-9999)
53
53
	qui gen `group' = 1
54
54
	qui outsheet id `group' `theta' `item_prefix'* using `response_file', replace delim(" ") noquote nolabel nonames
55
55
	* type `response_file'

Up to file-list Stata/openirt.pkg:

@@ -18,6 +18,8 @@ d OpenIRT relies on compiled code that w
18
18
f openirt.ado
19
19
f openirt.sthlp
20
20
G WIN	OpenIRT.exe	openirt.exe
21
G WIN64A	OpenIRT.exe	openirt.exe
22
G WIN64I	OpenIRT.exe	openirt.exe
21
23
G MAC	openirt_osx	openirt.exe
22
24
G MACINTEL	openirt_osx	openirt.exe
23
25
G MACINTEL64	openirt_osx	openirt.exe
@@ -25,4 +27,6 @@ G LINUX openirt_linux openirt.exe
25
27
F openirt.ini
26
28
F naep_children.dta
27
29
F naep_items.dta
28
F naep.do
30
F timss_children.dta
31
F timss_items.dta
32
F openirt_examples.do

Up to file-list Stata/openirt.sthlp:

59
59
{opt model("3PL")} specifies the default model for items.  The default model is the three parameter logistic model (3PL).  {opt model("2PL")} forces the guessing parameter to zero, given the two parameter logistic model.  Fixed items specified in {opt fixed_item_file(filename)} will override the default model type.  This allows a mixture of 3PL and 2PL models, and fixed and free items.
60
60
61
61
{phang}
62
{opt theta(varname)} specifies variable name holding fixed trait or ability parameters.  Any missing entries will be treated as free parameters.  In most cases applications, theta is free and therefore this options should be left out.
62
{opt theta(varname)} specifies variable name holding fixed trait or ability parameters.  Any missing entries will be treated as free parameters.  In most cases applications, theta is free and therefore this options should be left out.{p_end}
63
63
64
64
{phang}
65
{opt fixed_item_file("filename")} specifies filename holding fixed item types and parameters, such as items from the TIMSS or NAEP item bank.  The file must include at least four variables: {it:id}, {it:type}, {it:a}, {it:b}, {it:c}.  {it:id} gives the unique numeric item identifier that matches the {opt item_prefix("prefix")} postfix.  {it:type} should equal 1 for 2PL items and 2 for 3PL items. {it:a} is the item discrimination parameter, {it:b} is the item difficulty parameter, and {it:c} is the  item guessing parameter. Note: {cmd: openirt} assumes all items use the normal ogive metric ({it:D = 1.7}).
65
{opt fixed_item_file("filename")} specifies filename holding fixed item types and parameters, such as items from the TIMSS or NAEP item bank.  The file must include at least four variables: {it:id}, {it:type}, {it:a}, {it:b}, {it:c}.  {it:id} gives the unique numeric item identifier that matches the {opt item_prefix("prefix")} postfix.  {it:type} should equal 1 for 2PL items and 2 for 3PL items. {it:a} is the item discrimination parameter, {it:b} is the item difficulty parameter, and {it:c} is the  item guessing parameter. Note: {cmd: openirt} assumes all items use the normal ogive metric ({it:D = 1.7}).{p_end}
66
66
67
67
{dlgtab:MCMC Options}
68
68
69
69
{phang}
70
{opt samplesize(2000)} specifies the number of post burn in MCMC iterations (default = 2000). Plausible values are drawn at evenly spaced intervals from this sample, and EAP estimates are based on the mean of the entire sample. Larger sample sizes will reduce the monte carlo standard error.  In most applications the standard error of measurement dominates the monte carlo standard error after several thousand iterations, although longer chains should be used in any final analysis.
70
{opt samplesize(2000)} specifies the number of post burn in MCMC iterations (default = 2000). Plausible values are drawn at evenly spaced intervals from this sample, and EAP estimates are based on the mean of the entire sample. Larger sample sizes will reduce the monte carlo standard error.  In most applications the standard error of measurement dominates the monte carlo standard error after several thousand iterations, although longer chains should be used in any final analysis.{p_end}
71
71
72
72
{phang}
73
{opt burnin(1000)} specifies the number of burn in MCMC iterations (default = 1000).  MCMC estimates rely on the chain converging to a stationary distribution.  In most IRT applications this occurs quite quickly -- within several hundred iterations.  If estimates do not appear to be converging, increase the burn in period.
73
{opt burnin(1000)} specifies the number of burn in MCMC iterations (default = 1000).  MCMC estimates rely on the chain converging to a stationary distribution.  In most IRT applications this occurs quite quickly -- within several hundred iterations.  If estimates do not appear to be converging, increase the burn in period.{p_end}
74
74
75
75
{title:Discussion}
76
76
86
86
87
87
{pstd}{it:Note on speed}: Estimation can be slow due to the large number of free  parameters estimated using MCMC simulation.  Users with large data sets may wish to use small subsamples of data before running an analysis on the full sample.  On many systems a built in progress bar does not currently display in Stata.{p_end}
88
88
89
{title:General instruction.}
89
{title:General instructions}
90
90
91
{pstd}The easiest way to learn openirt is to work through the examples below.  Each follows these rough steps:{p_end}
91
92
92
{phang2}1. For complex tests, create a {opt fixed_item_file}.  A fixed item file is required if you have both 2PL and 3PL items or if any of the items are fixed.{p_end}
93
{phang}1. For complex tests, create a {opt fixed_item_file}.  A fixed item file is required if you have both 2PL and 3PL items or if any of the items are fixed.{p_end}
93
94
94
{phang2}2. Load the response data.  Responses should be coded 0/1 (numeric) for incorrect/correct.  For multiple tests forms, each row (unit) should include all possible items; items that a unit did not receive should be set to missing.  Items must all have the same prefix, e.g., item1, item2, etc.{pend}
95
{phang}2. Load the response data.  Responses should be coded 0/1 (numeric) for incorrect/correct.  For multiple tests forms, each row (unit) should include all possible items; items that a unit did not receive should be set to missing.  Items must all have the same prefix, e.g., item1, item2, etc.{p_end}
95
96
96
{phang2}3. Run the appropriate openirt command.  {pend}
97
{phang}3. Run the appropriate openirt command.  {p_end}
97
98
98
{phang2}3. Analyze the data using the saved trait and item parameter files.{pend}
99
99
{phang}3. Analyze the data using the saved trait and item parameter files.{p_end}
100
100
101
101
{title:Examples:  Scoring single form.}
102
102
103
{phang2}{cmd:. sysuse naep_children, clear}{p_end}
103
104
{phang2}{cmd:. openirt, id(id) save_item_parameters("items.dta") save_trait_parameters("traits.dta") item_prefix("item")}{p_end}
104
105
@@ -146,6 +147,3 @@ Van der Linden, W.J. and Hambleton, R.K.
146
147
{pstd}Harvard University, Cambridge, MA 02138.{p_end}
147
148
{pstd}Email: tzajonc@fas.harvard.edu{p_end}
148
149
{pstd}Web: http://www.people.fas.harvard.edu/~tzajonc/{p_end}
149
150
151

Up to file-list Stata/openirt_examples.do:

1
* Example code using OpenIRT
2
3
* Install:
4
cap program drop openirt
5
net install http://www.people.fas.harvard.edu/~tzajonc/stata/openirt/openirt, force
6
7
clear
8
set mem 50m
9
10
* Save 12 NAEP items locally
11
sysuse naep_items, clear
12
save naep_items, replace
13
count
14
15
* Easier to mix and match with c=0 for 2PL
16
recode c (.=0)
17
18
** DEMO 1: Item characteristic curves
19
20
* 2PL ICC (row 3)
21
twoway (function 1/(1+exp(-1.7*a[3]*(x-b[3]))), range(-4 4)), ///
22
	xtitle("Theta") ytitle("P(X=1|Theta)") title("Item Characteristic Curve")
23
24
* overlay two ICCs
25
twoway (function 1/(1+exp(-1.7*a[3]*(x-b[3]))), range(-4 4)) ///
26
	(function 1/(1+exp(-1.7*a[4]*(x-b[4]))), range(-4 4)), ///
27
	xtitle("Theta") ytitle("P(X=1|Theta)") title("Item Characteristic Curve") ///
28
	legend(order(1 "Item 16" 2 "Item 17"))
29
	
30
* 2pl and 3PL curve
31
twoway (function 1/(1+exp(-1.7*a[3]*(x-b[3]))), range(-4 4)) ///
32
	(function c[1] + (1-c[1])/(1+exp(-1.7*a[1]*(x-b[1]))), range(-4 4)), ///
33
	xtitle("Theta") ytitle("P(X=1|Theta)") title("Item Characteristic Curve") ///
34
	legend(order(1 "Item 16" 2 "Item 12"))
35
36
** Demo 2: Item information curves
37
38
* 2PL IIC
39
local i = 3
40
local iic2pl "(1.7*a[`i'])^2*(1/(1+exp(-1.7*a[`i']*(x-b[`i']))))*(1-(1/(1+exp(-1.7*a[`i']*(x-b[`i'])))))"
41
twoway (function `iic2pl', range(-4 4)), ///
42
	xtitle("Theta") ytitle("I(Theta)") title("Item Information Curve")
43
44
* 3PL IIC
45
local i = 2
46
local p "(c[`i']+(1-c[`i'])/(1+exp(-1.7*a[`i']*(x-b[`i']))))"
47
local iic3pl "((1.7*a[`i'])^2*(1-`p')/`p')*((`p'-c[`i'])/(1-c[`i']))^2"
48
twoway (function `iic3pl', range(-4 4)), ///
49
	xtitle("Theta") ytitle("I(Theta)") title("Item Information Curve")
50
51
* Multiple graphs
52
forvalues i = 1/9 {
53
	local title = "Item " + string(id[`i'])
54
	local p "(c[`i']+(1-c[`i'])/(1+exp(-1.7*a[`i']*(x-b[`i']))))"
55
	local iic3pl "((1.7*a[`i'])^2*(1-`p')/`p')*((`p'-c[`i'])/(1-c[`i']))^2"
56
	twoway (function `iic3pl', range(-5 5)), ///
57
		xtitle("Theta") ytitle("I(Theta)") title("`title'")
58
	graph save tmp`i', replace
59
}
60
graph combine tmp1.gph tmp2.gph tmp3.gph tmp4.gph tmp5.gph tmp6.gph tmp7.gph tmp8.gph tmp9.gph, ycommon xcommon
61
62
* Test information
63
replace c = 0 if c==.
64
local tif "0"
65
forvalues i = 1/12 {
66
	local p "(c[`i']+(1-c[`i'])/(1+exp(-1.7*a[`i']*(x-b[`i']))))"
67
	local iic3pl "((1.7*a[`i'])^2*(1-`p')/`p')*((`p'-c[`i'])/(1-c[`i']))^2"
68
	local tif "`tif'+`iic3pl'"
69
}
70
local se "1/sqrt(`tif')"
71
twoway (function `tif', range(-4 4)) /// 
72
	(function `se', range(-4 4) yaxis(2)), ///
73
	xtitle("Theta") ytitle("I(Theta)") title("Test Information") ytitle("Standard Error", axis(2)) ///
74
	legend(order(1 "Test Information" 2 "Expected Standard Error"))
75
76
** Demo 3: Test characteristic curve
77
replace c = 0 if c==.
78
local tcc "0"
79
forvalues i = 1/12 {
80
	local p "(c[`i']+(1-c[`i'])/(1+exp(-1.7*a[`i']*(x-b[`i']))))"
81
	local tcc "`tcc'+`p'/12"
82
}
83
twoway (function `tcc', range(-4 4)), /// 
84
	xtitle("Theta") ytitle("Percent Correct") title("Test Characteristic Curve")
85
86
** DEMO 4: Estimating IRT models using OpenIRT
87
88
* load response data
89
sysuse naep_children, clear
90
91
* Preliminary exploration: percent correct score
92
egen percent_correct = rowmean(item*)
93
kdensity percent_correct, xtitle("Percent Correct") title(NAEP Percent Correct Score Distribution)
94
95
* Often useful to check that items are positive correlated 
96
* with percent correct score
97
corr percent_correct item*
98
99
* Read help...
100
help openirt
101
102
* Example 1: Estimate both item parameters and ability for single test
103
openirt, id(id) item_prefix(item) save_item_parameters("items.dta") save_trait_parameters("traits.dta")
104
105
* Merge in ability estimates
106
merge id using traits, sort
107
108
* Compare distributions using different estimates
109
	
110
* Multiple plausible values can be combined to form better
111
* kdensity estimate.  Using one plausible value works better though too.
112
kdensity(theta_pv1), bw(.4) gen(x1 d1)
113
kdensity(theta_pv2), bw(.4) gen(x2 d2) at(x1)
114
kdensity(theta_pv3), bw(.4) gen(x3 d3) at(x1)
115
kdensity(theta_pv4), bw(.4) gen(x4 d4) at(x1)
116
kdensity(theta_pv5), bw(.4) gen(x5 d5) at(x1)
117
egen d = rowmean(d*)
118
line(d x1)
119
120
twoway (kdensity theta, bw(.4)) (kdensity theta_eap, bw(.4)) ///
121
	(line d x1) (kdensity theta_mle, bw(.4)), ///
122
	xtitle("Theta") title("True Theta vs EAP, PV, MLE Estimates") ///
123
	legend(order(1 "True" 2 "EAP" 3 "PV" 4 "MLE"))
124
125
drop x* d*
126
127
* QQ plot distribution comparisons
128
qqplot(theta_eap theta), xtitle("Theta (True)") ytitle("Theta (EAP)") title("") saving(tmp1.gph, replace)
129
qqplot(theta_mle theta), xtitle("Theta (True)") ytitle("Theta (MLE)") title("") saving(tmp2.gph, replace)
130
qqplot(theta_pv1 theta), xtitle("Theta (True)") ytitle("Theta (PV1)") title("") saving(tmp3.gph, replace)
131
qqplot(theta_pv2 theta), xtitle("Theta (True)") ytitle("Theta (PV2)") title("") saving(tmp4.gph, replace)
132
graph combine tmp1.gph tmp2.gph tmp3.gph tmp4.gph
133
134
* Graph TRUE vs EAP
135
twoway (scatter theta_eap theta) (function y=x, range(-3 3)), ///
136
	xtitle("Theta (True)") ytitle("Theta (EAP)") title("") ///
137
  text(3 3 "y = x", place(e)) legend(off)
138
139
* Graph TRUE vs MLE
140
twoway (scatter theta_mle theta) (function y=x, range(-3 3)), ///
141
	xtitle("Theta (True)") ytitle("Theta (MLE)") title("") ///
142
  text(3 3 "y = x", place(e)) legend(off)
143
144
* Example 2: Link two test forms
145
sysuse naep_children, clear
146
147
* Simulate two test forms by setting some responses to missing
148
* Must leave some common items.
149
recode item1-item6 (0/1=.) if _n <= 250
150
recode item7-item13 (0/1=.) if _n >250
151
152
* Estimate
153
openirt, id(id) item_prefix(item) save_item_parameters("items.dta") save_trait_parameters("traits.dta")
154
155
* Merge in ability estimates
156
merge id using traits, sort
157
158
* Graph TRUE vs EAP
159
twoway (scatter theta_eap theta) (function y=x, range(-3 3)), ///
160
	xtitle("Theta (True)") ytitle("Theta (EAP)") title("") ///
161
  text(3 3 "y = x", place(e)) legend(off)
162
163
* Graph TRUE vs MLE
164
twoway (scatter theta_mle theta) (function y=x, range(-3 3)), ///
165
	xtitle("Theta (True)") ytitle("Theta (MLE)") title("") ///
166
  text(3 3 "y = x", place(e)) legend(off)
167
168
* Example 3: Link to TIMSS using test formed from TIMSS item bank.
169
sysuse timss_items, clear
170
save fixed_items, replace
171
172
sysuse timss_children, clear
173
openirt, id(id) save_item_parameters("items.dta") save_trait_parameters("traits.dta") ///
174
	fixed_item_file("fixed_items.dta") item_prefix(item)
175
176
* Merge in ability estimates
177
merge id using traits, sort
178
179
* Graph TRUE vs EAP
180
twoway (scatter theta_eap theta) (function y=x, range(-3 3)), ///
181
	xtitle("Theta (True)") ytitle("Theta (EAP)") title("") ///
182
  text(3 3 "y = x", place(e)) legend(off)
183
184
* Graph TRUE vs MLE
185
twoway (scatter theta_mle theta) (function y=x, range(-3 3)), ///
186
	xtitle("Theta (True)") ytitle("Theta (MLE)") title("") ///
187
  text(3 3 "y = x", place(e)) legend(off)

Up to file-list Stata/timss_children.dta:

-
Diff size exceeds threshold (27.2 KB)view raw?

Up to file-list Stata/timss_items.dta:

Binary file has changed or diff was empty.

Up to file-list docs/Open IRT Slides.lyx:

-
Diff size exceeds threshold (26.2 KB)view raw?

Up to file-list src/openirt.cpp:

@@ -42,7 +42,7 @@ int main(int argc, char* argv[]) {
42
42
	cout << "Adding missing responses..." << endl;
43
43
	for(int i = 0; i < responses.num_responses; ++i) {
44
44
		for(int j = 0; j < items.num_items; ++j) {
45
			if(responses.x(i, j)==MISSING) {
45
			if(approx_equal(responses.x(i, j),MISSING)) {
46
46
				if(items.type(j) == TYPE_2PL) {
47
47
					Missing2PLParameter p(false, "missing", i, j);
48
48
					sampler.AddStep(new GibbsStep<Missing2PLParameter, int>(p));