tristanz / OpenIRT (http://people.fas.harvard.edu/~tzajonc/openirt.html)
Bayesian and Maximum Likelihood Estimation of Item Response Theory (IRT) Models
$ hg clone http://bitbucket.org/tristanz/openirt
| commit 25: | 09fe34d62567 |
| parent 24: | 12cc7127cb73 |
| branch: | default |
Changed (Δ61.8 KB):
Stata/openirt.ado (1 lines added, 1 lines removed)
Stata/openirt.pkg (5 lines added, 1 lines removed)
Stata/openirt.sthlp (11 lines added, 13 lines removed)
Stata/openirt_examples.do (187 lines added, 0 lines removed)
Stata/timss_children.dta (binary file changed)
Stata/timss_items.dta (binary file changed)
docs/Open IRT Slides.lyx (1531 lines added, 0 lines removed)
src/openirt.cpp (1 lines added, 1 lines removed)
Up to file-list Stata/openirt.ado:
| … | … | @@ -49,7 +49,7 @@ program openirt |
49 |
49 |
display as error "Warning: `v' is 100 percent 1, consider dropping item." |
50 |
50 |
} |
51 |
51 |
} |
52 |
||
52 |
qui recode _all (.=-9999) |
|
53 |
53 |
qui gen `group' = 1 |
54 |
54 |
qui outsheet id `group' `theta' `item_prefix'* using `response_file', replace delim(" ") noquote nolabel nonames |
55 |
55 |
* type `response_file' |
Up to file-list Stata/openirt.pkg:
| … | … | @@ -18,6 +18,8 @@ d OpenIRT relies on compiled code that w |
18 |
18 |
f openirt.ado |
19 |
19 |
f openirt.sthlp |
20 |
20 |
G WIN OpenIRT.exe openirt.exe |
21 |
G WIN64A OpenIRT.exe openirt.exe |
|
22 |
G WIN64I OpenIRT.exe openirt.exe |
|
21 |
23 |
G MAC openirt_osx openirt.exe |
22 |
24 |
G MACINTEL openirt_osx openirt.exe |
23 |
25 |
G MACINTEL64 openirt_osx openirt.exe |
| … | … | @@ -25,4 +27,6 @@ G LINUX openirt_linux openirt.exe |
25 |
27 |
F openirt.ini |
26 |
28 |
F naep_children.dta |
27 |
29 |
F naep_items.dta |
28 |
F |
|
30 |
F timss_children.dta |
|
31 |
F timss_items.dta |
|
32 |
F openirt_examples.do |
Up to file-list Stata/openirt.sthlp:
59 |
59 |
{opt model("3PL")} specifies the default model for items. The default model is the three parameter logistic model (3PL). {opt model("2PL")} forces the guessing parameter to zero, given the two parameter logistic model. Fixed items specified in {opt fixed_item_file(filename)} will override the default model type. This allows a mixture of 3PL and 2PL models, and fixed and free items. |
60 |
60 |
|
61 |
61 |
{phang} |
62 |
{opt theta(varname)} specifies variable name holding fixed trait or ability parameters. Any missing entries will be treated as free parameters. In most cases applications, theta is free and therefore this options should be left out. |
|
62 |
{opt theta(varname)} specifies variable name holding fixed trait or ability parameters. Any missing entries will be treated as free parameters. In most cases applications, theta is free and therefore this options should be left out.{p_end} |
|
63 |
63 |
|
64 |
64 |
{phang} |
65 |
{opt fixed_item_file("filename")} specifies filename holding fixed item types and parameters, such as items from the TIMSS or NAEP item bank. The file must include at least four variables: {it:id}, {it:type}, {it:a}, {it:b}, {it:c}. {it:id} gives the unique numeric item identifier that matches the {opt item_prefix("prefix")} postfix. {it:type} should equal 1 for 2PL items and 2 for 3PL items. {it:a} is the item discrimination parameter, {it:b} is the item difficulty parameter, and {it:c} is the item guessing parameter. Note: {cmd: openirt} assumes all items use the normal ogive metric ({it:D = 1.7}). |
|
65 |
{opt fixed_item_file("filename")} specifies filename holding fixed item types and parameters, such as items from the TIMSS or NAEP item bank. The file must include at least four variables: {it:id}, {it:type}, {it:a}, {it:b}, {it:c}. {it:id} gives the unique numeric item identifier that matches the {opt item_prefix("prefix")} postfix. {it:type} should equal 1 for 2PL items and 2 for 3PL items. {it:a} is the item discrimination parameter, {it:b} is the item difficulty parameter, and {it:c} is the item guessing parameter. Note: {cmd: openirt} assumes all items use the normal ogive metric ({it:D = 1.7}).{p_end} |
|
66 |
66 |
|
67 |
67 |
{dlgtab:MCMC Options} |
68 |
68 |
|
69 |
69 |
{phang} |
70 |
{opt samplesize(2000)} specifies the number of post burn in MCMC iterations (default = 2000). Plausible values are drawn at evenly spaced intervals from this sample, and EAP estimates are based on the mean of the entire sample. Larger sample sizes will reduce the monte carlo standard error. In most applications the standard error of measurement dominates the monte carlo standard error after several thousand iterations, although longer chains should be used in any final analysis. |
|
70 |
{opt samplesize(2000)} specifies the number of post burn in MCMC iterations (default = 2000). Plausible values are drawn at evenly spaced intervals from this sample, and EAP estimates are based on the mean of the entire sample. Larger sample sizes will reduce the monte carlo standard error. In most applications the standard error of measurement dominates the monte carlo standard error after several thousand iterations, although longer chains should be used in any final analysis.{p_end} |
|
71 |
71 |
|
72 |
72 |
{phang} |
73 |
{opt burnin(1000)} specifies the number of burn in MCMC iterations (default = 1000). MCMC estimates rely on the chain converging to a stationary distribution. In most IRT applications this occurs quite quickly -- within several hundred iterations. If estimates do not appear to be converging, increase the burn in period. |
|
73 |
{opt burnin(1000)} specifies the number of burn in MCMC iterations (default = 1000). MCMC estimates rely on the chain converging to a stationary distribution. In most IRT applications this occurs quite quickly -- within several hundred iterations. If estimates do not appear to be converging, increase the burn in period.{p_end} |
|
74 |
74 |
|
75 |
75 |
{title:Discussion} |
76 |
76 |
|
86 |
86 |
|
87 |
87 |
{pstd}{it:Note on speed}: Estimation can be slow due to the large number of free parameters estimated using MCMC simulation. Users with large data sets may wish to use small subsamples of data before running an analysis on the full sample. On many systems a built in progress bar does not currently display in Stata.{p_end} |
88 |
88 |
|
89 |
{title:General instruction |
|
89 |
{title:General instructions} |
|
90 |
||
90 |
91 |
{pstd}The easiest way to learn openirt is to work through the examples below. Each follows these rough steps:{p_end} |
91 |
92 |
|
92 |
{phang |
|
93 |
{phang}1. For complex tests, create a {opt fixed_item_file}. A fixed item file is required if you have both 2PL and 3PL items or if any of the items are fixed.{p_end} |
|
93 |
94 |
|
94 |
{phang |
|
95 |
{phang}2. Load the response data. Responses should be coded 0/1 (numeric) for incorrect/correct. For multiple tests forms, each row (unit) should include all possible items; items that a unit did not receive should be set to missing. Items must all have the same prefix, e.g., item1, item2, etc.{p_end} |
|
95 |
96 |
|
96 |
{phang |
|
97 |
{phang}3. Run the appropriate openirt command. {p_end} |
|
97 |
98 |
|
98 |
{phang2}3. Analyze the data using the saved trait and item parameter files.{pend} |
|
99 |
||
99 |
{phang}3. Analyze the data using the saved trait and item parameter files.{p_end} |
|
100 |
100 |
|
101 |
101 |
{title:Examples: Scoring single form.} |
102 |
||
102 |
103 |
{phang2}{cmd:. sysuse naep_children, clear}{p_end} |
103 |
104 |
{phang2}{cmd:. openirt, id(id) save_item_parameters("items.dta") save_trait_parameters("traits.dta") item_prefix("item")}{p_end} |
104 |
105 |
|
| … | … | @@ -146,6 +147,3 @@ Van der Linden, W.J. and Hambleton, R.K. |
146 |
147 |
{pstd}Harvard University, Cambridge, MA 02138.{p_end} |
147 |
148 |
{pstd}Email: tzajonc@fas.harvard.edu{p_end} |
148 |
149 |
{pstd}Web: http://www.people.fas.harvard.edu/~tzajonc/{p_end} |
149 |
||
150 |
||
151 |
Up to file-list Stata/openirt_examples.do:
1 |
* Example code using OpenIRT |
|
2 |
||
3 |
* Install: |
|
4 |
cap program drop openirt |
|
5 |
net install http://www.people.fas.harvard.edu/~tzajonc/stata/openirt/openirt, force |
|
6 |
||
7 |
clear |
|
8 |
set mem 50m |
|
9 |
||
10 |
* Save 12 NAEP items locally |
|
11 |
sysuse naep_items, clear |
|
12 |
save naep_items, replace |
|
13 |
count |
|
14 |
||
15 |
* Easier to mix and match with c=0 for 2PL |
|
16 |
recode c (.=0) |
|
17 |
||
18 |
** DEMO 1: Item characteristic curves |
|
19 |
||
20 |
* 2PL ICC (row 3) |
|
21 |
twoway (function 1/(1+exp(-1.7*a[3]*(x-b[3]))), range(-4 4)), /// |
|
22 |
xtitle("Theta") ytitle("P(X=1|Theta)") title("Item Characteristic Curve") |
|
23 |
||
24 |
* overlay two ICCs |
|
25 |
twoway (function 1/(1+exp(-1.7*a[3]*(x-b[3]))), range(-4 4)) /// |
|
26 |
(function 1/(1+exp(-1.7*a[4]*(x-b[4]))), range(-4 4)), /// |
|
27 |
xtitle("Theta") ytitle("P(X=1|Theta)") title("Item Characteristic Curve") /// |
|
28 |
legend(order(1 "Item 16" 2 "Item 17")) |
|
29 |
||
30 |
* 2pl and 3PL curve |
|
31 |
twoway (function 1/(1+exp(-1.7*a[3]*(x-b[3]))), range(-4 4)) /// |
|
32 |
(function c[1] + (1-c[1])/(1+exp(-1.7*a[1]*(x-b[1]))), range(-4 4)), /// |
|
33 |
xtitle("Theta") ytitle("P(X=1|Theta)") title("Item Characteristic Curve") /// |
|
34 |
legend(order(1 "Item 16" 2 "Item 12")) |
|
35 |
||
36 |
** Demo 2: Item information curves |
|
37 |
||
38 |
* 2PL IIC |
|
39 |
local i = 3 |
|
40 |
local iic2pl "(1.7*a[`i'])^2*(1/(1+exp(-1.7*a[`i']*(x-b[`i']))))*(1-(1/(1+exp(-1.7*a[`i']*(x-b[`i'])))))" |
|
41 |
twoway (function `iic2pl', range(-4 4)), /// |
|
42 |
xtitle("Theta") ytitle("I(Theta)") title("Item Information Curve") |
|
43 |
||
44 |
* 3PL IIC |
|
45 |
local i = 2 |
|
46 |
local p "(c[`i']+(1-c[`i'])/(1+exp(-1.7*a[`i']*(x-b[`i']))))" |
|
47 |
local iic3pl "((1.7*a[`i'])^2*(1-`p')/`p')*((`p'-c[`i'])/(1-c[`i']))^2" |
|
48 |
twoway (function `iic3pl', range(-4 4)), /// |
|
49 |
xtitle("Theta") ytitle("I(Theta)") title("Item Information Curve") |
|
50 |
||
51 |
* Multiple graphs |
|
52 |
forvalues i = 1/9 { |
|
53 |
local title = "Item " + string(id[`i']) |
|
54 |
local p "(c[`i']+(1-c[`i'])/(1+exp(-1.7*a[`i']*(x-b[`i']))))" |
|
55 |
local iic3pl "((1.7*a[`i'])^2*(1-`p')/`p')*((`p'-c[`i'])/(1-c[`i']))^2" |
|
56 |
twoway (function `iic3pl', range(-5 5)), /// |
|
57 |
xtitle("Theta") ytitle("I(Theta)") title("`title'") |
|
58 |
graph save tmp`i', replace |
|
59 |
} |
|
60 |
graph combine tmp1.gph tmp2.gph tmp3.gph tmp4.gph tmp5.gph tmp6.gph tmp7.gph tmp8.gph tmp9.gph, ycommon xcommon |
|
61 |
||
62 |
* Test information |
|
63 |
replace c = 0 if c==. |
|
64 |
local tif "0" |
|
65 |
forvalues i = 1/12 { |
|
66 |
local p "(c[`i']+(1-c[`i'])/(1+exp(-1.7*a[`i']*(x-b[`i']))))" |
|
67 |
local iic3pl "((1.7*a[`i'])^2*(1-`p')/`p')*((`p'-c[`i'])/(1-c[`i']))^2" |
|
68 |
local tif "`tif'+`iic3pl'" |
|
69 |
} |
|
70 |
local se "1/sqrt(`tif')" |
|
71 |
twoway (function `tif', range(-4 4)) /// |
|
72 |
(function `se', range(-4 4) yaxis(2)), /// |
|
73 |
xtitle("Theta") ytitle("I(Theta)") title("Test Information") ytitle("Standard Error", axis(2)) /// |
|
74 |
legend(order(1 "Test Information" 2 "Expected Standard Error")) |
|
75 |
||
76 |
** Demo 3: Test characteristic curve |
|
77 |
replace c = 0 if c==. |
|
78 |
local tcc "0" |
|
79 |
forvalues i = 1/12 { |
|
80 |
local p "(c[`i']+(1-c[`i'])/(1+exp(-1.7*a[`i']*(x-b[`i']))))" |
|
81 |
local tcc "`tcc'+`p'/12" |
|
82 |
} |
|
83 |
twoway (function `tcc', range(-4 4)), /// |
|
84 |
xtitle("Theta") ytitle("Percent Correct") title("Test Characteristic Curve") |
|
85 |
||
86 |
** DEMO 4: Estimating IRT models using OpenIRT |
|
87 |
||
88 |
* load response data |
|
89 |
sysuse naep_children, clear |
|
90 |
||
91 |
* Preliminary exploration: percent correct score |
|
92 |
egen percent_correct = rowmean(item*) |
|
93 |
kdensity percent_correct, xtitle("Percent Correct") title(NAEP Percent Correct Score Distribution) |
|
94 |
||
95 |
* Often useful to check that items are positive correlated |
|
96 |
* with percent correct score |
|
97 |
corr percent_correct item* |
|
98 |
||
99 |
* Read help... |
|
100 |
help openirt |
|
101 |
||
102 |
* Example 1: Estimate both item parameters and ability for single test |
|
103 |
openirt, id(id) item_prefix(item) save_item_parameters("items.dta") save_trait_parameters("traits.dta") |
|
104 |
||
105 |
* Merge in ability estimates |
|
106 |
merge id using traits, sort |
|
107 |
||
108 |
* Compare distributions using different estimates |
|
109 |
||
110 |
* Multiple plausible values can be combined to form better |
|
111 |
* kdensity estimate. Using one plausible value works better though too. |
|
112 |
kdensity(theta_pv1), bw(.4) gen(x1 d1) |
|
113 |
kdensity(theta_pv2), bw(.4) gen(x2 d2) at(x1) |
|
114 |
kdensity(theta_pv3), bw(.4) gen(x3 d3) at(x1) |
|
115 |
kdensity(theta_pv4), bw(.4) gen(x4 d4) at(x1) |
|
116 |
kdensity(theta_pv5), bw(.4) gen(x5 d5) at(x1) |
|
117 |
egen d = rowmean(d*) |
|
118 |
line(d x1) |
|
119 |
||
120 |
twoway (kdensity theta, bw(.4)) (kdensity theta_eap, bw(.4)) /// |
|
121 |
(line d x1) (kdensity theta_mle, bw(.4)), /// |
|
122 |
xtitle("Theta") title("True Theta vs EAP, PV, MLE Estimates") /// |
|
123 |
legend(order(1 "True" 2 "EAP" 3 "PV" 4 "MLE")) |
|
124 |
||
125 |
drop x* d* |
|
126 |
||
127 |
* QQ plot distribution comparisons |
|
128 |
qqplot(theta_eap theta), xtitle("Theta (True)") ytitle("Theta (EAP)") title("") saving(tmp1.gph, replace) |
|
129 |
qqplot(theta_mle theta), xtitle("Theta (True)") ytitle("Theta (MLE)") title("") saving(tmp2.gph, replace) |
|
130 |
qqplot(theta_pv1 theta), xtitle("Theta (True)") ytitle("Theta (PV1)") title("") saving(tmp3.gph, replace) |
|
131 |
qqplot(theta_pv2 theta), xtitle("Theta (True)") ytitle("Theta (PV2)") title("") saving(tmp4.gph, replace) |
|
132 |
graph combine tmp1.gph tmp2.gph tmp3.gph tmp4.gph |
|
133 |
||
134 |
* Graph TRUE vs EAP |
|
135 |
twoway (scatter theta_eap theta) (function y=x, range(-3 3)), /// |
|
136 |
xtitle("Theta (True)") ytitle("Theta (EAP)") title("") /// |
|
137 |
text(3 3 "y = x", place(e)) legend(off) |
|
138 |
||
139 |
* Graph TRUE vs MLE |
|
140 |
twoway (scatter theta_mle theta) (function y=x, range(-3 3)), /// |
|
141 |
xtitle("Theta (True)") ytitle("Theta (MLE)") title("") /// |
|
142 |
text(3 3 "y = x", place(e)) legend(off) |
|
143 |
||
144 |
* Example 2: Link two test forms |
|
145 |
sysuse naep_children, clear |
|
146 |
||
147 |
* Simulate two test forms by setting some responses to missing |
|
148 |
* Must leave some common items. |
|
149 |
recode item1-item6 (0/1=.) if _n <= 250 |
|
150 |
recode item7-item13 (0/1=.) if _n >250 |
|
151 |
||
152 |
* Estimate |
|
153 |
openirt, id(id) item_prefix(item) save_item_parameters("items.dta") save_trait_parameters("traits.dta") |
|
154 |
||
155 |
* Merge in ability estimates |
|
156 |
merge id using traits, sort |
|
157 |
||
158 |
* Graph TRUE vs EAP |
|
159 |
twoway (scatter theta_eap theta) (function y=x, range(-3 3)), /// |
|
160 |
xtitle("Theta (True)") ytitle("Theta (EAP)") title("") /// |
|
161 |
text(3 3 "y = x", place(e)) legend(off) |
|
162 |
||
163 |
* Graph TRUE vs MLE |
|
164 |
twoway (scatter theta_mle theta) (function y=x, range(-3 3)), /// |
|
165 |
xtitle("Theta (True)") ytitle("Theta (MLE)") title("") /// |
|
166 |
text(3 3 "y = x", place(e)) legend(off) |
|
167 |
||
168 |
* Example 3: Link to TIMSS using test formed from TIMSS item bank. |
|
169 |
sysuse timss_items, clear |
|
170 |
save fixed_items, replace |
|
171 |
||
172 |
sysuse timss_children, clear |
|
173 |
openirt, id(id) save_item_parameters("items.dta") save_trait_parameters("traits.dta") /// |
|
174 |
fixed_item_file("fixed_items.dta") item_prefix(item) |
|
175 |
||
176 |
* Merge in ability estimates |
|
177 |
merge id using traits, sort |
|
178 |
||
179 |
* Graph TRUE vs EAP |
|
180 |
twoway (scatter theta_eap theta) (function y=x, range(-3 3)), /// |
|
181 |
xtitle("Theta (True)") ytitle("Theta (EAP)") title("") /// |
|
182 |
text(3 3 "y = x", place(e)) legend(off) |
|
183 |
||
184 |
* Graph TRUE vs MLE |
|
185 |
twoway (scatter theta_mle theta) (function y=x, range(-3 3)), /// |
|
186 |
xtitle("Theta (True)") ytitle("Theta (MLE)") title("") /// |
|
187 |
text(3 3 "y = x", place(e)) legend(off) |
Up to file-list Stata/timss_children.dta:
- |
Diff size exceeds threshold (27.2 KB) — view raw? |
Up to file-list Stata/timss_items.dta:
Up to file-list docs/Open IRT Slides.lyx:
- |
Diff size exceeds threshold (26.2 KB) — view raw? |
Up to file-list src/openirt.cpp:
| … | … | @@ -42,7 +42,7 @@ int main(int argc, char* argv[]) { |
42 |
42 |
cout << "Adding missing responses..." << endl; |
43 |
43 |
for(int i = 0; i < responses.num_responses; ++i) { |
44 |
44 |
for(int j = 0; j < items.num_items; ++j) { |
45 |
if( |
|
45 |
if(approx_equal(responses.x(i, j),MISSING)) { |
|
46 |
46 |
if(items.type(j) == TYPE_2PL) { |
47 |
47 |
Missing2PLParameter p(false, "missing", i, j); |
48 |
48 |
sampler.AddStep(new GibbsStep<Missing2PLParameter, int>(p)); |
