tristanz / ScytheMCMC (http://bytebucket.org/tristanz/scythemcmc/wiki/html/index.html)
A Scythe Markov Chain Monte Carlo C++ Framework
Clone this repository (size: 1.0 MB): HTTPS / SSH
$ hg clone http://bitbucket.org/tristanz/scythemcmc
| commit 13: | 9844e0828091 |
| parent 12: | fd1ece924824 |
| branch: | default |
| tags: | tip |
Updates. Switched to Lecuyer and enable seed saving.
ScytheMCMC /
ScytheBugs.txt
| # Introduced |
|---|
1 |
cd0e21380937 |
Scythe Bugs and Missing Features. |
2 |
cd0e21380937 |
|
3 |
cd0e21380937 |
Bugs: |
4 |
782fbff1c0d8 |
- mean, sd, var, matrix methods etc are templated to be the same type as the |
5 |
782fbff1c0d8 |
matrix. However these functions should return double, never int. |
6 |
782fbff1c0d8 |
|
7 |
782fbff1c0d8 |
- Some matrix functions give compile errors because for int. For instance |
8 |
782fbff1c0d8 |
sd() on a int matrix type fails because std:pow(int, int) is not valid. |
9 |
782fbff1c0d8 |
|
10 |
782fbff1c0d8 |
- Something is messed up with rexp(). It's missing in the help file but: |
11 |
cd0e21380937 |
http://scythe.wustl.edu/api/classscythe_1_1rng_55184ef69537f8a47f82c1df4867d5dc.html#55184ef69537f8a47f82c1df4867d5dc |
12 |
782fbff1c0d8 |
Compiling the help using the latest doxygen seems to correct the problem, |
13 |
782fbff1c0d8 |
although macros don't get documented really. |
14 |
cd0e21380937 |
|
15 |
9844e0828091 |
- Docs for scythe::min refer to max. |
16 |
9844e0828091 |
|
17 |
9844e0828091 |
- What is linesearch? Docs are not very clear. |
18 |
9844e0828091 |
|
19 |
9844e0828091 |
- Reading in from file is fragile. outsheet in stata on OSX does not |
20 |
9844e0828091 |
use the proper newline characters. newlines across dos/unix/osx differ. |
21 |
9844e0828091 |
|
22 |
9844e0828091 |
- col_interchange doesn't sound like it is what it is in help file. It swaps |
23 |
9844e0828091 |
rows rather than reordering. |
24 |
9844e0828091 |
|
25 |
9844e0828091 |
- qnorm1 has return type "double double sd double" in docs. |
26 |
9844e0828091 |
|
27 |
782fbff1c0d8 |
Missing / Potential Features: |
28 |
782fbff1c0d8 |
|
29 |
9844e0828091 |
- rtnorm is parameterized using the variance whereas rnorm is parameterized |
30 |
9844e0828091 |
with sd. Possibility of errors. |
31 |
9844e0828091 |
|
32 |
782fbff1c0d8 |
- Fast versions of functions proportional to log density. In most MCMC |
33 |
782fbff1c0d8 |
applications sum(log(scythe::dnorm)), for instance, takes up a significant |
34 |
782fbff1c0d8 |
amount of time. However we usually just need the log of something |
35 |
782fbff1c0d8 |
proportional to density (for Slice or MH). For instance, |
36 |
782fbff1c0d8 |
dnorm(x, mean, sd, fastlog=true) = -(x-mean)^2/(2 * sd^2), which avoids |
37 |
782fbff1c0d8 |
and exp, sqrt, log, and a few other operations. I get a 10 |
38 |
782fbff1c0d8 |
fold increase in speed (of my entire MCMC algorithm) by replacing a |
39 |
782fbff1c0d8 |
sum(log(dnorm)) with |
40 |
782fbff1c0d8 |
|
41 |
782fbff1c0d8 |
// faster sum log dnorm. |
42 |
782fbff1c0d8 |
template <typename T, matrix_order PO, matrix_style PS> |
43 |
782fbff1c0d8 |
double sum_fast_log_dnorm(const Matrix<T,PO,PS> &A, double mean, double sd) |
44 |
782fbff1c0d8 |
{ |
45 |
782fbff1c0d8 |
double result = 0; |
46 |
782fbff1c0d8 |
for(int i = 0; i < A.size(); ++i) { |
47 |
782fbff1c0d8 |
result += fast_log_dnorm(A[i], mean, sd); |
48 |
782fbff1c0d8 |
} |
49 |
782fbff1c0d8 |
return result; |
50 |
782fbff1c0d8 |
} |
51 |
782fbff1c0d8 |
|
52 |
782fbff1c0d8 |
double fast_log_dnorm(double x, double mean, double sd) { |
53 |
782fbff1c0d8 |
return -pow((x - mean),2)/(2*pow(sd,2)); |
54 |
782fbff1c0d8 |
} |
55 |
782fbff1c0d8 |
|
56 |
782fbff1c0d8 |
I haven't figured it out, but this should also be vectorizable using SSE |
57 |
782fbff1c0d8 |
and multiple cores. Could lead to a lead to a 40 fold total increase in |
58 |
782fbff1c0d8 |
speed. |
59 |
782fbff1c0d8 |
|
60 |
782fbff1c0d8 |
- qnorm1 is the only inverse cdf available. It would be nice to have |
61 |
782fbff1c0d8 |
all the q functions available. These should be in R, no? |
62 |
782fbff1c0d8 |
|
63 |
782fbff1c0d8 |
- Missing some common random types, like multinomial. |
64 |
782fbff1c0d8 |
|
65 |
782fbff1c0d8 |
- Upgrade MT to use SSE instructions. Honestly, however, random number |
66 |
782fbff1c0d8 |
generation speed does not seem to matter at all. It's all the density |
67 |
782fbff1c0d8 |
calls, exp, log, pow, sqrt, which is where optimization should focus. |
68 |
782fbff1c0d8 |
|
69 |
782fbff1c0d8 |
- How hard would it be to add SSE instructions to log and exp, at the very |
70 |
782fbff1c0d8 |
least? Might get 5 fold increase in speed, without MKL. |
71 |
782fbff1c0d8 |
See: http://gruntthepeon.free.fr/ssemath/ |
72 |
782fbff1c0d8 |
|
73 |
782fbff1c0d8 |
- The easiest path towards performance would be to fix things allow |
74 |
782fbff1c0d8 |
the Intel MKL to do optimizations. I got 2-3x performance increase using |
75 |
782fbff1c0d8 |
Intel C++ compiler over g++4.01 without changing anything. |
76 |
782fbff1c0d8 |
Using getArray(), it would |
77 |
782fbff1c0d8 |
be easy to wrap MKL functions into Scythe with a day or two of work. |
78 |
782fbff1c0d8 |
I suspect gains would be large, since it takes advantage of SSE, multi cores |
79 |
782fbff1c0d8 |
etc. Then add a -DHAS_MKL flag. See: |
80 |
782fbff1c0d8 |
http://www.intel.com/software/products/mkl/docs/WebHelp/mkl.htm |
81 |
782fbff1c0d8 |
