Scythe Bugs and Missing Features. Bugs: - mean, sd, var, matrix methods etc are templated to be the same type as the matrix. However these functions should return double, never int. - Some matrix functions give compile errors because for int. For instance sd() on a int matrix type fails because std:pow(int, int) is not valid. - Something is messed up with rexp(). It's missing in the help file but: http://scythe.wustl.edu/api/classscythe_1_1rng_55184ef69537f8a47f82c1df4867d5dc.html#55184ef69537f8a47f82c1df4867d5dc Compiling the help using the latest doxygen seems to correct the problem, although macros don't get documented really. - Docs for scythe::min refer to max. - What is linesearch? Docs are not very clear. - Reading in from file is fragile. outsheet in stata on OSX does not use the proper newline characters. newlines across dos/unix/osx differ. - col_interchange doesn't sound like it is what it is in help file. It swaps rows rather than reordering. - qnorm1 has return type "double double sd double" in docs. Missing / Potential Features: - rtnorm is parameterized using the variance whereas rnorm is parameterized with sd. Possibility of errors. - Fast versions of functions proportional to log density. In most MCMC applications sum(log(scythe::dnorm)), for instance, takes up a significant amount of time. However we usually just need the log of something proportional to density (for Slice or MH). For instance, dnorm(x, mean, sd, fastlog=true) = -(x-mean)^2/(2 * sd^2), which avoids and exp, sqrt, log, and a few other operations. I get a 10 fold increase in speed (of my entire MCMC algorithm) by replacing a sum(log(dnorm)) with // faster sum log dnorm. template double sum_fast_log_dnorm(const Matrix &A, double mean, double sd) { double result = 0; for(int i = 0; i < A.size(); ++i) { result += fast_log_dnorm(A[i], mean, sd); } return result; } double fast_log_dnorm(double x, double mean, double sd) { return -pow((x - mean),2)/(2*pow(sd,2)); } I haven't figured it out, but this should also be vectorizable using SSE and multiple cores. Could lead to a lead to a 40 fold total increase in speed. - qnorm1 is the only inverse cdf available. It would be nice to have all the q functions available. These should be in R, no? - Missing some common random types, like multinomial. - Upgrade MT to use SSE instructions. Honestly, however, random number generation speed does not seem to matter at all. It's all the density calls, exp, log, pow, sqrt, which is where optimization should focus. - How hard would it be to add SSE instructions to log and exp, at the very least? Might get 5 fold increase in speed, without MKL. See: http://gruntthepeon.free.fr/ssemath/ - The easiest path towards performance would be to fix things allow the Intel MKL to do optimizations. I got 2-3x performance increase using Intel C++ compiler over g++4.01 without changing anything. Using getArray(), it would be easy to wrap MKL functions into Scythe with a day or two of work. I suspect gains would be large, since it takes advantage of SSE, multi cores etc. Then add a -DHAS_MKL flag. See: http://www.intel.com/software/products/mkl/docs/WebHelp/mkl.htm