I need a simple R function that implements a wild-t bootstrap for a given regression model, the wild-t bootstrap being a method to get appropriate standard errors in the presence of clustering.
Note that I tagged C programming because R was not available, and most C programmers know R, I guess. My apologies if this is misleading. Having said that, if someone wants to carry this out in C with an R frontend, you are welcome.
The function should take an lm-object and the name of a cluster variable as input (the cluster variable will be in the dataframe used that is stored in the lm-object. Would be cool if it was automatically retrieved from the dataframe stored in the lm-object), and computes wild-t boostrapped p-values for the regression coefficients. I will provide references for the method but it's simple: re-estimate leaving the regressor of interest out, take those residuals, multiply with 1 or -1, add to predicted dependend variable, re-estimate with regressor of interest back in, do this several times and compute p-value on t-stat with cluster-robust standard errors). The exact algorithm can be seen on slide 16 in the attached document. I will provide a more detailed paper later.
There is stata code floating around in the world wide web for this procedure, and I would want to see a verification using this, but this can be discussed later. Moreover, it should be reasonably efficient, and I have reasonably large samples (no big data though), so it should work with large samples of around n=1million (and 100 to 150 regressors or so).
An option over which variables to execute the bootstrap would be beneficial, as well.
To sum up: I need an R function that takes an lm-object, the name of a cluster variable and a list (or vector) of the names (or indices) of regressors of interest as input, bootstraps wild-t cluster robust p-values for the regressors of interest, and gives back the lm-object with the corresponding p-values. The function should work with reasonably large datasets (n=1million).
I am not an R programmer, but I am not an R newby either. I just don't have time at the moment, and I would appreciate help for this task. The steps involved strike me as quite simple, but I may overlook something. I can however provide a function that already carries out simple cluster-robust standard error estimation, so that it is easier to see what I have in mind. In any case, I'd suggest maybe around 40 USD for this.
Hi,
I am an R programmer and a masters level statistician. I am able to deliver on your project after reading through your project description. I will provide you will a well commented script so that you fully understand what I have done. I can look for the STATA version of the code that you mention to provide a check for results from my code.
I am looking forward to work with you. Feel free to ask any questions.
Kind regards.
$60 USD in 3 days
4.8 (10 reviews)
5.2
5.2
4 freelancers are bidding on average $112 USD for this job
Hi, I am interested in the project. I hold a MSc in Finance and a BSc in Economics. I guess my bid is higher than the utility you get from the code, but there is a fixed cost in understanding the project from my part. Regards.
Hi, I have more than 5 years experience on R and nearly 10 years on C, since my background is computational mathematics and statistics. I have used R/C/Matlab on many scientific computing.