# LUP Student Papers

## LUND UNIVERSITY LIBRARIES

### Bootstrap som hjälpmedel att öka noggrannheten och bedöma precisionen vid probitregression, med tillämpning på hörselmätningar

(2006)
Department of Statistics
Abstract
In many biomedical contexts, e. g. when evaluating hearing disorders, the data obtained can be described as pairs (x1, y1), …, (xn, yn) where x is a quantitative variable and y a 0/1 variable whose probability of taking the value 1 is a monotonic function of x. One way of analysing such data is to perform probit regression; thereby two parameters, b0 (= the constant) and b1 (= the slope) are estimated; the interest centres around m = -b0/b1, i. e. the value of x for which the probability is 0,5 that y takes value 1. General maximum-likelihood (ML) theory asserts that, provided the probit model is correct, the ML estimators are consistent and asymptotically unbiased and that the standard errors are asymptotically correct. Thus the situation... (More)
In many biomedical contexts, e. g. when evaluating hearing disorders, the data obtained can be described as pairs (x1, y1), …, (xn, yn) where x is a quantitative variable and y a 0/1 variable whose probability of taking the value 1 is a monotonic function of x. One way of analysing such data is to perform probit regression; thereby two parameters, b0 (= the constant) and b1 (= the slope) are estimated; the interest centres around m = -b0/b1, i. e. the value of x for which the probability is 0,5 that y takes value 1. General maximum-likelihood (ML) theory asserts that, provided the probit model is correct, the ML estimators are consistent and asymptotically unbiased and that the standard errors are asymptotically correct. Thus the situation is comforting for large sample sizes. What is not clear is how the ML estimator performs in not-quite-large samples. Object of the present paper is to investigate the biases in b^0, b^1 and m^ and the adequacy of the standard errors provided by the ML method, and also to see if bootstrapping can reduce the biases and produce standard errors that are more relevant than those given by ML method. There are two types of bootstrap, non-parametric and parametric bootstrap. With non-parametric bootstrap you draw independent observations with replacement and with the same probability. With parametric bootstrap you know the distribution with the exception of one or several parameters. We have the assumption that our data come from a probit model which means that we draw our bootstrap samples from the assumed probit model. After drawing the bootstrap sample in both of the methods of bootstrap we perform probit regression and the two parameters are estimated. In present paper we have both simulated and authentic data. The simulated data consist of 30 observations because that is how many observations there are in each patient’s examination in our authentic data. Our authentic data come from hearing examinations on patients with one particular of impairment of hearing. The result from simulated data gives us that the mean value of b^1 is corrected to the better with both of the methods of bootstrap, but the bias of b^0 corrects to the better with parametric bootstrap. The bias of m^ is the smallest with the use of pure probit regression. We also get that the standard errors of b^0 and b^1 are most in accord with the standard deviation when non-parametric bootstrap is being used. There is no specific method where the standard errors of m^ are in more accordance with the standard deviation of m^. With the use of real patients we most often get that the standard deviation is a little larger with the two bootstrap methods than it is for pure probit regression. Some of the patients get very high estimates of the standard deviation with one or with both of the two bootstrap methods. We also get that the bias-adjusted estimate with parametric bootstrap for patient 5 is negative, which should not occur because negative frequencies do not exist. (Less)
author
supervisor
organization
year
type
H1 - Master's Degree (One Year)
subject
keywords
bootstrap, probitregression, hörselmätningar, Statistics, operations research, programming, actuarial mathematics, Statistik, operationsanalys, programmering, aktuariematematik
language
Swedish
id
1334683
2006-06-19
date last changed
2010-08-03 10:49:19
```@misc{1334683,
abstract     = {In many biomedical contexts, e. g. when evaluating hearing disorders, the data obtained can be described as pairs (x1, y1), …, (xn, yn) where x is a quantitative variable and y a 0/1 variable whose probability of taking the value 1 is a monotonic function of x. One way of analysing such data is to perform probit regression; thereby two parameters, b0 (= the constant) and b1 (= the slope) are estimated; the interest centres around m = -b0/b1, i. e. the value of x for which the probability is 0,5 that y takes value 1. General maximum-likelihood (ML) theory asserts that, provided the probit model is correct, the ML estimators are consistent and asymptotically unbiased and that the standard errors are asymptotically correct. Thus the situation is comforting for large sample sizes. What is not clear is how the ML estimator performs in not-quite-large samples. Object of the present paper is to investigate the biases in b^0, b^1 and m^ and the adequacy of the standard errors provided by the ML method, and also to see if bootstrapping can reduce the biases and produce standard errors that are more relevant than those given by ML method. There are two types of bootstrap, non-parametric and parametric bootstrap. With non-parametric bootstrap you draw independent observations with replacement and with the same probability. With parametric bootstrap you know the distribution with the exception of one or several parameters. We have the assumption that our data come from a probit model which means that we draw our bootstrap samples from the assumed probit model. After drawing the bootstrap sample in both of the methods of bootstrap we perform probit regression and the two parameters are estimated. In present paper we have both simulated and authentic data. The simulated data consist of 30 observations because that is how many observations there are in each patient’s examination in our authentic data. Our authentic data come from hearing examinations on patients with one particular of impairment of hearing. The result from simulated data gives us that the mean value of b^1 is corrected to the better with both of the methods of bootstrap, but the bias of b^0 corrects to the better with parametric bootstrap. The bias of m^ is the smallest with the use of pure probit regression. We also get that the standard errors of b^0 and b^1 are most in accord with the standard deviation when non-parametric bootstrap is being used. There is no specific method where the standard errors of m^ are in more accordance with the standard deviation of m^. With the use of real patients we most often get that the standard deviation is a little larger with the two bootstrap methods than it is for pure probit regression. Some of the patients get very high estimates of the standard deviation with one or with both of the two bootstrap methods. We also get that the bias-adjusted estimate with parametric bootstrap for patient 5 is negative, which should not occur because negative frequencies do not exist.},