3
$\begingroup$

Say I have a nonlinear mathematical model f that maps points (u,v) to (x,y,z): $$ (x,y,z) = f(u,v) $$ This model has 3 parameters: $(\phi,\theta,\psi)$.

I have N correspondences: $$ (x_i,y_i,z_i) = f(u_i,v_i) $$ where $x_i$, $y_i$, $z_i$, $u_i$ and $v_i$ are known.

At the moment I am estimating the parameters from all correspondences by minimizing a summed square error with a Levenberg–Marquardt solver (lmdif1).

However in addition to point estimates I want to quantify the uncertainty.

I assume each $u_i$ and $v_i$ measurement is independent and zero mean normally distributed, with same standard deviation, $\sigma_{uv}$, around its true value. Likewise each $x_i$, $x_i$ and $z_i$ measurement is independent and zero mean normally distributed around its true value with the same standard deviation, $\sigma_{xyz}$.

Is the proposition below a good idea?

I split the N correspondences into 4 sets and calculate $(\phi,\theta,\psi)$ for each set.

So for $\phi$ I now have 4 values: $[\phi_1,\phi_2,\phi_3,\phi_4]$.

I calculate the mean, $\mu_{\phi}$, and standard deviation, $\sigma_{\phi}$, from these 4 values. The mean is my point estimate and $$ \mu_{\phi} - 2\sigma_{\phi} \leqslant \phi \leqslant \mu_{\phi} + 2\sigma_{\phi} $$ is my 95 % confidence interval.

$\endgroup$
1
  • 2
    $\begingroup$ "Uncertainty" and "confidence" are not defined (yet), because you haven't posited a probability model for the differences between the observations and the model. Are you perhaps assuming those differences are iid zero-mean spherical multinormal errors? $\endgroup$ Commented 18 hours ago

1 Answer 1

6
$\begingroup$

No, I see 3 problems with your idea:

  1. Since higher N should correspond to lower standard errors ($\hat \sigma_\phi$) using only N/4 you are biased towards larger values. In a linear model it would be by factor of $\sqrt 4$, so double what it should be

  2. 4 values is just way to little to get a decent estimate for a variance and by extension standard deviation. If the values are normally distributed you can look at the $\chi^2$-Distribution with 3 degrees of freedom for your variance estimate: https://en.wikipedia.org/wiki/Chi-squared_distribution which would mean that 5% of the time you get values that are less 35% of the true standard deviation and another 5% of the time you get values that are 60%+ to high. Calculation in R: sqrt(qchisq(c(0.05, 0.95), df = 3)/3)

  3. There's really little reason to assume that the symmetric +/- 2$\sigma$ interval is an actual 95%-confidence interval for your situation, even if you assume normal errors, because your model isn't linear.

The thing you want to do is bootstrap, i.e. draw samples of size N with replacement from your data and recalculate your estimate that way. You will find plenty of Online ressource about this, including on this website, e.g. Explaining to laypeople why bootstrapping works

I'd say you have two reasonable approaches: You make one main estimate $\hat \phi$ with your original data and the $k$ bootstraps. If running the estimation is easy you pick $k = 2000$ and use the 50st lowest/highest as your confidence interval limits. If the model is expensive to run you could pick $k> 10$ and calculate $\hat \sigma_\phi = \sqrt{\sum(\phi_i-\hat \phi)^2/k} $. This approach runs on hope with regard to the $\sigma$-estimate and assumes the simple symmetric interval is appropriate but that might be permissible in your field.

$\endgroup$
3
  • $\begingroup$ Regarding your point 1: are you saying that this process over estimates the uncertainty? Diving N/4 makes each solution ca 2 x as noisy? Also regardless of how the uncertainty estimation is done should not the point estimate be done on all data first? When averaging 4 solutions one is saying that this solution process is linear? And we have no reason to believe that. Therefore much better to make no such assumptions and point estimate from all data and afterwards estimate uncertainty? $\endgroup$ Commented 16 hours ago
  • $\begingroup$ As an extreme example let us assume N=8. Dividing the dataset into 4 makes each sub solution underdeterminated => the solution may be whatever. Averaging 4 random whatever solutions make no sense. $\endgroup$ Commented 15 hours ago
  • 1
    $\begingroup$ That's a lot of questions but I think I have to say yes to all of them. Note whubers comment about the implied probabilty model when using this as a confidence interval and, if you really want to engage with it, look up what confidence intervals actually are, e.g.: (stats.stackexchange.com/q/6652/341520) $\endgroup$ Commented 11 hours ago

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.