GNU Info

Info Node: (gnuplot.info)practical guidelines

(gnuplot.info)practical guidelines


Prev: statistical overview Up: error estimates
Enter node , (file) or (file)node

 If you have a basis for assigning weights to each data point, doing so lets
 you make use of additional knowledge about your measurements, e.g., take into
 account that some points may be more reliable than others.  That may affect
 the final values of the parameters.

 Weighting the data provides a basis for interpreting the additional `fit`
 (Note: fit ) output after the last iteration.  Even if you weight each point
 equally, estimating an average standard deviation rather than using a weight
 of 1 makes WSSR a dimensionless variable, as chisquare is by definition.

 Each fit iteration will display information which can be used to evaluate
 the progress of the fit.  (An '*' indicates that it did not find a smaller
 WSSR and is trying again.)  The 'sum of squares of residuals', also called
 'chisquare', is the WSSR between the data and your fitted function; `fit`
 has minimized that.  At this stage, with weighted data, chisquare is expected
 to approach the number of degrees of freedom (data points minus parameters).
 The WSSR can be used to calculate the reduced chisquare (WSSR/ndf) or stdfit,
 the standard deviation of the fit, sqrt(WSSR/ndf).  Both of these are
 reported for the final WSSR.

 If the data are unweighted, stdfit is the rms value of the deviation of the
 data from the fitted function, in user units.

 If you supplied valid data errors, the number of data points is large enough,
 and the model is correct, the reduced chisquare should be about unity.  (For
 details, look up the 'chi-squared distribution' in your favourite statistics
 reference.)  If so, there are additional tests, beyond the scope of this
 overview, for determining how well the model fits the data.

 A reduced chisquare much larger than 1.0 may be due to incorrect data error
 estimates, data errors not normally distributed, systematic measurement
 errors, 'outliers', or an incorrect model function.  A plot of the residuals,
 e.g., `plot 'datafile' using 1:($2-f($1))`, may help to show any systematic
 trends.  Plotting both the data points and the function may help to suggest
 another model.

 Similarly, a reduced chisquare less than 1.0 indicates WSSR is less than that
 expected for a random sample from the function with normally distributed
 errors.  The data error estimates may be too large, the statistical
 assumptions may not be justified, or the model function may be too general,
 fitting fluctuations in a particular sample in addition to the underlying
 trends.  In the latter case, a simpler function may be more appropriate.

 You'll have to get used to both `fit` and the kind of problems you apply it
 to before you can relate the standard errors to some more practical estimates
 of parameter uncertainties or evaluate the significance of the correlation
 matrix.

 Note that `fit`, in common with most NLLS implementations, minimizes the
 weighted sum of squared distances (y-f(x))**2.  It does not provide any means
 to account for "errors" in the values of x, only in y.  Also, any "outliers"
 (data points outside the normal distribution of the model) will have an
 exaggerated effect on the solution.


automatically generated by info2www version 1.2.2.9