The sigma image (σx,y) and χν2 are two things that may need some explanation for new users of GALFIT. Both things are defined in the traditional way as found in standard literature on statistics. The concept is important to understand for getting sensible fit results, for interpreting when a fit is acceptable, and for interpreting the relative merits of different fits. Most optimization algorithms use χν2 statistics to figure out how to converge on a better solution, or, once a solution has converged, what is the statistical significance of the fit. GALFIT is no different from a simple line or curve fitting algorithm when it comes to computing χν2 statistics. Therefore, everything you learned about from excellent books on statistics such as "Data Reduction and Error Analysis" by Bevington & Robinson, or "An Introduction to Error Analysis" by Taylor is applicable in GALFIT in a straightforward manner. So, this section is not intended to be a thorough introduction to basic statistics. Instead, I will try to make some abstractions more concrete for beginners of statistics, and to show how the abstractions are implemented in practice. If there are issues that need clarification below, users are very much welcome to contact me.

A question that often comes up when fitting a line to data points, or a function to imaging data, is: "why do you need to know the sigma at each data point at all?" The answer is that it's the only way for a computer to know when a fit is "good enough." But what does "good enough" mean, really? You might say that "good enough" is when a function runs through all the data points "as closely as possible." But, one then might ask, "are all the points equally good, or are some better than others?" Even if the data are perfect -- no cosmic ray hits, no dead/hot pixels, no flatfielding problems -- uncertainty from counting statistics alone means that all points are not equal. What does that mean exactly? It means that in an image, pixels that have high signal are also generally more certain, relatively speaking, compared to pixels that have fewer counts. So our intuition tells us that we don't want our fit to be equally weighted by uncertain data points in the same way as more certain ones. On the other hand, our intuition also says that the more pixels we have in an area, the more certain we are also about the the average flux value in that area, even if the fluxes there are small. This tells us that you'll want to include the number of pixels when considering an overall uncertainty, such that the "effective uncertainty" should go down with the number of data points that you have. In nature, the scheme for what we just described is Poisson statistics, where the weights are precisely Gaussian σ weights. Once you come to terms with this then the use of the sigma image becomes clear: it is a map of how uncertain the flux is at every pixel.

The sigma map is used directly in calculating the χν2 quantity. In words, χν2 is simply a sum of deviations between the data flux and the best fitting model, relative to expected deviation (σ) at each pixel, the whole thing squared. And the form of the equation looks like:

In the above equation Ndof is the number of degrees of freedom in the fit, which is roughly Ndof ~ nx × ny. The definition is approximate because usually the number of free parameters in the fit is subtracted from nx × ny in the definition of Ndof, but that number is generally quite small compared to the number of pixels, and is thus often not so important. GALFIT computes χν2 and uses the definition of sigma in the standard way: in the equation above, fdata(x,y) is your data at pixel (x,y) and fmodel(x,y) is a value that GALFIT generates at pixel (x,y), and σ(x,y) on the denominator is the Poisson (Gaussian) deviate of the flux at (x,y). In other words, σ(x,y) is what is known in GALFIT as the "sigma image" where σ sometimes corresponds to the standard deviation of the flux at that pixel. Just to be compact, I will use the forms datax,y, modelx,y, and σx,y for the three terms, instead of carrying around the f and the parentheses. In a hand-wavy sense, if your model is a perfect fit to your data, each pixel will have difference (datax,y - modelx,y), the term on the numerator, that is roughly the size of σx,y, on the denominator. So given the quantity in the summation is close to one, and you sum it up Ndof time (where Ndof = nx × ny is the number of pixels), then divided by Ndof, it's no surprise χν2 should be close to 1 if a fit is perfect. That's why when you have an appropriate model, and you get a good fit, the rule of thumb is that χν2 is, in principle, on the order of unity. This is only a rule of thumb because... imagine the following situation where every pixel (sky in an image) except for one (the object you're interested in) is fit well. If the number of pixels is large, χν2 will be close to one. Nevertheless, it's still a bad fit because you're not interested in the sky, which is dominating the statistics, and the object (the one pixel) is fit poorly by the model. A second useful rule of thumb to get an idea whether a sigma image is half-way reasonable or just totally wrong is to simply fit the SKY with a sky function in GALFIT, whence χν2 should come out close to unity (if the noise is not correlated). This is a good **sanity check**, but it doesn't guarantee that the sigma image is right. The third thing that should be said about the equation above is that the sigma image needs to have the same units as your data image: if your flux (numerator) is in units of counts/second, but your sigma image (denominator) is in units of counts, then your χν2 is off by factors of (exposure time2). If your χν2's appears really really low, this is most likely the reason why.

Side-bar: You will often hear people say that the above equation is based on the Gaussian noise model. The "Gaussian" reference is with regard to σx,y, because σx,y is the dispersion centered around the mean flux μ of a Gaussian function ~exp[-(f-μ)2/2σ2]. What this means is that if you can repeat observations an infinite number of times and record all the outcome, you'd be able to pin-point μ perfectly, and each individual measurement scatters around μ in a Gaussian distributed way -- meaning 68% of the time each flux measurement will be within ± σ of the mean μ, and 32% of the time the deviation is larger than σ from the mean μ value. Because this hypothetical Gaussian distribution is not something you can observe (you'd have to make an infinite number of observations), you can't observe σx,y directly. What we can do is to *infer* σx,y from our data -- and the more counts you have (i.e. S/N), the closer is your estimate of σ from the truth. We can blindly estimate σx,y at each pixel often without too much fuss, because σx,y can be easily estimated from Poisson statistics, going roughly like the √pixel value. What is much more of a problem in galaxy fitting is the assumption that fluxx,y scatters around the mean (modelx,y) like σ -- this is patently not true most of the time because our models aren't perfect, either because there's a profile mismatch, the galaxies aren't perfect ellipses, we can't fit every single blob in an image, the PSF is imperfect -- there are numerous other reasons. In other words, the residuals are not due to only Poisson noise, but mostly due to structures we can't subtract away perfectly. So the assumption of Gaussianity is why, in galaxy fitting, χ2 doesn't mean very much. Because the rigorous meaning of the χ2 analysis is violated to a large extent, the error bar on parameters which come out of the analysis, don't mean much either, except perhaps in low signal-to-noise situations when the noise dominates over non-Poisson fluctuations.
The saving grace for why the analysis is still meaningful, despite major violation of statistical principles, are two facts: first, in astronomy we are mostly interested in relative comparisons rather than an absolute measurement of each object for its own sake. Secondly, in response to signal-to-noise (i.e. exposure time), the scatter of the parameters around their mean value behaves like random noise (see Barden et al. 2008, ApJS, 175, 105), rather than in some unpredictable way. This latter fact does not have to be true: the model fit could in principle be dominated by the fluctuations on top of the galaxy profile which comes and goes with S/N. Therefore, it is still important to get the sigma image right because it controls the relative weights between the bright and faint pixels in the fit. If you get the sigma image really wrong, it will not be easy to interpret the profile parameters such as the Sérsic index n, for example, or to compare one object to another.

Now that we have the formalities out of the way, we can address the most frequently asked questions:

Now that you think you have to get the sigma image perfectly, the fact is that one only has to get the sigma image roughly right because the fits turn out not to be super sensitive to the sigma map. To the extent that the actual value of χ2 in galaxy fitting is mostly meaningless (because the residuals will mostly not be Gaussian distributed about the mean), except in relative terms, getting the normalization wrong, by messing up the GAIN or NCOMBINE values, is a bad habit that should be actively discouraged, but it doesn't affect how the parameters come out in the fit. It is more important to get the pedestal right, which is another reason for why the fitting region has to be fairly large compared to the galaxy. The fact that the fit is not too sensitive to the sigma image is a really good thing, because often times we don't know it all that well for various reasons. This fact should ease one's mind a little, but not so much that it stops one from trying to get the most accurate sigma map. If the sigma map is wrong, we really don't know our data, and there certainly are situations where getting it right is crucial to the science. Often, wrong sigma images will be most noticeable for galaxies with low S/N but may sometimes show up for even high S/N objects (see here).

Go back to GALFIT Home Page.
Go back to Chien Peng's Home Page.

Chien Peng ()