Since this variation is happening in dry weather, the gauge is apparently being affected by factors other than just precipitation. Here's one method to try to reduce this variation.
The raw data for the gauge is available on the net, and looks like:
Hogg Pass 2005 07 30 00 52.7 54.9 2005 07 30 01 52.5 53.1 2005 07 30 02 52.6 51.3 2005 07 30 03 52.6 49.5 2005 07 30 04 52.6 47.5If we saved this "page" to a file, fed it to a spreadsheet program, and plotted the "Temperature" and "Precipitation" data columns, we might see something like:
This technique assumes that no rainfall occurred during the sample data period.
The estimated temperature could be calculated as a weighted average of the current and a few of the previous hours' air temperatures. We need to determine how much "weight" to give to these factors. We can do this by creating an "estimated temperature" equation, plotting its values versus the precipitation values, and looking for a set of weights (or coefficients) which provide the best correlation.
The "cloud" of data points which show the correlation of precipitation to temperature will change as the coefficients change. For better correlations, the cloud will appear to coalesce along a simple curve.
To find these coefficients, add another data column to the spreadsheet, "T_estimated", which is the result of an equation:
The coefficients (c_0 ... c_3) should be defined in cells so that they may be easily changed and the effects of that change observed. When the sum of the values of these coefficients is 1.0, "T_estimated" can be considered to be a real temperature, blended from portions of the current and previous hours' data.
If c_0 = 1.0 and all other coefficients = 0.0, then "T_estimated" = the
If c_1 = 1.0 and all other coefficients = 0.0, then "T_estimated" = the previous hour's temperature.
Now change the graph to be a "scatter plot", with "T_estimated" on the X-axis and Precipitation on the Y-axis. Add a second degree polynomial "trend line", and show its "R2" correlation factor. We are going to use R2 to find the optimal coefficient weightings ("c_0" ... "c_3").
An R2 = 1.0 implies a perfect correlation, which is the maximum possible value. For all other cases R2 will be less. Whatever combination of coefficients yields the highest R2 value will be the "best" ones to use in the temperature estimation equation. For example:
With c_0 = 1.0 and all other coefficients = 0.0, "R2" = 0.866
With c_1 = 1.0 and all other coefficients = 0.0, "R2" = 0.9268.
This means that the reported precipitation amount is slightly more correlated to the previous hour's temperature than it is to its own!
Start adding non-zero values for the rest of the coefficients, and try to find the set which gives the best "R2" value. The sum of c_0 to c_3 need not be constrained to equal "1.00" until you are ready to finalize their values. At that point, normalize the coefficients. (R2 will not be affected.)
It appears that a combination of the current and two previous hours' data has the best correlation (R2 = 0.9325), for which the estimated temperature's equation becomes:
Different data sets can be expected to yield different results, so if you try this, your results will probably differ slightly. Short of automating and periodically re-generating these results, the above equation is probably "Close Enough".
To generalize this correction factor, we need to scale it by a factor proportional to the current precipitation divided by the sample data's precipitation amount:
If we ask the spreadsheet for the equation of the trend line, the correction equation looks something like:
This correction should probably only be used when "T_estimated" is above 45 degrees.
If we use these equations, the "corrected precipitation" has about one-third of the variation seen in the original data. The daily component is mostly gone, leaving some noise and a slowly varying component which looks to be related to atmospheric pressure.
The correlation of the temperature-corrected Precipitation to Atmospheric pressure was relatively poor (R2 = 0.2175), resulting in no observable improvement in the variation when attempting to correct for it.
A simple temperature correction is not perfect, but such is to be expected when using simple models and quantized data.