Hello teachers, I have some questions regarding the corr_ttest
function in pyleoclim
. This function provides a calculation for effective degrees of freedom, but when I reviewed the calculation method, I noticed that the resulting effective degrees of freedom can exceed the length of the data itself. The formula used is Ne = gmean([Ney1 + Ney2]), and I am curious as to why the geometric mean is used instead of the arithmetic mean. Additionally, the formula used in the article by (Hu, EPSL, 2017 ) seems different from the one in the corr_ttest
function. Therefore, I would like to ask which method provides the correct calculation for effective degrees of freedom?
Hi @LiaresCN. I don’t advocate for the modified T-test because it is approximate, and tends to fail when autocorrelation is very high (which is very likely the case in your example). Is there any reason why you cannot use phase randomization?
In general, when parametric methods misbehave, it is an indication that they are a poor fit for your data – a red flag to heed. That being said you raise a good point about the discrepancy between the code and paper. I now cannot remember why I implemented the geometric mean. Can you open an issue and I’ll get to it this week?
Some sense of the type of data you are applying this to would be helpful.