Hi John. That's a great catch! I took another look at the data, and figured out why this anomaly shows up - The math isn't wrong, but it does need some further explanation. I had done a preprocessing step where I capped the values of any returns at the 95th percentile if the max value in that column was > 5x 95th percentile value.

For example, consider the returns from ED to ED plus 1W. The max value in that column is 158% but 95th percentile value is just 10.94%. To prevent the top 5% outlier values from skewing the average, all values greater than 10.94 will be capped at that value. This also changes how the averages behave for those columns.

It looks like the ED to ED plus 1W distribution goes from positive to negative on capping! Let's take a closer look at that. Looking at ED to ED plus 1W distribution before capping:

Max value: 158.46

99 percentile value: 23.31

95 percentile value: 10.94

What percentage of values are negative?

1023 values <= 0, 1002 values > 0 (50.5% values are zero or negative - But outliers make the average positive!). The distribution is highly irregular.

The minimum values range from 0 to -53%. Bottom 10% of the returns are -7% to -53%.

The top 5% goes from 11% to 158%.

Removing the top 5 percent of values ranging from 11% to 159% turns the average from positive to negative. That's why it looks like the math doesn't add up. But I think the capping gives a more realistic picture of what happens during short term plays as the outliers account for most of the positive returns.

Without the capping, the average returns kind of add up. AD to ED is 3.23%, ED to ED plus 1W is 0.26%, and AD to ED plus 1W is 3.26%.

Hope that clarifies the situation a bit. I can share more data if you wish, there are too many columns in the original analysis and might be overwhelming. Thanks again for the great catch!

I greatly appreciate your detailed, helpful reply. I understand your rationale behind this. The depth of your analysis is remarkable. Thank you again for clarifying!

Great Analysis - But what about the logic showing the contrasting results between

One day after AD to ED plus 1 week [2.51%]

how can this be more than

One day after AD to ED [2.37%]

yet

ED to ED plus 1 week is negative [-0.14%]

There's something not right about the maths here, it doesn't add up

Am I missing something?

Please let me know

Hi John. That's a great catch! I took another look at the data, and figured out why this anomaly shows up - The math isn't wrong, but it does need some further explanation. I had done a preprocessing step where I capped the values of any returns at the 95th percentile if the max value in that column was > 5x 95th percentile value.

For example, consider the returns from ED to ED plus 1W. The max value in that column is 158% but 95th percentile value is just 10.94%. To prevent the top 5% outlier values from skewing the average, all values greater than 10.94 will be capped at that value. This also changes how the averages behave for those columns.

Let's look at the returns you are mentioning and how they change with capping. I have shared the data relevant to these returns so that you can take a closer look: https://docs.google.com/spreadsheets/d/1Mz0uFlBlCQTj-pz88wGUylfv1DrkyRzFQwWdHnrHLBU/edit?usp=sharing

AD to ED

Uncapped Average: 3.23

Capped Average: 2.37

AD to ED plus 1W

Uncapped Average: 3.26

Capped Average: 2.5

ED to ED plus 1W

Uncapped Average: 0.26

Capped Average: -0.14

It looks like the ED to ED plus 1W distribution goes from positive to negative on capping! Let's take a closer look at that. Looking at ED to ED plus 1W distribution before capping:

Max value: 158.46

99 percentile value: 23.31

95 percentile value: 10.94

What percentage of values are negative?

1023 values <= 0, 1002 values > 0 (50.5% values are zero or negative - But outliers make the average positive!). The distribution is highly irregular.

The minimum values range from 0 to -53%. Bottom 10% of the returns are -7% to -53%.

The top 5% goes from 11% to 158%.

Removing the top 5 percent of values ranging from 11% to 159% turns the average from positive to negative. That's why it looks like the math doesn't add up. But I think the capping gives a more realistic picture of what happens during short term plays as the outliers account for most of the positive returns.

Without the capping, the average returns kind of add up. AD to ED is 3.23%, ED to ED plus 1W is 0.26%, and AD to ED plus 1W is 3.26%.

Hope that clarifies the situation a bit. I can share more data if you wish, there are too many columns in the original analysis and might be overwhelming. Thanks again for the great catch!

I greatly appreciate your detailed, helpful reply. I understand your rationale behind this. The depth of your analysis is remarkable. Thank you again for clarifying!

You are welcome :)

Great Analysis - But what about the logic showing the contrasting results between

One day after AD to ED plus 1 week [2.51%]

how can this be more than

One day after AD to ED [2.37%]

yet

ED to ED plus one week is negative [-0.14%]

There's something not right about the maths here, it doesn't add up

Am I missing something

Please let me know