Research Crumbs: Tail Bound

Sat, Jun 12, 2021 2-minute read crumbsprobabilitystatistics

 

The other day I was having the regular meeting with my advisors to discuss a paper, and there were two theorems involving the tail bound that I found very intriguing. What’s interesting about them is that it’s likely that one would derive a similar but fallacious conclusions, which I name ‘false friend’ (amigo falso, a Spanish word), if the subtle difference is not understood. The context is a bit complicated so I will simplify the notations and just focus on the logics.

 

🔖 Case 1

Let \(X\) be a random variable and \(\theta\) is a parameter in the parameter space \(\Theta\). \(f(X,\theta)\) is a function of \(X\) and \(\theta\), and \(\epsilon\) is a value greater than \(0\).

The question is that if we want to prove \(P(|f(X,\theta)|\geq \epsilon)\rightarrow 0\) for any \(\theta\in\Theta\), which one of the sufficient conditions below should be established?

  1. \( P(\underset{\theta}{\max}|f(X,\theta)|\geq \epsilon)\rightarrow 0\)
  2. \(P(|f(X,\theta)|\geq \epsilon)\rightarrow 0\)

The answer is the first one. The second condition looks reasonable at first glance, however, the notation \(\theta\) is misleading. If we change it to \(\theta_1\), one may notice what the issue is.

Recall that the objective is to prove \(P(|f(X,\theta)|\geq \epsilon)\rightarrow 0\) for any \(\theta\in\Theta\). \(P(|f(X,\theta_1)|\geq \epsilon)\rightarrow 0\) does NOT imply \(P(|f(X,\theta_2)|\geq \epsilon)\rightarrow 0\), therefore the second condition cannot deduce the conclusion of interest.

 

🔖 Case 2

Let \(X_{ij}\) be independent random variables with mean \(\mu_{ij}=0\). Given the condition that \(P(|X_{ij}|\geq\epsilon)\rightarrow 0\), which one of the following conclusions is correct?

  1. \(P(\underset{0\leq u_{i}, v_{j} \leq 1}{\max}|\sum_{i=1}^m\sum_{j=1}^nX_{ij}u_iv_j|\geq\epsilon)\rightarrow 0\)
  2. \(P(\underset{0\leq t_{ij} \leq 1}{\max}|\sum_{i=1}^m\sum_{j=1}^nX_{ij}t_{ij}|\geq\epsilon)\rightarrow 0\)

The answer is the first one.

Let’s first demonstrate the incorrectness of the second conclusion using a counterexample. Assign \(t_{ij}=1\) if \(X_{ij}>0\) and \(t_{ij}=0\) otherwise, then \(|\sum_{i=1}^m\sum_{j=1}^nX_{ij}t_{ij}|\) is always greater than \(0\), therefore the second conclusion does not hold.

Naturally one may ask that does the first conclusion still hold if we employ the same argument above? The answer is yes, but how come? Well, the key point resides in a concept similar to the ‘degree of freedom’.

In the second statement, the number of \(t_{ij}\) matches the number of \(X_{ij}\), which is \(mn\). Hence each \(X_{ij}\) is controlled by a \(t_{ij}\) and we can take the positive \(X_{ij}\) into the summation while ignoring the the negative \(X_{ij}\), which makes the summation always greater than \(0\).

However, in the first statement, the \(u_i\) and \(v_j\) can only control \(m+n\) of \(X_{ij}\), and their summation value will be offset by the summation of the rest \(mn-m-n\) of \(X_{ij}\), therefore \(|\sum_{i=1}^m\sum_{j=1}^nX_{ij}u_iv_j|\) will go to zero as \(m,n\) go to infinity, the first conclusion holds.