May 30, 2017

Conditional Convergence in Probability

Convergence in probability is the simplest form of convergence for random variables: for any positive ε it must hold that P[ | Xn - X | > ε ] → 0 as n → ∞. This kind of convergence is easy to check, though harder to relate to first-year-analysis convergence than the associated notion of convergence almost surely: P[ XnX as n → ∞] = 1.

Convergence in probability is implied by convergence almost surely (most direct proof: express convergence almost surely in terms of more elementary events using countable unions and intersections and do some simple reasoning on the result), but is not implied by it. On the other hand a sequence of random variables that converges in probability always has a sub-sequence that converges almost surely. Moreover if every sub-sequence of a sequence of random variables contains a sub-sub-sequence that converges almost surely, then the original sequence converges in probability.

While studying the application of Dirichlet forms to Markov chain Monte Carlo (developing the work of Zanella et al, 2017), the following convergence-in-probability question arose:

Question

Work on a given probability space (Ω, F, P). Suppose that random variables X1, X2, ... converge to X in probability.
Given a sub-σ-algebra G < F, is it the case that random variables X1, X2, ... converge to X in G-conditional probability?
In other words, is it the case that, for every ε>0,

P[ | Xn - X | > ε | G ] → 0 almost surely?


If the conditioning is simply a conditioning on a single event of positive probability, then the answer is yes; consider that

P[ An | B ] = P[ An ∩ B ] / P[ B ] ≤ P[ An ] / P[ B ],

so P[ An | B ] will converge to zero if P[ An ] does.

However the general answer is, of course, no. In a nutshell, consider a sequence of random variables that converges in probability but not almost surely. Condition on the entire sequence(!), thus rendering it entirely deterministic. There is a positive chance that the conditioned sequence fails to converge; and if so then it cannot converge in (conditional) probability. We now give an explicit example of a sequence that converges in probability but not almost surely, and spell out the details of why conditional convergence in probability then fails.

Example

Consider a Uniform random variable U defined on [0, 1), using the usual Lebesgue σ-algebra F. Define X1, X2, ... as follows: consider the ensemble of events [(k-1)2-m, k2-m ) for k = 1, ..., m and m= 1, 2, ... Order these and let Xn be the indicator random variable corresponding to the nth event, while X= 0.. Then P[ Xn =1]=2-m if Xn is the indicator random variable corresponding to [(k-1)2-m, k2-m ), hence P[ | Xn - X | > ε ] → 0 if 0 < ε < 1. So certainly X1, X2, ... converge to X in probability. On the other hand, if G = F then almost surely the sequence X1, X2, ... contains infinitely many 1's as well as infinitely many 0's, so similarly for P[ | Xn - X | > ε | G ] = [Xn = 0], so almost surely P[ | Xn - X | > ε | G ] can never converge.

Discussion

More generally, this sort of problem arises whenever the σ-algebra G contains a random variable whose distribution is not atom-free.

Exactly the same argument shows that Lp convergence does not imply "conditional Lp convergence".

However the facts that

  1. convergence in probability implies existence of almost surely convergent subsequences,
  2. while convergence in probability itself is implied by existence of almost surely convergent sub-subsequences for every subsequence,

can be used to evade the issues raised here. For example, in the case of the application of Dirichlet forms to Markov chain Monte Carlo, even though convergence in probability is not preserved under conditioning, these considerations can be used to prove a strategic conditional CLT ...

Reference

Zanella, G., Bédard, M., & Kendall, W. S. (2017). A Dirichlet Form approach to MCMC Optimal Scaling. Stochastic Processes and Their Applications, to appear, 22pp. http://doi.org/10.1016/j.spa.2017.03.021


- 2 comments by 1 or more people Not publicly viewable

  1. Wilfrid Kendall

    Thanks to Martin Emil Jakobsen for pointing out a typo in the example of conditioning on a single event.

    13 Apr 2020, 09:37

  2. Wilfrid Kendall

    More recently I have noticed the following elementary fact about convergence in probability. Suppose X1, X2, ... is a sequence of random variables defined on the same probability space (Ω, ℱ, P). Suppose further that the sequence X1, X2, ... of random variables converges in probability to a non-random constant c. Then it will still do so (to the same non-random constant c) if the controlling probability measure P is replaced by another probability measure Q which is absolutely continuous with respect to P (that means, if E is in ℱ and P[E]=0 then also Q[E]=0).

    To see why, notice as above that X1, X2, ... of random variables converges in probability to c if and only if any subsequence of X1, X2, ... possesses a subsubsequence that converges almost surely to c. But almost-sure convergence under P (convergence off a specific event E in ℱ for which P[E]=0) must by absolute continuity imply almost-sure convergence under Q.

    Notice that the same argument applies if c is replaced by a truly random random variable Y. But then the distribution of Y under Q may be different from the distribution of Y under P.

    Please email me if you can supply a reference for this fact!

    12 Mar 2022, 11:48


Add a comment

You are not allowed to comment on this entry as it has restricted commenting permissions.

Trackbacks

May 2017

Mo Tu We Th Fr Sa Su
|  Today  | Jun
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31            

Search this blog

Galleries

Most recent comments

  • More recently I have noticed the following elementary fact about convergence in probability. Suppose… by Wilfrid Kendall on this entry
  • Thanks to Martin Emil Jakobsen for pointing out a typo in the example of conditioning on a single ev… by Wilfrid Kendall on this entry
  • The paper includes a nice example of application of a log–normal distribution, which is used to mode… by Wilfrid Kendall on this entry
  • See also their webapp https://045.medsci.ox.ac.uk/ by Wilfrid Kendall on this entry

Blog archive

Loading…
Not signed in
Sign in

Powered by BlogBuilder
© MMXXIV