Probabilistic Analysis

Probabilistic analysis is used to analyze running time of the algorithm. For this we must use knowledge of or make assumptions about, the distribution of inputs. Then we analyze our algorithm, computing an average running time, where we take average over the distribution of the possible inputs. Thus, in effect, we are averaging the running time over all possible inputs. This we call as “average case running time”.

We must be very careful in deciding the distribution of inputs. We may reasonably assume about possible set of inputs, and then use probabilistic analysis as a technique for designing the efficient algorithm, and means for gaining insight into the problem. Where we cannot describe the distribution of inputs, we cannot use probabilistic analysis.

In the hiring problem, the candidates came in random order means they can be in any order in the permutation of $\langle 1, \dots, c\rangle$.

Example 7.5   A Trivial example: Find the Lady.

Let us consider a variant of the classic card game “Find the Lady”. Here a dealer puts down three cards and we want to find a specific card among the three (say, the Queen of Spades). In this version, the dealer will let us turn over as many cards as we want, but each card we turn over will cost us a dollar, irrespective of whether you get a queen. If we find the queen, we get two dollars.

Because this is a toy example, we assume that the dealer is not cheating.

A deterministic algorithm tries the cards in some fixed order. A clever dealer will place the Queen in the last place we look: so the worst-case payoff is a loss of a dollar.

In the average case, if we assume that all three positions are equally likely to hold the target card, then we turn over one card a third of the time, two cards a third of the time, and all three cards a third of the time. This gives an expected payoff of

$\displaystyle \frac{1}{3}(1 + 2 + 3) - 2 = 0.
$

But this requires making assumptions about the distribution of the cards, and we are no longer doing worst-case analysis.

The trick to randomized algorithms is that we can can obtain the same expected payoff even in the worst case by supplying the randomness ourselves. If we turn over cards in a random order, then the same analysis for the average case tells us we get the same expected payoff of 0—but unlike the average case, we get this expected performance no matter where the dealer places the cards. $\Box$

A randomized algorithm is one that makes random choices during its execution. The behavior of such an algorithm may thus, be random even on a fixed input. The design and analysis of a randomized algorithm focuses on establishing that it is likely to behave “well” on every input; the likelihood in such a statement depends only on the probabilistic choices made by the algorithm during execution and not on any assumptions about the input.

It is especially important to distinguish a randomized algorithm from the average-case analysis of algorithms, where one analyzes an algorithm assuming that its input is drawn from a fixed probability distribution. With a randomized algorithm, in contrast, no assumption is made about the input.

Throughout this discussion we assume the RAM model of computation, in which we have a machine that can perform the following operations involving registers and main memory: input-output operations, memory-register transfers, indirect addressing, and branching and arithmetic operations. Each register or memory location may hold an integer that can be accessed as a unit, but an algorithm has no access to the representation of the number. The arithmetic instructions permitted are $+, −, \times,$ and $/$. In addition, an algorithm can compare two numbers, and evaluate the square root of a positive number. In this chapter $E[X]$ will denote the expectation of a random variable $X$, and $P_r[A]$ will denote the probability of an event $A$.