Вероятность прогона k успехов в последовательности n испытаний Бернулли

13

Я пытаюсь найти правильную вероятность получения 8 попыток подряд в блоке из 25 испытаний, у вас есть 8 полных блоков (из 25 испытаний), чтобы получить 8 испытаний подряд. Вероятность получения любого правильного триал-теста на основе угадывания составляет 1/3, после получения правильных 8-ми строк блоки заканчиваются (поэтому получить более 8-ти правильных строк технически невозможно). Как бы я узнал о вероятности этого? Я думал о том, чтобы использовать (1/3) ^ 8 как вероятность получения 8 правильной строки подряд, есть 17 возможных шансов получить 8 подряд в блоке из 25 испытаний, если я умножу 17 Возможности * 8 блоков, которые я получу 136, 1- (1- (1/3) ^ 8) ^ 136 даст мне вероятность получить 8 подряд подряд в этой ситуации, или я что-то упустил здесь?

probability binomial

— AcidNynex
источник

1

Я считаю, что проблема с приведенным аргументом заключается в том, что рассматриваемые события не являются независимыми. Например, рассмотрим один блок. Если я скажу вам , что (а) нет пробег восьми , что начинается в положении 6, (б) является бег , начиная с позиции 7 и (в) нет запуска , начиная с позиции 8, что это говорит вам о вероятность запуска, начиная с позиций, скажем, с 9 по 15?

— кардинал

14

Отслеживая вещи, вы можете получить точную формулу .

Пусть $p=1/3$ будет вероятность успеха и $k=8$ будет число успехов в строку , которую нужно рассчитывать. Они исправлены для проблемы. Значения переменных: $m$ , количество испытаний, оставшихся в блоке; и $j$ - количество последовательных успехов, которые уже наблюдались. Пусть шанс в конечном итоге достичь $k$ успехов подряд до исчерпания $m$ испытаний будет записан как $f_{p,k}(j,m)$ . Мы ищем $f_{1/3,8}(0,25)$ .

Предположим, мы только что видели наш $j^\text{th}$ успех подряд с $m\gt0$ испытаний. Следующее испытание либо является успешным, с вероятностью $p$ - в этом случае $j$ увеличивается до $j+1$ -; или же это сбой с вероятностью $1-p$ - в этом случае $j$ сбрасывается в $0$ . В любом случае $m$ уменьшается на $1$ . Откуда

f_{p, k} (j, m) = p f_{p, k} (j + 1, m - 1) + (1 - p) f_{p, k} (0, m - 1) .

$f_{p,k}(j,m) = p f_{p,k}(j+1,m-1) + (1-p)f_{p,k}(0,m-1).$

В качестве начальных условий мы имеем очевидные результаты для ( то есть мы уже видели в ряду) и для $f_{p,k}(k,m)=1$ $m \ge 0$ $k$ $f_{p,k}(j,m)=0$ $k-j \gt m$ ( то есть не хватает испытаний, чтобы получить $k$ в ряд). Теперь это быстро и просто (с использованием динамического программирования или, поскольку параметры этой задачи настолько малы, рекурсия), чтобы вычислить

f_{p, 8} (0, 25) = 18 p^{8} - 17 p^{9} - 45 p^{16} + 81 p^{17} - 36 p^{18} .

$f_{p,8}(0,25) = 18p^8 - 17p^9 - 45p^{16} + 81p^{17}-36p^{18}.$

Когда это дает . $p=1/3$ $80897 / 43046721 \approx 0.0018793$

Относительно быстрый Rкод для симуляции это

hits8 <- function() {
    x <- rbinom(26, 1, 1/3)                # 25 Binomial trials
    x[1] <- 0                              # ... and a 0 to get started with `diff`
    if(sum(x) >= 8) {                      # Are there at least 8 successes?
        max(diff(cumsum(x), lag=8)) >= 8   # Are there 8 successes in a row anywhere?
    } else {
        FALSE                              # Not enough successes for 8 in a row
    }
}
set.seed(17)
mean(replicate(10^5, hits8()))

Через 3 секунды после расчета . Хотя это выглядит высоко, это только 1.7 стандартных ошибок. Я еще итераций, получив : всего на стандартных ошибок меньше, чем ожидалось. (В качестве двойной проверки, поскольку в более ранней версии этого кода была небольшая ошибка, я также 400 000 итераций в Mathematica, получив оценку .) $0.00213$ $10^6$ $0.001867$ $0.3$ $0.0018475$

Этот результат меньше , чем одну десятую оценку в вопросе. Но , возможно , я не до конца понял: еще одна интерпретация « у вас есть 8 полных блоков ... чтобы получить 8 испытаний исправить в строке» в том , что ответ изыскиваются РАВНО . $1-(1-(1/3)^8)^{136} \approx 0.0205$ $1 - (1 - f_{1/3,8}(0,25))^8) = 0.0149358...$

— whuber
источник

13

While @whuber's excellent dynamic programming solution is well worth a read, its runtime is $\mathcal O(k^2m)$ with respect to total number of trials $m$ and the desired trial length $k$ whereas the matrix exponentiation method is $\mathcal O(k^3\log(m))$ . If $m$ is much larger than $k$ , the following method is faster.

Both solutions consider the problem as a Markov chain with states representing the number of correct trials at the end of the string so far, and a state for achieving the desired correct trials in a row. The transition matrix is such that seeing a failure with probability $p$ sends you back to state 0, and otherwise with probability $1-p$ advances you to the next state (the final state is an absorbing state). By raising this matrix to the $n$ th power, the value in the first row, and last column is the probability of seeing $k=8$ heads in a row. In Python:

import numpy as np

def heads_in_a_row(flips, p, want):
    a = np.zeros((want + 1, want + 1))
    for i in range(want):
        a[i, 0] = 1 - p
        a[i, i + 1] = p
    a[want, want] = 1.0
    return np.linalg.matrix_power(a, flips)[0, want]

print(heads_in_a_row(flips=25, p=1.0 / 3.0, want=8))

yields 0.00187928367413 as desired.

— Neil G
источник

10

According to this answer, I will explain the Markov-Chain approach by @Neil G a bit more and provide a general solution to such problems in R. Let's denote the desired number of correct trials in a row by $k$ , the number of trials as $n$ and a correct trial by $W$ (win) and an incorrect trial by $F$ (fail). In the process of keeping track of the trials, you want to know whether you already had a streak of 8 correct trials and the number of correct trials at the end of your current sequence. There are 9 states ( $k+1$ ):

$A$ $8$ $F$

$B$ $8$ $FW$

$C$ $8$ $FWW$

$\ldots$

$H$ $8$ $FWWWWWWW$

$I$ $8$

$B$ from state $A$ is $p=1/3$ and with probability $1-p=2/3$ we stay in state $A$ . From state $B$ , the probability of moving to state $C$ is $1/3$ and with probability $2/3$ we move back to $A$ . And so on. If we are in state $I$ , we stay there.

From this, we can construct a $9\times9$ transition matrix $M$ (as each column of $M$ sums to $1$ and all entries are positive, $M$ is called a left stochastic matrix):

M = (\begin{matrix} 2 / 3 & 2 / 3 & 2 / 3 & 2 / 3 & 2 / 3 & 2 / 3 & 2 / 3 & 2 / 3 & 0 \\ 1 / 3 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 / 3 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 / 3 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 / 3 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 / 3 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 / 3 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 / 3 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 / 3 & 1 \end{matrix})

$M= \begin{pmatrix} 2/3 & 2/3 & 2/3 & 2/3 & 2/3 & 2/3 & 2/3 & 2/3 & 0 \\ 1/3 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1/3 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1/3 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1/3 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1/3 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1/3 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1/3 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1/3 & 1 \end{pmatrix}$

Each column and row corresponds to one state. After $n$ trials, the entries of $M^{n}$ give the probability of getting from state $j$ (column) to state $i$ (row) in $n$ trials. The rightmost column corresponds to the state $I$ and the only entry is $1$ in the right lower corner. This means that once we are in state $I$ , the probability to stay in $I$ is $1$ . We are interested in the probability of getting to state $I$ from state $A$ in $n=25$ steps which corresponds to the lower left entry of $M^{25}$ (i.e. $M^{25}_{91}$ ). All we have to do now is calculating $M^{25}$ . We can do that in R with the matrix power function from the expm package:

library(expm)

k <- 8   # desired number of correct trials in a row
p <- 1/3 # probability of getting a correct trial
n <- 25  # Total number of trials 

# Set up the transition matrix M

M <- matrix(0, k+1, k+1)

M[ 1, 1:k ] <- (1-p)

M[ k+1, k+1 ] <- 1

for( i in 2:(k+1) ) {

  M[i, i-1] <- p

}

# Name the columns and rows according to the states (A-I)

colnames(M) <- rownames(M) <- LETTERS[ 1:(k+1) ]

round(M,2)

     A    B    C    D    E    F    G    H I
A 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0.67 0
B 0.33 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0
C 0.00 0.33 0.00 0.00 0.00 0.00 0.00 0.00 0
D 0.00 0.00 0.33 0.00 0.00 0.00 0.00 0.00 0
E 0.00 0.00 0.00 0.33 0.00 0.00 0.00 0.00 0
F 0.00 0.00 0.00 0.00 0.33 0.00 0.00 0.00 0
G 0.00 0.00 0.00 0.00 0.00 0.33 0.00 0.00 0
H 0.00 0.00 0.00 0.00 0.00 0.00 0.33 0.00 0
I 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.33 1

# Calculate M^25

Mn <- M%^%n
Mn[ (k+1), 1 ]
[1] 0.001879284

The probability of getting from state $A$ to state $I$ in 25 steps is $0.001879284$ , as established by the other answers.

— COOLSerdash
источник

3

Here is some R code that I wrote to simulate this:

tmpfun <- function() {
     x <- rbinom(25, 1, 1/3)  
     rx <- rle(x)
     any( rx$lengths[ rx$values==1 ] >= 8 )
}

tmpfun2 <- function() {
    any( replicate(8, tmpfun()) )
}

mean(replicate(100000, tmpfun2()))

I am getting values a little smaller than your formula, so one of us may have made a mistake somewhere.

— Greg Snow
источник

Does your function include trials where it is impossible to get 8 in a row right, e.g. where the "run" started on trial 20?

— Michelle

Most likely me, my R simulation is giving me smaller values as well. I'm just curious if there is an algebraic solution to solve this as a simple probability issue in case someone disputes a simulation.

— AcidNynex

1

I think this answer would be improved by providing the output you obtained so that it can be compared. Of course, including something like a histogram in addition would be even better! The code looks right to me at first glance. Cheers. :)

— cardinal

3

Here is a Mathematica simulation for the Markov chain approach, note that Mathematica indexes by $1$ not $0$ :

M = Table[e[i, j] /. {
    e[9, 1] :> 0,
    e[9, 9] :> 1,
    e[_, 1] :> (1 - p),
    e[_, _] /; j == i + 1 :> p,
    e[_, _] :> 0
  }, {i, 1, 9}, {j, 1, 9}];

x = MatrixPower[M, 25][[1, 9]] // Expand

This would yield the analytical answer:

18 p^{8} - 17 p^{9} - 45 p^{16} + 81 p^{17} - 36 p^{18}

$18 p^8 - 17 p^9 - 45 p^{16} + 81 p^{17} - 36 p^{18}$

Evaluating at $p=\frac{1.0}{3.0}$

x /. p -> 1/3 // N

Will return $0.00187928$

This can also be evaluated directly using builtin Probability and DiscreteMarkovProcess Mathematica functions:

Probability[k[25] == 9, Distributed[k, DiscreteMarkovProcess[1, M /. p -> 1/3]]] // N

Which will get us the same answer: $0.00187928$

— Hossam Karim
источник