Clever Geek Handbook
📜 ⬆️ ⬇️

Baum-Welsh Algorithm

The Baum-Welsh algorithm is used in computer science and statistics to find unknown parameters of the hidden Markov model (HMM). It uses a forward-reverse algorithm and is a special case of a generalized EM algorithm .

Baum - Welch Algorithm for Estimating Markov's Hidden Model

The hidden Markov model is a probabilistic model of many random variables{Oone,...,Ot,Qone,...,Qt} {\ displaystyle \ {O_ {1}, \; \ ldots, \; O_ {t}, \; Q_ {1}, \; \ ldots, \; Q_ {t} \}} {\displaystyle \{O_{1},\;\ldots ,\;O_{t},\;Q_{1},\;\ldots ,\;Q_{t}\}} . VariablesOt {\ displaystyle O_ {t}} {\displaystyle O_{t}} - known discrete observations, andQt {\ displaystyle Q_ {t}} Q_{t} - “hidden” discrete quantities. Within the framework of the hidden Markov model, there are two independent statements ensuring the convergence of this algorithm:

  1. t{\ displaystyle t} t hidden variable with known(t-one) {\ displaystyle (t-1)} {\displaystyle (t-1)} variable is independent of all previous(t-one) {\ displaystyle (t-1)} {\displaystyle (t-1)} variables, i.e.P(Qt∣Qt-one,Ot-one,...,Qone,Oone)=P(Qt∣Qt-one) {\ displaystyle P (Q_ {t} \ mid Q_ {t-1}, \; O_ {t-1}, \; \ ldots, \; Q_ {1}, \; O_ {1}) = P (Q_ {t} \ mid Q_ {t-1})} {\displaystyle P(Q_{t}\mid Q_{t-1},\;O_{t-1},\;\ldots ,\;Q_{1},\;O_{1})=P(Q_{t}\mid Q_{t-1})} ;
  2. t{\ displaystyle t} t known observation depends only ont {\ displaystyle t} t state, that is, does not depend on time,P(Ot∣Qt,Qt-one,Ot-one,...,Qone,Oone)=P(Ot∣Qt) {\ displaystyle P (O_ {t} \ mid Q_ {t}, \; Q_ {t-1}, \; O_ {t-1}, \; \ ldots, \; Q_ {1}, \; O_ { 1}) = P (O_ {t} \ mid Q_ {t})} {\displaystyle P(O_{t}\mid Q_{t},\;Q_{t-1},\;O_{t-1},\;\ldots ,\;Q_{1},\;O_{1})=P(O_{t}\mid Q_{t})} .

Next, an algorithm of “assumptions and maximizations” will be proposed for finding the maximum probabilistic estimate of the parameters of the hidden Markov model for a given set of observations. This algorithm is also known as the Baum-Welsh algorithm.

Qt{\ displaystyle Q_ {t}} Q_{t} Is a discrete random variable that takes one ofN {\ displaystyle N} N values(one...N) {\ displaystyle (1 \ ldots N)} {\displaystyle (1\ldots N)} . We assume that this Markov model, defined asP(Qt∣Qt-one) {\ displaystyle P (Q_ {t} \ mid Q_ {t-1})} {\displaystyle P(Q_{t}\mid Q_{t-1})} homogeneous in time, i.e. independent oft {\ displaystyle t} t . Then you can askP(Qt∣Qt-one) {\ displaystyle P (Q_ {t} \ mid Q_ {t-1})} {\displaystyle P(Q_{t}\mid Q_{t-1})} as a time-independent stochastic matrix of displacementsA={aij}=p(Qt=j∣Qt-one=i) {\ displaystyle A = \ {a_ {ij} \} = p (Q_ {t} = j \ mid Q_ {t-1} = i)} {\displaystyle A=\{a_{ij}\}=p(Q_{t}=j\mid Q_{t-1}=i)} . Special occasion for timet=one {\ displaystyle t = 1} t=1 determined by the initial distributionπi=P(Qone=i) {\ displaystyle \ pi _ {i} = P (Q_ {1} = i)} {\displaystyle \pi _{i}=P(Q_{1}=i)} .

We assume that we are ablej {\ displaystyle j} j at timet {\ displaystyle t} t , if aQt=j {\ displaystyle Q_ {t} = j} {\displaystyle Q_{t}=j} . The sequence of given states is defined asq=(qone,...,qT) {\ displaystyle q = (q_ {1}, \; \ ldots, \; q_ {T})} {\displaystyle q=(q_{1},\;\ldots ,\;q_{T})} whereqt∈{one...N} {\ displaystyle q_ {t} \ in \ {1 \ ldots N \}} {\displaystyle q_{t}\in \{1\ldots N\}} is a state at the momentt {\ displaystyle t} t .

Observation may have one ofL {\ displaystyle L} L possible valuesOt∈{oone,...,oL} {\ displaystyle O_ {t} \ in \ {o_ {1}, \; \ ldots, \; o_ {L} \}} {\displaystyle O_{t}\in \{o_{1},\;\ldots ,\;o_{L}\}} . The probability of a given observation vector at timet {\ displaystyle t} t for conditionj {\ displaystyle j} j defined asbj(ot)=P(Ot=ot∣Qt=j) {\ displaystyle b_ {j} (o_ {t}) = P (O_ {t} = o_ {t} \ mid Q_ {t} = j)} {\displaystyle b_{j}(o_{t})=P(O_{t}=o_{t}\mid Q_{t}=j)} (B={bij} {\ displaystyle B = \ {b_ {ij} \}} {\displaystyle B=\{b_{ij}\}} Is a matrixL {\ displaystyle L} L onN {\ displaystyle N} N ) Preset Sequence of ObservationsO {\ displaystyle O} O expressed asO=(Oone=oone,...,OT=oT) {\ displaystyle O = (O_ {1} = o_ {1}, \; \ ldots, \; O_ {T} = o_ {T})} {\displaystyle O=(O_{1}=o_{1},\;\ldots ,\;O_{T}=o_{T})} .

Therefore, we can describe the hidden Markov model usingλ=(A,B,π) {\ displaystyle \ lambda = (A \ ;, B, \; \ pi)} {\displaystyle \lambda =(A\;,B,\;\pi )} . For a given observation vectorO {\ displaystyle O}   Baum-Welsh algorithm findsλ∗=maxλP(O∣λ) {\ displaystyle \ lambda ^ {*} = \ max _ {\ lambda} P (O \ mid \ lambda)}   .λ {\ displaystyle \ lambda}   maximizes the likelihood of observationsO {\ displaystyle O}   .

Algorithm

Initial data:λ=(A,B,π) {\ displaystyle \ lambda = (A, \; B, \; \ pi)}   with random initial conditions.

The algorithm iteratively updates the parameterλ {\ displaystyle \ lambda}   before converging at one point.

Direct procedure

Defineαi(t)=p(Oone=oone,...,Ot=ot,Qt=i∣λ) {\ displaystyle \ alpha _ {i} (t) = p (O_ {1} = o_ {1}, \; \ ldots, \; O_ {t} = o_ {t}, \; Q_ {t} = i \ mid \ lambda)}   , which is the probability of obtaining a given sequenceoone,...,ot {\ displaystyle o_ {1}, \; \ ldots, \; o_ {t}}   for conditioni {\ displaystyle i}   at timet {\ displaystyle t}   .

αi(t){\ displaystyle \ alpha _ {i} (t)}   can be calculated recursively:

  1. αi(one)=πi⋅bi(Oone);{\ displaystyle \ alpha _ {i} (1) = \ pi _ {i} \ cdot b_ {i} (O_ {1});}  
  2. αj(t+one)=bj(Ot+one)∑i=oneNαi(t)⋅aij.{\ displaystyle \ alpha _ {j} (t + 1) = b_ {j} (O_ {t + 1}) \ sum _ {i = 1} ^ {N} {\ alpha _ {i} (t) \ cdot a_ {ij}}.}  

Reverse Procedure

This procedure allows you to calculateβi(t)=p(Ot+one=ot+one,...,OT=oT∣Qt=i,λ) {\ displaystyle \ beta _ {i} (t) = p (O_ {t + 1} = o_ {t + 1}, \ ldots, O_ {T} = o_ {T} \ mid Q_ {t} = i, \ lambda)}   probability of a finite given sequenceot+one,...,oT {\ displaystyle o_ {t + 1}, \; \ ldots, \; o_ {T}}   provided that we started from the initial statei {\ displaystyle i}   at timet {\ displaystyle t}   .

Can calculateβi(t) {\ displaystyle \ beta _ {i} (t)}   :

  1. βi(T)=p(∣Qt=i,λ)=one;{\ displaystyle \ beta _ {i} (T) = p (\ mid Q_ {t} = i, \ lambda) = 1;}  
  2. βi(t)=∑j=oneNβj(t+one)aijbj(Ot+one).{\ displaystyle \ beta _ {i} (t) = \ sum _ {j = 1} ^ {N} {\ beta _ {j} (t + 1) a_ {ij} b_ {j} (O_ {t + one})}.}  

Usingα {\ displaystyle \ alpha}   andβ {\ displaystyle \ beta}   the following values ​​can be calculated:

  • γi(t)≡p(Qt=i∣O,λ)=αi(t)βi(t)∑j=oneNαj(t)βj(t),{\ displaystyle \ gamma _ {i} (t) \ equiv p (Q_ {t} = i \ mid O, \; \ lambda) = {\ frac {\ alpha _ {i} (t) \ beta _ {i } (t)} {\ displaystyle \ sum _ {j = 1} ^ {N} \ alpha _ {j} (t) \ beta _ {j} (t)}},}  
  • ξij(t)≡p(Qt=i,Qt+one=j∣O,λ)=αi(t)aijβj(t+one)bj(ot+one)∑i=oneN∑j=oneNαi(t)aijβj(t+one)bj(Ot+one).{\ displaystyle \ xi _ {ij} (t) \ equiv p (Q_ {t} = i, \; Q_ {t + 1} = j \ mid O, \; \ lambda) = {\ frac {\ alpha _ {i} (t) a_ {ij} \ beta _ {j} (t + 1) b_ {j} (o_ {t + 1})} {\ displaystyle \ sum _ {i = 1} ^ {N} \ displaystyle \ sum _ {j = 1} ^ {N} \ alpha _ {i} (t) a_ {ij} \ beta _ {j} (t + 1) b_ {j} (O_ {t + 1})} }.}  

Havingγ {\ displaystyle \ gamma}   andξ {\ displaystyle \ xi}   , you can determine:

  • π¯i=γi(one),{\ displaystyle {\ bar {\ pi}} _ {i} = \ gamma _ {i} (1),}  
  • a¯ij=∑t=oneT-oneξij(t)∑t=oneT-oneγi(t),{\ displaystyle {\ bar {a}} _ {ij} = {\ frac {\ displaystyle \ sum _ {t = 1} ^ {T-1} \ xi _ {ij} (t)} {\ displaystyle \ sum _ {t = 1} ^ {T-1} \ gamma _ {i} (t)}},}  
  • b¯i(k)=∑t=oneTδOt,okγi(t)∑t=oneTγi(t).{\ displaystyle {\ bar {b}} _ {i} (k) = {\ frac {\ displaystyle \ sum _ {t = 1} ^ {T} \ delta _ {O_ {t}, \; o_ {k }} \ gamma _ {i} (t)} {\ displaystyle \ sum _ {t = 1} ^ {T} \ gamma _ {i} (t)}}.}  

Using new valuesA {\ displaystyle A}   ,B {\ displaystyle B}   andπ {\ displaystyle \ pi}   , iterations continue until convergence.

Sources

  • The Baum-Welch algorithm for estimating a Hidden Markov Model
  • Baum-Welch Algorithm
  • Lectures of S. Nikolenko on hidden Markov models
Source - https://ru.wikipedia.org/w/index.php?title=Bauma_ Algorithm_— Welsh &oldid = 101058106


More articles:

  • Engaged Numbers
  • Laboratory Gasometer
  • Akhmatov Prize
  • Quintus Granius
  • Grigoriev, Vladimir Yakovlevich
  • Tipsey, Bhagyashri
  • Komlosh, Peter
  • Anoshko, Valery Stanislavovich
  • Titok, Marina Alekseevna
  • Chermak, Ales

All articles

Clever Geek | 2019