admin管理员组文章数量:1516870
PLSA+EM
-
加入隐变量的联合概率,条件概率等为:
p(di,zk,wj)=p(di)p(zk∣di)p(wj∣zk)p\left(d_{i}, z_{k}, w_{j}\right)=p\left(d_{i}\right) p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)p(di,zk,wj)=p(di)p(zk∣di)p(wj∣zk)
P(wj∣di)=∑k=1KP(zk∣di)P(wj∣zk)P(di,wj)=P(di)∑k=1KP(wj∣zk)P(zk∣di)\begin{array}{c} P\left(w_{j} | d_{i}\right)=\sum_{k=1}^{K} P\left(z_{k} | d_{i}\right) P\left(w_{j} | z_{k}\right) \\ P\left(d_{i}, w_{j}\right)=P\left(d_{i}\right) \sum_{k=1}^{K} P\left(w_{j} | z_{k}\right) P\left(z_{k} | d_{i}\right) \end{array}P(wj∣di)=∑k=1KP(zk∣di)P(wj∣zk)P(di,wj)=P(di)∑k=1KP(wj∣zk)P(zk∣di) -
得到对数似然函数:
L=∑i=1N∑j=1M[n(di,wj)logP(di)+n(di,wj)log∑k=1KP(wj∣zk)P(zk∣di)]L=\sum_{i=1}^{N} \sum_{j=1}^{M}\left[n\left(d_{i}, w_{j}\right) \log P\left(d_{i}\right)+n\left(d_{i}, w_{j}\right) \log \sum_{k=1}^{K} P\left(w_{j} | z_{k}\right) P\left(z_{k} | d_{i}\right)\right]L=i=1∑Nj=1∑M[n(di,wj)logP(di)+n(di,wj)logk=1∑KP(wj∣zk)P(zk∣di)] -
求E-step,即是求解后验概率,根据步骤一的已知可以得到:
γ(zijk)=p(zk∣di,wj)=p(di)p(zk∣di)p(wj∣zk)∑k=1Kp(di)p(zk∣di)p(wj∣zk)\gamma\left(z_{i j k}\right)=p\left(z_{k} | d_{i}, w_{j}\right)=\frac{p\left(d_{i}\right) p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}{\sum_{k=1}^{K} p\left(d_{i}\right) p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}γ(zijk)=p(zk∣di,wj)=∑k=1Kp(di)p(zk∣di)p(wj∣zk)p(di)p(zk∣di)p(wj∣zk)
和p(di)p(d_i)p(di)参数无关,消去得到:
γ(zijk)=p(zk∣di)p(wj∣zk)∑k=1Kp(zk∣di)p(wj∣zk)\gamma\left(z_{i j k}\right)=\frac{p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}{\sum_{k=1}^{K} p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}γ(zijk)=∑k=1Kp(zk∣di)p(wj∣zk)p(zk∣di)p(wj∣zk) -
M-step
(1)求Q函数,对于一对样本而言,有期望函数为:
∑k=1Kγ(zijk)logp(di,zk,wj)=∑k=1Kγ(zijk)(logp(zk∣di)p(wj∣zk)+logp(di))\begin{array}{l} \sum_{k=1}^{K} \gamma\left(z_{i j k}\right) \log p\left(d_{i}, z_{k}, w_{j}\right) =\sum_{k=1}^{K} \gamma\left(z_{i j k}\right)\left(\log p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)+\log p\left(d_{i}\right)\right) \end{array}∑k=1Kγ(zijk)logp(di,zk,wj)=∑k=1Kγ(zijk)(logp(zk∣di)p(wj∣zk)+logp(di))
由于和单个样本的logP(di)logP(d_i)logP(di)为常数,可以不考虑在优化中,简化为:
∑k=1Kγ(zijk)(logp(zk∣di)p(wj∣zk))\begin{array}{l} \sum_{k=1}^{K} \gamma\left(z_{i j k}\right)\left(\log p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)\right) \end{array}∑k=1Kγ(zijk)(logp(zk∣di)p(wj∣zk))
(2)对全部样本有:
Q=∑i=1N∑j=1Mn(di,wj)∑k=1Kγ(zijk)(logp(zk∣di)p(wj∣zk))Q=\sum_{i=1}^{N} \sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \sum_{k=1}^{K} \gamma\left(z_{i j k}\right)\left(\log p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)\right)Q=i=1∑Nj=1∑Mn(di,wj)k=1∑Kγ(zijk)(logp(zk∣di)p(wj∣zk))
(3)最大化Q函数,结合约束项∑k=1Kp(zk∣d)=1\sum_{k=1}^{K} p\left(z_{k} | d\right)=1∑k=1Kp(zk∣d)=1和约束项∑w∈Vp(w∣zk)=1\sum_{w \in V} p\left(w | z_{k}\right)=1∑w∈Vp(w∣zk)=1分别可求到如下:
1)对于p(zk∣di)p\left(z_{k} | d_{i}\right)p(zk∣di),根据拉格朗日乘子法:
Lg=Q(θ,θold)+λ(∑k=1Kp(zk∣di)−1)Lg=Q\left(\theta, \theta^{o l d}\right)+\lambda\left(\sum_{k=1}^{K} p\left(z_{k} | d_{i}\right)-1\right)Lg=Q(θ,θold)+λ(k=1∑Kp(zk∣di)−1)
2)对p(zk∣di)p\left(z_{k} | d_{i}\right)p(zk∣di)求偏导有,
−∑j=1Mn(di,wj)γ(zijk)=λp(zk∣di)-\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)=\lambda p\left(z_{k} | d_{i}\right)−j=1∑Mn(di,wj)γ(zijk)=λp(zk∣di)
3)由于∑k=1Kγ(zijk)=1\sum_{k=1}^{K}\gamma\left(z_{i j k}\right)=1∑k=1Kγ(zijk)=1和∑k=1Kp(zk∣di)=1\sum_{k=1}^{K}p\left(z_{k} | d_{i}\right)=1∑k=1Kp(zk∣di)=1,带入上式有:
λ=−∑j=1Mn(di,wj)\lambda=-\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right)λ=−j=1∑Mn(di,wj)
4)把λ\lambdaλ带入到上上式中,得到p(zk∣di)p\left(z_{k} | d_{i}\right)p(zk∣di)的表达式:
p(zk∣di)=∑j=1Mn(di,wj)γ(zijk)∑j=1Mn(di,wj)p\left(z_{k} | d_{i}\right)=\frac{\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}{\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right)}p(zk∣di)=∑j=1Mn(di,wj)∑j=1Mn(di,wj)γ(zijk)
同理,采用拉格朗日乘子法也可以求得p(wj∣zk)p\left(w_{j} | z_{k}\right)p(wj∣zk)的表达,过程如下:
1)表达式:
Lg=Q(θ,θold)+λ(∑k=1Kp(wj∣zk)−1)Lg=Q\left(\theta, \theta^{\text {old}}\right)+\lambda\left(\sum_{k=1}^{K} p\left(w_{j} | z_{k}\right)-1\right)Lg=Q(θ,θold)+λ(k=1∑Kp(wj∣zk)−1)
2)求偏导得:
−∑i=1Nn(di,wj)γ(zijk)=λp(wj∣zk)-\sum_{i=1}^{N} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)=\lambda p\left(w_{j} | z_{k}\right)−i=1∑Nn(di,wj)γ(zijk)=λp(wj∣zk)
3)对参数jjj的词累加得:
λ=−∑i=1N∑j=1Mn(di,wj)γ(zijk)\lambda=-\sum_{i=1}^{N} \sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)λ=−i=1∑Nj=1∑Mn(di,wj)γ(zijk)
4)再带入(2)中,求得:
p(wj∣zk)=∑i=1Nn(di,wj)γ(zijk)∑i=1N∑j=1Mn(di,wj)γ(zijk)p\left(w_{j} | z_{k}\right)=\frac{\sum_{i=1}^{N} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}{\sum_{i=1}^{N} \sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}p(wj∣zk)=∑i=1N∑j=1Mn(di,wj)γ(zijk)∑i=1Nn(di,wj)γ(zijk)
- 总结得到优化的步骤为:
E-step,求后验概率:
γ(zijk)=p(zk∣di)p(wj∣zk)∑k=1Kp(zk∣di)p(wj∣zk)\gamma\left(z_{i j k}\right)=\frac{p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}{\sum_{k=1}^{K} p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}γ(zijk)=∑k=1Kp(zk∣di)p(wj∣zk)p(zk∣di)p(wj∣zk)
M-step:
p(zk∣di)=∑j=1Mn(di,wj)γ(zijk)∑j=1Mn(di,wj)p\left(z_{k} | d_{i}\right)=\frac{\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}{\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right)}p(zk∣di)=∑j=1Mn(di,wj)∑j=1Mn(di,wj)γ(zijk)
p(wj∣zk)=∑i=1Nn(di,wj)γ(zijk)∑i=1N∑j=1Mn(di,wj)γ(zijk)p\left(w_{j} | z_{k}\right)=\frac{\sum_{i=1}^{N} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}{\sum_{i=1}^{N} \sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}p(wj∣zk)=∑i=1N∑j=1Mn(di,wj)γ(zijk)∑i=1Nn(di,wj)γ(zijk)
本文标签: PLSAEM
版权声明:本文标题:PLSA+EM 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://www.betaflare.com/biancheng/1707357493a721500.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。


发表评论