Open
Conversation
add answer to the question:DQN的两个关键trick分别是什么?
Author
|
add answer to the question:DQN的两个关键trick分别是什么? |
|
@MonkeyCode-AI review |
MonkeyCode-AI
left a comment
There was a problem hiding this comment.
Pull Request 概述
- 本次PR修改了"强化学习.md"文件,完善了关于DQN两个关键技巧的描述,将原本的TODO项替换为具体的解释内容。
Pull Request 变更详情
| 文件路径 | 变更类型 | 变更内容 |
|---|---|---|
| docs/强化学习.md | 修改 | 完善了DQN两个关键技巧的描述,添加了Replay buffer和Fixed Q-targets的解释 |
docs/强化学习.md
Outdated
There was a problem hiding this comment.
DQN的两个关键技巧描述准确,但可以进一步优化表述以提高可读性。
Suggested change
| - Fixed Q-targets: 在更新Q网络参数时,用以计算$q_{target}$的网络参数是上一次迭代前的网络参数$\theta_{i-1}$,当前q值是根据网络参数为$\theta_{i}$的Q网络得出,这也是一种打乱相关性的机理。 | |
| Replay buffer(经验回放):在训练过程中,将经验存储在经验池中,并随机采样用于更新网络参数。这种方法一方面打破了样本之间的相关性,另一方面提高了样本的利用效率(一个样本可能被多次用于网络参数更新)。 | |
| Fixed Q-targets(固定Q目标):在更新Q网络参数时,用于计算$q_{target}$的网络参数是上一次迭代前的网络参数$\\theta_{i-1}$,而当前q值是根据网络参数为$\\theta_{i}$的Q网络得出。这也是一种打破相关性的机制。 |
|
⏳ MonkeyCode-AI 正在分析,请稍等片刻... |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
add answer to the question:DQN的两个关键trick分别是什么?