(一万小时计划)-十二月一日总结 - 润新知

(一万小时计划)-十二月一日总结

十二月一日学习汇总

代码：

Deep reinforcement learning course ：https://github.com/simoninithomas/Deep_reinforcement_learning_Course/tree/master/PPO with Sonic the Hedgehog，

https://medium.com/deep-reinforcement-learning-course/launching-deep-reinforcement-learning-course-v2-0-38fa3c24bcbc

Deep reinforcement learning with pytorch：https://github.com/sweetice/Deep-reinforcement-learning-with-pytorch，https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch

Reinforce详解:https://blog.csdn.net/lrt366/article/details/91359230

相关代码：https://github.com/chingyaoc/pytorch-REINFORCE/blob/master/assets/algo.png

算法博客：

DDPG算法详解：https://blog.csdn.net/kenneth_yu/article/details/78478356

策略梯度：https://developer.ibm.com/zh/articles/ba-lo-deep-introduce-policy-gradient/

ONpolicy off policy 区别：https://www.zhihu.com/question/57159315#:~:text=On-policy和off-policy,策略，后者则不是。&text=-greedy，则是on-policy。&text=)%EF%BC%8C%E6%9B%B4%E6%96%B0%E7%9A%84%E6%97%B6%E5%80%99%E6%98%AF0,%EF%BC%8C%E5%88%99%E6%98%AFoff%2Dpolicy%E3%80%82。

TD算法详解：https://zhuanlan.zhihu.com/p/25913410

DQN算法：https://blog.csdn.net/qq_30615903/article/details/80744083，https://zhuanlan.zhihu.com/p/21421729

No module named ...解决办法：https://github.com/openai/spinningup/issues/60

课程

MIt概率论：https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041-probabilistic-systems-analysis-and-applied-probability-fall-2010/video-lectures/

应用随机过程：概率模型导论

Probability in electrical engineering and computer science an application driven course

凸优化以及随机过程,CS285.

数学相关知识：https://www.msra.cn/zh-cn/news/features/book-recommendation-machine-learning-math

论文相关：https://arxiv.org/abs/1701.08936
相关阅读:
Delphi命名规则
 highcharts 折线，饼状，条状综合图
 Highcharts创建一个简单的柱状图
 创建一个简单的WCF程序
 VS快捷键大全
 2021.05.28 手写简易web服务器
 2021.05.23 春眠不觉晓，optional知多少……
springboot整合ActiveMQ实现异步交易
 安利一款云容器管理工具portainer……
uglifyjs压缩js文件(指令压缩/ 批量压缩/ 编程方式压缩)
原文地址：https://www.cnblogs.com/ethancode/p/14070612.html

Copyright © 2020-2023 润新知