RLFromHumanPrefrences：加强从人类偏好中学习，以产生与预期不符的行为，并通过Garner工具通过人类偏好进行学习下载

【文件属性】：

文件名称：RLFromHumanPrefrences：加强从人类偏好中学习，以产生与预期不符的行为，并通过Garner工具通过人类偏好进行学习

文件大小：400KB

文件格式：ZIP

更新时间：2024-03-05 01:58:22

Python

RLFromHumanPrefrences 通过通过人的偏好学习，从人的偏好中加强学习，以产生与环境奖励不符的行为。工具。要求 Python 3（它可能与Python 2兼容，但我没有对其进行测试）为了安装要求，请遵循： # PyTorch conda install pytorch torchvision -c soumith # Baselines for Atari preprocessing git clone https://github.com/openai/baselines.git cd baselines pip install -e . # Other requirements pip install -r requirements.txt 代码从以下存储库改编的代码：纸根据人的喜好进行深度强化学习使用工具复制OpenAI和Deepminds项目，以有

立即下载

【文件预览】：
RLFromHumanPrefrences-main
----.ipynb_checkpoints()
--------reward_predictor-checkpoint.py(6KB)
--------pref_db-checkpoint.py(7KB)
----evaluation.py(2KB)
----baselines()
----main.py(7KB)
----wandb()
--------run-20201119_204157-3ouumq8n()
--------run-20201119_204621-2ohukqv8()
----requirements.txt(15B)
----main-old.py(8KB)
----training.ipynb(13KB)
----LICENSE(1KB)
----reward_predictor.py(6KB)
----README.md(1KB)
----pref_db.py(7KB)
----download.gif(209KB)
----Garner-python()
----pref_work.ipynb(19KB)
----.gitignore(1KB)
----a2c_ppo_acktr()
--------kfac.py(8KB)
--------arguments.py(5KB)
--------utils.py(2KB)
--------model.py(7KB)
--------__init__.py(0B)
--------storage.py(10KB)
--------a2c_acktr.py(3KB)
--------envs.py(8KB)
--------distributions.py(3KB)

秒客网

RLFromHumanPrefrences：加强从人类偏好中学习，以产生与预期不符的行为，并通过Garner工具通过人类偏好进行学习

网友评论

相关文章