• baselines算法库run.py模块分析


    baselines算法库地址:

    https://gitee.com/devilmaycry812839668/baselines

    ========================================

    对baselines算法库中 run.py模块代码分析:

    记录gym所有的游戏环境:

    _game_envs = defaultdict(set)
    for env in gym.envs.registry.all():
        # TODO: solve this with regexes
        env_type = env.entry_point.split(':')[0].split('.')[-1]
        _game_envs[env_type].add(env.id)

    测试:

    打印结果:

    algorithmic {'ReversedAddition-v0', 'RepeatCopy-v0', 'DuplicatedInput-v0', 'ReversedAddition3-v0', 'Copy-v0', 'Reverse-v0'}
    classic_control {'CartPole-v0', 'CartPole-v1', 'Acrobot-v1', 'Pendulum-v0', 'MountainCarContinuous-v0', 'MountainCar-v0'}
    box2d {'BipedalWalker-v3', 'LunarLander-v2', 'LunarLanderContinuous-v2', 'CarRacing-v0', 'BipedalWalkerHardcore-v3'}
    toy_text {'KellyCoinflipGeneralized-v0', 'KellyCoinflip-v0', 'GuessingGame-v0', 'Taxi-v3', 'CliffWalking-v0', 'NChain-v0', 'FrozenLake-v0', 'FrozenLake8x8-v0', 'HotterColder-v0', 'Roulette-v0', 'Blackjack-v0'}
    mujoco {'Ant-v2', 'Hopper-v2', 'Reacher-v2', 'InvertedDoublePendulum-v2', 'Humanoid-v2', 'HumanoidStandup-v2', 'Walker2d-v2', 'Thrower-v2', 'InvertedPendulum-v2', 'HalfCheetah-v2', 'Swimmer-v2', 'Striker-v2', 'Pusher-v2'}
    half_cheetah_v3 {'HalfCheetah-v3'}
    hopper_v3 {'Hopper-v3'}
    swimmer_v3 {'Swimmer-v3'}
    walker2d_v3 {'Walker2d-v3'}
    ant_v3 {'Ant-v3'}
    humanoid_v3 {'Humanoid-v3'}
    robotics {'HandManipulateBlockRotateZTouchSensorsDense-v1', 'FetchReach-v1', 'FetchPush-v1', 'HandManipulateBlockTouchSensors-v0', 'HandManipulatePenFull-v0', 'FetchPickAndPlace-v1', 'HandManipulateBlockTouchSensors-v1', 'HandManipulateBlockRotateParallelTouchSensorsDense-v1', 'HandManipulateBlockRotateParallel-v0', 'HandManipulateBlockRotateXYZ-v0', 'FetchSlideDense-v1', 'HandManipulatePenRotateTouchSensors-v1', 'HandManipulateBlockFullDense-v0', 'HandManipulateEggFullDense-v0', 'HandManipulateBlockRotateZTouchSensors-v0', 'HandManipulatePenDense-v0', 'HandManipulateEggDense-v0', 'HandManipulatePenRotateTouchSensors-v0', 'HandReachDense-v0', 'HandManipulateBlockRotateZTouchSensorsDense-v0', 'HandManipulateEggTouchSensors-v1', 'FetchReachDense-v1', 'HandManipulateEggRotateTouchSensorsDense-v1', 'HandManipulatePenRotateTouchSensorsDense-v1', 'HandManipulateEggRotate-v0', 'HandManipulateEggRotateDense-v0', 'HandManipulateBlockRotateXYZTouchSensors-v0', 'HandManipulateBlockTouchSensorsDense-v1', 'HandManipulateEggTouchSensorsDense-v0', 'HandManipulateBlockRotateXYZDense-v0', 'HandManipulateBlock-v0', 'HandReach-v0', 'HandManipulateBlockRotateParallelDense-v0', 'HandManipulateBlockRotateParallelTouchSensors-v0', 'HandManipulateBlockRotateZTouchSensors-v1', 'HandManipulateEggFull-v0', 'HandManipulatePen-v0', 'HandManipulateEggRotateTouchSensors-v1', 'HandManipulateBlockRotateZDense-v0', 'HandManipulateBlockRotateXYZTouchSensorsDense-v1', 'HandManipulatePenRotateDense-v0', 'HandManipulatePenRotate-v0', 'HandManipulateBlockRotateParallelTouchSensorsDense-v0', 'HandManipulatePenTouchSensorsDense-v0', 'HandManipulatePenTouchSensors-v1', 'HandManipulateEggTouchSensorsDense-v1', 'HandManipulatePenTouchSensorsDense-v1', 'HandManipulateBlockRotateXYZTouchSensors-v1', 'HandManipulateBlockRotateParallelTouchSensors-v1', 'HandManipulatePenFullDense-v0', 'HandManipulateEggTouchSensors-v0', 'HandManipulateEgg-v0', 'HandManipulateBlockRotateZ-v0', 'HandManipulateBlockTouchSensorsDense-v0', 'FetchSlide-v1', 'HandManipulatePenRotateTouchSensorsDense-v0', 'FetchPushDense-v1', 'HandManipulateBlockFull-v0', 'HandManipulateBlockDense-v0', 'HandManipulateEggRotateTouchSensors-v0', 'HandManipulateEggRotateTouchSensorsDense-v0', 'FetchPickAndPlaceDense-v1', 'HandManipulatePenTouchSensors-v0', 'HandManipulateBlockRotateXYZTouchSensorsDense-v0'}
    atari {'ElevatorAction-ramDeterministic-v0', 'RoadRunner-ramDeterministic-v0', 'HeroDeterministic-v4', 'Bowling-ramNoFrameskip-v4', 'Bowling-ramDeterministic-v0', 'Assault-ramDeterministic-v0', 'MsPacman-v4', 'MsPacmanDeterministic-v0', 'BowlingNoFrameskip-v4', 'Atlantis-ram-v4', 'Boxing-v4', 'ChopperCommand-ramDeterministic-v4', 'NameThisGame-ramDeterministic-v4', 'JourneyEscapeNoFrameskip-v4', 'Solaris-ramDeterministic-v4', 'CrazyClimber-ramNoFrameskip-v4', 'NameThisGame-ramNoFrameskip-v4', 'QbertNoFrameskip-v4', 'Gravitar-v4', 'ZaxxonNoFrameskip-v4', 'DoubleDunkDeterministic-v0', 'KangarooDeterministic-v4', 'Pitfall-ram-v0', 'BattleZoneDeterministic-v4', 'Amidar-v4', 'Enduro-ramDeterministic-v0', 'StarGunner-ram-v0', 'BankHeistDeterministic-v4', 'BattleZone-ramDeterministic-v0', 'ChopperCommand-ram-v0', 'Jamesbond-ramNoFrameskip-v0', 'DoubleDunk-v4', 'Pong-v0', 'Qbert-v0', 'Tutankham-ramNoFrameskip-v0', 'BattleZone-ramNoFrameskip-v0', 'Defender-ramDeterministic-v4', 'Kangaroo-ram-v0', 'MontezumaRevengeNoFrameskip-v4', 'Frostbite-ramNoFrameskip-v4', 'Hero-ramDeterministic-v0', 'QbertDeterministic-v4', 'EnduroDeterministic-v0', 'RobotankDeterministic-v4', 'SolarisDeterministic-v4', 'Enduro-ramDeterministic-v4', 'Asterix-ram-v4', 'Asterix-ramDeterministic-v4', 'CrazyClimberDeterministic-v4', 'KrullNoFrameskip-v4', 'Freeway-ram-v4', 'JourneyEscape-v4', 'Asterix-ramNoFrameskip-v0', 'DemonAttack-v0', 'Boxing-ram-v4', 'SpaceInvaders-ram-v4', 'Kangaroo-v0', 'Solaris-v4', 'Breakout-ram-v4', 'KungFuMasterDeterministic-v4', 'CrazyClimber-ramNoFrameskip-v0', 'IceHockey-ramDeterministic-v0', 'Adventure-ramNoFrameskip-v4', 'HeroNoFrameskip-v0', 'PrivateEyeNoFrameskip-v0', 'SpaceInvaders-ramDeterministic-v4', 'Freeway-ramNoFrameskip-v0', 'AirRaid-ram-v4', 'Zaxxon-v0', 'Riverraid-v0', 'Robotank-ramNoFrameskip-v4', 'AmidarDeterministic-v0', 'WizardOfWor-ram-v4', 'VideoPinball-v4', 'BeamRider-ramNoFrameskip-v0', 'DefenderDeterministic-v0', 'Centipede-ram-v0', 'AirRaidNoFrameskip-v4', 'BankHeist-ramDeterministic-v4', 'JourneyEscape-ramDeterministic-v4', 'TimePilot-ramNoFrameskip-v4', 'MsPacman-ramDeterministic-v4', 'Skiing-v0', 'UpNDownDeterministic-v4', 'Pong-ramNoFrameskip-v0', 'Centipede-ramNoFrameskip-v0', 'AirRaidDeterministic-v0', 'KungFuMaster-ramNoFrameskip-v0', 'PrivateEye-ramNoFrameskip-v4', 'AssaultNoFrameskip-v4', 'WizardOfWor-ramNoFrameskip-v0', 'BeamRiderDeterministic-v4', 'PhoenixDeterministic-v0', 'PongDeterministic-v4', 'Jamesbond-v0', 'VideoPinball-ram-v0', 'RoadRunner-ram-v0', 'GopherNoFrameskip-v0', 'Assault-v4', 'Defender-ram-v4', 'UpNDown-ram-v0', 'MontezumaRevenge-ram-v0', 'Defender-ram-v0', 'BankHeist-ramNoFrameskip-v4', 'DoubleDunk-ramNoFrameskip-v0', 'VentureDeterministic-v0', 'MontezumaRevenge-ramDeterministic-v4', 'Adventure-ramDeterministic-v0', 'MsPacman-ramNoFrameskip-v0', 'Krull-ram-v4', 'EnduroNoFrameskip-v4', 'JamesbondNoFrameskip-v0', 'Atlantis-ramNoFrameskip-v4', 'ElevatorAction-ramNoFrameskip-v4', 'DemonAttackNoFrameskip-v4', 'AdventureNoFrameskip-v4', 'Carnival-ramNoFrameskip-v4', 'PitfallDeterministic-v4', 'TimePilot-v4', 'Solaris-ramNoFrameskip-v0', 'PitfallDeterministic-v0', 'DoubleDunkNoFrameskip-v0', 'UpNDown-ramDeterministic-v0', 'JourneyEscapeNoFrameskip-v0', 'Asteroids-v0', 'NameThisGame-v0', 'BattleZone-v0', 'Tennis-v0', 'ChopperCommandNoFrameskip-v4', 'YarsRevenge-ram-v4', 'Hero-v4', 'PongNoFrameskip-v4', 'NameThisGame-ram-v0', 'Assault-ramNoFrameskip-v4', 'RiverraidDeterministic-v4', 'ZaxxonDeterministic-v4', 'VentureNoFrameskip-v0', 'SpaceInvadersNoFrameskip-v4', 'Tutankham-ram-v0', 'BreakoutNoFrameskip-v0', 'AirRaid-ramDeterministic-v4', 'AsteroidsDeterministic-v0', 'BankHeistNoFrameskip-v4', 'Bowling-ramNoFrameskip-v0', 'Alien-ramDeterministic-v4', 'Alien-ramNoFrameskip-v0', 'IceHockey-v0', 'AirRaid-v0', 'SeaquestNoFrameskip-v4', 'Atlantis-ramDeterministic-v0', 'SpaceInvaders-ram-v0', 'Jamesbond-v4', 'Kangaroo-ramNoFrameskip-v0', 'AlienNoFrameskip-v4', 'Phoenix-ram-v0', 'GravitarNoFrameskip-v0', 'PitfallNoFrameskip-v4', 'DoubleDunk-ram-v4', 'StarGunner-v0', 'CrazyClimber-ramDeterministic-v0', 'Robotank-ram-v0', 'Breakout-ramNoFrameskip-v0', 'AsterixNoFrameskip-v0', 'AtlantisNoFrameskip-v4', 'Boxing-ramNoFrameskip-v0', 'GopherDeterministic-v4', 'Gravitar-ram-v0', 'YarsRevenge-ramDeterministic-v0', 'KungFuMaster-ramDeterministic-v0', 'VideoPinballDeterministic-v4', 'Asteroids-ramNoFrameskip-v0', 'PhoenixNoFrameskip-v0', 'Jamesbond-ramNoFrameskip-v4', 'RiverraidNoFrameskip-v0', 'Pong-ramDeterministic-v4', 'TimePilot-ramDeterministic-v0', 'IceHockeyDeterministic-v4', 'Freeway-ram-v0', 'Adventure-ram-v4', 'FishingDerby-v4', 'Hero-ramNoFrameskip-v0', 'Seaquest-ramNoFrameskip-v4', 'SpaceInvadersDeterministic-v0', 'WizardOfWor-ram-v0', 'SpaceInvadersNoFrameskip-v0', 'Zaxxon-ram-v4', 'SpaceInvaders-v4', 'FreewayNoFrameskip-v0', 'BoxingNoFrameskip-v4', 'TimePilot-ram-v0', 'MontezumaRevengeNoFrameskip-v0', 'BreakoutDeterministic-v4', 'Defender-ramNoFrameskip-v4', 'TutankhamDeterministic-v0', 'Venture-ram-v0', 'Pong-v4', 'Robotank-v4', 'Atlantis-v4', 'Skiing-ram-v4', 'MsPacman-ramNoFrameskip-v4', 'AssaultNoFrameskip-v0', 'NameThisGame-ramDeterministic-v0', 'BattleZone-ramNoFrameskip-v4', 'Krull-ramDeterministic-v4', 'Breakout-ram-v0', 'MontezumaRevenge-ramNoFrameskip-v0', 'AssaultDeterministic-v0', 'Qbert-ram-v0', 'Seaquest-ram-v4', 'Seaquest-v4', 'Solaris-ram-v4', 'Solaris-ram-v0', 'RoadRunnerDeterministic-v4', 'CarnivalDeterministic-v4', 'Hero-ramDeterministic-v4', 'Amidar-ramNoFrameskip-v0', 'YarsRevenge-ramNoFrameskip-v4', 'GopherNoFrameskip-v4', 'FishingDerbyNoFrameskip-v0', 'AirRaidDeterministic-v4', 'Asterix-v4', 'MsPacmanNoFrameskip-v4', 'Bowling-v0', 'KungFuMaster-ram-v4', 'GopherDeterministic-v0', 'AirRaidNoFrameskip-v0', 'BankHeist-ram-v0', 'SkiingDeterministic-v0', 'StarGunnerDeterministic-v0', 'FishingDerby-v0', 'StarGunner-ramDeterministic-v0', 'RobotankNoFrameskip-v4', 'KangarooNoFrameskip-v0', 'TutankhamNoFrameskip-v0', 'Robotank-ramNoFrameskip-v0', 'AirRaid-ram-v0', 'Assault-ramDeterministic-v4', 'FishingDerby-ramNoFrameskip-v4', 'RoadRunner-v0', 'Krull-ram-v0', 'Pooyan-ramDeterministic-v0', 'Pitfall-ramNoFrameskip-v0', 'UpNDownDeterministic-v0', 'AsterixDeterministic-v4', 'Kangaroo-v4', 'SkiingNoFrameskip-v0', 'Gopher-v4', 'CarnivalDeterministic-v0', 'Solaris-ramNoFrameskip-v4', 'VideoPinball-ramNoFrameskip-v0', 'FrostbiteNoFrameskip-v4', 'RoadRunnerNoFrameskip-v4', 'ElevatorAction-ram-v0', 'Defender-ramNoFrameskip-v0', 'Frostbite-ramDeterministic-v4', 'EnduroNoFrameskip-v0', 'Bowling-ram-v0', 'IceHockey-ramNoFrameskip-v0', 'Defender-v0', 'CarnivalNoFrameskip-v4', 'Zaxxon-ramNoFrameskip-v0', 'Frostbite-ramDeterministic-v0', 'ZaxxonDeterministic-v0', 'IceHockey-ramDeterministic-v4', 'AsteroidsNoFrameskip-v0', 'Phoenix-v0', 'DoubleDunk-v0', 'Gravitar-ramNoFrameskip-v4', 'NameThisGameDeterministic-v0', 'BeamRiderNoFrameskip-v0', 'JamesbondDeterministic-v0', 'BattleZoneNoFrameskip-v4', 'YarsRevenge-ramDeterministic-v4', 'Breakout-v4', 'Adventure-ramDeterministic-v4', 'FrostbiteNoFrameskip-v0', 'Freeway-v4', 'ElevatorActionDeterministic-v0', 'BankHeist-v4', 'Frostbite-ram-v0', 'PongDeterministic-v0', 'AirRaid-v4', 'GravitarDeterministic-v4', 'UpNDown-ramNoFrameskip-v4', 'DefenderNoFrameskip-v4', 'Pitfall-ramDeterministic-v4', 'IceHockeyNoFrameskip-v0', 'JourneyEscape-ramNoFrameskip-v0', 'Pitfall-v4', 'Qbert-ram-v4', 'YarsRevenge-ram-v0', 'BeamRider-ramNoFrameskip-v4', 'BeamRider-ram-v0', 'PhoenixNoFrameskip-v4', 'KrullNoFrameskip-v0', 'Seaquest-v0', 'AmidarDeterministic-v4', 'SpaceInvaders-v0', 'SolarisNoFrameskip-v0', 'Venture-v4', 'Pong-ramNoFrameskip-v4', 'Enduro-v4', 'Pooyan-ram-v4', 'AlienDeterministic-v4', 'MsPacman-ram-v0', 'Phoenix-ram-v4', 'Seaquest-ramDeterministic-v0', 'Amidar-ramDeterministic-v0', 'Bowling-v4', 'SpaceInvadersDeterministic-v4', 'Gopher-ram-v0', 'JourneyEscape-ramNoFrameskip-v4', 'Phoenix-ramNoFrameskip-v4', 'RoadRunner-ramNoFrameskip-v0', 'Boxing-ram-v0', 'DoubleDunk-ramDeterministic-v4', 'JamesbondNoFrameskip-v4', 'Breakout-ramDeterministic-v4', 'Phoenix-v4', 'SeaquestDeterministic-v0', 'QbertDeterministic-v0', 'Krull-v0', 'Jamesbond-ramDeterministic-v0', 'Riverraid-ramNoFrameskip-v0', 'IceHockeyDeterministic-v0', 'Phoenix-ramNoFrameskip-v0', 'PrivateEye-v4', 'Centipede-v0', 'CrazyClimberNoFrameskip-v0', 'Jamesbond-ram-v4', 'SkiingDeterministic-v4', 'Freeway-ramNoFrameskip-v4', 'BattleZone-v4', 'Carnival-ramDeterministic-v4', 'IceHockeyNoFrameskip-v4', 'KrullDeterministic-v0', 'Venture-ramDeterministic-v4', 'Kangaroo-ramDeterministic-v4', 'YarsRevengeNoFrameskip-v4', 'IceHockey-ram-v4', 'DemonAttack-ramNoFrameskip-v0', 'PrivateEye-ram-v4', 'Alien-v0', 'AirRaid-ramNoFrameskip-v4', 'UpNDown-v0', 'SpaceInvaders-ramNoFrameskip-v4', 'VideoPinball-ramDeterministic-v0', 'Frostbite-v4', 'Tutankham-v0', 'SeaquestDeterministic-v4', 'Atlantis-ramDeterministic-v4', 'Gopher-ramDeterministic-v0', 'Hero-ram-v4', 'UpNDown-ramDeterministic-v4', 'NameThisGameNoFrameskip-v4', 'TimePilot-v0', 'FrostbiteDeterministic-v0', 'Amidar-ram-v4', 'ChopperCommandNoFrameskip-v0', 'PitfallNoFrameskip-v0', 'CrazyClimber-v4', 'RoadRunner-ramNoFrameskip-v4', 'BankHeistNoFrameskip-v0', 'Tutankham-v4', 'KungFuMasterNoFrameskip-v4', 'YarsRevengeDeterministic-v0', 'HeroDeterministic-v0', 'AdventureDeterministic-v0', 'Berzerk-ram-v4', 'Centipede-ram-v4', 'Skiing-ramDeterministic-v4', 'VideoPinball-ramNoFrameskip-v4', 'AdventureNoFrameskip-v0', 'DefenderDeterministic-v4', 'Alien-ramNoFrameskip-v4', 'ElevatorAction-v0', 'PrivateEyeDeterministic-v4', 'MontezumaRevenge-ram-v4', 'Asterix-ramNoFrameskip-v4', 'Alien-v4', 'TennisNoFrameskip-v0', 'Atlantis-v0', 'Krull-ramNoFrameskip-v4', 'BerzerkDeterministic-v4', 'Tennis-v4', 'VideoPinballNoFrameskip-v0', 'Zaxxon-ramDeterministic-v0', 'NameThisGameNoFrameskip-v0', 'JourneyEscape-ramDeterministic-v0', 'Pooyan-ramNoFrameskip-v4', 'Pitfall-v0', 'Riverraid-ramDeterministic-v0', 'VideoPinballDeterministic-v0', 'Tutankham-ramDeterministic-v4', 'Phoenix-ramDeterministic-v0', 'Tennis-ramNoFrameskip-v4', 'Hero-ram-v0', 'Frostbite-ram-v4', 'TennisNoFrameskip-v4', 'Carnival-ramDeterministic-v0', 'DemonAttack-ram-v4', 'JourneyEscape-ram-v4', 'UpNDown-v4', 'PooyanNoFrameskip-v4', 'Centipede-ramDeterministic-v4', 'Seaquest-ramDeterministic-v4', 'Pooyan-v4', 'AlienNoFrameskip-v0', 'Pong-ram-v4', 'UpNDownNoFrameskip-v0', 'AsterixNoFrameskip-v4', 'Breakout-ramDeterministic-v0', 'Venture-ramNoFrameskip-v4', 'Pong-ram-v0', 'Adventure-ramNoFrameskip-v0', 'Asteroids-ram-v4', 'Carnival-v0', 'PooyanNoFrameskip-v0', 'PrivateEye-ramNoFrameskip-v0', 'Qbert-v4', 'TutankhamNoFrameskip-v4', 'Skiing-ramNoFrameskip-v0', 'Riverraid-v4', 'VentureNoFrameskip-v4', 'CentipedeNoFrameskip-v4', 'Adventure-v4', 'MsPacmanNoFrameskip-v0', 'TimePilotDeterministic-v0', 'BattleZoneDeterministic-v0', 'BankHeist-ramNoFrameskip-v0', 'FishingDerbyDeterministic-v4', 'Boxing-ramDeterministic-v4', 'Asteroids-ramDeterministic-v0', 'PooyanDeterministic-v4', 'CrazyClimber-ram-v4', 'RoadRunner-v4', 'Amidar-v0', 'DemonAttackDeterministic-v0', 'Riverraid-ram-v0', 'PrivateEye-ramDeterministic-v0', 'RiverraidDeterministic-v0', 'RobotankNoFrameskip-v0', 'Seaquest-ram-v0', 'CarnivalNoFrameskip-v0', 'FishingDerby-ramDeterministic-v0', 'KangarooNoFrameskip-v4', 'FishingDerbyNoFrameskip-v4', 'ElevatorActionDeterministic-v4', 'Enduro-v0', 'AtlantisDeterministic-v0', 'KungFuMasterDeterministic-v0', 'Breakout-ramNoFrameskip-v4', 'PrivateEyeDeterministic-v0', 'BowlingNoFrameskip-v0', 'DoubleDunkNoFrameskip-v4', 'AsteroidsNoFrameskip-v4', 'CrazyClimberDeterministic-v0', 'Assault-ram-v0', 'Venture-ramNoFrameskip-v0', 'Gravitar-v0', 'GravitarNoFrameskip-v4', 'StarGunner-v4', 'KungFuMaster-ramDeterministic-v4', 'Tennis-ram-v4', 'Asterix-ramDeterministic-v0', 'Venture-ramDeterministic-v0', 'Tennis-ramDeterministic-v0', 'SolarisDeterministic-v0', 'VentureDeterministic-v4', 'VideoPinball-ramDeterministic-v4', 'Gravitar-ramDeterministic-v0', 'IceHockey-ramNoFrameskip-v4', 'ChopperCommand-ramDeterministic-v0', 'Asteroids-v4', 'ElevatorActionNoFrameskip-v0', 'BeamRider-ram-v4', 'Qbert-ramDeterministic-v0', 'Robotank-ramDeterministic-v4', 'Hero-ramNoFrameskip-v4', 'Skiing-ramNoFrameskip-v4', 'SkiingNoFrameskip-v4', 'Berzerk-v4', 'ChopperCommand-ramNoFrameskip-v0', 'BoxingDeterministic-v4', 'Venture-ram-v4', 'KungFuMaster-v0', 'PrivateEyeNoFrameskip-v4', 'HeroNoFrameskip-v4', 'Asteroids-ramDeterministic-v4', 'Krull-ramDeterministic-v0', 'GravitarDeterministic-v0', 'Kangaroo-ramDeterministic-v0', 'Gopher-ramNoFrameskip-v0', 'Breakout-v0', 'TimePilotDeterministic-v4', 'Jamesbond-ram-v0', 'Freeway-ramDeterministic-v4', 'JamesbondDeterministic-v4', 'JourneyEscapeDeterministic-v4', 'Gopher-v0', 'Tutankham-ramNoFrameskip-v4', 'KungFuMasterNoFrameskip-v0', 'SpaceInvaders-ramNoFrameskip-v0', 'Enduro-ram-v0', 'YarsRevengeNoFrameskip-v0', 'AirRaid-ramNoFrameskip-v0', 'Tutankham-ram-v4', 'Kangaroo-ram-v4', 'FreewayDeterministic-v4', 'RoadRunnerNoFrameskip-v0', 'RoadRunnerDeterministic-v0', 'CentipedeNoFrameskip-v0', 'AssaultDeterministic-v4', 'WizardOfWorDeterministic-v0', 'AsterixDeterministic-v0', 'SolarisNoFrameskip-v4', 'FishingDerbyDeterministic-v0', 'ChopperCommand-ramNoFrameskip-v4', 'Enduro-ramNoFrameskip-v4', 'Bowling-ramDeterministic-v4', 'YarsRevenge-ramNoFrameskip-v0', 'UpNDownNoFrameskip-v4', 'BankHeistDeterministic-v0', 'Tennis-ramNoFrameskip-v0', 'CrazyClimber-ram-v0', 'EnduroDeterministic-v4', 'TennisDeterministic-v0', 'Riverraid-ram-v4', 'Amidar-ram-v0', 'MsPacman-ram-v4', 'AirRaid-ramDeterministic-v0', 'Adventure-v0', 'Enduro-ramNoFrameskip-v0', 'KangarooDeterministic-v0', 'BreakoutNoFrameskip-v4', 'AtlantisNoFrameskip-v0', 'TimePilot-ramDeterministic-v4', 'StarGunner-ramDeterministic-v4', 'UpNDown-ram-v4', 'Carnival-v4', 'Frostbite-ramNoFrameskip-v0', 'ZaxxonNoFrameskip-v0', 'MontezumaRevengeDeterministic-v4', 'Gopher-ramDeterministic-v4', 'Asterix-v0', 'Riverraid-ramNoFrameskip-v4', 'Hero-v0', 'QbertNoFrameskip-v0', 'Jamesbond-ramDeterministic-v4', 'KungFuMaster-v4', 'Seaquest-ramNoFrameskip-v0', 'ChopperCommandDeterministic-v4', 'StarGunnerNoFrameskip-v4', 'BeamRider-v0', 'BattleZone-ram-v0', 'AsteroidsDeterministic-v4', 'FishingDerby-ram-v4', 'WizardOfWorNoFrameskip-v0', 'Alien-ram-v0', 'AtlantisDeterministic-v4', 'Qbert-ramNoFrameskip-v0', 'AlienDeterministic-v0', 'Kangaroo-ramNoFrameskip-v4', 'DoubleDunkDeterministic-v4', 'WizardOfWor-v4', 'MontezumaRevenge-ramNoFrameskip-v4', 'BerzerkNoFrameskip-v4', 'YarsRevenge-v4', 'DemonAttack-v4', 'NameThisGameDeterministic-v4', 'DoubleDunk-ram-v0', 'Zaxxon-ramDeterministic-v4', 'Robotank-v0', 'Pooyan-ramDeterministic-v4', 'DoubleDunk-ramDeterministic-v0', 'Krull-ramNoFrameskip-v0', 'TimePilot-ram-v4', 'DemonAttackDeterministic-v4', 'Pitfall-ramDeterministic-v0', 'PooyanDeterministic-v0', 'Carnival-ram-v0', 'Pooyan-v0', 'ChopperCommand-ram-v4', 'AmidarNoFrameskip-v0', 'BerzerkNoFrameskip-v0', 'ElevatorAction-ramDeterministic-v4', 'KrullDeterministic-v4', 'TimePilotNoFrameskip-v4', 'Enduro-ram-v4', 'Tennis-ram-v0', 'Carnival-ram-v4', 'ChopperCommand-v4', 'MontezumaRevenge-v4', 'TimePilotNoFrameskip-v0', 'Skiing-ram-v0', 'BankHeist-ram-v4', 'BankHeist-ramDeterministic-v0', 'JourneyEscape-v0', 'Krull-v4', 'Tutankham-ramDeterministic-v0', 'MsPacmanDeterministic-v4', 'Alien-ramDeterministic-v0', 'ChopperCommandDeterministic-v0', 'Pooyan-ramNoFrameskip-v0', 'Bowling-ram-v4', 'Riverraid-ramDeterministic-v4', 'MontezumaRevenge-ramDeterministic-v0', 'PrivateEye-ram-v0', 'ElevatorAction-ram-v4', 'SpaceInvaders-ramDeterministic-v0', 'Pooyan-ram-v0', 'Berzerk-ramNoFrameskip-v4', 'CentipedeDeterministic-v0', 'AdventureDeterministic-v4', 'PhoenixDeterministic-v4', 'YarsRevengeDeterministic-v4', 'ElevatorActionNoFrameskip-v4', 'PrivateEye-v0', 'FreewayNoFrameskip-v4', 'Asterix-ram-v0', 'NameThisGame-ram-v4', 'ElevatorAction-ramNoFrameskip-v0', 'BeamRiderDeterministic-v0', 'ChopperCommand-v0', 'DemonAttack-ram-v0', 'BowlingDeterministic-v4', 'StarGunnerDeterministic-v4', 'JourneyEscapeDeterministic-v0', 'StarGunnerNoFrameskip-v0', 'Zaxxon-ram-v0', 'Freeway-v0', 'Defender-v4', 'Centipede-ramNoFrameskip-v4', 'Robotank-ramDeterministic-v0', 'IceHockey-ram-v0', 'NameThisGame-ramNoFrameskip-v0', 'StarGunner-ramNoFrameskip-v0', 'Pitfall-ramNoFrameskip-v4', 'WizardOfWor-ramDeterministic-v0', 'BeamRider-ramDeterministic-v0', 'MontezumaRevengeDeterministic-v0', 'RoadRunner-ramDeterministic-v4', 'Solaris-v0', 'Venture-v0', 'Amidar-ramNoFrameskip-v4', 'Assault-ram-v4', 'Alien-ram-v4', 'Atlantis-ramNoFrameskip-v0', 'RoadRunner-ram-v4', 'Assault-ramNoFrameskip-v0', 'Qbert-ramNoFrameskip-v4', 'BattleZone-ramDeterministic-v4', 'Pong-ramDeterministic-v0', 'MsPacman-v0', 'FishingDerby-ramNoFrameskip-v0', 'BattleZone-ram-v4', 'YarsRevenge-v0', 'Gopher-ramNoFrameskip-v4', 'FishingDerby-ram-v0', 'DoubleDunk-ramNoFrameskip-v4', 'BeamRider-v4', 'DefenderNoFrameskip-v0', 'Gopher-ram-v4', 'Qbert-ramDeterministic-v4', 'Gravitar-ram-v4', 'VideoPinballNoFrameskip-v4', 'TennisDeterministic-v4', 'BeamRiderNoFrameskip-v4', 'WizardOfWorDeterministic-v4', 'Frostbite-v0', 'TutankhamDeterministic-v4', 'ElevatorAction-v4', 'MontezumaRevenge-v0', 'Amidar-ramDeterministic-v4', 'Skiing-ramDeterministic-v0', 'Zaxxon-v4', 'Skiing-v4', 'Assault-v0', 'Berzerk-v0', 'Asteroids-ramNoFrameskip-v4', 'CrazyClimber-v0', 'DemonAttack-ramDeterministic-v4', 'WizardOfWor-ramNoFrameskip-v4', 'FishingDerby-ramDeterministic-v4', 'Boxing-v0', 'KungFuMaster-ram-v0', 'VideoPinball-ram-v4', 'BattleZoneNoFrameskip-v0', 'BankHeist-v0', 'Berzerk-ram-v0', 'Carnival-ramNoFrameskip-v0', 'MsPacman-ramDeterministic-v0', 'Asteroids-ram-v0', 'PongNoFrameskip-v0', 'DemonAttack-ramNoFrameskip-v4', 'WizardOfWorNoFrameskip-v4', 'Atlantis-ram-v0', 'Freeway-ramDeterministic-v0', 'BoxingNoFrameskip-v0', 'WizardOfWor-ramDeterministic-v4', 'KungFuMaster-ramNoFrameskip-v4', 'Zaxxon-ramNoFrameskip-v4', 'WizardOfWor-v0', 'Berzerk-ramNoFrameskip-v0', 'Phoenix-ramDeterministic-v4', 'Gravitar-ramNoFrameskip-v0', 'CentipedeDeterministic-v4', 'DemonAttackNoFrameskip-v0', 'BreakoutDeterministic-v0', 'NameThisGame-v4', 'AmidarNoFrameskip-v4', 'RiverraidNoFrameskip-v4', 'Tennis-ramDeterministic-v4', 'VideoPinball-v0', 'BerzerkDeterministic-v0', 'Boxing-ramDeterministic-v0', 'Berzerk-ramDeterministic-v0', 'UpNDown-ramNoFrameskip-v0', 'DemonAttack-ramDeterministic-v0', 'Gravitar-ramDeterministic-v4', 'TimePilot-ramNoFrameskip-v0', 'FrostbiteDeterministic-v4', 'BeamRider-ramDeterministic-v4', 'Robotank-ram-v4', 'Defender-ramDeterministic-v0', 'Centipede-ramDeterministic-v0', 'FreewayDeterministic-v0', 'IceHockey-v4', 'StarGunner-ram-v4', 'Berzerk-ramDeterministic-v4', 'RobotankDeterministic-v0', 'Centipede-v4', 'PrivateEye-ramDeterministic-v4', 'Adventure-ram-v0', 'CrazyClimber-ramDeterministic-v4', 'Solaris-ramDeterministic-v0', 'BowlingDeterministic-v0', 'SeaquestNoFrameskip-v0', 'Pitfall-ram-v4', 'JourneyEscape-ram-v0', 'Boxing-ramNoFrameskip-v4', 'BoxingDeterministic-v0', 'StarGunner-ramNoFrameskip-v4', 'CrazyClimberNoFrameskip-v4'}
    unittest {'CubeCrashSparse-v0', 'CubeCrash-v0', 'CubeCrashScreenBecomesBlack-v0', 'MemorizeDigits-v0'}
    View Code

    ================================

    函数:

    def get_default_network(env_type):
        if env_type in {'atari', 'retro'}:
            return 'cnn'
        else:
            return 'mlp'

    根据输出参数返回指定哪种神经网络结构,CNN还是MLP。

    ===========================================

    函数:

    def get_alg_module(alg, submodule=None):
        submodule = submodule or alg
        try:
            # first try to import the alg module from baselines
            alg_module = import_module('.'.join(['baselines', alg, submodule]))
        except ImportError:
            # then from rl_algs
            alg_module = import_module('.'.join(['rl_' + 'algs', alg, submodule]))
    
        return alg_module

    get_alg_module函数的作用就是 输入的alg变量是个字符串,比如 alg为 'os' , 这个函数内部的作用就是import os; return os;

    该函数就是在程序启动时识别要启动的具体算法,然后在算法库中根据具体的算法名称调用该算法。

    函数:

    def get_learn_function(alg):
        return get_alg_module(alg).learn

    调用具体强化学习算法模块的learn函数

    函数:

    def get_learn_function_defaults(alg, env_type):
        try:
            alg_defaults = get_alg_module(alg, 'defaults')
            kwargs = getattr(alg_defaults, env_type)()
        except (ImportError, AttributeError):
            kwargs = {}
        return kwargs

    alg为要调用的算法,如:deepq、a2c、acer等,这里alg为字符串类型,如下:

    baselines中的所有算法都以名字单独有一个文件夹。

    env_type为游戏环境类型,也是最前面的字典_game_envs中的key值:

    我们比较常用的游戏环境为atari 和 mujoco 。

    函数get_learn_function_defaults的作用就是返回调用的强化学习算法模块下的default子模块中定义的名称为对应的env_type的函数来返回算法运行参数字典,这里我们以deepq算法模块下的default子模块为例:

     函数:

    def get_env_type(args):
        env_id = args.env
    
        if args.env_type is not None:
            return args.env_type, env_id
    
        # Re-parse the gym registry, since we could have new envs since last time.
        for env in gym.envs.registry.all():
            env_type = env.entry_point.split(':')[0].split('.')[-1]
            _game_envs[env_type].add(env.id)  # This is a set so add is idempotent
    
        if env_id in _game_envs.keys():
            env_type = env_id
            env_id = [g for g in _game_envs[env_type]][0]
        else:
            env_type = None
            for g, e in _game_envs.items():
                if env_id in e:
                    env_type = g
                    break
            if ':' in env_id:
                env_type = re.sub(r':.*', '', env_id)
            assert env_type is not None, 'env_id {} is not recognized in env types'.format(env_id, _game_envs.keys())
    
        return env_type, env_id

    根据输入的环境名env_id识别出所属的环境类型env_type 。

    对输入参数的解析:

    def arg_parser():
        """
        Create an empty argparse.ArgumentParser.
        """
        import argparse
        return argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    
    def common_arg_parser():
        """
        Create an argparse.ArgumentParser for run_mujoco.py.
        """
        parser = arg_parser()
        parser.add_argument('--env', help='environment ID', type=str, default='Reacher-v2')
        parser.add_argument('--env_type', help='type of environment, used when the environment type cannot be automatically determined', type=str)
        parser.add_argument('--seed', help='RNG seed', type=int, default=None)
        parser.add_argument('--alg', help='Algorithm', type=str, default='ppo2')
        parser.add_argument('--num_timesteps', type=float, default=1e6),
        parser.add_argument('--network', help='network type (mlp, cnn, lstm, cnn_lstm, conv_only)', default=None)
        parser.add_argument('--gamestate', help='game state to load (so far only used in retro games)', default=None)
        parser.add_argument('--num_env', help='Number of environment copies being run in parallel. When not specified, set to number of cpus for Atari, and to 1 for Mujoco', default=None, type=int)
        parser.add_argument('--reward_scale', help='Reward scale factor. Default: 1.0', default=1.0, type=float)
        parser.add_argument('--save_path', help='Path to save trained model to', default=None, type=str)
        parser.add_argument('--save_video_interval', help='Save video every x steps (0 = disabled)', default=0, type=int)
        parser.add_argument('--save_video_length', help='Length of recorded video. Default: 200', default=200, type=int)
        parser.add_argument('--log_path', help='Directory to save learning curve data.', default=None, type=str)
        parser.add_argument('--play', default=False, action='store_true')
        return parser
    
    def parse_unknown_args(args):
        """
        Parse arguments not consumed by arg parser into a dictionary
        """
        retval = {}
        preceded_by_key = False
        for arg in args:
            if arg.startswith('--'):
                if '=' in arg:
                    key = arg.split('=')[0][2:]
                    value = arg.split('=')[1]
                    retval[key] = value
                else:
                    key = arg[2:]
                    preceded_by_key = True
            elif preceded_by_key:
                retval[key] = arg
                preceded_by_key = False
    
        return retval
    
    def parse_cmdline_kwargs(args):
        '''
        convert a list of '='-spaced command-line arguments to a dictionary, evaluating python objects when possible
        '''
        def parse(v):
    
            assert isinstance(v, str)
            try:
                return eval(v)
            except (NameError, SyntaxError):
                return v
    
        return {k: parse(v) for k,v in parse_unknown_args(args).items()}
    
    
    arg_parser = common_arg_parser()
    args, unknown_args = arg_parser.parse_known_args()
    extra_args = parse_cmdline_kwargs(unknown_args)
    
    print(args)
    print(unknown_args)
    print(extra_args)

    运行:

    python test.py --aaa=me --xxx=11.11  --abc=True   --cde=1+99

    结果:

    Namespace(env='Reacher-v2', env_type=None, seed=None, alg='ppo2', num_timesteps=1000000.0, network=None, gamestate=None, num_env=None, reward_scale=1.0, save_path=None, save_video_interval=0, save_video_length=200, log_path=None, play=False)
    ['--aaa=me', '--xxx=11.11', '--abc=True', '--cde=1+99']
    {'aaa': 'me', 'xxx': 11.11, 'abc': True, 'cde': 100}

    其中,函数parse_unknown_args 是对没有成功解析的字符串构成的list转为dict类型,这里就是将list数据:

    ['--aaa=me', '--xxx=11.11', '--abc=True', '--cde=1+99']

    转成字典类型:

    {'aaa:'me', 'xxx':'11.11', 'abc':'True', 'cde':'1+99'}

    而操作:{k: parse(v) for k,v in parse_unknown_args(args).items()}

    再将字符类型的value转为对应的python类型,也就有了最后解析出来的字典:

    {'aaa': 'me', 'xxx': 11.11, 'abc': True, 'cde': 100} 。

    不过这个字符解析的操作搞得这么复杂在这里其实意义不太大,毕竟对于未定义的变量进行解析本就是一个极为小众的操作,这里不得不是baselines库的编写有些太不精简和优化了。

    baselines所支持的游戏环境,由于可以兼容retro模块,也就是说可以安装一些任天堂的游戏,因此需要为支持的环境字典中添加retro下的游戏名:

    _game_envs['retro'] = {
        'BubbleBobble-Nes',
        'SuperMarioBros-Nes',
        'TwinBee3PokoPokoDaimaou-Nes',
        'SpaceHarrier-Nes',
        'SonicTheHedgehog-Genesis',
        'Vectorman-Genesis',
        'FinalFight-Snes',
        'SpaceInvaders-Snes',
    }

    打印完整的环境类型和环境名:

    import gym
    from collections import defaultdict
    
    _game_envs = defaultdict(set)
    for env in gym.envs.registry.all():
        # TODO: solve this with regexes
        env_type = env.entry_point.split(':')[0].split('.')[-1]
        _game_envs[env_type].add(env.id)
    
    _game_envs['retro'] = {
        'BubbleBobble-Nes',
        'SuperMarioBros-Nes',
        'TwinBee3PokoPokoDaimaou-Nes',
        'SpaceHarrier-Nes',
        'SonicTheHedgehog-Genesis',
        'Vectorman-Genesis',
        'FinalFight-Snes',
        'SpaceInvaders-Snes',
    }
    
    for env_type in _game_envs:
        print(env_type, '\n', list(_game_envs[env_type]))
        print('......')

    打印结果:

    algorithmic 
     ['DuplicatedInput-v0', 'ReversedAddition-v0', 'Copy-v0', 'RepeatCopy-v0', 'Reverse-v0', 'ReversedAddition3-v0']
    ......
    classic_control 
     ['CartPole-v1', 'CartPole-v0', 'MountainCar-v0', 'Acrobot-v1', 'MountainCarContinuous-v0', 'Pendulum-v0']
    ......
    box2d 
     ['LunarLander-v2', 'LunarLanderContinuous-v2', 'CarRacing-v0', 'BipedalWalkerHardcore-v3', 'BipedalWalker-v3']
    ......
    toy_text 
     ['FrozenLake-v0', 'KellyCoinflip-v0', 'Blackjack-v0', 'NChain-v0', 'GuessingGame-v0', 'FrozenLake8x8-v0', 'Taxi-v3', 'HotterColder-v0', 'CliffWalking-v0', 'Roulette-v0', 'KellyCoinflipGeneralized-v0']
    ......
    mujoco 
     ['Walker2d-v2', 'Swimmer-v2', 'Humanoid-v2', 'InvertedDoublePendulum-v2', 'Reacher-v2', 'Hopper-v2', 'HalfCheetah-v2', 'Striker-v2', 'InvertedPendulum-v2', 'Thrower-v2', 'HumanoidStandup-v2', 'Ant-v2', 'Pusher-v2']
    ......
    half_cheetah_v3 
     ['HalfCheetah-v3']
    ......
    hopper_v3 
     ['Hopper-v3']
    ......
    swimmer_v3 
     ['Swimmer-v3']
    ......
    walker2d_v3 
     ['Walker2d-v3']
    ......
    ant_v3 
     ['Ant-v3']
    ......
    humanoid_v3 
     ['Humanoid-v3']
    ......
    robotics 
     ['FetchPush-v1', 'HandManipulateBlockRotateXYZTouchSensorsDense-v1', 'HandManipulateBlockRotateParallelTouchSensorsDense-v0', 'HandManipulateEggRotateTouchSensors-v1', 'HandManipulateBlockTouchSensors-v0', 'HandManipulateEggTouchSensorsDense-v0', 'HandManipulateEggFullDense-v0', 'HandManipulatePenDense-v0', 'HandManipulateBlockTouchSensorsDense-v1', 'HandManipulateBlockRotateParallelTouchSensors-v1', 'HandManipulateEggFull-v0', 'FetchPickAndPlaceDense-v1', 'HandManipulateEggRotateTouchSensors-v0', 'HandManipulateBlockRotateXYZTouchSensors-v0', 'HandManipulateBlockRotateParallelTouchSensors-v0', 'HandManipulateEggTouchSensors-v1', 'FetchPushDense-v1', 'HandManipulateBlockRotateZTouchSensorsDense-v1', 'HandReach-v0', 'HandManipulatePenTouchSensors-v1', 'HandManipulateBlockRotateXYZDense-v0', 'HandManipulateEggRotateTouchSensorsDense-v1', 'HandManipulateBlockRotateXYZTouchSensorsDense-v0', 'HandManipulatePenFullDense-v0', 'FetchPickAndPlace-v1', 'HandManipulatePenRotateTouchSensors-v0', 'HandManipulateBlockRotateZDense-v0', 'HandManipulateEggRotateTouchSensorsDense-v0', 'HandManipulateBlockRotateParallelDense-v0', 'HandManipulateBlock-v0', 'HandManipulateEggTouchSensorsDense-v1', 'HandManipulateBlockRotateXYZTouchSensors-v1', 'FetchReachDense-v1', 'HandManipulatePenTouchSensorsDense-v0', 'HandManipulatePenTouchSensorsDense-v1', 'HandManipulateEgg-v0', 'HandManipulateEggDense-v0', 'HandManipulateBlockRotateZTouchSensors-v1', 'FetchReach-v1', 'HandManipulatePen-v0', 'HandManipulateBlockFullDense-v0', 'HandManipulateBlockFull-v0', 'HandManipulateBlockRotateZ-v0', 'HandManipulateBlockRotateZTouchSensors-v0', 'HandManipulatePenTouchSensors-v0', 'HandManipulateEggRotate-v0', 'HandReachDense-v0', 'HandManipulateBlockRotateXYZ-v0', 'HandManipulateBlockDense-v0', 'FetchSlide-v1', 'HandManipulateBlockRotateParallel-v0', 'FetchSlideDense-v1', 'HandManipulatePenRotate-v0', 'HandManipulatePenFull-v0', 'HandManipulateBlockRotateParallelTouchSensorsDense-v1', 'HandManipulatePenRotateDense-v0', 'HandManipulateEggRotateDense-v0', 'HandManipulateBlockTouchSensors-v1', 'HandManipulatePenRotateTouchSensorsDense-v0', 'HandManipulateBlockTouchSensorsDense-v0', 'HandManipulatePenRotateTouchSensors-v1', 'HandManipulateEggTouchSensors-v0', 'HandManipulatePenRotateTouchSensorsDense-v1', 'HandManipulateBlockRotateZTouchSensorsDense-v0']
    ......
    atari 
     ['RoadRunner-ramDeterministic-v4', 'YarsRevenge-v4', 'TimePilot-ram-v0', 'Zaxxon-ramDeterministic-v0', 'AsteroidsNoFrameskip-v0', 'MsPacmanNoFrameskip-v0', 'SpaceInvaders-ramDeterministic-v4', 'Gopher-ramNoFrameskip-v0', 'Venture-v4', 'TutankhamNoFrameskip-v0', 'Skiing-ramDeterministic-v0', 'Phoenix-ram-v4', 'Asterix-v4', 'Pong-ram-v4', 'TimePilotNoFrameskip-v0', 'Assault-ramDeterministic-v4', 'Riverraid-ramDeterministic-v0', 'BerzerkDeterministic-v0', 'DefenderDeterministic-v4', 'Gopher-ramDeterministic-v0', 'Gravitar-ramNoFrameskip-v0', 'Skiing-ramDeterministic-v4', 'ChopperCommand-ramDeterministic-v0', 'Atlantis-ramDeterministic-v0', 'DefenderDeterministic-v0', 'BankHeistNoFrameskip-v0', 'ZaxxonDeterministic-v4', 'Solaris-ramNoFrameskip-v4', 'AmidarNoFrameskip-v0', 'HeroDeterministic-v0', 'AsterixDeterministic-v4', 'TimePilot-ramDeterministic-v4', 'IceHockeyDeterministic-v0', 'Centipede-ram-v0', 'BeamRiderNoFrameskip-v0', 'BeamRider-ramDeterministic-v4', 'AtlantisDeterministic-v0', 'MsPacmanDeterministic-v0', 'DefenderNoFrameskip-v4', 'Jamesbond-ramDeterministic-v4', 'Venture-ram-v4', 'QbertDeterministic-v0', 'Solaris-ramDeterministic-v0', 'Solaris-ramNoFrameskip-v0', 'Jamesbond-ramNoFrameskip-v0', 'Krull-v4', 'Tutankham-v4', 'WizardOfWor-ramDeterministic-v0', 'BattleZone-ramNoFrameskip-v4', 'FishingDerbyDeterministic-v4', 'RoadRunner-v4', 'TimePilot-ramDeterministic-v0', 'YarsRevengeDeterministic-v0', 'AsterixNoFrameskip-v4', 'BeamRiderDeterministic-v0', 'Bowling-ramNoFrameskip-v0', 'JourneyEscape-v0', 'TimePilot-v0', 'GopherDeterministic-v4', 'PhoenixDeterministic-v0', 'StarGunner-v4', 'Alien-v0', 'Pong-ramNoFrameskip-v4', 'Frostbite-ram-v4', 'Pooyan-ram-v4', 'Bowling-ramDeterministic-v0', 'RoadRunner-ram-v4', 'TimePilotNoFrameskip-v4', 'Phoenix-ram-v0', 'Assault-ramDeterministic-v0', 'Seaquest-ramNoFrameskip-v4', 'AlienNoFrameskip-v0', 'Defender-ram-v0', 'SolarisDeterministic-v4', 'Tutankham-v0', 'Kangaroo-ramDeterministic-v0', 'PhoenixDeterministic-v4', 'VideoPinballDeterministic-v0', 'Freeway-ramDeterministic-v4', 'Asteroids-v0', 'JourneyEscape-ram-v0', 'StarGunner-ramNoFrameskip-v4', 'VideoPinball-v4', 'Carnival-ramNoFrameskip-v0', 'Assault-ram-v4', 'PrivateEye-ramDeterministic-v0', 'Venture-v0', 'PrivateEyeDeterministic-v0', 'Alien-ram-v4', 'ElevatorActionNoFrameskip-v0', 'MsPacman-ram-v0', 'Asterix-ram-v0', 'Bowling-ram-v4', 'MontezumaRevenge-ramNoFrameskip-v0', 'VentureDeterministic-v0', 'AmidarDeterministic-v0', 'ElevatorActionNoFrameskip-v4', 'SpaceInvaders-ram-v0', 'ChopperCommandNoFrameskip-v0', 'SeaquestDeterministic-v4', 'StarGunner-ramDeterministic-v0', 'SpaceInvadersDeterministic-v4', 'VideoPinballNoFrameskip-v0', 'MontezumaRevenge-ram-v4', 'SpaceInvaders-ram-v4', 'Frostbite-v4', 'Pong-ramDeterministic-v4', 'Adventure-ramNoFrameskip-v4', 'AdventureDeterministic-v0', 'KrullNoFrameskip-v4', 'NameThisGame-ramNoFrameskip-v0', 'Centipede-v0', 'WizardOfWor-ram-v0', 'AtlantisNoFrameskip-v4', 'AsteroidsDeterministic-v0', 'BeamRiderDeterministic-v4', 'Centipede-ramDeterministic-v4', 'YarsRevenge-ramDeterministic-v0', 'PooyanDeterministic-v0', 'DemonAttack-ramDeterministic-v4', 'RoadRunnerDeterministic-v4', 'PongNoFrameskip-v4', 'CrazyClimber-ram-v4', 'IceHockey-v4', 'Adventure-v0', 'Krull-v0', 'BerzerkNoFrameskip-v0', 'Freeway-v0', 'Amidar-ramNoFrameskip-v0', 'MsPacmanDeterministic-v4', 'GopherDeterministic-v0', 'YarsRevenge-v0', 'Pong-ramDeterministic-v0', 'Robotank-ramDeterministic-v0', 'AtlantisNoFrameskip-v0', 'TennisNoFrameskip-v4', 'VentureNoFrameskip-v4', 'BankHeist-ramDeterministic-v4', 'Gravitar-ramDeterministic-v4', 'PrivateEye-v4', 'StarGunner-ramNoFrameskip-v0', 'VideoPinball-v0', 'BreakoutDeterministic-v4', 'StarGunnerDeterministic-v4', 'KungFuMaster-ramNoFrameskip-v0', 'RoadRunner-v0', 'Krull-ramNoFrameskip-v0', 'AirRaidNoFrameskip-v4', 'Gravitar-ram-v0', 'Tutankham-ramNoFrameskip-v0', 'Adventure-ram-v4', 'FishingDerby-ramDeterministic-v0', 'PrivateEye-ram-v0', 'Boxing-ramNoFrameskip-v4', 'KungFuMaster-ram-v4', 'ZaxxonNoFrameskip-v4', 'YarsRevenge-ram-v4', 'JourneyEscapeDeterministic-v4', 'BoxingNoFrameskip-v0', 'PrivateEye-ramNoFrameskip-v0', 'Zaxxon-ram-v4', 'RobotankNoFrameskip-v4', 'Amidar-v0', 'SolarisDeterministic-v0', 'Gopher-ram-v0', 'WizardOfWorDeterministic-v0', 'DoubleDunkNoFrameskip-v0', 'CentipedeNoFrameskip-v4', 'Berzerk-ramDeterministic-v0', 'Skiing-ram-v4', 'Gopher-ramDeterministic-v4', 'Amidar-ram-v4', 'Breakout-ramDeterministic-v4', 'Freeway-ramNoFrameskip-v0', 'NameThisGame-ramDeterministic-v4', 'Hero-v4', 'PrivateEye-ram-v4', 'Riverraid-v4', 'MontezumaRevenge-ramNoFrameskip-v4', 'Seaquest-ram-v0', 'CentipedeDeterministic-v0', 'Phoenix-ramNoFrameskip-v4', 'JamesbondDeterministic-v0', 'IceHockey-ramNoFrameskip-v0', 'Atlantis-ram-v4', 'Seaquest-ramDeterministic-v4', 'Adventure-ramDeterministic-v0', 'AssaultNoFrameskip-v0', 'BeamRider-ramNoFrameskip-v4', 'BerzerkNoFrameskip-v4', 'NameThisGameDeterministic-v4', 'SkiingNoFrameskip-v4', 'CrazyClimberDeterministic-v0', 'ZaxxonDeterministic-v0', 'Phoenix-ramDeterministic-v0', 'Boxing-ramDeterministic-v4', 'Pong-v4', 'SpaceInvadersNoFrameskip-v4', 'DoubleDunk-v4', 'PooyanNoFrameskip-v4', 'MsPacman-ramDeterministic-v4', 'RoadRunner-ramNoFrameskip-v4', 'Enduro-ramNoFrameskip-v0', 'Freeway-ram-v4', 'Berzerk-v0', 'Zaxxon-ramNoFrameskip-v4', 'Enduro-v0', 'Freeway-ram-v0', 'FishingDerby-ramDeterministic-v4', 'MsPacman-ram-v4', 'TutankhamDeterministic-v0', 'UpNDown-ram-v4', 'Venture-ram-v0', 'Assault-v0', 'DemonAttack-v0', 'TimePilotDeterministic-v0', 'RoadRunner-ramNoFrameskip-v0', 'BattleZoneDeterministic-v0', 'DemonAttack-ramNoFrameskip-v4', 'Boxing-ramDeterministic-v0', 'MontezumaRevengeNoFrameskip-v4', 'DemonAttack-v4', 'PrivateEyeDeterministic-v4', 'MontezumaRevenge-v0', 'AssaultDeterministic-v4', 'SpaceInvadersNoFrameskip-v0', 'Robotank-ram-v0', 'WizardOfWorNoFrameskip-v0', 'Asteroids-ramDeterministic-v0', 'CrazyClimberNoFrameskip-v0', 'DoubleDunk-ramNoFrameskip-v4', 'Zaxxon-ramNoFrameskip-v0', 'PooyanDeterministic-v4', 'RobotankDeterministic-v4', 'Krull-ram-v0', 'Carnival-ramDeterministic-v0', 'BankHeistDeterministic-v4', 'IceHockey-ramDeterministic-v0', 'Jamesbond-ram-v0', 'StarGunnerNoFrameskip-v0', 'BoxingNoFrameskip-v4', 'Gopher-v4', 'AlienDeterministic-v0', 'Zaxxon-ramDeterministic-v4', 'VideoPinball-ramNoFrameskip-v4', 'TimePilot-v4', 'NameThisGame-ram-v0', 'PitfallDeterministic-v0', 'GravitarDeterministic-v4', 'DoubleDunk-ramNoFrameskip-v0', 'YarsRevenge-ram-v0', 'PhoenixNoFrameskip-v4', 'Gravitar-v4', 'Pooyan-ramDeterministic-v4', 'Berzerk-ramDeterministic-v4', 'RiverraidDeterministic-v0', 'CrazyClimber-ramNoFrameskip-v0', 'KungFuMasterDeterministic-v0', 'Phoenix-ramNoFrameskip-v0', 'StarGunner-ram-v0', 'BeamRider-ram-v4', 'Defender-ramDeterministic-v0', 'MsPacman-ramNoFrameskip-v0', 'Pitfall-ramDeterministic-v0', 'BoxingDeterministic-v0', 'Enduro-ram-v4', 'Adventure-ramDeterministic-v4', 'Riverraid-ramNoFrameskip-v0', 'Enduro-ramDeterministic-v0', 'Boxing-v4', 'Skiing-v0', 'BattleZoneNoFrameskip-v0', 'Pitfall-ramNoFrameskip-v0', 'BowlingNoFrameskip-v0', 'DoubleDunkDeterministic-v0', 'ElevatorAction-ramDeterministic-v0', 'Boxing-ram-v0', 'WizardOfWor-ramNoFrameskip-v4', 'MsPacman-ramNoFrameskip-v4', 'Defender-v4', 'Bowling-v4', 'Berzerk-ramNoFrameskip-v4', 'MontezumaRevenge-ram-v0', 'AtlantisDeterministic-v4', 'GravitarDeterministic-v0', 'FreewayDeterministic-v0', 'Breakout-ram-v4', 'CrazyClimberDeterministic-v4', 'Zaxxon-v4', 'EnduroDeterministic-v0', 'KungFuMaster-ramNoFrameskip-v4', 'NameThisGame-ram-v4', 'Breakout-ramNoFrameskip-v0', 'ElevatorAction-ramNoFrameskip-v4', 'Carnival-v0', 'Pooyan-ram-v0', 'BankHeist-v0', 'NameThisGameDeterministic-v0', 'MontezumaRevenge-ramDeterministic-v4', 'DoubleDunk-ram-v4', 'Asteroids-v4', 'Alien-ramNoFrameskip-v0', 'Breakout-ramNoFrameskip-v4', 'AmidarDeterministic-v4', 'SpaceInvaders-v4', 'Boxing-v0', 'Riverraid-ram-v0', 'UpNDownDeterministic-v0', 'Defender-ramNoFrameskip-v0', 'StarGunner-ram-v4', 'MsPacmanNoFrameskip-v4', 'Skiing-ramNoFrameskip-v4', 'Tennis-v4', 'SolarisNoFrameskip-v4', 'TutankhamNoFrameskip-v4', 'ChopperCommand-ramNoFrameskip-v0', 'Gravitar-ramNoFrameskip-v4', 'VideoPinball-ram-v4', 'Qbert-ramDeterministic-v4', 'Pitfall-ramNoFrameskip-v4', 'Pooyan-ramDeterministic-v0', 'MsPacman-ramDeterministic-v0', 'Enduro-v4', 'Freeway-ramNoFrameskip-v4', 'JourneyEscape-ramDeterministic-v0', 'Alien-ram-v0', 'FreewayDeterministic-v4', 'Qbert-v0', 'Kangaroo-ramNoFrameskip-v4', 'Atlantis-ramNoFrameskip-v0', 'Frostbite-ramNoFrameskip-v0', 'Hero-ramNoFrameskip-v4', 'YarsRevengeNoFrameskip-v4', 'SkiingDeterministic-v0', 'Gopher-v0', 'PitfallDeterministic-v4', 'Carnival-v4', 'Riverraid-ram-v4', 'FishingDerby-ramNoFrameskip-v0', 'AirRaidDeterministic-v4', 'Pitfall-ram-v4', 'BattleZone-v4', 'Tennis-ram-v0', 'YarsRevengeNoFrameskip-v0', 'BattleZone-ram-v0', 'AirRaid-ramNoFrameskip-v0', 'Qbert-ramNoFrameskip-v0', 'TennisDeterministic-v4', 'Asteroids-ram-v4', 'AsteroidsDeterministic-v4', 'BowlingDeterministic-v4', 'Asterix-ramNoFrameskip-v0', 'Qbert-ramNoFrameskip-v4', 'ElevatorAction-v4', 'MontezumaRevenge-ramDeterministic-v0', 'BeamRider-ramNoFrameskip-v0', 'BankHeist-ramNoFrameskip-v4', 'ElevatorActionDeterministic-v0', 'FishingDerby-v0', 'FishingDerbyNoFrameskip-v0', 'Frostbite-ram-v0', 'JamesbondDeterministic-v4', 'Phoenix-v4', 'Pooyan-v0', 'NameThisGame-ramDeterministic-v0', 'Riverraid-ramDeterministic-v4', 'JamesbondNoFrameskip-v4', 'Kangaroo-ram-v0', 'Gopher-ramNoFrameskip-v4', 'Centipede-ramDeterministic-v0', 'Carnival-ram-v0', 'AlienNoFrameskip-v4', 'BowlingNoFrameskip-v4', 'Breakout-v4', 'KungFuMasterNoFrameskip-v4', 'Amidar-ramDeterministic-v4', 'Asterix-v0', 'PongDeterministic-v0', 'UpNDown-ramDeterministic-v0', 'FreewayNoFrameskip-v4', 'Gravitar-v0', 'BattleZone-ramDeterministic-v4', 'Bowling-ramDeterministic-v4', 'Frostbite-ramNoFrameskip-v4', 'Gopher-ram-v4', 'Asteroids-ramNoFrameskip-v4', 'BankHeist-ram-v4', 'Centipede-ram-v4', 'Kangaroo-ramDeterministic-v4', 'Pong-v0', 'SeaquestNoFrameskip-v0', 'Jamesbond-v4', 'FishingDerby-v4', 'Krull-ram-v4', 'BeamRider-v0', 'TennisDeterministic-v0', 'ChopperCommandDeterministic-v4', 'QbertNoFrameskip-v4', 'BattleZone-ramDeterministic-v0', 'RoadRunnerNoFrameskip-v4', 'Atlantis-v0', 'BankHeist-v4', 'TennisNoFrameskip-v0', 'SpaceInvaders-v0', 'DemonAttackNoFrameskip-v4', 'BattleZoneDeterministic-v4', 'Tutankham-ramDeterministic-v4', 'KungFuMaster-v4', 'Boxing-ram-v4', 'SkiingNoFrameskip-v0', 'Pong-ram-v0', 'Asterix-ramDeterministic-v0', 'JamesbondNoFrameskip-v0', 'StarGunner-v0', 'WizardOfWor-ramDeterministic-v4', 'FrostbiteNoFrameskip-v0', 'FishingDerbyDeterministic-v0', 'QbertNoFrameskip-v0', 'KungFuMaster-ram-v0', 'Atlantis-v4', 'CrazyClimberNoFrameskip-v4', 'Defender-ram-v4', 'JourneyEscape-v4', 'Zaxxon-v0', 'DoubleDunk-ramDeterministic-v0', 'YarsRevenge-ramNoFrameskip-v0', 'BattleZone-ram-v4', 'Hero-ram-v4', 'KungFuMaster-ramDeterministic-v4', 'CentipedeDeterministic-v4', 'Seaquest-ram-v4', 'ChopperCommand-ram-v0', 'ElevatorAction-v0', 'Atlantis-ram-v0', 'Skiing-v4', 'Robotank-ramDeterministic-v4', 'Pitfall-v0', 'Solaris-ram-v4', 'Tennis-ramNoFrameskip-v4', 'Venture-ramDeterministic-v0', 'KrullNoFrameskip-v0', 'DemonAttackNoFrameskip-v0', 'MontezumaRevengeDeterministic-v0', 'EnduroNoFrameskip-v0', 'Freeway-v4', 'CrazyClimber-ram-v0', 'FishingDerby-ramNoFrameskip-v4', 'RiverraidNoFrameskip-v4', 'Berzerk-ramNoFrameskip-v0', 'Venture-ramNoFrameskip-v0', 'Adventure-v4', 'PrivateEye-ramNoFrameskip-v4', 'UpNDown-ramNoFrameskip-v0', 'BreakoutNoFrameskip-v4', 'ElevatorActionDeterministic-v4', 'BeamRiderNoFrameskip-v4', 'DemonAttackDeterministic-v4', 'Kangaroo-ram-v4', 'Phoenix-ramDeterministic-v4', 'AirRaidDeterministic-v0', 'Hero-v0', 'UpNDown-ramDeterministic-v4', 'Defender-ramNoFrameskip-v4', 'DoubleDunk-ramDeterministic-v4', 'Centipede-v4', 'ChopperCommand-v4', 'Frostbite-ramDeterministic-v0', 'SeaquestNoFrameskip-v4', 'SolarisNoFrameskip-v0', 'AirRaid-v4', 'HeroNoFrameskip-v4', 'FishingDerby-ram-v0', 'Gravitar-ramDeterministic-v0', 'KungFuMasterDeterministic-v4', 'Pong-ramNoFrameskip-v0', 'IceHockey-ramDeterministic-v4', 'Kangaroo-ramNoFrameskip-v0', 'UpNDown-ram-v0', 'BeamRider-ram-v0', 'Hero-ramDeterministic-v0', 'Jamesbond-v0', 'Tennis-ramNoFrameskip-v0', 'Freeway-ramDeterministic-v0', 'WizardOfWor-v0', 'Tennis-ram-v4', 'FrostbiteNoFrameskip-v4', 'Skiing-ram-v0', 'NameThisGameNoFrameskip-v4', 'Alien-ramDeterministic-v0', 'Pitfall-ramDeterministic-v4', 'IceHockey-ramNoFrameskip-v4', 'Solaris-v0', 'Asterix-ram-v4', 'Adventure-ram-v0', 'HeroNoFrameskip-v0', 'Frostbite-ramDeterministic-v4', 'Bowling-ramNoFrameskip-v4', 'VideoPinballDeterministic-v4', 'VideoPinball-ramDeterministic-v0', 'DefenderNoFrameskip-v0', 'TimePilotDeterministic-v4', 'UpNDown-v0', 'Seaquest-v0', 'QbertDeterministic-v4', 'FreewayNoFrameskip-v0', 'VideoPinball-ramDeterministic-v4', 'Kangaroo-v4', 'Alien-ramNoFrameskip-v4', 'KrullDeterministic-v4', 'PongDeterministic-v4', 'RoadRunnerDeterministic-v0', 'GravitarNoFrameskip-v4', 'AlienDeterministic-v4', 'KungFuMasterNoFrameskip-v0', 'Hero-ramDeterministic-v4', 'Asterix-ramNoFrameskip-v4', 'KrullDeterministic-v0', 'AirRaid-ram-v4', 'MontezumaRevengeDeterministic-v4', 'Qbert-ramDeterministic-v0', 'Asteroids-ramNoFrameskip-v0', 'StarGunner-ramDeterministic-v4', 'GravitarNoFrameskip-v0', 'WizardOfWor-ram-v4', 'YarsRevenge-ramDeterministic-v4', 'DemonAttack-ramDeterministic-v0', 'NameThisGame-ramNoFrameskip-v4', 'PooyanNoFrameskip-v0', 'Enduro-ram-v0', 'Robotank-ramNoFrameskip-v0', 'Asteroids-ramDeterministic-v4', 'Carnival-ramDeterministic-v4', 'Berzerk-v4', 'JourneyEscape-ramDeterministic-v4', 'Kangaroo-v0', 'BattleZone-ramNoFrameskip-v0', 'NameThisGame-v0', 'Assault-ramNoFrameskip-v0', 'StarGunnerNoFrameskip-v4', 'CentipedeNoFrameskip-v0', 'Venture-ramDeterministic-v4', 'Frostbite-v0', 'JourneyEscape-ramNoFrameskip-v0', 'Krull-ramDeterministic-v0', 'VideoPinball-ramNoFrameskip-v0', 'IceHockeyDeterministic-v4', 'AdventureDeterministic-v4', 'Carnival-ram-v4', 'YarsRevenge-ramNoFrameskip-v4', 'CarnivalDeterministic-v4', 'UpNDownNoFrameskip-v4', 'FishingDerbyNoFrameskip-v4', 'Riverraid-v0', 'Robotank-ramNoFrameskip-v4', 'CrazyClimber-v0', 'RiverraidNoFrameskip-v0', 'AsterixDeterministic-v0', 'WizardOfWor-v4', 'AdventureNoFrameskip-v0', 'Solaris-ram-v0', 'Seaquest-ramNoFrameskip-v0', 'Tennis-ramDeterministic-v0', 'DoubleDunkNoFrameskip-v4', 'VentureDeterministic-v4', 'AsterixNoFrameskip-v0', 'Qbert-ram-v4', 'MontezumaRevengeNoFrameskip-v0', 'RoadRunner-ramDeterministic-v0', 'Phoenix-v0', 'BreakoutDeterministic-v0', 'WizardOfWorDeterministic-v4', 'ZaxxonNoFrameskip-v0', 'KangarooDeterministic-v0', 'Alien-v4', 'MsPacman-v4', 'Assault-ramNoFrameskip-v4', 'RobotankNoFrameskip-v0', 'MontezumaRevenge-v4', 'Solaris-v4', 'Tennis-v0', 'Amidar-ram-v0', 'SpaceInvadersDeterministic-v0', 'TutankhamDeterministic-v4', 'DoubleDunk-ram-v0', 'NameThisGameNoFrameskip-v0', 'JourneyEscapeNoFrameskip-v0', 'PitfallNoFrameskip-v4', 'TimePilot-ram-v4', 'Tutankham-ram-v4', 'CrazyClimber-v4', 'CrazyClimber-ramNoFrameskip-v4', 'Enduro-ramNoFrameskip-v4', 'Breakout-ramDeterministic-v0', 'BankHeistDeterministic-v0', 'ChopperCommandNoFrameskip-v4', 'Defender-v0', 'Pitfall-v4', 'BeamRider-v4', 'AmidarNoFrameskip-v4', 'Solaris-ramDeterministic-v4', 'ElevatorAction-ram-v0', 'SpaceInvaders-ramDeterministic-v0', 'BankHeistNoFrameskip-v4', 'JourneyEscape-ramNoFrameskip-v4', 'EnduroNoFrameskip-v4', 'Tutankham-ramNoFrameskip-v4', 'KungFuMaster-ramDeterministic-v0', 'Berzerk-ram-v0', 'UpNDownNoFrameskip-v0', 'IceHockey-v0', 'SpaceInvaders-ramNoFrameskip-v4', 'Jamesbond-ramNoFrameskip-v4', 'Bowling-ram-v0', 'PrivateEyeNoFrameskip-v4', 'ElevatorAction-ram-v4', 'WizardOfWor-ramNoFrameskip-v0', 'Seaquest-ramDeterministic-v0', 'Pooyan-ramNoFrameskip-v4', 'Zaxxon-ram-v0', 'Atlantis-ramNoFrameskip-v4', 'UpNDownDeterministic-v4', 'CarnivalNoFrameskip-v4', 'ChopperCommand-v0', 'JourneyEscapeDeterministic-v0', 'FrostbiteDeterministic-v4', 'TimePilot-ramNoFrameskip-v4', 'Assault-v4', 'Enduro-ramDeterministic-v4', 'ChopperCommand-ramNoFrameskip-v4', 'KungFuMaster-v0', 'DemonAttack-ram-v0', 'BattleZone-v0', 'Centipede-ramNoFrameskip-v4', 'Venture-ramNoFrameskip-v4', 'UpNDown-v4', 'VideoPinballNoFrameskip-v4', 'BreakoutNoFrameskip-v0', 'AirRaid-ramNoFrameskip-v4', 'Centipede-ramNoFrameskip-v0', 'ChopperCommand-ram-v4', 'Alien-ramDeterministic-v4', 'IceHockeyNoFrameskip-v4', 'AirRaidNoFrameskip-v0', 'Qbert-v4', 'SpaceInvaders-ramNoFrameskip-v0', 'Asteroids-ram-v0', 'Pooyan-ramNoFrameskip-v0', 'Qbert-ram-v0', 'CrazyClimber-ramDeterministic-v4', 'DemonAttackDeterministic-v0', 'AirRaid-v0', 'BerzerkDeterministic-v4', 'Tennis-ramDeterministic-v4', 'KangarooDeterministic-v4', 'FishingDerby-ram-v4', 'CarnivalDeterministic-v0', 'VideoPinball-ram-v0', 'HeroDeterministic-v4', 'ElevatorAction-ramNoFrameskip-v0', 'Krull-ramNoFrameskip-v4', 'UpNDown-ramNoFrameskip-v4', 'Pooyan-v4', 'ChopperCommandDeterministic-v0', 'IceHockey-ram-v0', 'StarGunnerDeterministic-v0', 'TimePilot-ramNoFrameskip-v0', 'ElevatorAction-ramDeterministic-v4', 'Jamesbond-ram-v4', 'Hero-ramNoFrameskip-v0', 'Bowling-v0', 'BowlingDeterministic-v0', 'EnduroDeterministic-v4', 'Riverraid-ramNoFrameskip-v4', 'RiverraidDeterministic-v4', 'Breakout-ram-v0', 'Robotank-v0', 'Atlantis-ramDeterministic-v4', 'Assault-ram-v0', 'Seaquest-v4', 'PhoenixNoFrameskip-v0', 'Defender-ramDeterministic-v4', 'Krull-ramDeterministic-v4', 'BankHeist-ramNoFrameskip-v0', 'NameThisGame-v4', 'AirRaid-ram-v0', 'Berzerk-ram-v4', 'Boxing-ramNoFrameskip-v0', 'AssaultDeterministic-v0', 'Asterix-ramDeterministic-v4', 'Jamesbond-ramDeterministic-v0', 'Breakout-v0', 'DemonAttack-ramNoFrameskip-v0', 'DoubleDunkDeterministic-v4', 'CrazyClimber-ramDeterministic-v0', 'CarnivalNoFrameskip-v0', 'PrivateEyeNoFrameskip-v0', 'BeamRider-ramDeterministic-v0', 'Robotank-v4', 'SkiingDeterministic-v4', 'AirRaid-ramDeterministic-v4', 'PitfallNoFrameskip-v0', 'Skiing-ramNoFrameskip-v0', 'AsteroidsNoFrameskip-v4', 'Carnival-ramNoFrameskip-v4', 'RoadRunnerNoFrameskip-v0', 'PrivateEye-ramDeterministic-v4', 'KangarooNoFrameskip-v0', 'AirRaid-ramDeterministic-v0', 'JourneyEscapeNoFrameskip-v4', 'BattleZoneNoFrameskip-v4', 'DemonAttack-ram-v4', 'JourneyEscape-ram-v4', 'Tutankham-ram-v0', 'ChopperCommand-ramDeterministic-v4', 'GopherNoFrameskip-v4', 'IceHockey-ram-v4', 'RoadRunner-ram-v0', 'Robotank-ram-v4', 'WizardOfWorNoFrameskip-v4', 'Hero-ram-v0', 'SeaquestDeterministic-v0', 'PrivateEye-v0', 'Gravitar-ram-v4', 'Amidar-v4', 'AssaultNoFrameskip-v4', 'GopherNoFrameskip-v0', 'MsPacman-v0', 'VentureNoFrameskip-v0', 'KangarooNoFrameskip-v4', 'YarsRevengeDeterministic-v4', 'Tutankham-ramDeterministic-v0', 'Adventure-ramNoFrameskip-v0', 'BankHeist-ram-v0', 'DoubleDunk-v0', 'Pitfall-ram-v0', 'PongNoFrameskip-v0', 'AdventureNoFrameskip-v4', 'IceHockeyNoFrameskip-v0', 'Amidar-ramNoFrameskip-v4', 'FrostbiteDeterministic-v0', 'BoxingDeterministic-v4', 'RobotankDeterministic-v0', 'Amidar-ramDeterministic-v0', 'BankHeist-ramDeterministic-v0']
    ......
    unittest 
     ['MemorizeDigits-v0', 'CubeCrash-v0', 'CubeCrashScreenBecomesBlack-v0', 'CubeCrashSparse-v0']
    ......
    retro 
     ['SuperMarioBros-Nes', 'Vectorman-Genesis', 'FinalFight-Snes', 'SonicTheHedgehog-Genesis', 'TwinBee3PokoPokoDaimaou-Nes', 'SpaceHarrier-Nes', 'SpaceInvaders-Snes', 'BubbleBobble-Nes']
    ......
    View Code

    在 main 函数中主要调用train函数。

    在train函数中真正执行算法训练的函数操作为:

        model = learn(
            env=env,
            seed=seed,
            total_timesteps=total_timesteps,
            **alg_kwargs
        )

    如果是执行deepq的算法,那么这里的这个learn函数就是deepq算法模块中的learn函数,也就是说根据指定的不同算法模块最后传入的learn函数都是其对应算法模块下的learn函数。

    learn函数可以根据指定的timesteps来训练算法直到算法最后训练完成,最终的算法的网络模型给传输回来,也就是这里的model变量,model变量本身也是一个类对象,model的step函数输入变量为observation,输出变量为action的np.array类型对象。

    在train函数中对游戏画面录像的操作:

        if args.save_video_interval != 0:
            env = VecVideoRecorder(env, osp.join(logger.get_dir(), "videos"), record_video_trigger=lambda x: x % args.save_video_interval == 0, video_length=args.save_video_length)

     由于gym的视频录像功能并不是很好用,因此推荐使用opencv的图片录像的代码。

    ========================================

       

    在这个模块还有一个最为重要的操作就是对游戏环境进行创建,这里的函数为build_env 。

    def build_env(args):
        ncpu = multiprocessing.cpu_count()
        if sys.platform == 'darwin': ncpu //= 2
        nenv = args.num_env or ncpu
        alg = args.alg
        seed = args.seed
    
        env_type, env_id = get_env_type(args)
    
        if env_type in {'atari', 'retro'}:
            if alg == 'deepq':
                env = make_env(env_id, env_type, seed=seed, wrapper_kwargs={'frame_stack': True})
            elif alg == 'trpo_mpi':
                env = make_env(env_id, env_type, seed=seed)
            else:
                frame_stack_size = 4
                env = make_vec_env(env_id, env_type, nenv, seed, gamestate=args.gamestate, reward_scale=args.reward_scale)
                env = VecFrameStack(env, frame_stack_size)
    
        else:
            config = tf.ConfigProto(allow_soft_placement=True,
                                   intra_op_parallelism_threads=1,
                                   inter_op_parallelism_threads=1)
            config.gpu_options.allow_growth = True
            get_session(config=config)
    
            flatten_dict_observations = alg not in {'her'}
            env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations)
    
            if env_type == 'mujoco':
                env = VecNormalize(env, use_tf=True)
    
        return env

    如果游戏环境不属于{'atari', 'retro'},并且调用的强化学习算法不为'her',(游戏环境不包括'mujoco'游戏),创建环境的操作为:

            flatten_dict_observations = True
            env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations)

    如果游戏环境不属于{'atari', 'retro'},并且调用的强化学习算法为'her',(游戏环境不包括'mujoco'游戏),创建环境的操作为:

            flatten_dict_observations = False
            env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations)

    在这里如果gym游戏不属于'atari、retro、mujoco'那么也就是纯gym的原生游戏环境了,对于这样的游戏的创建统一调用方法:

            env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations)

    唯一不同的是调用的强化学习算法是否为‘her’,如果是则设置flatten_dict_observations为False,否则设置为True。

    至于这个flatten_dict_observations变量的设置不是很理解,不过这个设置的意思就是说如果要用的环境为gym的原生环境,并且不调用‘her’算法那么需要对observation进行flatten操作。这里的flatten操作就是调用gym下的FlattenObservation函数。

    而这个FlattenObservation操作:

    import numpy as np
    import gym.spaces as spaces
    from gym import ObservationWrapper
    
    
    class FlattenObservation(ObservationWrapper):
        r"""Observation wrapper that flattens the observation."""
        def __init__(self, env):
            super(FlattenObservation, self).__init__(env)
    
            flatdim = spaces.flatdim(env.observation_space)
            self.observation_space = spaces.Box(low=-float('inf'), high=float('inf'), shape=(flatdim,), dtype=np.float32)
    
        def observation(self, observation):
            return spaces.flatten(self.env.observation_space, observation)
    def flatdim(space):
        if isinstance(space, Box):
            return int(np.prod(space.shape))
        elif isinstance(space, Discrete):
            return int(space.n)
        elif isinstance(space, Tuple):
            return int(sum([flatdim(s) for s in space.spaces]))
        elif isinstance(space, Dict):
            return int(sum([flatdim(s) for s in space.spaces.values()]))
        elif isinstance(space, MultiBinary):
            return int(space.n)
        elif isinstance(space, MultiDiscrete):
            return int(np.prod(space.shape))
        else:
            raise NotImplementedError
    
    
    def flatten(space, x):
        if isinstance(space, Box):
            return np.asarray(x, dtype=np.float32).flatten()
        elif isinstance(space, Discrete):
            onehot = np.zeros(space.n, dtype=np.float32)
            onehot[x] = 1.0
            return onehot
        elif isinstance(space, Tuple):
            return np.concatenate([flatten(s, x_part) for x_part, s in zip(x, space.spaces)])
        elif isinstance(space, Dict):
            return np.concatenate([flatten(s, x[key]) for key, s in space.spaces.items()])
        elif isinstance(space, MultiBinary):
            return np.asarray(x).flatten()
        elif isinstance(space, MultiDiscrete):
            return np.asarray(x).flatten()
        else:
            raise NotImplementedError

    该步操作主要是将observation进行flatten操作,尤其是将observation的spaces空间为Tuple和Dict的进行flatten。

    该步操作可以将observation的spaces空间属于Box, Discrete, Tuple, Dict, MultiBinary, MultiDiscrete类型的observation全部进行flatten操作。

    但是由于flatten的操作具体代码为:

        if flatten_dict_observations and isinstance(env.observation_space, gym.spaces.Dict):
            env = FlattenObservation(env)

    也就是说实际上只能对observation的空间为Dict的observation进行flatten操作。

    但是需要注意的是不论是在make_env函数中还是在make_vec_env函数中默认的参数设置都是flatten_dict_observations=True,也就是说在gym的原生环境中使用‘her‘’算法是不进行flatten操作的,由于对这个‘her’算法并不是很了解这里就不进行分析了。

    =================================================

    如果游戏环境不属于{'atari', 'retro'},并且为'mujoco'游戏,环境创建操作为:

            flatten_dict_observations = alg not in {'her'}
            env = make_vec_env(env_id, env_type, args.num_env or 1, seed, reward_scale=args.reward_scale, flatten_dict_observations=flatten_dict_observations)
    
            if env_type == 'mujoco':
                env = VecNormalize(env, use_tf=True)

    ===========================================

  • 相关阅读:
    Spring MVC 3.0.5+Spring 3.0.5+MyBatis3.0.4全注解实例详解(四)
    中国B2B行业将进入后平台时代
    做产品经理 而不是功能经理(转淘宝鬼脚七)
    全球最值的学习的100个网站
    gridview 内的button 用法
    框架内 FRAME的源src如何根据条件而变化?C#解决方案
    一般中小企网络出口的后备线路(adsl做后备)
    Windows server 2012_远程_没有远程桌面授权服务器可以提供许可证
    访问网站返回常见的状态码200,404等表示什么意思(转)
    站长学习 一 Robots简单认识
  • 原文地址:https://www.cnblogs.com/devilmaycry812839668/p/16021404.html
Copyright © 2020-2023  润新知