Unverified Commit 2d5155da authored by Benjamin Beyret's avatar Benjamin Beyret Committed by GitHub
Browse files

merge Dev v1.1.0 - curriculum with yaml files

Dev v1.1.0
parents 6ebfa72e 5fe24394
...@@ -172,6 +172,9 @@ features with the agent's frames in order to have frames in line with the config ...@@ -172,6 +172,9 @@ features with the agent's frames in order to have frames in line with the config
## Version History ## Version History
- v1.1.0
- Add curriculum learning to `animalai-train` to use yaml configurations
- v1.0.5 - v1.0.5
- ~~Adds customisable resolution during evaluation~~ (removed, evaluation is only `84x84`) - ~~Adds customisable resolution during evaluation~~ (removed, evaluation is only `84x84`)
- Update `animalai-train` to tf 1.14 to fix `gin` broken dependency - Update `animalai-train` to tf 1.14 to fix `gin` broken dependency
......
...@@ -2,7 +2,7 @@ from setuptools import setup ...@@ -2,7 +2,7 @@ from setuptools import setup
setup( setup(
name='animalai', name='animalai',
version='1.0.5', version='1.1.0',
description='Animal AI competition interface', description='Animal AI competition interface',
url='https://github.com/beyretb/AnimalAI-Olympics', url='https://github.com/beyretb/AnimalAI-Olympics',
author='Benjamin Beyret', author='Benjamin Beyret',
......
...@@ -5,6 +5,7 @@ You can find here the following documentation: ...@@ -5,6 +5,7 @@ You can find here the following documentation:
- [The quickstart guide](quickstart.md) - [The quickstart guide](quickstart.md)
- [How to design configuration files](configFile.md) - [How to design configuration files](configFile.md)
- [How training works](training.md) - [How training works](training.md)
- [Add a curriculum to your training using animalai-train](curriculum.md)
- [All the objects you can include in the arenas as well as their specifications](definitionsOfObjects.md) - [All the objects you can include in the arenas as well as their specifications](definitionsOfObjects.md)
- [How to submit your agent](submission.md) - [How to submit your agent](submission.md)
- [A guide to train on AWS](cloudTraining.md) - [A guide to train on AWS](cloudTraining.md)
......
# Curriculum Learning
The `animalai-train` package contains a curriculum learning feature where you can specify a set of configuration files
which constitute lessons as part of the curriculum. See the
[ml-agents documentation](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Curriculum-Learning.md)
on curriculum learning for an overview of the technique. Our implementation is adapted from the ml-agents one, to use
configuration files rather than environment parameters (which don't exist in `animalai`).
## Meta Curriculum
To define a curriculum you will need to provide the following:
- lessons (or levels), generally of increasing difficulty, that your agent will learn on, switching from easy to more difficult
- a metric you want to monitor to switch from one level to the next
- the value for each of these thresholds
In practice, you will place these parameters in a `json` file named after the brain in the environment (`Learner.json` in
our case), and place this file in a folder with all the configuration files you wish to use. This constitutes what we call
a meta-curriculum.
## Example
An example is provided in [the example folder](../examples/configs/curriculum). The idea of this curriculum is to train
an agent to navigate a maze by creating maze like structures of perpendicular walls, starting with a single wall and food,
adding one more wall at each level. Below are samples from the 6 different levels.
![](Curriculum/0.png) |![](Curriculum/1.png)|![](Curriculum/2.png)|
:--------------------:|:-------------------:|:-------------------:
![](Curriculum/3.png) |![](Curriculum/4.png)|![](Curriculum/5.png)|
To produce such a curriculum, we define the meta-curriculum in the following `json` format:
```
{
"measure": "reward",
"thresholds": [
1.5,
1.4,
1.3,
1.2,
1.1
],
"min_lesson_length": 100,
"signal_smoothing": true,
"configuration_files": [
"0.yaml",
"1.yaml",
"2.yaml",
"3.yaml",
"4.yaml",
"5.yaml"
]
}
```
All parameters are the same as in [ml-agents](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Curriculum-Learning.md),
except for the `configuration_files`. From the ml-agents documentation:
* `measure` - What to measure learning progress, and advancement in lessons by.
* `reward` - Uses a measure received reward.
* `progress` - Uses ratio of steps/max_steps.
* `thresholds` (float array) - Points in value of `measure` where lesson should
be increased.
* `min_lesson_length` (int) - The minimum number of episodes that should be
completed before the lesson can change. If `measure` is set to `reward`, the
average cumulative reward of the last `min_lesson_length` episodes will be
used to determine if the lesson should change. Must be nonnegative.
__Important__: the average reward that is compared to the thresholds is
different than the mean reward that is logged to the console. For example,
if `min_lesson_length` is `100`, the lesson will increment after the average
cumulative reward of the last `100` episodes exceeds the current threshold.
The mean reward logged to the console is dictated by the `summary_freq`
parameter in the
[trainer configuration file](../examples/configs/trainer_config.yaml).
* `signal_smoothing` (true/false) - Whether to weight the current progress
measure by previous values.
* If `true`, weighting will be 0.75 (new) 0.25 (old).
The `configuration_files` parameter is simply a list of files names which contain the lessons in the order they should be loaded.
Note that if you have `n` lessons, you need to define `n-1` thresholds.
## Training
Once the folder created, training is done in the same way as before but now we pass a `MetaCurriculum` object to the
`meta_curriculum` argument of a `TrainerController`.
We provide an example using the above curriculum in [examples/trainCurriculum.py](../examples/trainCurriculum.py).
Training this agent, you can see the lessons switch using tensorboard:
![](Curriculum/learning.png)
![](Curriculum/lessons.png)
...@@ -3,6 +3,7 @@ import json ...@@ -3,6 +3,7 @@ import json
import math import math
from .exception import CurriculumError from .exception import CurriculumError
from animalai.envs.arena_config import ArenaConfig
import logging import logging
...@@ -10,10 +11,11 @@ logger = logging.getLogger('mlagents.trainers') ...@@ -10,10 +11,11 @@ logger = logging.getLogger('mlagents.trainers')
class Curriculum(object): class Curriculum(object):
def __init__(self, location): def __init__(self, location, yaml_files):
""" """
Initializes a Curriculum object. Initializes a Curriculum object.
:param location: Path to JSON defining curriculum. :param location: Path to JSON defining curriculum.
:param yaml_files: A list of configuration files for each lesson
""" """
self.max_lesson_num = 0 self.max_lesson_num = 0
self.measure = None self.measure = None
...@@ -32,7 +34,7 @@ class Curriculum(object): ...@@ -32,7 +34,7 @@ class Curriculum(object):
raise CurriculumError('There was an error decoding {}' raise CurriculumError('There was an error decoding {}'
.format(location)) .format(location))
self.smoothing_value = 0 self.smoothing_value = 0
for key in ['parameters', 'measure', 'thresholds', for key in ['configuration_files', 'measure', 'thresholds',
'min_lesson_length', 'signal_smoothing']: 'min_lesson_length', 'signal_smoothing']:
if key not in self.data: if key not in self.data:
raise CurriculumError("{0} does not contain a " raise CurriculumError("{0} does not contain a "
...@@ -43,18 +45,25 @@ class Curriculum(object): ...@@ -43,18 +45,25 @@ class Curriculum(object):
self.min_lesson_length = self.data['min_lesson_length'] self.min_lesson_length = self.data['min_lesson_length']
self.max_lesson_num = len(self.data['thresholds']) self.max_lesson_num = len(self.data['thresholds'])
parameters = self.data['parameters'] configuration_files = self.data['configuration_files']
for key in parameters: # for key in configuration_files:
# if key not in default_reset_parameters: # if key not in default_reset_parameters:
# raise CurriculumError( # raise CurriculumError(
# 'The parameter {0} in Curriculum {1} is not present in ' # 'The parameter {0} in Curriculum {1} is not present in '
# 'the Environment'.format(key, location)) # 'the Environment'.format(key, location))
if len(parameters[key]) != self.max_lesson_num + 1: if len(configuration_files) != self.max_lesson_num + 1:
raise CurriculumError( raise CurriculumError(
'The parameter {0} in Curriculum {1} must have {2} values ' 'The parameter {0} in Curriculum {1} must have {2} values '
'but {3} were found'.format(key, location, 'but {3} were found'.format(key, location,
self.max_lesson_num + 1, self.max_lesson_num + 1,
len(parameters[key]))) len(configuration_files)))
folder = os.path.dirname(location)
folder_yaml_files = os.listdir(folder)
if not all([file in folder_yaml_files for file in configuration_files]):
raise Curriculum(
'One or more configuration file(s) in curriculum {0} could not be found'.format(location)
)
self.configurations = [ArenaConfig(os.path.join(folder, file)) for file in yaml_files]
@property @property
def lesson_num(self): def lesson_num(self):
...@@ -79,15 +88,13 @@ class Curriculum(object): ...@@ -79,15 +88,13 @@ class Curriculum(object):
if self.lesson_num < self.max_lesson_num: if self.lesson_num < self.max_lesson_num:
if measure_val > self.data['thresholds'][self.lesson_num]: if measure_val > self.data['thresholds'][self.lesson_num]:
self.lesson_num += 1 self.lesson_num += 1
config = {} # config = {}
parameters = self.data['parameters'] # parameters = self.data['parameters']
for key in parameters: # for key in parameters:
config[key] = parameters[key][self.lesson_num] # config[key] = parameters[key][self.lesson_num]
logger.info('{0} lesson changed. Now in lesson {1}: {2}' logger.info('{0} lesson changed. Now in lesson {1}'
.format(self._brain_name, .format(self._brain_name,
self.lesson_num, self.lesson_num))
', '.join([str(x) + ' -> ' + str(config[x])
for x in config])))
return True return True
return False return False
...@@ -103,8 +110,8 @@ class Curriculum(object): ...@@ -103,8 +110,8 @@ class Curriculum(object):
if lesson is None: if lesson is None:
lesson = self.lesson_num lesson = self.lesson_num
lesson = max(0, min(lesson, self.max_lesson_num)) lesson = max(0, min(lesson, self.max_lesson_num))
config = {} config = self.configurations[lesson]
parameters = self.data['parameters'] # parameters = self.data['parameters']
for key in parameters: # for key in parameters:
config[key] = parameters[key][lesson] # config[key] = parameters[key][lesson]
return config return config
...@@ -20,34 +20,41 @@ class MetaCurriculum(object): ...@@ -20,34 +20,41 @@ class MetaCurriculum(object):
Args: Args:
curriculum_folder (str): The relative or absolute path of the curriculum_folder (str): The relative or absolute path of the
folder which holds the curriculums for this environment. folder which holds the curriculums for this environment.
The folder should contain JSON files whose names are the The folder should contain one JSON file which name is the
brains that the curriculums belong to. same as the brains in the academy (e.g Learner) and contains
the parameters for the curriculum as well as all the YAML
files for each curriculum lesson
""" """
used_reset_parameters = set() # used_reset_parameters = set()
self._brains_to_curriculums = {} self._brains_to_curriculums = {}
self._configuration_files = []
try: try:
for curriculum_filename in os.listdir(curriculum_folder): json_files = [file for file in os.listdir(curriculum_folder) if '.json' in file.lower()]
yaml_files = [file for file in os.listdir(curriculum_folder) if
('.yaml' in file.lower() or '.yml' in file.lower())]
for curriculum_filename in json_files:
brain_name = curriculum_filename.split('.')[0] brain_name = curriculum_filename.split('.')[0]
curriculum_filepath = \ curriculum_filepath = \
os.path.join(curriculum_folder, curriculum_filename) os.path.join(curriculum_folder, curriculum_filename)
curriculum = Curriculum(curriculum_filepath) curriculum = Curriculum(curriculum_filepath, yaml_files)
# ===== TO REMOVE ??? ===========
# Check if any two curriculums use the same reset params. # Check if any two curriculums use the same reset params.
if any([(parameter in curriculum.get_config().keys()) # if any([(parameter in curriculum.get_config().keys())
for parameter in used_reset_parameters]): # for parameter in used_reset_parameters]):
logger.warning('Two or more curriculums will ' # logger.warning('Two or more curriculums will '
'attempt to change the same reset ' # 'attempt to change the same reset '
'parameter. The result will be ' # 'parameter. The result will be '
'non-deterministic.') # 'non-deterministic.')
#
used_reset_parameters.update(curriculum.get_config().keys()) # used_reset_parameters.update(curriculum.get_config().keys())
# ===== end of to remove =========
self._brains_to_curriculums[brain_name] = curriculum self._brains_to_curriculums[brain_name] = curriculum
except NotADirectoryError: except NotADirectoryError:
raise MetaCurriculumError(curriculum_folder + ' is not a ' raise MetaCurriculumError(curriculum_folder + ' is not a '
'directory. Refer to the ML-Agents ' 'directory. Refer to the ML-Agents '
'curriculum learning docs.') 'curriculum learning docs.')
@property @property
def brains_to_curriculums(self): def brains_to_curriculums(self):
...@@ -83,7 +90,7 @@ class MetaCurriculum(object): ...@@ -83,7 +90,7 @@ class MetaCurriculum(object):
increment its lesson. increment its lesson.
""" """
return reward_buff_size >= (self.brains_to_curriculums[brain_name] return reward_buff_size >= (self.brains_to_curriculums[brain_name]
.min_lesson_length) .min_lesson_length)
def increment_lessons(self, measure_vals, reward_buff_sizes=None): def increment_lessons(self, measure_vals, reward_buff_sizes=None):
"""Attempts to increments all the lessons of all the curriculums in this """Attempts to increments all the lessons of all the curriculums in this
...@@ -108,14 +115,13 @@ class MetaCurriculum(object): ...@@ -108,14 +115,13 @@ class MetaCurriculum(object):
if self._lesson_ready_to_increment(brain_name, buff_size): if self._lesson_ready_to_increment(brain_name, buff_size):
measure_val = measure_vals[brain_name] measure_val = measure_vals[brain_name]
ret[brain_name] = (self.brains_to_curriculums[brain_name] ret[brain_name] = (self.brains_to_curriculums[brain_name]
.increment_lesson(measure_val)) .increment_lesson(measure_val))
else: else:
for brain_name, measure_val in measure_vals.items(): for brain_name, measure_val in measure_vals.items():
ret[brain_name] = (self.brains_to_curriculums[brain_name] ret[brain_name] = (self.brains_to_curriculums[brain_name]
.increment_lesson(measure_val)) .increment_lesson(measure_val))
return ret return ret
def set_all_curriculums_to_lesson_num(self, lesson_num): def set_all_curriculums_to_lesson_num(self, lesson_num):
"""Sets all the curriculums in this meta curriculum to a specified """Sets all the curriculums in this meta curriculum to a specified
lesson number. lesson number.
...@@ -127,7 +133,6 @@ class MetaCurriculum(object): ...@@ -127,7 +133,6 @@ class MetaCurriculum(object):
for _, curriculum in self.brains_to_curriculums.items(): for _, curriculum in self.brains_to_curriculums.items():
curriculum.lesson_num = lesson_num curriculum.lesson_num = lesson_num
def get_config(self): def get_config(self):
"""Get the combined configuration of all curriculums in this """Get the combined configuration of all curriculums in this
MetaCurriculum. MetaCurriculum.
...@@ -135,10 +140,10 @@ class MetaCurriculum(object): ...@@ -135,10 +140,10 @@ class MetaCurriculum(object):
Returns: Returns:
A dict from parameter to value. A dict from parameter to value.
""" """
config = {} # config = {}
for _, curriculum in self.brains_to_curriculums.items(): for _, curriculum in self.brains_to_curriculums.items():
curr_config = curriculum.get_config() curr_config = curriculum.get_config()
config.update(curr_config) # config.update(curr_config)
return config return curr_config
...@@ -180,11 +180,11 @@ class TrainerController(object): ...@@ -180,11 +180,11 @@ class TrainerController(object):
environment. environment.
""" """
if self.meta_curriculum is not None: if self.meta_curriculum is not None:
return env.reset(config=self.meta_curriculum.get_config()) return env.reset(arenas_configurations=self.meta_curriculum.get_config())
else: else:
if self.update_config: if self.update_config:
return env.reset(arenas_configurations=self.config)
self.update_config = False self.update_config = False
return env.reset(arenas_configurations=self.config)
else: else:
return env.reset() return env.reset()
......
...@@ -2,7 +2,7 @@ from setuptools import setup ...@@ -2,7 +2,7 @@ from setuptools import setup
setup( setup(
name='animalai_train', name='animalai_train',
version='1.0.5', version='1.1.0',
description='Animal AI competition training library', description='Animal AI competition training library',
url='https://github.com/beyretb/AnimalAI-Olympics', url='https://github.com/beyretb/AnimalAI-Olympics',
author='Benjamin Beyret', author='Benjamin Beyret',
......
!ArenaConfig
arenas:
0: !Arena
t: 250
items:
- !Item
name: Wall
positions:
- !Vector3 {x: -1, y: 0, z: 10}
colors:
rotations: [90]
sizes:
- !Vector3 {x: 1, y: 5, z: 9}
- !Item
name: GoodGoal
positions:
- !Vector3 {x: -1, y: 0, z: 35}
sizes:
- !Vector3 {x: 2, y: 2, z: 2}
- !Item
name: Agent
positions:
- !Vector3 {x: -1, y: 1, z: 5}
\ No newline at end of file
!ArenaConfig
arenas:
0: !Arena
t: 300
items:
- !Item
name: Wall
positions:
- !Vector3 {x: -1, y: 0, z: 10}
- !Vector3 {x: -1, y: 0, z: 20}
colors:
rotations: [90,90]
sizes:
- !Vector3 {x: 1, y: 5, z: 9}
- !Vector3 {x: 1, y: 5, z: 9}
- !Item
name: GoodGoal
positions:
- !Vector3 {x: -1, y: 0, z: 35}
sizes:
- !Vector3 {x: 2, y: 2, z: 2}
- !Item
name: Agent
positions:
- !Vector3 {x: -1, y: 1, z: 5}
\ No newline at end of file
!ArenaConfig
arenas:
0: !Arena
t: 350
items:
- !Item
name: Wall
positions:
- !Vector3 {x: -1, y: 0, z: 10}
- !Vector3 {x: -1, y: 0, z: 20}
- !Vector3 {x: -1, y: 0, z: 30}
colors:
rotations: [90,90,90]
sizes:
- !Vector3 {x: 1, y: 5, z: 9}
- !Vector3 {x: 1, y: 5, z: 9}
- !Vector3 {x: 1, y: 5, z: 9}
- !Item
name: GoodGoal
positions:
- !Vector3 {x: -1, y: 0, z: 35}
sizes:
- !Vector3 {x: 2, y: 2, z: 2}
- !Item
name: Agent
positions:
- !Vector3 {x: -1, y: 1, z: 5}
\ No newline at end of file
!ArenaConfig
arenas:
0: !Arena
t: 400
items:
- !Item
name: Wall
positions:
- !Vector3 {x: -1, y: 0, z: 10}
- !Vector3 {x: -1, y: 0, z: 20}
- !Vector3 {x: -1, y: 0, z: 30}
- !Vector3 {x: 10, y: 0, z: -1}
colors:
rotations: [90,90,90,0]
sizes:
- !Vector3 {x: 1, y: 5, z: 9}
- !Vector3 {x: 1, y: 5, z: 9}
- !Vector3 {x: 1, y: 5, z: 9}
- !Vector3 {x: 1, y: 5, z: 9}
- !Item
name: GoodGoal
positions:
- !Vector3 {x: -1, y: 0, z: 35}
sizes:
- !Vector3 {x: 2, y: 2, z: 2}
- !Item
name: Agent
positions:
- !Vector3 {x: -1, y: 1, z: 5}
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment