Commit 4b17ea66 authored by Benjamin's avatar Benjamin
Browse files

trainin jupyter notebook

parent f8fccea4
......@@ -5,6 +5,8 @@ examples/env/*
models/
summaries/
logs/
examples/models/*
!examples/models/self_control_curriculum_pre_trained
# Environemnt logfile
*Project.log
......
......@@ -2,7 +2,7 @@ from setuptools import setup, find_packages
setup(
name="animalai",
version="2.0.0b0",
version="2.0.0b3",
description="Animal AI envronment Python API",
url="https://github.com/beyretb/AnimalAI-Olympics",
author="Benjamin Beyret",
......
......@@ -62,13 +62,22 @@ def run_training_aai(run_seed: int, options: RunOptionsAAI) -> None:
options.arena_config,
options.resolution,
)
engine_config = EngineConfig(
options.width,
options.height,
AnimalAIEnvironment.QUALITY_LEVEL.train,
AnimalAIEnvironment.TIMESCALE.train,
AnimalAIEnvironment.TARGET_FRAME_RATE.train,
)
if options.train_model:
engine_config = EngineConfig(
options.width,
options.height,
AnimalAIEnvironment.QUALITY_LEVEL.train,
AnimalAIEnvironment.TIMESCALE.train,
AnimalAIEnvironment.TARGET_FRAME_RATE.train,
)
else:
engine_config = EngineConfig(
AnimalAIEnvironment.WINDOW_WIDTH.play,
AnimalAIEnvironment.WINDOW_HEIGHT.play,
AnimalAIEnvironment.QUALITY_LEVEL.play,
AnimalAIEnvironment.TIMESCALE.play,
AnimalAIEnvironment.TARGET_FRAME_RATE.play,
)
env_manager = SubprocessEnvManagerAAI(
env_factory, engine_config, options.num_envs
)
......
......@@ -2,7 +2,7 @@ from setuptools import setup, find_packages
setup(
name="animalai_train",
version="2.0.0b0",
version="2.0.0b3",
description="Animal AI training library",
url="https://github.com/beyretb/AnimalAI-Olympics",
author="Benjamin Beyret",
......@@ -16,6 +16,6 @@ setup(
],
packages=find_packages(exclude=["*.tests", "*.tests.*", "tests.*", "tests"]),
zip_safe=False,
install_requires=["animalai==2.0.0b0", "mlagents==0.15.0"],
install_requires=["animalai==2.0.0b3", "mlagents==0.15.0"],
python_requires=">=3.6.1",
)
TODO
# Examples
Detail the various files in this folder
- load_and_play:
- train_ml_agents
- ...
\ No newline at end of file
## Notebooks
To run the notebooks you can simply install the requirements by running (we recommend using a virtual environment):
```
pip install -r requirements.txt
```
Then you can start a jupyter notebook by running `jupyter notebook` from your terminal.
## Designing arenas
You can use `load_config_and_play.py` to visualize a `yml` configuration for an environment arena. Make sure `animalai`
is [installed](../README.md#requirements) and run `python load_config_and_play.py your_configuration_file.yml` which will open the environment in
play mode (control with W,A,S,D or the arrows), close the environment by pressing CTRL+C in the terminal.
## Animalai-train examples
We provide two scripts which show how to use `animalai_train` to train agents:
- `train_ml_agents.py` uses ml-agents' PPO implementation (or SAC) and can run multiple environments in parralel to speed up
the training process
- `train_curriculum.py` shows how you can add a curriculum to your training loop
To run either of these make sure you have `animalai-train` [installed](../README.md#requirements).
## OpenAI Gym and Baselines
You can use the OpenAI Gym interface to train using Baselines or other similar libraries (including
[Dopamine](https://github.com/google/dopamine) and [Stable Baselines](https://github.com/hill-a/stable-baselines)). To
do so you'll need to install:
On Linux:
```
sudo apt-get update && sudo apt-get install cmake libopenmpi-dev python3-dev zlib1g-dev &&
pip install tensorflow==1.14 &&
pip install git+https://github.com/openai/baselines.git@master#egg=baselines-0.1.6
```
On Mac: TODO
You can then run `train_baselines_dqn.py` or `train_baselines_ppo2.py` for examples.
\ No newline at end of file
!ArenaConfig
arenas:
0: !Arena
pass_mark: 0
t: 250
items:
- !Item
name: CylinderTunnelTransparent
positions:
- !Vector3 {x: 20, y: 0, z: 20}
rotations: [90]
sizes:
- !Vector3 {x: 10, y: 10, z: 10}
- !Item
name: GoodGoal
positions:
- !Vector3 {x: 20, y: 0, z: 20}
sizes:
- !Vector3 {x: 1, y: 1, z: 1}
- !Item
name: Agent
positions:
- !Vector3 {x: 20, y: 0, z: 1}
rotations: [180]
\ No newline at end of file
!ArenaConfig
arenas:
0: !Arena
-1: !Arena
pass_mark: 0
t: 250
items:
......
!ArenaConfig
arenas:
0: !Arena
-1: !Arena
pass_mark: 0
t: 250
items:
......
!ArenaConfig
arenas:
0: !Arena
-1: !Arena
pass_mark: 0
t: 250
items:
......
!ArenaConfig
arenas:
0: !Arena
-1: !Arena
pass_mark: 0
t: 250
items:
......
!ArenaConfig
arenas:
0: !Arena
-1: !Arena
pass_mark: 0
t: 250
items:
......
!ArenaConfig
arenas:
0: !Arena
-1: !Arena
pass_mark: 0
t: 250
items:
......
......@@ -4,8 +4,8 @@
0.8,
0.8,
0.8,
0.8,
0.8
0.6,
0.2
],
"min_lesson_length": 100,
"signal_smoothing": true,
......
......@@ -32,11 +32,11 @@
```
%% Cell type:markdown id: tags:
## Can your agent self control?
## Can your agent self control? - Part I
Self control is hard, we've all been there (looking at you chocolate bar). But don't worry, this is something a lot of species struggle with. In [The evolution of self-control](https://www.pnas.org/content/111/20/E2140) MacLean et al. tested this ability in **36 different species**! In a very simple experiment, animals are offered food they can reach easily by reaching out to it. But then, they're shown the same food behind a transparent wall, they need to go around the wall to grab the food. They can see the food just as well, but they need to refrain from reaching out like before.
Below are videos of such animals, as well as two participants' submissions to our compeition, exhibiting similar behaviors (remember, these agents never encoutered this task during training):
......@@ -78,10 +78,23 @@
%% Cell type:markdown id: tags:
This file contains the configuration of one arena (`!Arena`), with only the agent on the ground (`y=0`) in the center (`x=20`,`z=20`) and a `GoodGoal` (green sphere) of size 1 in front of it (`x=20`,`z=22`). Pretty simple right!
One _little trick_ we used here: one environment can contain several arenas during training, each with its own configuration. This allows your training algorithm to collect more observations at once. You can just place the configurations one after the others like this:
```
!ArenaConfig
arenas:
0: !Arena
......
1: !Arena
......
2: !Arena
......
```
But if you want all the arenas in the environment to have the same configuration then do as we did above: define one configuration only with key `-1`
You can now use this to load an environment and play yourself ([this script does that for you](./load_config_and_play.py)). Make sure you have followed the [installation guide](../README.md#requirements) and then create an `AnimalAIEnvironment` in play mode:
%% Cell type:code id: tags:
``` python
......@@ -177,10 +190,12 @@
%% Cell type:markdown id: tags:
This tells you that we'll switch from one level to the next once the reward per episod is above 0.8. Easy right?
In the next notebook we'll use the above curriculum example to train an agent that can solve the tube task we saw in the videos earlier. Before that, it is worth looking at an extra feature of the environment (blackouts) and use that to interact with the environment from python rather than playing manually.
%% Cell type:markdown id: tags:
## Interacting with the environment + bonus light switch!
In this final part, we look at the API for interacting with the environment. Namely, we want to take steps, collect observations and rewards. For this part we'll load an environment which tests for a cognitive skill called **objects permanence**. It tests the capacity of an agent to understand that an object still exist even if it moved out of sight, think of a car turning a corner, we all know the car hasn't vanished from existence. This test introduces another feature of the environment: **the light switch** which allows to switch the light in the environment on and off. Let's have a look at the experiment:
......
tensorflow>=1.7,<2.0 # if you wish to run examples using tf>=2.0 change the baselines requirement accordingly
animalai==2.0.0b0
animalai_train==2.0.0b0
baselines # replace with git+https://github.com/openai/baselines.git@tf2 to use tf2
animalai==2.0.0b1
animalai_train==2.0.0b2
jupyter
matplotlib
\ No newline at end of file
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment