ReplayBuffers¶
All replay_buffers are derived from nnabla_rl.models.ReplayBuffer
ReplayBuffer¶
- class nnabla_rl.replay_buffer.ReplayBuffer(capacity: Optional[int] = None)[source]¶
- append(experience: Tuple[Type[numpy.array], Type[numpy.array], float, float, Type[numpy.array], Dict[str, Any]])[source]¶
Add new experience to the replay buffer.
- Parameters
experience (array-like) – Experience includes trainsitions, such as state, action, reward, the iteration of environment has done or not. Please see to get more information in [Replay buffer documents](replay_buffer.md)
Notes
If the replay buffer size is full, the oldest (head of the buffer) experience will be dropped off and the given experince will be added to the tail of the buffer.
- append_all(experiences: Sequence[Tuple[Type[numpy.array], Type[numpy.array], float, float, Type[numpy.array], Dict[str, Any]]])[source]¶
Add list of experiences to the replay buffer.
- Parameters
experiences (Sequence[Experience]) – Sequence of experiences to insert to the buffer
Notes
If the replay buffer size is full, the oldest (head of the buffer) experience will be dropped off and the given experince will be added to the tail of the buffer.
- property capacity: Optional[int]¶
Capacity (max length) of this replay buffer otherwise None
- sample(num_samples: int = 1) → Tuple[Sequence[Tuple[Type[numpy.array], Type[numpy.array], float, float, Type[numpy.array], Dict[str, Any]]], Dict[str, Any]][source]¶
Randomly sample num_samples experiences from the replay buffer.
- Parameters
num_samples (int) – Number of samples to sample from the replay buffer.
- Returns
Random num_samples of experiences. info (Dict[str, Any]): dictionary of information about experiences.
- Return type
experiences (Sequence[Experience])
Notes
Sampling strategy depends on the undelying implementation.
- sample_indices(indices: Sequence[int]) → Tuple[Sequence[Tuple[Type[numpy.array], Type[numpy.array], float, float, Type[numpy.array], Dict[str, Any]]], Dict[str, Any]][source]¶
Sample experiences for given indices from the replay buffer.
- Parameters
indices (array-like) – list of array index to sample the data
- Returns
Sample of experiences for given indices.
- Return type
experiences (array-like)
- Raises
ValueError – If indices are empty
List of ReplayBuffer¶
- class nnabla_rl.replay_buffers.DecorableReplayBuffer(capacity, decor_fun)[source]¶
Bases:
nnabla_rl.replay_buffer.ReplayBuffer
Buffer which can decorate the experience with external decoration function
This buffer enables decorating the experience before the item is used for building the batch. Decoration function will be called when __getitem__ is called. You can use this buffer to augment the data or add noise to the experience.
- class nnabla_rl.replay_buffers.MemoryEfficientAtariBuffer(capacity)[source]¶
Bases:
nnabla_rl.replay_buffer.ReplayBuffer
Buffer designed to compactly save experiences of Atari environments used in DQN.
DQN (and other training algorithms) requires large replay buffer when training on Atari games. If you naively save the experiences, you’ll need more than 100GB to save them (assuming 1M experiences). Which usually does not fit in the machine’s memory (unless you have money:). This replay buffer reduces the size of experience by casting the images to uint8 and removing old frames concatenated to the observation. By using this buffer, you can hold 1M experiences using only 20GB(approx.) of memory.
Note that this class is designed only for DQN style training on atari environment. (i.e. State consists of 4 concatenated grayscaled frames and its values are normalized between 0 and 1)
- append(experience)[source]¶
Add new experience to the replay buffer.
- Parameters
experience (array-like) – Experience includes trainsitions, such as state, action, reward, the iteration of environment has done or not. Please see to get more information in [Replay buffer documents](replay_buffer.md)
Notes
If the replay buffer size is full, the oldest (head of the buffer) experience will be dropped off and the given experince will be added to the tail of the buffer.
- append_all(experiences)[source]¶
Add list of experiences to the replay buffer.
- Parameters
experiences (Sequence[Experience]) – Sequence of experiences to insert to the buffer
Notes
If the replay buffer size is full, the oldest (head of the buffer) experience will be dropped off and the given experince will be added to the tail of the buffer.
- class nnabla_rl.replay_buffers.PrioritizedReplayBuffer(capacity, alpha=0.6, beta=0.4, betasteps=10000, epsilon=1e-08)[source]¶
Bases:
nnabla_rl.replay_buffer.ReplayBuffer
- append(experience)[source]¶
Add new experience to the replay buffer.
- Parameters
experience (array-like) – Experience includes trainsitions, such as state, action, reward, the iteration of environment has done or not. Please see to get more information in [Replay buffer documents](replay_buffer.md)
Notes
If the replay buffer size is full, the oldest (head of the buffer) experience will be dropped off and the given experince will be added to the tail of the buffer.
- sample(num_samples=1)[source]¶
Randomly sample num_samples experiences from the replay buffer.
- Parameters
num_samples (int) – Number of samples to sample from the replay buffer.
- Returns
Random num_samples of experiences. info (Dict[str, Any]): dictionary of information about experiences.
- Return type
experiences (Sequence[Experience])
Notes
Sampling strategy depends on the undelying implementation.