reagent.gym package

Subpackages

Submodules

reagent.gym.types module

class reagent.gym.types.GaussianSamplerScore(loc: torch.Tensor, scale_log: torch.Tensor)

Bases: reagent.base_dataclass.BaseDataClass

class reagent.gym.types.Sampler

Bases: abc.ABC

Given scores, select the action.

abstract log_prob(scores: Any, action: torch.Tensor) → torch.Tensor
abstract sample_action(scores: Any) → reagent.types.ActorOutput
update() → None

Call to update internal parameters (e.g. decay epsilon)

reagent.gym.types.TrainerPreprocessor = typing.Callable[[typing.Any], reagent.types.PreprocessedTrainingBatch]

Called after env.step(action) Args: (state, action, reward, terminal, log_prob)

class reagent.gym.types.Trajectory(transitions: List[reagent.gym.types.Transition] = <factory>)

Bases: reagent.base_dataclass.BaseDataClass

add_transition(transition: reagent.gym.types.Transition)
calculate_cumulative_reward(gamma: float = 1.0)

Return (discounted) sum of rewards.

class reagent.gym.types.Transition(mdp_id: int, sequence_number: int, observation: Any, action: Any, reward: float, terminal: bool, log_prob: Union[float, NoneType] = None, possible_actions: Union[List[int], NoneType] = None, possible_actions_mask: Union[List[int], NoneType] = None)

Bases: reagent.base_dataclass.BaseDataClass

asdict()
log_prob = None
possible_actions = None
possible_actions_mask = None
reagent.gym.types.get_optional_fields(cls) → List[str]

return list of optional annotated fields

reagent.gym.utils module

Module contents