reagent.model_managers.discrete package

Submodules

reagent.model_managers.discrete.discrete_c51dqn module

class reagent.model_managers.discrete.discrete_c51dqn.DiscreteC51DQN(target_action_distribution: Optional[List[float]] = None, state_feature_config_provider: reagent.workflow.types.ModelFeatureConfigProvider__Union = <factory>, preprocessing_options: Optional[reagent.workflow.types.PreprocessingOptions] = None, reader_options: Optional[reagent.workflow.types.ReaderOptions] = None, eval_parameters: reagent.core.parameters.EvaluationParameters = <factory>, trainer_param: reagent.training.parameters.C51TrainerParameters = <factory>, net_builder: reagent.net_builder.unions.CategoricalDQNNetBuilder__Union = <factory>, cpe_net_builder: reagent.net_builder.unions.CategoricalDQNNetBuilder__Union = <factory>)

Bases: reagent.model_managers.discrete_dqn_base.DiscreteDQNBase

property action_names
build_serving_module(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData]) torch.nn.modules.module.Module

Returns a TorchScript predictor module

build_trainer(normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData], use_gpu: bool, reward_options: Optional[reagent.workflow.types.RewardOptions] = None) reagent.training.c51_trainer.C51Trainer

Implement this to build the trainer, given the config

TODO: This function should return ReAgentLightningModule & the dictionary of modules created

cpe_net_builder: reagent.net_builder.unions.CategoricalDQNNetBuilder__Union
net_builder: reagent.net_builder.unions.CategoricalDQNNetBuilder__Union
property rl_parameters
trainer_param: reagent.training.parameters.C51TrainerParameters

reagent.model_managers.discrete.discrete_crr module

class reagent.model_managers.discrete.discrete_crr.ActorDQN(actor)

Bases: reagent.models.base.ModelBase

forward(state)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

input_prototype()

This function provides the input for ONNX graph tracing.

The return value should be what expected by forward().

training: bool
class reagent.model_managers.discrete.discrete_crr.ActorPolicyWrapper(actor_network)

Bases: reagent.gym.policies.policy.Policy

Actor’s forward function is our act

act(obs: reagent.core.types.FeatureData, possible_actions_mask: Optional[torch.Tensor] = None) reagent.core.types.ActorOutput

Performs the composition described above. These are the actions being put into the replay buffer, not necessary the actions taken by the environment!

class reagent.model_managers.discrete.discrete_crr.DiscreteCRR(target_action_distribution: Optional[List[float]] = None, state_feature_config_provider: reagent.workflow.types.ModelFeatureConfigProvider__Union = <factory>, preprocessing_options: Optional[reagent.workflow.types.PreprocessingOptions] = None, reader_options: Optional[reagent.workflow.types.ReaderOptions] = None, eval_parameters: reagent.core.parameters.EvaluationParameters = <factory>, trainer_param: reagent.training.parameters.CRRTrainerParameters = <factory>, actor_net_builder: reagent.net_builder.unions.DiscreteActorNetBuilder__Union = <factory>, critic_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>, cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>)

Bases: reagent.model_managers.discrete_dqn_base.DiscreteDQNBase

property action_names
actor_net_builder: reagent.net_builder.unions.DiscreteActorNetBuilder__Union
build_actor_module(trainer_module: reagent.training.discrete_crr_trainer.DiscreteCRRTrainer, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData]) torch.nn.modules.module.Module
build_serving_modules(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData])

actor_dqn is the actor module wrapped in the DQN predictor wrapper. This helps putting the actor in places where DQN predictor wrapper is expected. If the policy is greedy, then this wrapper would work.

build_trainer(normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData], use_gpu: bool, reward_options: Optional[reagent.workflow.types.RewardOptions] = None) reagent.training.discrete_crr_trainer.DiscreteCRRTrainer

Implement this to build the trainer, given the config

TODO: This function should return ReAgentLightningModule & the dictionary of modules created

cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
create_policy(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, serving: bool = False, normalization_data_map: Optional[Dict[str, reagent.core.parameters.NormalizationData]] = None) reagent.gym.policies.policy.Policy

Create online actor critic policy.

critic_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
eval_parameters: reagent.core.parameters.EvaluationParameters
get_reporter()
property rl_parameters
serving_module_names()

Returns the keys that would be returned in build_serving_modules(). This method is required because we need to reserve entity IDs for these serving modules before we start the training.

trainer_param: reagent.training.parameters.CRRTrainerParameters

reagent.model_managers.discrete.discrete_dqn module

class reagent.model_managers.discrete.discrete_dqn.DiscreteDQN(target_action_distribution: Optional[List[float]] = None, state_feature_config_provider: reagent.workflow.types.ModelFeatureConfigProvider__Union = <factory>, preprocessing_options: Optional[reagent.workflow.types.PreprocessingOptions] = None, reader_options: Optional[reagent.workflow.types.ReaderOptions] = None, eval_parameters: reagent.core.parameters.EvaluationParameters = <factory>, trainer_param: reagent.training.parameters.DQNTrainerParameters = <factory>, net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>, cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>)

Bases: reagent.model_managers.discrete_dqn_base.DiscreteDQNBase

property action_names
build_serving_module(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData]) torch.nn.modules.module.Module

Returns a TorchScript predictor module

build_serving_modules(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData])

Returns TorchScript for serving in production

build_trainer(normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData], use_gpu: bool, reward_options: Optional[reagent.workflow.types.RewardOptions] = None) reagent.training.dqn_trainer.DQNTrainer

Implement this to build the trainer, given the config

TODO: This function should return ReAgentLightningModule & the dictionary of modules created

cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
get_reporter()
net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
property rl_parameters
serving_module_names()

Returns the keys that would be returned in build_serving_modules(). This method is required because we need to reserve entity IDs for these serving modules before we start the training.

trainer_param: reagent.training.parameters.DQNTrainerParameters

reagent.model_managers.discrete.discrete_qrdqn module

class reagent.model_managers.discrete.discrete_qrdqn.DiscreteQRDQN(target_action_distribution: Optional[List[float]] = None, state_feature_config_provider: reagent.workflow.types.ModelFeatureConfigProvider__Union = <factory>, preprocessing_options: Optional[reagent.workflow.types.PreprocessingOptions] = None, reader_options: Optional[reagent.workflow.types.ReaderOptions] = None, eval_parameters: reagent.core.parameters.EvaluationParameters = <factory>, trainer_param: reagent.training.parameters.QRDQNTrainerParameters = <factory>, net_builder: reagent.net_builder.unions.QRDQNNetBuilder__Union = <factory>, cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>)

Bases: reagent.model_managers.discrete_dqn_base.DiscreteDQNBase

property action_names
build_serving_module(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData]) torch.nn.modules.module.Module

Returns a TorchScript predictor module

build_trainer(normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData], use_gpu: bool, reward_options: Optional[reagent.workflow.types.RewardOptions] = None) reagent.training.qrdqn_trainer.QRDQNTrainer

Implement this to build the trainer, given the config

TODO: This function should return ReAgentLightningModule & the dictionary of modules created

cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
net_builder: reagent.net_builder.unions.QRDQNNetBuilder__Union
property rl_parameters
trainer_param: reagent.training.parameters.QRDQNTrainerParameters

Module contents

class reagent.model_managers.discrete.DiscreteC51DQN(target_action_distribution: Optional[List[float]] = None, state_feature_config_provider: reagent.workflow.types.ModelFeatureConfigProvider__Union = <factory>, preprocessing_options: Optional[reagent.workflow.types.PreprocessingOptions] = None, reader_options: Optional[reagent.workflow.types.ReaderOptions] = None, eval_parameters: reagent.core.parameters.EvaluationParameters = <factory>, trainer_param: reagent.training.parameters.C51TrainerParameters = <factory>, net_builder: reagent.net_builder.unions.CategoricalDQNNetBuilder__Union = <factory>, cpe_net_builder: reagent.net_builder.unions.CategoricalDQNNetBuilder__Union = <factory>)

Bases: reagent.model_managers.discrete_dqn_base.DiscreteDQNBase

property action_names
build_serving_module(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData]) torch.nn.modules.module.Module

Returns a TorchScript predictor module

build_trainer(normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData], use_gpu: bool, reward_options: Optional[reagent.workflow.types.RewardOptions] = None) reagent.training.c51_trainer.C51Trainer

Implement this to build the trainer, given the config

TODO: This function should return ReAgentLightningModule & the dictionary of modules created

cpe_net_builder: reagent.net_builder.unions.CategoricalDQNNetBuilder__Union
eval_parameters: EvaluationParameters
net_builder: reagent.net_builder.unions.CategoricalDQNNetBuilder__Union
property rl_parameters
state_feature_config_provider: ModelFeatureConfigProvider__Union
trainer_param: reagent.training.parameters.C51TrainerParameters
class reagent.model_managers.discrete.DiscreteCRR(target_action_distribution: Optional[List[float]] = None, state_feature_config_provider: reagent.workflow.types.ModelFeatureConfigProvider__Union = <factory>, preprocessing_options: Optional[reagent.workflow.types.PreprocessingOptions] = None, reader_options: Optional[reagent.workflow.types.ReaderOptions] = None, eval_parameters: reagent.core.parameters.EvaluationParameters = <factory>, trainer_param: reagent.training.parameters.CRRTrainerParameters = <factory>, actor_net_builder: reagent.net_builder.unions.DiscreteActorNetBuilder__Union = <factory>, critic_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>, cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>)

Bases: reagent.model_managers.discrete_dqn_base.DiscreteDQNBase

property action_names
actor_net_builder: reagent.net_builder.unions.DiscreteActorNetBuilder__Union
build_actor_module(trainer_module: reagent.training.discrete_crr_trainer.DiscreteCRRTrainer, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData]) torch.nn.modules.module.Module
build_serving_modules(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData])

actor_dqn is the actor module wrapped in the DQN predictor wrapper. This helps putting the actor in places where DQN predictor wrapper is expected. If the policy is greedy, then this wrapper would work.

build_trainer(normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData], use_gpu: bool, reward_options: Optional[reagent.workflow.types.RewardOptions] = None) reagent.training.discrete_crr_trainer.DiscreteCRRTrainer

Implement this to build the trainer, given the config

TODO: This function should return ReAgentLightningModule & the dictionary of modules created

cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
create_policy(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, serving: bool = False, normalization_data_map: Optional[Dict[str, reagent.core.parameters.NormalizationData]] = None) reagent.gym.policies.policy.Policy

Create online actor critic policy.

critic_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
eval_parameters: reagent.core.parameters.EvaluationParameters
get_reporter()
property rl_parameters
serving_module_names()

Returns the keys that would be returned in build_serving_modules(). This method is required because we need to reserve entity IDs for these serving modules before we start the training.

state_feature_config_provider: ModelFeatureConfigProvider__Union
trainer_param: reagent.training.parameters.CRRTrainerParameters
class reagent.model_managers.discrete.DiscreteDQN(target_action_distribution: Optional[List[float]] = None, state_feature_config_provider: reagent.workflow.types.ModelFeatureConfigProvider__Union = <factory>, preprocessing_options: Optional[reagent.workflow.types.PreprocessingOptions] = None, reader_options: Optional[reagent.workflow.types.ReaderOptions] = None, eval_parameters: reagent.core.parameters.EvaluationParameters = <factory>, trainer_param: reagent.training.parameters.DQNTrainerParameters = <factory>, net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>, cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>)

Bases: reagent.model_managers.discrete_dqn_base.DiscreteDQNBase

property action_names
build_serving_module(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData]) torch.nn.modules.module.Module

Returns a TorchScript predictor module

build_serving_modules(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData])

Returns TorchScript for serving in production

build_trainer(normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData], use_gpu: bool, reward_options: Optional[reagent.workflow.types.RewardOptions] = None) reagent.training.dqn_trainer.DQNTrainer

Implement this to build the trainer, given the config

TODO: This function should return ReAgentLightningModule & the dictionary of modules created

cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
eval_parameters: EvaluationParameters
get_reporter()
net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
property rl_parameters
serving_module_names()

Returns the keys that would be returned in build_serving_modules(). This method is required because we need to reserve entity IDs for these serving modules before we start the training.

state_feature_config_provider: ModelFeatureConfigProvider__Union
trainer_param: reagent.training.parameters.DQNTrainerParameters
class reagent.model_managers.discrete.DiscreteQRDQN(target_action_distribution: Optional[List[float]] = None, state_feature_config_provider: reagent.workflow.types.ModelFeatureConfigProvider__Union = <factory>, preprocessing_options: Optional[reagent.workflow.types.PreprocessingOptions] = None, reader_options: Optional[reagent.workflow.types.ReaderOptions] = None, eval_parameters: reagent.core.parameters.EvaluationParameters = <factory>, trainer_param: reagent.training.parameters.QRDQNTrainerParameters = <factory>, net_builder: reagent.net_builder.unions.QRDQNNetBuilder__Union = <factory>, cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union = <factory>)

Bases: reagent.model_managers.discrete_dqn_base.DiscreteDQNBase

property action_names
build_serving_module(trainer_module: reagent.training.reagent_lightning_module.ReAgentLightningModule, normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData]) torch.nn.modules.module.Module

Returns a TorchScript predictor module

build_trainer(normalization_data_map: Dict[str, reagent.core.parameters.NormalizationData], use_gpu: bool, reward_options: Optional[reagent.workflow.types.RewardOptions] = None) reagent.training.qrdqn_trainer.QRDQNTrainer

Implement this to build the trainer, given the config

TODO: This function should return ReAgentLightningModule & the dictionary of modules created

cpe_net_builder: reagent.net_builder.unions.DiscreteDQNNetBuilder__Union
eval_parameters: EvaluationParameters
net_builder: reagent.net_builder.unions.QRDQNNetBuilder__Union
property rl_parameters
state_feature_config_provider: ModelFeatureConfigProvider__Union
trainer_param: reagent.training.parameters.QRDQNTrainerParameters