
ReAgent: Applied Reinforcement Learning Platform¶
Overview¶
ReAgent is an open source end-to-end platform for applied reinforcement learning (RL) developed and used at Facebook. ReAgent is built in Python and uses PyTorch for modeling and training and TorchScript for model serving. The platform contains workflows to train popular deep RL algorithms and includes data preprocessing, feature transformation, distributed training, counterfactual policy evaluation, and optimized serving. For more detailed information about ReAgent see the white paper here: Platform.
The source code is available here: Source code.
The platform was once named “Horizon” but we have adopted the name “ReAgent” recently to emphasize its broader scope in decision making and reasoning.
Algorithms Supported¶
Discrete-Action DQN
Parametric-Action DQN
Twin Delayed DDPG (TD3)
Soft Actor-Critic (SAC)
Installation¶
ReAgent can be installed via. Docker or manually. Detailed instructions on how to install ReAgent can be found here: Installation.
Usage¶
The ReAgent Serving Platform (RASP) tutorial covers serving and training models and is available here: ReAgent Serving Platform (RASP).
Detailed instructions on how to use ReAgent can be found here: Usage.
Citing¶
- @article{gauci2018horizon,
title={Horizon: Facebook’s Open Source Applied Reinforcement Learning Platform}, author={Gauci, Jason and Conti, Edoardo and Liang, Yitao and Virochsiri, Kittipat and Chen, Zhengxing and He, Yuchen and Kaden, Zachary and Narayanan, Vivek and Ye, Xiaohui}, journal={arXiv preprint arXiv:1811.00260}, year={2018}
}
Table of Contents¶
Getting Started
Advanced Topics
Package Reference
- Evaluation
- Submodules
- ml.rl.evaluation.cpe module
- ml.rl.evaluation.doubly_robust_estimator module
- ml.rl.evaluation.evaluation_data_page module
- ml.rl.evaluation.evaluator module
- ml.rl.evaluation.ranking_evaluator module
- ml.rl.evaluation.sequential_doubly_robust_estimator module
- ml.rl.evaluation.weighted_sequential_doubly_robust_estimator module
- ml.rl.evaluation.world_model_evaluator module
- Module contents
- Models
- Submodules
- ml.rl.models.actor module
- ml.rl.models.base module
- ml.rl.models.bcq module
- ml.rl.models.categorical_dqn module
- ml.rl.models.cem_planner module
- ml.rl.models.convolutional_network module
- ml.rl.models.dqn module
- ml.rl.models.dueling_q_network module
- ml.rl.models.dueling_quantile_dqn module
- ml.rl.models.example_sequence_model module
- ml.rl.models.fully_connected_network module
- ml.rl.models.mdn_rnn module
- ml.rl.models.no_soft_update_embedding module
- ml.rl.models.parametric_dqn module
- ml.rl.models.quantile_dqn module
- ml.rl.models.seq2slate module
- ml.rl.models.world_model module
- Module contents
- Prediction
- Preprocessing
- Submodules
- ml.rl.preprocessing.batch_preprocessor module
- ml.rl.preprocessing.feature_extractor module
- ml.rl.preprocessing.identify_types module
- ml.rl.preprocessing.normalization module
- ml.rl.preprocessing.postprocessor module
- ml.rl.preprocessing.preprocessor module
- ml.rl.preprocessing.preprocessor_net module
- ml.rl.preprocessing.sparse_to_dense module
- Module contents
- Readers
- Simulators
- Training
- Subpackages
- Submodules
- ml.rl.training.c51_trainer module
- ml.rl.training.cem_trainer module
- ml.rl.training.dqn_predictor module
- ml.rl.training.dqn_trainer module
- ml.rl.training.dqn_trainer_base module
- ml.rl.training.imitator_training module
- ml.rl.training.loss_reporter module
- ml.rl.training.off_policy_predictor module
- ml.rl.training.on_policy_predictor module
- ml.rl.training.parametric_dqn_trainer module
- ml.rl.training.qrdqn_trainer module
- ml.rl.training.rl_dataset module
- ml.rl.training.rl_trainer_pytorch module
- ml.rl.training.sac_trainer module
- ml.rl.training.sandboxed_predictor module
- ml.rl.training.td3_trainer module
- ml.rl.training.training_data_page module
- Module contents
- Workflow
- Submodules
- ml.rl.workflow.base_workflow module
- ml.rl.workflow.create_normalization_metadata module
- ml.rl.workflow.dqn_workflow module
- ml.rl.workflow.helpers module
- ml.rl.workflow.page_handler module
- ml.rl.workflow.parametric_dqn_workflow module
- ml.rl.workflow.preprocess_handler module
- ml.rl.workflow.transitional module
- Module contents
- All Modules
- reagent package
- Subpackages
- reagent.evaluation package
- Submodules
- reagent.evaluation.cpe module
- reagent.evaluation.doubly_robust_estimator module
- reagent.evaluation.evaluation_data_page module
- reagent.evaluation.evaluator module
- reagent.evaluation.ope_adapter module
- reagent.evaluation.ranking_listwise_evaluator module
- reagent.evaluation.ranking_policy_gradient_evaluator module
- reagent.evaluation.reward_net_evaluator module
- reagent.evaluation.seq2reward_evaluator module
- reagent.evaluation.sequential_doubly_robust_estimator module
- reagent.evaluation.weighted_sequential_doubly_robust_estimator module
- reagent.evaluation.world_model_evaluator module
- Module contents
- reagent.gym package
- Subpackages
- reagent.gym.agents package
- reagent.gym.envs package
- reagent.gym.policies package
- reagent.gym.preprocessors package
- reagent.gym.runners package
- reagent.gym.tests package
- Submodules
- reagent.gym.types module
- reagent.gym.utils module
- Module contents
- Subpackages
- reagent.models package
- Submodules
- reagent.models.actor module
- reagent.models.base module
- reagent.models.bcq module
- reagent.models.categorical_dqn module
- reagent.models.cem_planner module
- reagent.models.containers module
- reagent.models.convolutional_network module
- reagent.models.critic module
- reagent.models.dqn module
- reagent.models.dueling_q_network module
- reagent.models.embedding_bag_concat module
- reagent.models.fully_connected_network module
- reagent.models.mdn_rnn module
- reagent.models.model_feature_config_provider module
- reagent.models.no_soft_update_embedding module
- reagent.models.seq2reward_model module
- reagent.models.seq2slate module
- reagent.models.seq2slate_reward module
- reagent.models.world_model module
- Module contents
- reagent.ope package
- Subpackages
- reagent.ope.datasets package
- reagent.ope.estimators package
- reagent.ope.test package
- reagent.ope.trainers package
- Submodules
- reagent.ope.utils module
- Module contents
- Subpackages
- reagent.optimizer package
- reagent.prediction package
- reagent.preprocessing package
- Submodules
- reagent.preprocessing.batch_preprocessor module
- reagent.preprocessing.identify_types module
- reagent.preprocessing.normalization module
- reagent.preprocessing.postprocessor module
- reagent.preprocessing.preprocessor module
- reagent.preprocessing.sparse_preprocessor module
- reagent.preprocessing.sparse_to_dense module
- reagent.preprocessing.transforms module
- reagent.preprocessing.types module
- Module contents
- reagent.replay_memory package
- reagent.training package
- Subpackages
- Submodules
- reagent.training.c51_trainer module
- reagent.training.cem_trainer module
- reagent.training.dqn_trainer module
- reagent.training.dqn_trainer_base module
- reagent.training.imitator_training module
- reagent.training.loss_reporter module
- reagent.training.parameters module
- reagent.training.parametric_dqn_trainer module
- reagent.training.qrdqn_trainer module
- reagent.training.reinforce module
- reagent.training.reward_network_trainer module
- reagent.training.rl_trainer_pytorch module
- reagent.training.sac_trainer module
- reagent.training.slate_q_trainer module
- reagent.training.td3_trainer module
- reagent.training.trainer module
- reagent.training.utils module
- Module contents
- reagent.workflow package
- Submodules
- reagent.workflow.cli module
- reagent.workflow.data_fetcher module
- reagent.workflow.env module
- reagent.workflow.gym_batch_rl module
- reagent.workflow.identify_types_flow module
- reagent.workflow.result_registries module
- reagent.workflow.result_types module
- reagent.workflow.spark_utils module
- reagent.workflow.tagged_union module
- reagent.workflow.training module
- reagent.workflow.training_reports module
- reagent.workflow.types module
- reagent.workflow.utils module
- Module contents
- reagent.evaluation package
- Submodules
- reagent.base_dataclass module
- reagent.debug_on_error module
- reagent.json_serialize module
- reagent.parameters module
- reagent.parameters_seq2slate module
- reagent.tensorboardX module
- reagent.torch_utils module
- reagent.types module
- Module contents
- Subpackages
- reagent package