reagent.models package
Submodules
reagent.models.actor module
- class reagent.models.actor.DirichletFullyConnectedActor(state_dim, action_dim, sizes, activations, use_batch_norm=False)
Bases:
reagent.models.base.ModelBase
- EPSILON = 1e-06
- forward(state)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- get_log_prob(state, action)
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.actor.FullyConnectedActor(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], use_batch_norm: bool = False, action_activation: str = 'tanh', exploration_variance: Optional[float] = None)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData) reagent.core.types.ActorOutput
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.actor.GaussianFullyConnectedActor(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], scale: float = 0.05, use_batch_norm: bool = False, use_layer_norm: bool = False, use_l2_normalization: bool = False)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- get_log_prob(state: reagent.core.types.FeatureData, squashed_action: torch.Tensor)
Action is expected to be squashed with tanh
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.actor.StochasticActor(scorer, sampler)
Bases:
reagent.models.base.ModelBase
- forward(state)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- get_distributed_data_parallel_model()
Return DistributedDataParallel version of this model
This needs to be implemented explicitly because: 1) Model with EmbeddingBag module is not compatible with vanilla DistributedDataParallel 2) Exporting logic needs structured data. DistributedDataParallel doesn’t work with structured data.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
reagent.models.base module
- class reagent.models.base.ModelBase
Bases:
torch.nn.modules.module.Module
A base class to support exporting through ONNX
- cpu_model()
Override this in DistributedDataParallel models
- feature_config() Optional[reagent.core.types.ModelFeatureConfig]
If the model needs additional preprocessing, e.g., using sequence features, returns the config here.
- get_distributed_data_parallel_model()
Return DistributedDataParallel version of this model
This needs to be implemented explicitly because: 1) Model with EmbeddingBag module is not compatible with vanilla DistributedDataParallel 2) Exporting logic needs structured data. DistributedDataParallel doesn’t work with structured data.
- get_target_network()
Return a copy of this network to be used as target network
Subclass should override this if the target network should share parameters with the network to be trained.
- input_prototype() Any
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
reagent.models.bcq module
- class reagent.models.bcq.BatchConstrainedDQN(state_dim, q_network, imitator_network, bcq_drop_threshold)
Bases:
reagent.models.base.ModelBase
- forward(state)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
reagent.models.categorical_dqn module
- class reagent.models.categorical_dqn.CategoricalDQN(distributional_network: reagent.models.base.ModelBase, *, qmin: float, qmax: float, num_atoms: int)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- log_dist(state: reagent.core.types.FeatureData) torch.Tensor
- training: bool
reagent.models.cem_planner module
A network which implements a cross entropy method-based planner
The planner plans the best next action based on simulation data generated by an ensemble of world models.
The idea is inspired by: https://arxiv.org/abs/1805.12114
- class reagent.models.cem_planner.CEMPlannerNetwork(mem_net_list: List[reagent.models.world_model.MemoryNetwork], cem_num_iterations: int, cem_population_size: int, ensemble_population_size: int, num_elites: int, plan_horizon_length: int, state_dim: int, action_dim: int, discrete_action: bool, terminal_effective: bool, gamma: float, alpha: float = 0.25, epsilon: float = 0.001, action_upper_bounds: Optional[numpy.ndarray] = None, action_lower_bounds: Optional[numpy.ndarray] = None)
Bases:
torch.nn.modules.module.Module
- acc_rewards_of_all_solutions(state: reagent.core.types.FeatureData, solutions: torch.Tensor) float
Calculate accumulated rewards of solutions.
- Parameters
state – the input which contains the starting state
solutions – its shape is (cem_pop_size, plan_horizon_length, action_dim)
- Returns
a vector of size cem_pop_size, which is the reward of each solution
- acc_rewards_of_one_solution(init_state: torch.Tensor, solution: torch.Tensor, solution_idx: int)
ensemble_pop_size trajectories will be sampled to evaluate a CEM solution. Each trajectory is generated by one world model
- Parameters
init_state – its shape is (state_dim, )
solution – its shape is (plan_horizon_length, action_dim)
solution_idx – the index of the solution
- Return reward
Reward of each of ensemble_pop_size trajectories
- constrained_variance(mean, var)
- continuous_planning(state: reagent.core.types.FeatureData) torch.Tensor
- discrete_planning(state: reagent.core.types.FeatureData) Tuple[int, numpy.ndarray]
- forward(state: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- sample_reward_next_state_terminal(state: reagent.core.types.FeatureData, action: reagent.core.types.FeatureData, mem_net: reagent.models.world_model.MemoryNetwork)
Sample one-step dynamics based on the provided world model
- training: bool
reagent.models.containers module
- class reagent.models.containers.Sequential(*args: torch.nn.modules.module.Module)
- class reagent.models.containers.Sequential(arg: collections.OrderedDict[str, torch.nn.modules.module.Module])
Bases:
torch.nn.modules.container.Sequential
,reagent.models.base.ModelBase
Used this instead of torch.nn.Sequential to automate model tracing
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
reagent.models.convolutional_network module
- class reagent.models.convolutional_network.ConvolutionalNetwork(cnn_parameters, layers, activations, use_layer_norm)
Bases:
torch.nn.modules.module.Module
- conv_forward(input)
- forward(input) torch.FloatTensor
Forward pass for generic convnet DNNs. Assumes activation names are valid pytorch activation names. :param input image tensor
- training: bool
reagent.models.critic module
- class reagent.models.critic.FullyConnectedCritic(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], use_batch_norm: bool = False, use_layer_norm: bool = False, output_dim: int = 1)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, action: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
reagent.models.dqn module
- class reagent.models.dqn.FullyConnectedDQN(state_dim, action_dim, sizes, activations, *, output_activation: str = 'linear', num_atoms: Optional[int] = None, use_batch_norm: bool = False, dropout_ratio: float = 0.0, normalized_output: bool = False, use_layer_norm: bool = False)
Bases:
reagent.models.fully_connected_network.FloatFeatureFullyConnected
- forward(state: reagent.core.types.FeatureData, possible_actions_mask: Optional[torch.Tensor] = None) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
reagent.models.dueling_q_network module
- class reagent.models.dueling_q_network.DuelingQNetwork(*, shared_network: reagent.models.base.ModelBase, advantage_network: reagent.models.base.ModelBase, value_network: reagent.models.base.ModelBase)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, possible_actions_mask: Optional[torch.Tensor] = None) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- classmethod make_fully_connected(state_dim: int, action_dim: int, layers: List[int], activations: List[str], num_atoms: Optional[int] = None, use_batch_norm: bool = False)
- training: bool
- class reagent.models.dueling_q_network.ParametricDuelingQNetwork(*, shared_network: reagent.models.base.ModelBase, advantage_network: reagent.models.base.ModelBase, value_network: reagent.models.base.ModelBase)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, action: reagent.core.types.FeatureData) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- classmethod make_fully_connected(state_dim: int, action_dim: int, layers: List[int], activations: List[str], use_batch_norm: bool = False)
- training: bool
reagent.models.embedding_bag_concat module
- class reagent.models.embedding_bag_concat.EmbeddingBagConcat(state_dim: int, model_feature_config: reagent.core.types.ModelFeatureConfig, embedding_dim: int)
Bases:
reagent.models.base.ModelBase
Concatenating embedding with float features before passing the input to DQN
- forward(state: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- property output_dim: int
- training: bool
reagent.models.fully_connected_network module
- class reagent.models.fully_connected_network.FloatFeatureFullyConnected(state_dim, output_dim, sizes, activations, *, output_activation: str = 'linear', num_atoms: Optional[int] = None, use_batch_norm: bool = False, dropout_ratio: float = 0.0, normalized_output: bool = False, use_layer_norm: bool = False)
Bases:
reagent.models.base.ModelBase
A fully connected network that takes FloatFeatures input and supports distributional prediction.
- forward(state: reagent.core.types.FeatureData) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.fully_connected_network.FullyConnectedNetwork(layers, activations, *, use_batch_norm: bool = False, min_std: float = 0.0, dropout_ratio: float = 0.0, use_layer_norm: bool = False, normalize_output: bool = False, orthogonal_init: bool = False)
Bases:
reagent.models.base.ModelBase
- forward(input: torch.Tensor) torch.Tensor
Forward pass for generic feed-forward DNNs. Assumes activation names are valid pytorch activation names. :param input tensor
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.fully_connected_network.SlateBatchNorm1d(*args, **kwargs)
Bases:
torch.nn.modules.module.Module
Same as nn.BatchNorm1d is input has shape (batch_size, feat_dim). But if input has shape (batch_size, num_candidates, item_feats), like in LearnedVM, we transpose it, since that’s what nn.BatchNorm1d computes Batch Normalization over 1st dimension, while we want to compute it over item_feats.
NOTE: this is different from nn.BatchNorm2d which is for CNNs, and expects 4D inputs
- forward(x: torch.Tensor)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- reagent.models.fully_connected_network.gaussian_fill_w_gain(tensor, gain, dim_in, min_std=0.0) None
Gaussian initialization with gain.
reagent.models.linear_regression module
- class reagent.models.linear_regression.LinearRegressionUCB(input_dim: int, *, l2_reg_lambda: float = 1.0, predict_ucb: float = False, ucb_alpha: float = 1.0)
Bases:
reagent.models.base.ModelBase
A linear regression model for LinUCB. Note that instead of being trained by a PyTorch optimizer, we explicitly
update attributes A and b (according to the LinUCB formulas implemented in reagent.training.cb.linucb_trainer.LinUCBTrainer).
- Since computing the regression coefficients inverse matrix inversion (expensive op), we
save time by only computing the coefficients when necessary (when doing inference).
- Parameters
input_dim – Dimension of input data
l2_reg_lambda – The weight on L2 regularization
predict_ucb – If True, the model outputs an Upper Confidence Bound (UCB). If False, the model outputs the point estimate
ucb_alpha – The coefficient on the standard deviation in UCB formula. Only used if predict_ucb=True.
- forward(inp: torch.Tensor, ucb_alpha: Optional[float] = None) torch.Tensor
Forward can return the mean or a UCB. If returning UCB, the CI width is stddev*ucb_alpha If ucb_alpha is not passed in, a fixed alpha from init is used
- input_prototype() torch.Tensor
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- reagent.models.linear_regression.batch_quadratic_form(x: torch.Tensor, A: torch.Tensor) torch.Tensor
Compute the quadratic form x^T * A * x for a batched input x. Inspired by https://stackoverflow.com/questions/18541851/calculate-vt-a-v-for-a-matrix-of-vectors-v This is a vectorized implementation of out[i] = x[i].t() @ A @ x[i] x shape: (B, N) A shape: (N, N) output shape: (B)
reagent.models.mdn_rnn module
- class reagent.models.mdn_rnn.MDNRNN(state_dim, action_dim, num_hiddens, num_hidden_layers, num_gaussians)
Bases:
torch.nn.modules.module.Module
Mixture Density Network - Recurrent Neural Network
- forward(actions: torch.Tensor, states: torch.Tensor, hidden=None)
Forward pass of MDN-RNN
- Parameters
actions – (SEQ_LEN, BATCH_SIZE, ACTION_DIM) torch tensor
states – (SEQ_LEN, BATCH_SIZE, STATE_DIM) torch tensor
- Returns
parameters of the GMM prediction for the next state,
gaussian prediction of the reward and logit prediction of non-terminality. And the RNN’s outputs.
mus: (SEQ_LEN, BATCH_SIZE, NUM_GAUSSIANS, STATE_DIM) torch tensor
sigmas: (SEQ_LEN, BATCH_SIZE, NUM_GAUSSIANS, STATE_DIM) torch tensor
logpi: (SEQ_LEN, BATCH_SIZE, NUM_GAUSSIANS) torch tensor
reward: (SEQ_LEN, BATCH_SIZE) torch tensor
not_terminal: (SEQ_LEN, BATCH_SIZE) torch tensor
- last_step_hidden_and_cell: TUPLE(
(NUM_LAYERS, BATCH_SIZE, HIDDEN_SIZE), (NUM_LAYERS, BATCH_SIZE, HIDDEN_SIZE)
) torch tensor - all_steps_hidden: (SEQ_LEN, BATCH_SIZE, HIDDEN_SIZE) torch tensor
- training: bool
- class reagent.models.mdn_rnn.MDNRNNMemoryPool(max_replay_memory_size)
Bases:
object
- deque_sample(indices)
- insert_into_memory(state, action, next_state, reward, not_terminal)
- property memory_size
- sample_memories(batch_size, use_gpu=False) reagent.core.types.MemoryNetworkInput
- Parameters
batch_size – number of samples to return
use_gpu – whether to put samples on gpu
State’s shape is SEQ_LEN x BATCH_SIZE x STATE_DIM, for example. By default, MDN-RNN consumes data with SEQ_LEN as the first dimension.
- class reagent.models.mdn_rnn.MDNRNNMemorySample(state, action, next_state, reward, not_terminal)
Bases:
NamedTuple
- action: numpy.ndarray
Alias for field number 1
- next_state: numpy.ndarray
Alias for field number 2
- not_terminal: float
Alias for field number 4
- reward: float
Alias for field number 3
- state: numpy.ndarray
Alias for field number 0
- reagent.models.mdn_rnn.gmm_loss(batch, mus, sigmas, logpi, reduce=True)
Computes the gmm loss.
Compute minus the log probability of batch under the GMM model described by mus, sigmas, pi. Precisely, with bs1, bs2, … the sizes of the batch dimensions (several batch dimension are useful when you have both a batch axis and a time step axis), gs the number of mixtures and fs the number of features.
- Parameters
- Returns
- loss(batch) = - mean_{i1=0..bs1, i2=0..bs2, …} log(
- sum_{k=1..gs} pi[i1, i2, …, k] * N(
batch[i1, i2, …, :] | mus[i1, i2, …, k, :], sigmas[i1, i2, …, k, :]))
NOTE: The loss is not reduced along the feature dimension (i.e. it should scale linearily with fs).
Adapted from: https://github.com/ctallec/world-models
- reagent.models.mdn_rnn.transpose(*args)
reagent.models.mlp_scorer module
- class reagent.models.mlp_scorer.MLPScorer(mlp: torch.nn.modules.module.Module, has_user_feat: bool = False)
Bases:
reagent.models.base.ModelBase
Log-space in and out
- forward(obs: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
reagent.models.model_feature_config_provider module
- class reagent.models.model_feature_config_provider.ModelFeatureConfigProvider
Bases:
object
- REGISTRY = {'raw': <class 'reagent.models.model_feature_config_provider.RawModelFeatureConfigProvider'>}
- REGISTRY_FROZEN = True
- REGISTRY_NAME = 'ModelFeatureConfigProvider'
- abstract get_model_feature_config() reagent.core.types.ModelFeatureConfig
- class reagent.models.model_feature_config_provider.RawModelFeatureConfigProvider(float_feature_infos: List[reagent.core.types.FloatFeatureInfo] = <factory>, id_mapping_config: Dict[str, reagent.core.types.IdMappingUnion] = <factory>, id_list_feature_configs: List[reagent.core.types.IdListFeatureConfig] = <factory>, id_score_list_feature_configs: List[reagent.core.types.IdScoreListFeatureConfig] = <factory>)
Bases:
reagent.models.model_feature_config_provider.ModelFeatureConfigProvider
,reagent.core.types.ModelFeatureConfig
- get_model_feature_config() reagent.core.types.ModelFeatureConfig
reagent.models.no_soft_update_embedding module
- class reagent.models.no_soft_update_embedding.NoSoftUpdateEmbedding(num_embeddings: int, embedding_dim: int, padding_idx: Optional[int] = None, max_norm: Optional[float] = None, norm_type: float = 2.0, scale_grad_by_freq: bool = False, sparse: bool = False, _weight: Optional[torch.Tensor] = None, device=None, dtype=None)
Bases:
torch.nn.modules.sparse.Embedding
Use this instead of vanilla Embedding module to avoid soft-updating the embedding table in the target network.
- embedding_dim: int
- max_norm: Optional[float]
- norm_type: float
- num_embeddings: int
- padding_idx: Optional[int]
- scale_grad_by_freq: bool
- sparse: bool
- weight: torch.Tensor
reagent.models.seq2reward_model module
- class reagent.models.seq2reward_model.Seq2RewardNetwork(state_dim, action_dim, num_hiddens, num_hidden_layers)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, action: reagent.core.types.FeatureData, valid_reward_len: Optional[torch.Tensor] = None)
Forward pass of Seq2Reward
Takes in the current state and use it as init hidden The input sequence are pure actions only Output the predicted reward after each time step
- Parameters
actions – (SEQ_LEN, BATCH_SIZE, ACTION_DIM) torch tensor
states – (SEQ_LEN, BATCH_SIZE, STATE_DIM) torch tensor
valid_reward_len – (BATCH_SIZE,) torch tensor
- Returns
predicated accumulated rewards at last step for the given sequence - acc_reward: (BATCH_SIZE, 1) torch tensor
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
reagent.models.seq2slate module
- class reagent.models.seq2slate.BaselineNet(state_dim, dim_feedforward, num_stacked_layers)
Bases:
torch.nn.modules.module.Module
- forward(input: reagent.core.types.PreprocessedRankingInput)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.Decoder(layer, num_layers)
Bases:
torch.nn.modules.module.Module
Generic num_layers layer decoder with masking.
- forward(x, memory, tgt_src_mask, tgt_tgt_mask)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.DecoderLastLayerPytorch(d_model, nhead, dim_feedforward=2048, dropout=0.1, activation=<function relu>, layer_norm_eps=1e-05, batch_first=False, norm_first=False, device=None, dtype=None)
Bases:
torch.nn.modules.transformer.TransformerDecoderLayer
The last layer of Decoder. Modified from PyTorch official code: instead of attention embedding, return attention weights which can be directly used to sample items
- forward(tgt, memory, tgt_mask, memory_mask)
Pass the inputs (and mask) through the decoder layer.
- Parameters
tgt – the sequence to the decoder layer (required).
memory – the sequence from the last layer of the encoder (required).
tgt_mask – the mask for the tgt sequence (optional).
memory_mask – the mask for the memory sequence (optional).
tgt_key_padding_mask – the mask for the tgt keys per batch (optional).
memory_key_padding_mask – the mask for the memory keys per batch (optional).
- Shape:
see the docs in Transformer class.
- training: bool
- class reagent.models.seq2slate.DecoderLayer(size, self_attn, src_attn, feed_forward)
Bases:
torch.nn.modules.module.Module
Decoder is made of self-attn, src-attn, and feed forward
- forward(x, m, tgt_src_mask, tgt_tgt_mask)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.DecoderPyTorch(dim_model, num_heads, dim_feedforward, num_layers)
Bases:
torch.nn.modules.module.Module
Transformer-based decoder based on PyTorch official implementation
- forward(tgt_embed, memory, tgt_src_mask, tgt_tgt_mask)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.Embedder(dim_in, dim_out)
Bases:
torch.nn.modules.module.Module
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.Encoder(layer, num_layers)
Bases:
torch.nn.modules.module.Module
Core encoder is a stack of num_layers layers
- forward(x, mask)
Pass the input (and mask) through each layer in turn.
- training: bool
- class reagent.models.seq2slate.EncoderLayer(dim_model, self_attn, feed_forward)
Bases:
torch.nn.modules.module.Module
Encoder is made up of self-attn and feed forward
- forward(src_embed, src_mask)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.EncoderPyTorch(dim_model, num_heads, dim_feedforward, num_layers)
Bases:
torch.nn.modules.module.Module
Transformer-based encoder based on PyTorch official implementation
- forward(src)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.Generator
Bases:
torch.nn.modules.module.Module
Candidate generation
- forward(probs: torch.Tensor, greedy: bool)
Decode one-step
- Parameters
probs – probability distributions of decoder. Shape: batch_size, tgt_seq_len, candidate_size
greedy – whether to greedily pick or sample the next symbol
- training: bool
- class reagent.models.seq2slate.MultiHeadedAttention(num_heads, dim_model)
Bases:
torch.nn.modules.module.Module
- forward(query, key, value, mask=None)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.PositionalEncoding(dim_model)
Bases:
torch.nn.modules.module.Module
A special, non-learnable positional encoding for handling variable (possibly longer) lengths of inputs. We simply add an ordinal number as an additional dimension for the input embeddings, and then project them back to the original number of dimensions
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.PositionwiseFeedForward(dim_model, dim_feedforward)
Bases:
torch.nn.modules.module.Module
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate.Seq2SlateNet(state_dim: int, candidate_dim: int, num_stacked_layers: int, dim_model: int, max_src_seq_len: int, max_tgt_seq_len: int, output_arch: reagent.model_utils.seq2slate_utils.Seq2SlateOutputArch, temperature: float)
Bases:
reagent.models.base.ModelBase
- candidate_dim: int
- dim_model: int
- forward(input: reagent.core.types.PreprocessedRankingInput, mode: reagent.model_utils.seq2slate_utils.Seq2SlateMode, tgt_seq_len: Optional[int] = None, greedy: Optional[bool] = None)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- get_distributed_data_parallel_model()
Return DistributedDataParallel version of this model
This needs to be implemented explicitly because: 1) Model with EmbeddingBag module is not compatible with vanilla DistributedDataParallel 2) Exporting logic needs structured data. DistributedDataParallel doesn’t work with structured data.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- max_src_seq_len: int
- max_tgt_seq_len: int
- num_stacked_layers: int
- state_dim: int
- temperature: float
- class reagent.models.seq2slate.Seq2SlateTransformerModel(state_dim: int, candidate_dim: int, num_stacked_layers: int, num_heads: int, dim_model: int, dim_feedforward: int, max_src_seq_len: int, max_tgt_seq_len: int, output_arch: reagent.model_utils.seq2slate_utils.Seq2SlateOutputArch, temperature: float = 1.0, state_embed_dim: Optional[int] = None)
Bases:
torch.nn.modules.module.Module
A Seq2Slate network with Transformer. The network is essentially an encoder-decoder structure. The encoder inputs a sequence of candidate feature vectors and a state feature vector, and the decoder outputs an ordered list of candidate indices. The output order is learned through REINFORCE algorithm to optimize sequence-wise reward.
One application example is to rank candidate feeds to a specific user such that the final list of feeds as a whole optimizes the user’s engagement.
Seq2Slate paper: https://arxiv.org/abs/1810.02019 Transformer paper: https://arxiv.org/abs/1706.03762
The model archtecture can also adapt to some variations. (1) The decoder can be autoregressive (2) The decoder can take encoder scores and perform iterative softmax (aka frechet sort) (3) No decoder and the output order is solely based on encoder scores
- decode(memory, state, tgt_in_idx, tgt_in_seq)
- encode(state, src_seq)
- encoder_output_to_scores(state: torch.Tensor, src_seq: torch.Tensor, tgt_out_idx: torch.Tensor) reagent.models.seq2slate.Seq2SlateTransformerOutput
- forward(mode: str, state: torch.Tensor, src_seq: torch.Tensor, tgt_in_idx: Optional[torch.Tensor] = None, tgt_out_idx: Optional[torch.Tensor] = None, tgt_in_seq: Optional[torch.Tensor] = None, tgt_seq_len: Optional[int] = None, greedy: Optional[bool] = None) reagent.models.seq2slate.Seq2SlateTransformerOutput
- Parameters
input – model input
mode –
a string indicating which mode to perform. “rank”: return ranked actions and their generative probabilities. “per_seq_log_probs”: return generative log probabilities of given
tgt sequences (used for REINFORCE training)
- ”per_symbol_log_probs”: return generative log probabilties of each
symbol in given tgt sequences (used in TEACHER FORCING training)
tgt_seq_len – the length of output sequence to be decoded. Only used in rank mode
greedy – whether to sample based on softmax distribution or greedily when decoding. Only used in rank mode
- training: bool
- class reagent.models.seq2slate.Seq2SlateTransformerNet(state_dim: int, candidate_dim: int, num_stacked_layers: int, dim_model: int, max_src_seq_len: int, max_tgt_seq_len: int, output_arch: reagent.model_utils.seq2slate_utils.Seq2SlateOutputArch, temperature: float, num_heads: int, dim_feedforward: int, state_embed_dim: Optional[int] = None)
Bases:
reagent.models.seq2slate.Seq2SlateNet
- dim_feedforward: int
- num_heads: int
- state_embed_dim: Optional[int] = None
- class reagent.models.seq2slate.Seq2SlateTransformerOutput(ranked_per_symbol_probs, ranked_per_seq_probs, ranked_tgt_out_idx, per_symbol_log_probs, per_seq_log_probs, encoder_scores)
Bases:
NamedTuple
- encoder_scores: Optional[torch.Tensor]
Alias for field number 5
- per_seq_log_probs: Optional[torch.Tensor]
Alias for field number 4
- per_symbol_log_probs: Optional[torch.Tensor]
Alias for field number 3
- ranked_per_seq_probs: Optional[torch.Tensor]
Alias for field number 1
- ranked_per_symbol_probs: Optional[torch.Tensor]
Alias for field number 0
- ranked_tgt_out_idx: Optional[torch.Tensor]
Alias for field number 2
- class reagent.models.seq2slate.SublayerConnection(dim_model)
Bases:
torch.nn.modules.module.Module
A residual connection followed by a layer norm.
- forward(x, sublayer)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
reagent.models.seq2slate_reward module
- class reagent.models.seq2slate_reward.Seq2SlateGRURewardNet(state_dim: int, candidate_dim: int, num_stacked_layers: int, dim_model: int, max_src_seq_len: int, max_tgt_seq_len: int)
Bases:
reagent.models.seq2slate_reward.Seq2SlateRewardNetBase
- embed(state, tgt_in_seq)
- forward(input: reagent.core.types.PreprocessedRankingInput)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate_reward.Seq2SlateRewardNetBase(state_dim: int, candidate_dim: int, dim_model: int, num_stacked_layers: int, max_src_seq_len: int, max_tgt_seq_len: int)
Bases:
reagent.models.base.ModelBase
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.seq2slate_reward.Seq2SlateRewardNetEnsemble(models: List[reagent.models.base.ModelBase])
Bases:
reagent.models.base.ModelBase
- forward(state: torch.Tensor, src_seq: torch.Tensor, tgt_out_seq: torch.Tensor, src_src_mask: torch.Tensor, tgt_out_idx: torch.Tensor) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.seq2slate_reward.Seq2SlateRewardNetJITWrapper(model: reagent.models.seq2slate_reward.Seq2SlateRewardNetBase)
Bases:
reagent.models.base.ModelBase
- forward(state: torch.Tensor, src_seq: torch.Tensor, tgt_out_seq: torch.Tensor, src_src_mask: torch.Tensor, tgt_out_idx: torch.Tensor) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype(use_gpu=False)
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.seq2slate_reward.Seq2SlateTransformerRewardNet(state_dim: int, candidate_dim: int, num_stacked_layers: int, num_heads: int, dim_model: int, dim_feedforward: int, max_src_seq_len: int, max_tgt_seq_len: int)
Bases:
reagent.models.seq2slate_reward.Seq2SlateRewardNetBase
- decode(memory, state, tgt_src_mask, tgt_in_seq, tgt_tgt_mask, tgt_seq_len)
- encode(state, src_seq, src_mask)
- forward(input: reagent.core.types.PreprocessedRankingInput)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
reagent.models.synthetic_reward module
- class reagent.models.synthetic_reward.Concat
Bases:
torch.nn.modules.module.Module
- forward(state: torch.Tensor, action: torch.Tensor)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.synthetic_reward.NGramConvolutionalNetwork(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], last_layer_activation: str, context_size: int, conv_net_params: reagent.core.parameters.ConvNetParameters, use_layer_norm: bool = False)
Bases:
torch.nn.modules.module.Module
- forward(state: torch.Tensor, action: torch.Tensor) torch.Tensor
Forward pass NGram conv net.
- Parameters
shape (input) – seq_len, batch_size, feature_dim
- training: bool
- class reagent.models.synthetic_reward.NGramFullyConnectedNetwork(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], last_layer_activation: str, context_size: int, use_layer_norm: bool = False)
Bases:
torch.nn.modules.module.Module
- forward(state: torch.Tensor, action: torch.Tensor) torch.Tensor
Forward pass NGram conv net.
- Parameters
shape (input) – seq_len, batch_size, feature_dim
- training: bool
- class reagent.models.synthetic_reward.PETransformerEncoderLayer(d_model, nhead, dim_feedforward=2048, dropout=0.0, activation='relu', layer_norm_eps=1e-05, max_len=100, use_ff=True, pos_weight=0.5, batch_first=False, device=None, dtype=None)
Bases:
torch.nn.modules.module.Module
PETransformerEncoderLayer is made up of Positional Encoding (PE), residual connections, self-attn and feedforward network. Major differences between this implementation and the pytorch official torch.nn.TransformerEncoderLayer are: 1. Augment input data with positional encoding. hat{x} = x + PE{x} 2. Two paralle residual blocks are applied to the raw input data (x) and encoded input data (hat{x}), respectively, i.e. z = Residual(x), hat{z} = Residual(hat{x}) 3. Treat z as the Value input, and hat{z} as the Query and Key input to feed a self-attention block.
- Main Args:
d_model: the number of expected features in the input (required). nhead: the number of heads in the multiheadattention models (required). dim_feedforward: the dimension of the feedforward network model (default=2048). activation: the activation function of intermediate layer, relu or gelu (default=relu). layer_norm_eps: the eps value in layer normalization components (default=1e-5). batch_first: If
True
, then the input and output tensors are providedas (batch, seq, feature). Default:
False
.max_len: argument passed to the Positional Encoding module, see more details in the PositionalEncoding class.
- forward(src, src_mask=None, src_key_padding_mask=None)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.synthetic_reward.PositionalEncoding(feature_dim=128, dropout=0.0, max_len=100)
Bases:
torch.nn.modules.module.Module
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.synthetic_reward.ResidualBlock(d_model=64, dim_feedforward=128)
Bases:
torch.nn.modules.module.Module
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.synthetic_reward.SequenceSyntheticRewardNet(state_dim: int, action_dim: int, lstm_hidden_size: int, lstm_num_layers: int, lstm_bidirectional: bool, last_layer_activation: str)
Bases:
torch.nn.modules.module.Module
- forward(state: torch.Tensor, action: torch.Tensor)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.synthetic_reward.SequentialMultiArguments(*args: torch.nn.modules.module.Module)
- class reagent.models.synthetic_reward.SequentialMultiArguments(arg: collections.OrderedDict[str, torch.nn.modules.module.Module])
Bases:
torch.nn.modules.container.Sequential
Sequential which can take more than 1 argument in forward function
- forward(*inputs)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class reagent.models.synthetic_reward.SingleStepSyntheticRewardNet(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], last_layer_activation: str, use_batch_norm: bool = False, use_layer_norm: bool = False)
Bases:
torch.nn.modules.module.Module
- forward(state: torch.Tensor, action: torch.Tensor)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.synthetic_reward.SyntheticRewardNet(net: torch.nn.modules.module.Module)
Bases:
reagent.models.base.ModelBase
This base class provides basic operations to consume inputs and call a synthetic reward net
A synthetic reward net (self.net) assumes the input contains only torch.Tensors. Expected input shape:
state: seq_len, batch_size, state_dim action: seq_len, batch_size, action_dim
- Expected output shape:
reward: batch_size, seq_len
- export_mlp()
Export an pytorch nn to feed to predictor wrapper.
- forward(training_batch: reagent.core.types.MemoryNetworkInput)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.synthetic_reward.TransformerSyntheticRewardNet(state_dim: int, action_dim: int, d_model: int, nhead: int = 2, num_encoder_layers: int = 2, dim_feedforward: int = 128, dropout: float = 0.0, activation: str = 'relu', last_layer_activation: str = 'leaky_relu', layer_norm_eps: float = 1e-05, max_len: int = 10)
Bases:
torch.nn.modules.module.Module
- forward(state: torch.Tensor, action: torch.Tensor)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- reagent.models.synthetic_reward.ngram(input: torch.Tensor, context_size: int, ngram_padding: torch.Tensor)
reagent.models.world_model module
- class reagent.models.world_model.MemoryNetwork(state_dim, action_dim, num_hiddens, num_hidden_layers, num_gaussians)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, action: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
Module contents
- class reagent.models.BatchConstrainedDQN(state_dim, q_network, imitator_network, bcq_drop_threshold)
Bases:
reagent.models.base.ModelBase
- forward(state)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.CategoricalDQN(distributional_network: reagent.models.base.ModelBase, *, qmin: float, qmax: float, num_atoms: int)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- log_dist(state: reagent.core.types.FeatureData) torch.Tensor
- training: bool
- class reagent.models.DirichletFullyConnectedActor(state_dim, action_dim, sizes, activations, use_batch_norm=False)
Bases:
reagent.models.base.ModelBase
- EPSILON = 1e-06
- forward(state)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- get_log_prob(state, action)
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.DuelingQNetwork(*, shared_network: reagent.models.base.ModelBase, advantage_network: reagent.models.base.ModelBase, value_network: reagent.models.base.ModelBase)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, possible_actions_mask: Optional[torch.Tensor] = None) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- classmethod make_fully_connected(state_dim: int, action_dim: int, layers: List[int], activations: List[str], num_atoms: Optional[int] = None, use_batch_norm: bool = False)
- training: bool
- class reagent.models.EmbeddingBagConcat(state_dim: int, model_feature_config: reagent.core.types.ModelFeatureConfig, embedding_dim: int)
Bases:
reagent.models.base.ModelBase
Concatenating embedding with float features before passing the input to DQN
- feat2table: Dict[str, str]
- forward(state: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- property output_dim: int
- training: bool
- class reagent.models.FullyConnectedActor(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], use_batch_norm: bool = False, action_activation: str = 'tanh', exploration_variance: Optional[float] = None)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData) reagent.core.types.ActorOutput
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.FullyConnectedCritic(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], use_batch_norm: bool = False, use_layer_norm: bool = False, output_dim: int = 1)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, action: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.FullyConnectedDQN(state_dim, action_dim, sizes, activations, *, output_activation: str = 'linear', num_atoms: Optional[int] = None, use_batch_norm: bool = False, dropout_ratio: float = 0.0, normalized_output: bool = False, use_layer_norm: bool = False)
Bases:
reagent.models.fully_connected_network.FloatFeatureFullyConnected
- forward(state: reagent.core.types.FeatureData, possible_actions_mask: Optional[torch.Tensor] = None) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class reagent.models.FullyConnectedNetwork(layers, activations, *, use_batch_norm: bool = False, min_std: float = 0.0, dropout_ratio: float = 0.0, use_layer_norm: bool = False, normalize_output: bool = False, orthogonal_init: bool = False)
Bases:
reagent.models.base.ModelBase
- forward(input: torch.Tensor) torch.Tensor
Forward pass for generic feed-forward DNNs. Assumes activation names are valid pytorch activation names. :param input tensor
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.GaussianFullyConnectedActor(state_dim: int, action_dim: int, sizes: List[int], activations: List[str], scale: float = 0.05, use_batch_norm: bool = False, use_layer_norm: bool = False, use_l2_normalization: bool = False)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- get_log_prob(state: reagent.core.types.FeatureData, squashed_action: torch.Tensor)
Action is expected to be squashed with tanh
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.MLPScorer(mlp: torch.nn.modules.module.Module, has_user_feat: bool = False)
Bases:
reagent.models.base.ModelBase
Log-space in and out
- forward(obs: reagent.core.types.FeatureData)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.ModelBase
Bases:
torch.nn.modules.module.Module
A base class to support exporting through ONNX
- cpu_model()
Override this in DistributedDataParallel models
- feature_config() Optional[reagent.core.types.ModelFeatureConfig]
If the model needs additional preprocessing, e.g., using sequence features, returns the config here.
- get_distributed_data_parallel_model()
Return DistributedDataParallel version of this model
This needs to be implemented explicitly because: 1) Model with EmbeddingBag module is not compatible with vanilla DistributedDataParallel 2) Exporting logic needs structured data. DistributedDataParallel doesn’t work with structured data.
- get_target_network()
Return a copy of this network to be used as target network
Subclass should override this if the target network should share parameters with the network to be trained.
- input_prototype() Any
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.ParametricDuelingQNetwork(*, shared_network: reagent.models.base.ModelBase, advantage_network: reagent.models.base.ModelBase, value_network: reagent.models.base.ModelBase)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, action: reagent.core.types.FeatureData) torch.Tensor
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- classmethod make_fully_connected(state_dim: int, action_dim: int, layers: List[int], activations: List[str], use_batch_norm: bool = False)
- training: bool
- class reagent.models.Seq2RewardNetwork(state_dim, action_dim, num_hiddens, num_hidden_layers)
Bases:
reagent.models.base.ModelBase
- forward(state: reagent.core.types.FeatureData, action: reagent.core.types.FeatureData, valid_reward_len: Optional[torch.Tensor] = None)
Forward pass of Seq2Reward
Takes in the current state and use it as init hidden The input sequence are pure actions only Output the predicted reward after each time step
- Parameters
actions – (SEQ_LEN, BATCH_SIZE, ACTION_DIM) torch tensor
states – (SEQ_LEN, BATCH_SIZE, STATE_DIM) torch tensor
valid_reward_len – (BATCH_SIZE,) torch tensor
- Returns
predicated accumulated rewards at last step for the given sequence - acc_reward: (BATCH_SIZE, 1) torch tensor
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().
- training: bool
- class reagent.models.Sequential(*args: torch.nn.modules.module.Module)
- class reagent.models.Sequential(arg: collections.OrderedDict[str, torch.nn.modules.module.Module])
Bases:
torch.nn.modules.container.Sequential
,reagent.models.base.ModelBase
Used this instead of torch.nn.Sequential to automate model tracing
- input_prototype()
This function provides the input for ONNX graph tracing.
The return value should be what expected by forward().