reagent.preprocessing package

Submodules

reagent.preprocessing.batch_preprocessor module

class reagent.preprocessing.batch_preprocessor.BatchPreprocessor

Bases: torch.nn.modules.module.Module

training: bool
class reagent.preprocessing.batch_preprocessor.DiscreteDqnBatchPreprocessor(num_actions: int, state_preprocessor: reagent.preprocessing.preprocessor.Preprocessor, use_gpu: bool = False)

Bases: reagent.preprocessing.batch_preprocessor.BatchPreprocessor

forward(batch: Dict[str, torch.Tensor]) reagent.core.types.DiscreteDqnInput

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class reagent.preprocessing.batch_preprocessor.ParametricDqnBatchPreprocessor(state_preprocessor: reagent.preprocessing.preprocessor.Preprocessor, action_preprocessor: reagent.preprocessing.preprocessor.Preprocessor, use_gpu: bool)

Bases: reagent.preprocessing.batch_preprocessor.BatchPreprocessor

forward(batch: Dict[str, torch.Tensor]) reagent.core.types.ParametricDqnInput

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class reagent.preprocessing.batch_preprocessor.PolicyNetworkBatchPreprocessor(state_preprocessor: reagent.preprocessing.preprocessor.Preprocessor, action_preprocessor: reagent.preprocessing.preprocessor.Preprocessor, use_gpu: bool = False)

Bases: reagent.preprocessing.batch_preprocessor.BatchPreprocessor

forward(batch: Dict[str, torch.Tensor]) reagent.core.types.PolicyNetworkInput

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
reagent.preprocessing.batch_preprocessor.batch_to_device(batch: Dict[str, torch.Tensor], device: torch.device)

reagent.preprocessing.identify_types module

reagent.preprocessing.identify_types.identify_type(values, enum_threshold=10)

reagent.preprocessing.normalization module

reagent.preprocessing.normalization.construct_action_scale_tensor(action_norm_params, action_scale_overrides)

Construct tensors that will rescale each action value on each dimension i from [min_serving_value[i], max_serving_value[i]] to [-1, 1] for training.

reagent.preprocessing.normalization.deserialize(parameters_json) Dict[int, reagent.core.parameters.NormalizationParameters]
reagent.preprocessing.normalization.get_feature_config(float_features: Optional[List[Tuple[int, str]]]) reagent.core.types.ModelFeatureConfig
reagent.preprocessing.normalization.get_feature_norm_metadata(feature_name, feature_value_list, norm_params)
reagent.preprocessing.normalization.get_feature_start_indices(sorted_features: List[int], normalization_parameters: Dict[int, reagent.core.parameters.NormalizationParameters])

Returns the starting index for each feature in the output feature vector

reagent.preprocessing.normalization.get_num_output_features(normalization_parameters: Dict[int, reagent.core.parameters.NormalizationParameters]) int
reagent.preprocessing.normalization.identify_parameter(feature_name, values, max_unique_enum_values=10, quantile_size=20, quantile_k2_threshold=1000.0, skip_box_cox=False, skip_quantiles=False, feature_type=None)
reagent.preprocessing.normalization.no_op_feature()
reagent.preprocessing.normalization.serialize(parameters)
reagent.preprocessing.normalization.serialize_one(feature_parameters)
reagent.preprocessing.normalization.sort_features_by_normalization(normalization_parameters: Dict[int, reagent.core.parameters.NormalizationParameters]) Tuple[List[int], List[int]]

Helper function to return a sorted list from a normalization map. Also returns the starting index for each feature type

reagent.preprocessing.postprocessor module

class reagent.preprocessing.postprocessor.Postprocessor(normalization_parameters: Dict[int, reagent.core.parameters.NormalizationParameters], use_gpu: bool)

Bases: torch.nn.modules.module.Module

Inverting action

forward(input: torch.Tensor) torch.Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

input_prototype() Tuple[torch.Tensor]
training: bool

reagent.preprocessing.preprocessor module

class reagent.preprocessing.preprocessor.Preprocessor(normalization_parameters: Dict[int, reagent.core.parameters.NormalizationParameters], use_gpu: Optional[bool] = None, device: Optional[torch.device] = None)

Bases: torch.nn.modules.module.Module

forward(input: torch.Tensor, input_presence_byte: torch.Tensor) torch.Tensor

Preprocess the input matrix :param input tensor

input_prototype() Tuple[torch.Tensor, torch.Tensor]
training: bool

reagent.preprocessing.sparse_preprocessor module

class reagent.preprocessing.sparse_preprocessor.ExplicitMapIDList(id2index: Dict[int, int])

Bases: reagent.preprocessing.sparse_preprocessor.MapIDList

forward(raw_values: torch.Tensor) torch.Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class reagent.preprocessing.sparse_preprocessor.ExplicitMapIDScoreList(id2index: Dict[int, int])

Bases: reagent.preprocessing.sparse_preprocessor.MapIDScoreList

forward(raw_keys: torch.Tensor, raw_values: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class reagent.preprocessing.sparse_preprocessor.MapIDList

Bases: torch.nn.modules.module.Module

abstract forward(raw_values: torch.Tensor) torch.Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class reagent.preprocessing.sparse_preprocessor.MapIDScoreList

Bases: torch.nn.modules.module.Module

abstract forward(raw_keys: torch.Tensor, raw_values: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class reagent.preprocessing.sparse_preprocessor.ModuloMapIDList(modulo: int)

Bases: reagent.preprocessing.sparse_preprocessor.MapIDList

forward(raw_values: torch.Tensor) torch.Tensor

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class reagent.preprocessing.sparse_preprocessor.ModuloMapIDScoreList(modulo: int)

Bases: reagent.preprocessing.sparse_preprocessor.MapIDScoreList

forward(raw_keys: torch.Tensor, raw_values: torch.Tensor) Tuple[torch.Tensor, torch.Tensor]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class reagent.preprocessing.sparse_preprocessor.SparsePreprocessor(id2name: Dict[int, str], name2id: Dict[str, int], id_list_mappers: Dict[int, reagent.preprocessing.sparse_preprocessor.MapIDList], id_score_list_mappers: Dict[int, reagent.preprocessing.sparse_preprocessor.MapIDScoreList], device: torch.device)

Bases: torch.nn.modules.module.Module

Performs preprocessing for sparse features (i.e. id_list, id_score_list)

Functionality includes: (1) changes keys from feature_id to feature_name, for better debuggability (2) maps sparse ids to embedding table indices based on id_mapping (3) filters out ids which aren’t in the id2name

preprocess_id_list(id_list: Dict[int, Tuple[torch.Tensor, torch.Tensor]]) Dict[str, Tuple[torch.Tensor, torch.Tensor]]

Input: rlt.ServingIdListFeature Output: rlt.IdListFeature

preprocess_id_score_list(id_score_list: Dict[int, Tuple[torch.Tensor, torch.Tensor, torch.Tensor]]) Dict[str, Tuple[torch.Tensor, torch.Tensor, torch.Tensor]]

Input: rlt.ServingIdScoreListFeature Output: rlt.IdScoreListFeature

training: bool
reagent.preprocessing.sparse_preprocessor.make_sparse_preprocessor(feature_config: reagent.core.types.ModelFeatureConfig, device: torch.device)

Helper to initialize, for scripting SparsePreprocessor

reagent.preprocessing.sparse_to_dense module

class reagent.preprocessing.sparse_to_dense.PythonSparseToDenseProcessor(sorted_features: List[int], set_missing_value_to_zero: bool = False)

Bases: reagent.preprocessing.sparse_to_dense.SparseToDenseProcessor

process(sparse_data: List[Dict[int, float]]) Tuple[torch.Tensor, torch.Tensor]
class reagent.preprocessing.sparse_to_dense.SparseToDenseProcessor(sorted_features: List[int], set_missing_value_to_zero: bool = False)

Bases: object

class reagent.preprocessing.sparse_to_dense.StringKeySparseToDenseProcessor(sorted_features: List[int], set_missing_value_to_zero: bool = False)

Bases: reagent.preprocessing.sparse_to_dense.SparseToDenseProcessor

We just have this in case the input data is keyed by string

process(sparse_data: List[Dict[str, float]]) Tuple[torch.Tensor, torch.Tensor]

reagent.preprocessing.transforms module

class reagent.preprocessing.transforms.AppendConstant(keys: List[str], dim: int = - 1, const: float = 1.0)

Bases: object

Append a column of constant value at the beginning of the specified dimension Can be used to add a column of “1” to the Linear Regression input data to capture intercept/bias

class reagent.preprocessing.transforms.Cat(input_keys: List[str], output_key: str, dim: int, broadcast: bool = True)

Bases: object

This transform concatenates the tensors along a specified dim

class reagent.preprocessing.transforms.ColumnVector(keys: List[str])

Bases: object

Ensure that the keys are column vectors

class reagent.preprocessing.transforms.Compose(*transforms)

Bases: object

Applies an iterable collection of transform functions

class reagent.preprocessing.transforms.DenseNormalization(keys: List[str], normalization_data: reagent.core.parameters.NormalizationData, device: Optional[torch.device] = None)

Bases: object

Normalize the keys using normalization_data. The keys are expected to be Tuple[torch.Tensor, torch.Tensor], where the first element is the value and the second element is the presence mask. This transform replaces the keys in the input data.

class reagent.preprocessing.transforms.Filter(*, keep_keys: Optional[List[str]] = None, remove_keys: Optional[List[str]] = None)

Bases: object

Remove some keys from the dict. Can specify keep_keys (they will be kept) or remove_keys (they will be removed)

class reagent.preprocessing.transforms.FixedLengthSequenceDenseNormalization(keys: List[str], sequence_id: int, normalization_data: reagent.core.parameters.NormalizationData, expected_length: Optional[int] = None, device: Optional[torch.device] = None)

Bases: object

Combines the FixedLengthSequences, DenseNormalization, and SlateView transforms

class reagent.preprocessing.transforms.FixedLengthSequences(keys: List[str], sequence_id: int, expected_length: Optional[int] = None, *, to_keys: Optional[List[str]] = None)

Bases: object

Does two things:
  1. makes sure each sequence in the list of keys has the expected fixed length

2. if to_keys is provided, copies the relevant sequence_id to the new key, otherwise overwrites the old key

Expects each data[key] to be Dict[Int, Tuple[Tensor, T]]. Where: - key is the feature id - sequence_id is the key of the dict data[key] - The first element of the tuple is the offset for each example, which is expected to be in fixed interval. - The second element is the data at each step in the sequence

This is mainly for FB internal use, see fbcode/caffe2/caffe2/fb/proto/io_metadata.thrift for the data format extracted from SequenceFeatureMetadata

NOTE: this is not product between two lists (keys and to_keys); it’s setting keys[sequence_id] to to_keys in a parallel way

class reagent.preprocessing.transforms.GetEye(key: str, size: int)

Bases: object

Place a diagonal tensor into the data dictionary

class reagent.preprocessing.transforms.Lambda(keys: List[str], fn: Callable)

Bases: object

Applies an arbitrary callable transform

class reagent.preprocessing.transforms.MapIDListFeatures(id_list_keys: List[str], id_score_list_keys: List[str], feature_config: reagent.core.types.ModelFeatureConfig, device: torch.device)

Bases: object

Applies a SparsePreprocessor (see sparse_preprocessor.SparsePreprocessor)

class reagent.preprocessing.transforms.MaskByPresence(keys: List[str])

Bases: object

Expect data to be (value, presence) and return value * presence. This zeros out values that aren’t present.

class reagent.preprocessing.transforms.OneHotActions(keys: List[str], num_actions: int)

Bases: object

Keys should be in the set {0,1,2,…,num_actions}, where a value equal to num_actions denotes that it’s not valid.

class reagent.preprocessing.transforms.OuterProduct(key1: str, key2: str, output_key: str, drop_inputs: bool = False)

Bases: object

This transform creates a tensor with an outer product of elements of 2 tensors. The outer product is stored under the new key. The 2 input tensors might be dropped, depending on input arguments

class reagent.preprocessing.transforms.Rename(old_names: List[str], new_names: List[str])

Bases: object

Change key names

class reagent.preprocessing.transforms.SelectValuePresenceColumns(source: str, dest: str, indices: List[int])

Bases: object

Select columns from value-presence source key

class reagent.preprocessing.transforms.SlateView(keys: List[str], slate_size: int)

Bases: object

Assuming that the keys are flatten fixed-length sequences with length of slate_size, unflatten it by inserting slate_size to the 1st dim. I.e., turns the input from the shape of [B * slate_size, D] to [B, slate_size, D].

class reagent.preprocessing.transforms.StackDenseFixedSizeArray(keys: List[str], size: int, dtype=torch.float32)

Bases: object

If data is a tensor, ensures it has the correct shape. If data is a list of (value, presence) discards the presence tensors and concatenates the values to output a tensor of shape (batch_size, feature_dim).

class reagent.preprocessing.transforms.UnsqueezeRepeat(keys: List[str], dim: int, num_repeat: int = 1)

Bases: object

This transform adds an extra dimension to the tensor and repeats

the tensor along that dimension

class reagent.preprocessing.transforms.ValuePresence

Bases: object

For every key x, looks for x_presence; if x_presence exists, replace x with tuple of x and x_presence, delete x_presence key

reagent.preprocessing.types module

class reagent.preprocessing.types.InputColumn

Bases: object

ACTION = 'action'
ACTION_PROBABILITY = 'action_probability'
CANDIDATE_FEATURES = 'candidate_features'
EXTRAS = 'extras'
ITEM_MASK = 'item_mask'
ITEM_PROBABILITY = 'item_probability'
MDP_ID = 'mdp_id'
METRICS = 'metrics'
NEXT_ACTION = 'next_action'
NEXT_CANDIDATE_FEATURES = 'next_candidate_features'
NEXT_ITEM_MASK = 'next_item_mask'
NEXT_ITEM_PROBABILITY = 'next_item_probability'
NEXT_STATE_FEATURES = 'next_state_features'
NEXT_STATE_ID_LIST_FEATURES = 'next_state_id_list_features'
NEXT_STATE_ID_SCORE_LIST_FEATURES = 'next_state_id_score_list_features'
NEXT_STATE_SEQUENCE_FEATURES = 'next_state_sequence_features'
NOT_TERMINAL = 'not_terminal'
POSITION_REWARD = 'position_reward'
POSSIBLE_ACTIONS = 'possible_actions'
POSSIBLE_ACTIONS_MASK = 'possible_actions_mask'
POSSIBLE_NEXT_ACTIONS = 'possible_next_actions'
POSSIBLE_NEXT_ACTIONS_MASK = 'possible_next_actions_mask'
REWARD = 'reward'
REWARD_MASK = 'reward_mask'
SCORES = 'scores'
SEQUENCE_NUMBER = 'sequence_number'
SLATE_REWARD = 'slate_reward'
STATE_FEATURES = 'state_features'
STATE_ID_LIST_FEATURES = 'state_id_list_features'
STATE_ID_SCORE_LIST_FEATURES = 'state_id_score_list_features'
STATE_SEQUENCE_FEATURES = 'state_sequence_features'
STEP = 'step'
TIME_DIFF = 'time_diff'
TIME_SINCE_FIRST = 'time_since_first'
VALID_STEP = 'valid_step'
WEIGHT = 'weight'

Module contents