harl.utils package¶
Submodules¶
harl.utils.configs_tools module¶
Tools for loading and updating configs.
- harl.utils.configs_tools.convert_json(obj)[source]¶
Convert obj to a version which can be serialized with JSON.
- harl.utils.configs_tools.get_defaults_yaml_args(algo, env)[source]¶
Load config file for user-specified algo and env. :param algo: (str) Algorithm name. :param env: (str) Environment name.
- Returns:
(dict) Algorithm config. env_args: (dict) Environment config.
- Return type:
algo_args
- harl.utils.configs_tools.init_dir(env, env_args, algo, exp_name, seed, logger_path)[source]¶
Init directory for saving results.
harl.utils.discrete_util module¶
- harl.utils.discrete_util.gumbel_softmax(logits, device, temperature=1.0, hard=False)[source]¶
Sample from the Gumbel-Softmax distribution and optionally discretize. :param logits: [batch_size, n_class] unnormalized log-probs :param temperature: non-negative scalar :param hard: if True, take argmax, but differentiate w.r.t. soft sample y
- Returns:
[batch_size, n_class] sample from the Gumbel-Softmax distribution. If hard=True, then the returned sample will be one-hot, otherwise it will be a probabilitiy distribution that sums to 1 across classes
- harl.utils.discrete_util.gumbel_softmax_sample(logits, temperature, device)[source]¶
Draw a sample from the Gumbel-Softmax distribution
harl.utils.envs_tools module¶
Tools for HARL.
- harl.utils.envs_tools.check(value)[source]¶
Check if value is a numpy array, if so, convert it to a torch tensor.
- harl.utils.envs_tools.get_num_agents(env, env_args, envs)[source]¶
Get the number of agents in the environment.
- harl.utils.envs_tools.get_shape_from_act_space(act_space)[source]¶
Get shape from action space. :param act_space: (gym.spaces) action space
- Returns:
(tuple) action shape
- Return type:
act_shape
- harl.utils.envs_tools.get_shape_from_obs_space(obs_space)[source]¶
Get shape from observation space. :param obs_space: (gym.spaces or list) observation space
- Returns:
(tuple) observation shape
- Return type:
obs_shape
- harl.utils.envs_tools.make_eval_env(env_name, seed, n_threads, env_args)[source]¶
Make env for evaluation.
harl.utils.models_tools module¶
Tools for HARL.
- harl.utils.models_tools.get_active_func(activation_func)[source]¶
Get the activation function. :param activation_func: (str) activation function
- Returns:
(torch.nn) activation function
- Return type:
activation function
- harl.utils.models_tools.get_init_method(initialization_method)[source]¶
Get the initialization method. :param initialization_method: (str) initialization method
- Returns:
(torch.nn) initialization method
- Return type:
initialization method
- harl.utils.models_tools.init(module, weight_init, bias_init, gain=1)[source]¶
Init module. :param module: (torch.nn) module :param weight_init: (torch.nn) weight init :param bias_init: (torch.nn) bias init :param gain: (float) gain
- Returns:
(torch.nn) module
- Return type:
module
- harl.utils.models_tools.init_device(args)[source]¶
Init device. :param args: (dict) arguments
- Returns:
(torch.device) device
- Return type:
device
- harl.utils.models_tools.update_linear_schedule(optimizer, epoch, total_num_epochs, initial_lr)[source]¶
Decreases the learning rate linearly :param optimizer: (torch.optim) optimizer :param epoch: (int) current epoch :param total_num_epochs: (int) total number of epochs :param initial_lr: (float) initial learning rate
harl.utils.trans_tools module¶
Tools for HARL.
harl.utils.trpo_util module¶
TRPO utility functions.
- harl.utils.trpo_util.conjugate_gradient(actor, obs, rnn_states, action, masks, available_actions, active_masks, b, nsteps, device, residual_tol=1e-10)[source]¶
Conjugate gradient algorithm. # refer to https://github.com/openai/baselines/blob/master/baselines/common/cg.py
- harl.utils.trpo_util.fisher_vector_product(actor, obs, rnn_states, action, masks, available_actions, active_masks, p)[source]¶
Fisher vector product.