Lex-MAB

Implementations of PF-LEX, OM-LEX, and NOM-LEX for the multi-armed bandit with lexicographically ordered objectives (Lex-MAB).

In all implementations:

A is the number of arms,
D is the number of objectives,
means is an AxD matrix, where means[a,i] is the expected reward of arm a in objective i, and arm 0 is assumed to be lexicographic optimal,
K is the number of individual runs,
T is the number of rounds,
reg is a DxKxT matrix, where reg[i,k,t] is the regret incurred in objective i, individual run k, and round t.

In the implementation of PF-LEX:

dlt is the confidence term,
eps is proportional to the suboptimality that the learner aims to tolerate,
TT is the period in which the regrets are recorded. Note that this script takes an argument called uid, which is a unique identifier with which the final results are saved. This is to allow for parallel execution of the script.

In the implementation of NOM-LEX:

eta is a 1xD matrix, where eta[0,i] is the near lexicographic optimal expected reward in objective i.

Sat-MAB

In addition to algorithms for Lex-MAB, 'satisficing.py' includes an adapted version of NOM-LEX for the multi-armed bandit with satisficing objectives (Sat-MAB), along with an implementation of Satisficing-In-Mean-Rewards UCL (Reverdy et al., "Satisficing in multi-armed bandit problems," 2017).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
nom-lex.py		nom-lex.py
om-lex.py		om-lex.py
pf-lex.py		pf-lex.py
satisficing.py		satisficing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lex-MAB

Sat-MAB

About

Uh oh!

Releases

Packages

Languages

Bilkent-CYBORG/Lex-MAB

Folders and files

Latest commit

History

Repository files navigation

Lex-MAB

Sat-MAB

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages