RPIL

Robust Probabilistic Imitation Learning

Robust Probabilistic Imitation Learning (RPIL) is a method that I conceived where in I model a set of expert demonstrations are having two sources, a true expert and an adversary. Using logistic regression we can pose the problem as a mixture of multinomial logistic regression models. From there we can solve the non-convex optimization using an Expectation Maximization like algorithm. Experimentally I show that this algorithm can detect and remove adversarial demonstrations from the training set and thus perform much better that if the demonstrations are considered to be correct.

Imitation learning (IL) attempts to teach an autonomous agent a task given demonstrations. In many convectional IL frameworks these demonstrations are considered to be expert (correct) and homogenous (optimizing the same reward function). This assumption however is often not held in real world applications. For this reason it is of interest to create and IL method that is robust to adversarial demonstrations. This means identifying and removing incorrect demonstrations from a dataset. This project shows that there is a way for simultaneously autonomously identify and remove adversarial demonstrations from the training set, and learn the task. It is show that this can be done through a probabilistic re-weighting of the demonstrations using a Expectation Maximization like algorithm. The results of this method are shown using several common IL baseline problems. A more in depth discussion of theory and results can be found in the paper and presentation. The paper for this project, along with the code can be found on GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
data		data
paper		paper
presentation		presentation
src		src
Adversarial_Mixture_Of_Logits.ipynb		Adversarial_Mixture_Of_Logits.ipynb
Adversarial_Mixture_Of_Logits2.ipynb		Adversarial_Mixture_Of_Logits2.ipynb
Example.ipynb		Example.ipynb
Example1.ipynb		Example1.ipynb
Example2.ipynb		Example2.ipynb
Example_of_Mixtures.ipynb		Example_of_Mixtures.ipynb
Multinomial_Logistic_Regression.ipynb		Multinomial_Logistic_Regression.ipynb
README.md		README.md
Untitled.ipynb		Untitled.ipynb
results.ipynb		results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RPIL

Robust Probabilistic Imitation Learning

About

Uh oh!

Releases

Packages

Languages

brendanjcrowe/RPIL

Folders and files

Latest commit

History

Repository files navigation

RPIL

Robust Probabilistic Imitation Learning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages