Skip to content
6 changes: 4 additions & 2 deletions tutorials/W2D1_Macrocircuits/W2D1_Intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,9 @@
"source": [
"## Prerequisites\n",
"\n",
"Materials of this day assume you have had the experience of model building in `pytorch` earlier. It would be beneficial too if you had the basics of Linear Algebra before as well as if you had played around with Actor-Critic model in Reinforcement Learning setup."
"In order to get the most out of today's tutorials, it would greatly help if you had experience building (simple) neural network models in PyTorch. We will also be using some concepts from Linear Algebra, so some familiarity with concepts from that domain will come in handy. We will also be looking at a specific algorithm in Reinforcement Learning (RL) called the Actor-Critic model, so it would help if you had some familiarity with Reinforcement Learning. We touched a little bit on RL in W1D2 (\"Comparing Tasks\"), specifically in Tutorial 3 (\"Reinforcement Learning Across Temporal Scales\"). It could be good to refer back to that tutorial and to check out the two videos on Meta-RL in that tutorial notebook.\n",
"\n",
"Today is a little more technical, more theory-driven, but it will give you a lot of skills and appreciation to work with these very interesting ideas in NeuroAI. What we encourage you to keep in mind is how this knowledge helps you to appreciate the concept of generalization, the over-arching theme of this entire course. Lots of points today will indicate how learning dynamics will arrive at solutions that **generalize well**!"
]
},
{
Expand Down Expand Up @@ -210,7 +212,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.19"
"version": "3.9.22"
}
},
"nbformat": 4,
Expand Down
99 changes: 63 additions & 36 deletions tutorials/W2D1_Macrocircuits/W2D1_Tutorial1.ipynb

Large diffs are not rendered by default.

88 changes: 46 additions & 42 deletions tutorials/W2D1_Macrocircuits/W2D1_Tutorial2.ipynb

Large diffs are not rendered by default.

66 changes: 35 additions & 31 deletions tutorials/W2D1_Macrocircuits/W2D1_Tutorial3.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@
"\n",
"__Content creators:__ Ruiyi Zhang\n",
"\n",
"__Content reviewers:__ Xaq Pitkow, Hlib Solodzhuk, Patrick Mineault\n",
"__Content reviewers:__ Xaq Pitkow, Hlib Solodzhuk, Patrick Mineault, Alex Murphy\n",
"\n",
"__Production editors:__ Konstantine Tsafatinos, Ella Batty, Spiros Chavlis, Samuele Bolotta, Hlib Solodzhuk"
"__Production editors:__ Konstantine Tsafatinos, Ella Batty, Spiros Chavlis, Samuele Bolotta, Hlib Solodzhuk, Alex Murphy"
]
},
{
Expand Down Expand Up @@ -96,7 +96,8 @@
"# @title Install and import feedback gadget\n",
"\n",
"!pip install vibecheck datatops --quiet\n",
"!pip install pandas~=2.0.0 --quiet\n",
"!pip install pandas --quiet\n",
"!pip install scikit-learn --quiet\n",
"\n",
"from vibecheck import DatatopsContentReviewContainer\n",
"def content_review(notebook_section: str):\n",
Expand Down Expand Up @@ -1789,7 +1790,7 @@
"---\n",
"# Section 2: Evaluate agents in the training task\n",
"\n",
"Estimated timing to here from start of tutorial: 25 minutes"
"*Estimated timing to here from start of tutorial: 25 minutes*"
]
},
{
Expand All @@ -1798,7 +1799,7 @@
"execution": {}
},
"source": [
"With the code for the environment and agents done, we will now write an evaluation function allowing the agent to interact with the environment."
"With the code for the environment and agents complete, we will now write an evaluation function allowing the agent to interact with the environment where the quality of the model can be assessed."
]
},
{
Expand All @@ -1816,7 +1817,7 @@
"execution": {}
},
"source": [
"We first sample 1000 targets for agents to steer to."
"We first sample 1,000 targets for the RL agent to steer towards."
]
},
{
Expand Down Expand Up @@ -2061,7 +2062,7 @@
"execution": {}
},
"source": [
"Since training RL agents takes a lot of time, here we load the pre-trained modular and holistic agents and evaluate these two agents on the same sampled 1000 targets. We will then store the evaluation data in pandas dataframes."
"Since training RL agents takes a lot of time, here we load the pre-trained modular and holistic agents and evaluate these two agents on the same sampled 1,000 targets. We will then store the evaluation data in `pandas` Dataframe object."
]
},
{
Expand Down Expand Up @@ -2499,9 +2500,11 @@
"execution": {}
},
"source": [
"It is well known that an RL agent's performance can vary significantly with different random seeds. Therefore, no conclusions can be drawn based on one training run with a single random seed.\n",
"It is well known that an RL agent's performance can vary significantly with different random seeds. Therefore, no conclusions can be drawn based on one training run with a single random seed. Therefore, to make more convincing conclusions, we must run the same experiment across different random initializations in order to be sure that any repeatedly-obtainable result is robustly seen across such different random initializations.\n",
"\n",
"Both agents were trained with eight random seeds, and all of them were evaluated using the same sample of $1000$ targets. Let's load this saved trajectory data."
"Both agents were trained across 8 random seeds. All of them were evaluated using the same sample of 1,000 targets.\n",
"\n",
"Let's load this saved trajectory data."
]
},
{
Expand Down Expand Up @@ -2536,7 +2539,7 @@
"execution": {}
},
"source": [
"We first compute the fraction of rewarded trials in the total $1000$ trials for all training runs with different random seeds for the modular and holistic agents. We visualize this using a bar plot, with each red dot denoting the performance of a random seed."
"We first compute the fraction of rewarded trials in the total 1,000 trials for all training runs with different random seeds for the modular and holistic agents. We visualize this using a bar plot, with each red dot denoting the performance of a random seed."
]
},
{
Expand Down Expand Up @@ -2592,9 +2595,9 @@
"execution": {}
},
"source": [
"Despite similar performance measured by a rewarded fraction, we dis observe qualitative differences in the trajectories of the two agents in the previous sections. It is possible that the holistic agent's more curved trajectories, although reaching the target, are less efficient, i.e., they waste more time.\n",
"Despite similar performance measured by a rewarded fraction, we did observe qualitative differences in the trajectories of the two agents in the previous sections. It is possible that the holistic agent's more curved trajectories, although reaching the target, are less efficient, i.e., they waste more time.\n",
"\n",
"Therefore, we also plot the time spent by both agents for the same 1000 targets."
"Therefore, we also plot the time spent by both agents for the same 1,000 targets."
]
},
{
Expand Down Expand Up @@ -2774,7 +2777,9 @@
"---\n",
"# Section 3: A novel gain task\n",
"\n",
"Estimated timing to here from start of tutorial: 50 minutes"
"*Estimated timing to here from start of tutorial: 50 minutes*\n",
"\n",
"The prior task had a fixed joystick gain that meant consistent linear and angular velocities. We will now look at a novel task that tests the generalization capabilities of these models by varying this setting between training and testing. Will the model generalize well?"
]
},
{
Expand Down Expand Up @@ -3208,22 +3213,7 @@
"execution": {}
},
"source": [
"---\n",
"# Summary\n",
"\n",
"*Estimated timing of tutorial: 1 hour*\n",
"\n",
"In this tutorial, we explored the difference in agents' performance based on their architecture. We revealed that modular architecture, with separate modules for learning different aspects of behavior, is superior to a holistic architecture with a single module. The modular architecture with stronger inductive bias achieves good performance faster and has the capability to generalize to other tasks as well. Intriguingly, this modularity is a property we also observe in the brains, which could be important for generalization in the brain as well."
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Bonus Section 1: Decoding analysis"
"## Decoding analysis"
]
},
{
Expand Down Expand Up @@ -3355,7 +3345,21 @@
},
"source": [
"---\n",
"# Bonus Section 2: Generalization, but no free lunch\n",
"# The Big Picture\n",
"\n",
"*Estimated timing of tutorial: 1 hour*\n",
"\n",
"In this tutorial, we explored the difference in agents' performance based on their architecture. We revealed that modular architecture, with separate modules for learning different aspects of behavior, is superior to a holistic architecture with a single module. The modular architecture with stronger inductive bias achieves good performance faster and has the capability to generalize to other tasks as well. Intriguingly, this modularity is a property we also observe in the brains, which could be important for generalization in the brain as well."
]
},
{
"cell_type": "markdown",
"metadata": {
"execution": {}
},
"source": [
"---\n",
"# Addendum: Generalization, but no free lunch\n",
"\n",
"The No Free Lunch theorems proved that no inductive bias can excel across all tasks. It has been studied in the [paper](https://www.science.org/doi/10.1126/sciadv.adk1256) that agents with a modular architecture can acquire the underlying structure of the training task. In contrast, holistic agents tend to acquire different knowledge than modular agents during training, such as forming beliefs based on unreliable information sources or exhibiting less efficient control actions. The novel gain task has a structure similar to the training task, consequently, a modular agent that accurately learns the training task's structure can leverage its knowledge in these novel tasks.\n",
"\n",
Expand Down Expand Up @@ -3406,7 +3410,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.19"
"version": "3.9.22"
}
},
"nbformat": 4,
Expand Down
6 changes: 4 additions & 2 deletions tutorials/W2D1_Macrocircuits/instructor/W2D1_Intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,9 @@
"source": [
"## Prerequisites\n",
"\n",
"Materials of this day assume you have had the experience of model building in `pytorch` earlier. It would be beneficial too if you had the basics of Linear Algebra before as well as if you had played around with Actor-Critic model in Reinforcement Learning setup."
"In order to get the most out of today's tutorials, it would greatly help if you had experience building (simple) neural network models in PyTorch. We will also be using some concepts from Linear Algebra, so some familiarity with concepts from that domain will come in handy. We will also be looking at a specific algorithm in Reinforcement Learning (RL) called the Actor-Critic model, so it would help if you had some familiarity with Reinforcement Learning. We touched a little bit on RL in W1D2 (\"Comparing Tasks\"), specifically in Tutorial 3 (\"Reinforcement Learning Across Temporal Scales\"). It could be good to refer back to that tutorial and to check out the two videos on Meta-RL in that tutorial notebook.\n",
"\n",
"Today is a little more technical, more theory-driven, but it will give you a lot of skills and appreciation to work with these very interesting ideas in NeuroAI. What we encourage you to keep in mind is how this knowledge helps you to appreciate the concept of generalization, the over-arching theme of this entire course. Lots of points today will indicate how learning dynamics will arrive at solutions that **generalize well**!"
]
},
{
Expand Down Expand Up @@ -210,7 +212,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.19"
"version": "3.9.22"
}
},
"nbformat": 4,
Expand Down
Loading
Loading