Week 6: ML with Python #94

BekahHW · 2023-02-13T10:34:55Z

BekahHW
Feb 13, 2023

Getting Started

The goal this week is to get to "Conclusion" lesson from freeCodeCamp's ML with python and check-in here.

Learning Resources

Check-in

What is Q-Learning? What is reinforcement learning?
What are some pros and cons of Q-Learning?
What do you know today that you didn't when you started?
What are some key terms or ideas from these lessons?
What do you need to be clarified or what questions do you have?
What are you excited to learn more about?

BekahHW · 2023-02-15T19:38:24Z

BekahHW
Feb 15, 2023
Author

What is Q-Learning? What is reinforcement learning? Algorithm to implement reinforcement learning. An agent navigates states and action and maximizes reward. Matrix of actions and states.
What are some pros and cons of Q-Learning?
What do you know today that you didn't when you started? Learning more about using Q-tables.
What are some key terms or ideas from these lessons? agent, state, action, reward, observation space
What do you need to be clarified or what questions do you have? I'm still not clear on how this transitions from what we've been doing before and how to make decisions about what to use.
What are you excited to learn more about?

0 replies

paceaux · 2023-02-16T15:51:07Z

paceaux
Feb 16, 2023

The Collab for Reinforcement learning:

https://colab.research.google.com/drive/1IlrlS3bB8t1Gd5Pogol4MIwUxlAjhWOQ?usp=sharing

1 reply

BekahHW Feb 16, 2023
Author

thank you!

dominicduffin1 · 2023-02-16T19:22:39Z

dominicduffin1
Feb 16, 2023

Y'all may have noticed I haven't really been posting in these discussions, unfortunately the time I was hoping to devote to this cohort ended up being swallowed up so I had to drop out from following along. I'm still excited to learn more about this stuff and I hope to get back to the content someday..

0 replies

paceaux · 2023-02-17T21:08:02Z

paceaux
Feb 17, 2023

What's Q-Learning? It's a kind of reinforcement learning. RL is a way to simulate motivations and incentives as they relate to learning.
Pros and cons? Not sure. This seems to be a useful way to discover or evaluate human behaviors?
What I know now that I didn't before? that this exists? The problem though is that it's all about maximizing reward. You have to strike the right balance of risk vs. reward to learn something. You need enough reward to solicit change, but not so much that it gives an unnatural result.
Key terms/ideas:
- environment: the thing we're trying to solve/do
- agent: the explorer within the environment
- state: the status of the agent
- action: the thing the agent does in the environment
- reward: what the agent is trying to maximize while in the environemnt
- local maxima: the maximum amount the agent "earns"
Excited to learn more about? I would actually want to try this with language learning. First language acquisition is a kind of reinforcement learning so I wonder if this could be used to model how babies or non-natives acquire language.

0 replies

rilhia · 2023-03-25T00:42:24Z

rilhia
Mar 25, 2023

This took me back a few years to my uni days. I used to love software agents. Give SARSA a go, if you haven't already. I took a look at this tutorial (https://www.datacamp.com/tutorial/introduction-q-learning-beginner-tutorial) and had a play with it. It should be noted that using the OpenAI FrozenLake environment creates extra problems for Q Learning. Since falling into a hole ends the session with no reward at all, every time your agent doesn't reach the goal, it essentially wastes the training episode. This makes it really inefficient to get your agent to learn the environment. So I created a "walled" environment. It is not possible for the agent to step into a hole. If it attempts to, it uses a step and stays where it is. This way, the agent can reach the end (given enough steps for the training episode) and the reward for reaching the end can be distributed across the whole route. It is SO MUCH more efficient to train your agent in this way.

Prior to changing this, an 8x8 map with 2 rows with only 1 frozen square in each, was difficult to get the agent learning a route. After the modification to the environment I was able to get a 150x150 environment, with a significant number of holes, to be learnt reasonably quickly. I've attached a GIF of the map and the agent navigating it. Oh, I also added diagonal movement to make it more interesting. You will need to zoom in to see the agent moving.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepgram

Week 6: ML with Python #94

{{title}}

Replies: 5 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Deepgram

Week 6: ML with Python #94

BekahHW Feb 13, 2023

Getting Started

Learning Resources

Check-in

Replies: 5 comments · 1 reply

BekahHW Feb 15, 2023 Author

paceaux Feb 16, 2023

BekahHW Feb 16, 2023 Author

dominicduffin1 Feb 16, 2023

paceaux Feb 17, 2023

rilhia Mar 25, 2023

BekahHW
Feb 13, 2023

Replies: 5 comments 1 reply

BekahHW
Feb 15, 2023
Author

paceaux
Feb 16, 2023

BekahHW Feb 16, 2023
Author

dominicduffin1
Feb 16, 2023

paceaux
Feb 17, 2023

rilhia
Mar 25, 2023