![]() ![]() Behavioral cloning mimics what a human would do Point out different portions of image (input, output, whole as data) Data: has the computer remember decisions at each time step Accuracy State Space: All possible states or configurations our self-driving car can be in Action space: Possible actions the self-driving car can be in -> Direction, step on the gas, break Rewards: if the car is doing what we want it to do, that is a reward A self driving car has an environment where it can basically perform multiple actions and be in certain states to get to rewards.The systems do not start off with any knowledge. Deep Q-Learning systems take a lot longer to train compared to Behavior Cloning A neural network used to approximate the Q-Function An environment to give use observation/rewards/actionsģ. Deep Q-Learning agents require three things:ġ. Replaces regular Q-table with the neural networks. Deep Q-Learning is a kind of learning process that requires 2 neural networks. When the q-learning agent is training, what policy should itĮxploring versus Exploiting(Epsilon Greedy)Įxploring - Sampling from a set of actions inĮxploiting - Taking advantage of what the → Let’s move on to RL algorithms to learn a policy from scratch, without any human teacher at all ![]() Good way to transfer some human intuition to complex tasks! Does not know what to do when car veers off trackīehavior cloning approach is not perfect but it is a solid starting point. CNN output: action (accelerate, turn left, turn right, stay) CNN data: Human played the driving simulation #Inspirit ai review manual#Instead of a list of manual rules, we used a convolutional neural network A policy takes in a state and outputs an action In behavioral cloning, the policy tries to mimic what a human would do ![]() Slow down + turn until the front is a road Turn in the direction with more road pixels (action)ģ. Count the road pixels on the left and right half of the grid If the front of the car is a road (state): What we see What the computer sees as road ![]() Using our road image, separate road pixel colours from all other colours To do (Do we want the car to be in the left lane? Right Rewards - Basically what we want the self driving car State space - All the possible states the self driving carĬan be in - write out 96x96x3 images, put an image onĪction space - All the possible actions the self driving State space, Action space, and Rewards (William) So Q(s, a) gives me the quality of taking action a from state s and then behaving optimally OrĪn estimate of the best possible total reward I get for taking an action at a certain state. Q-Function: Q is a function that determines the quality of a certain (state, action) pair. Rewards: A certain reaction for an action in the Reinforcement Learning algorithm. State: All information that is required to make a decision.Īction: Commands or tasks performed by the program code. William Feng, William Kim, Yudhiishbala Senthilkumar, Emily Joseph, Joshua Li, Sean Hwang Inspirit AI Deep Dive - Self Driving Car Project (Mar 2022) ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |