A key part of getting transitions to happen when we want them to is the design of reasonable cost functions.
we want to penalize and reward the right things. I am going to work through an example of one way you might think about
designing a cost function. let is consider how we would design a cost function for vehicle speed. on one hand ,we want to
get to our destination quickly, but on the other hand, we do not want to break the law. an essential quantity we have to control is the desired velocity of the car.
some velocities are more beneficial,some are even illegal.let is fill in this graph and try to assign some costs to every velocity.
for the sake of simplicity, let is assume that all of the cost functions will have an output between zero and one.
We will adjust the importance of each cost function later by adjusting the weithts.
let is say the speed limit for the road we are on is here, well ,we know that if we ate going well above the speed limit, that should be maximum cost.
and maybe we want to set an ideal zero cost speed thst is slightly below the speed limit so that we have some buffer.
and then we can think about how much we want to penalize not moving at all.obviously ,not moving is bad. but maybe not as bad as breaking the speed limit, so we would put it here.
to keep it simple, we could just say there is a linear cost between zero and the target speed.
and since breaking the law is a binary thing, let is just say any speed greater than or equal the speed limit has maximal cost. And again, we can arbitrarily connect these points with a linear funcion and the flat maximum cost for anything above
the speed limit. now ,in practice ,we might actually want to parametrize some of these quantities so that we could later adjust them until we got the right behavior.
so first, we might define a parameter called stop cost for the zero-velocity case and a parameter called buffer velocity which would probably be a few miles per hour.
then, out overall cost function has three domains. If we are going less than the target speed, the cost function would look like this.
if we are above the speed limit, the cost is just one.And if we are between the cost would look like this .