Let's balance exploration and goal-directed traversal (AIMA ch. 4).

Last week we demonstrated that a randomly exploring agent can score net-positive in the partially observable world; let's try to maximize our score this time through exploration and goal-directed traversal.

One policy might be to randomly explore until a known path is encountered, and then follow that path to the goal; one can imagine a situation, however, in which it might be better to reject a suboptimal path and keep exploring.

We might be able to use a simulated annealing model where the probability of rejecting known paths for random exploration decreases with time.

I'm sure there are other interesting solutions, too.

Join or login to comment.

  • Peter D.

    Good conversation; a little non-functioning code.

    February 21, 2013

7 went

People in this
Meetup are also in:

Create a Meetup Group and meet new people

Get started Learn more
Henry

I decided to start Reno Motorcycle Riders Group because I wanted to be part of a group of people who enjoyed my passion... I was excited and nervous. Our group has grown by leaps and bounds. I never thought it would be this big.

Henry, started Reno Motorcycle Riders

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy