Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in AI and Deep Learning by (50.2k points)

I am currently reading Wiley and Woolridge's Introduction to Multi-Agent Systems, and I was hoping whether somebody could clarify the following to me. When speaking about utility functions, the authors state:

A utility is a numeric value representing how "good" the state is: the higher the utility, the better.

The task of the agent is then to bring about states that maximize utility - we do not specify to the agent how this is to be done. In this approach, a task specification would simply be a function

u:E -> R 

which associates a real value with every environment state.

Given such a performance measure, we can then define the overall utility of an agent in some particular environment in several different ways. One (pessimistic) way is to define the utility of the agent as the utility of the worst state that might be encountered by the agent; another might be to define the overall utility as the average utility of all states encountered. There is no right or wrong way: the measure depends upon the kind of task you want your agent to carry out.

The main disadvantage of this approach is that it assigns utilities to local states; it is difficult to specify a long-term view when assigning utilities to individual states.

I am having problems understanding the disadvantage and what exactly a local state is. Could somebody clarify this?

1 Answer

0 votes
by (108k points)

In this section of this book, the description of assigning the utilities to the local state is given by a classic problem called Tile World.

It is a two-dimensional grid world, in which we have an agent, tiles, obstacles, and holes.

An agent can move in four directions (up, down, left, right) and if it is located next to a tile, it can push it in the appropriate direction. Holes have to be filled up with tiles by the agent. The aim is to fill all holes with tiles.


Environment State

The state of the environment can be described using the below variables:

The agent's current position (a_x, a_y)

Four tile's present positions (t1_x, t1_y), (t2_x, t2_y), (t3_x, t3_y) , (t4_x, t4_y)

State Transfer

Say in the current state, if the agent pushes the tile beneath it down, the system state transfers to the next state, in which every variable stays the same, except the agent's current position and the position of the tile which is being pushed.

Utility function

Our utility function can be defined as the percentage of holes being filled, i.e.,

            # of holes filled

   u =  -------------------------

            # of total holes 

It's apparent that:

If the agent fills all holes, utility = 1

If the agent fills zero holes, utility = 0

Associating utility function

Now, look at the two states below.

enter image description here enter image description here

It's easy to see that:

  • Both states have the same utility value which is 1/3 (because 1 out 3 holes are filled)

  • The left (state s1) is a dead position, in which you are unable to move all tiles into holes

  • The right (state s2) is a good position, in which you have options to move the remaining two tiles into holes.

So the conclusions are:

  1. If you associate the utility function only to a local state, e.g., u(s1) or u(s2), you actually could not tell the difference in terms of utilities. u(s1)=u(s2)=1/3.

  2. You need a global or long-term view of the states which can be represented with the run, which is a sequence of interleaved environment states and actions the agent takes.

  3. You can assign a utility, not to individual states, but tuns. Such an approach takes an inherently long term view.  

u: run -> real value

Browse Categories