AI01 - A sokoban solving robot - thederakhshan.comthederakhshan.com/Portfolio_files/AI01 7. semester.pdf1 Introduction This report is based an assignment during the AI01 course at

AI01 - A sokoban solving robot

Daryosh Derakhshan, Edwin Ranjitkumar, Kasper Sæderup

Group 2

October 25, 2009

1 IntroductionThis report is based an assignment during the AI01 course at the Universityof Southern Denmark. This report contains an introduction to the assignmentthat has been given, and will present the different iterations the group have beenthrough to solve the assignment. The assignment given, is to create a robot outof Lego Mindstorm that is able to solve a sokoban puzzle. This includes givingthe robot the ability to follow lines and to push objects. Different importantissues will be discussed in this report, among others, how the physical structureof the robot has been designed, and where the sensors are placed. Furthermorethis report will also describe the kind of architecture that is used to create thekinds of behaviors that is needed to solve the sokoban puzzle in this specifickind of environment. The report will likewise present how the robot’s solvingcapabilities has been designed and implemented.

2 ToolsThroughout the project, different tools were used for solving the project’s maintask. Besides the tools from Lego Mindstorm, different key elements were cho-sen by the group to help solving the obstacles in the project. Some tools wereintroduced in class and some was discovered by the group members. Not alltrivial elements are mentioned in this section, but instead some key and inter-esting tools which turned out to be efficient and good choices for the project.While BrixCC were introduced as a developing tool for the NXT, the groupchose to use the more familiar Java as programming language. Before it waspossible to use Java, a firmware that includes a Java Virtual Machine was in-stalled on the brick, which was programmed through an eclipse plug-in, bothdeveloped by LeJOS 1.For the purpose of optimizing, a custom Sokoban map was made out of card-board and tape. This map made it possible for several tests to be done, leadingto optimizing several aspects of the robots behavior skills.

3 The physical structure of the robotThis section contains information about the physical structure of the SokobanRobot and the considerations about its construction.

3.1 The robot’s wheelsIt is essential in a labyrinth that the robot’s turning radius is as small as possible.Two types of contructions was considered; one with caterpillar tracks and theTribot2 structure. The Tribot construction was chosen because it was simple

1http://lejos.sourceforge.net/nxt/nxj/tutorial/index.htm2http://mindstorms.lego.com/Overview/MTR_TriBot.aspx

1

and easy to extend with sensor tools.The two wheels in front has a diameter of 5.6 cm. The wheels are big enoughfor lifting the robot off the ground, small enough for keeping the center of masslow to the ground, making it stable so the robot doesn’t fall over while makinga turn. Arguments for using bigger wheels, are, among others, that the wheelswould have more grip on the surface, and would likewise result in an increase ofthe robot’s top speed. However there are also negative consequences, such as anincrease in wheel size would result in less accuracy when turning specific amountof degrees, which is not necessarily a critical factor in this scenario. Since biggerwheels also results in an increase of the surface area between the wheels and theground, it will require more energy to accelerate the robot, which results in agreater battery usage.The third wheel in the back, makes the robot capable of turning around in placeand therefore the robot has the smallest turning radius possible. The wheel ismade so that it turns by following the robot’s movements, simply by the forceof the driving wheels in front. The robots structure is illustrated in figure 1.

Figure 1: The physical construction of the Sokoban Robot

3.2 The robot’s sensorsThe robot is finding its way through the labyrinth by following black lines on awhite surface. The lines are possible to track with light sensors because of thecontrast in luminance between the two colors. Having two light sensors makes

2

the line following sufficient enough to reliably keep the robot on the track. Thedistance between these sensors should be equal to the width of the line, toprevent the robot from swaying. A third light sensor is placed about 10 cmin front of the robot’s axle on the front wheels, for detecting intersections andplacing the tomato cans (gems) precisely. The sensors positions are illustratedin figure 2. It is possible to detect the intersections with the two line followinglight sensors, but the front sensor does a more stable detection, since it willonly react on intersections, whereas the follow line sensors will both react onthe intersections and when driving out of track.

Figure 2: The placement of the light sensors on the Sokoban Robot

3.2.1 Line following light sensors

For optimal line following the light sensors are placed close together on bothsides of the line, likewise the light sensors are placed close to the axle, on thetwo wheels in front, to avoid severe movement from side to side while followingthe line. The sensors are placed about 0.5 cm above the surface. If the sensoris placed too close or too far away from the surface it will produce unreliableluminosity value of the surface.

3.2.2 Intersection detector light sensor

The light sensor is placed on the right hand side of the robot, at a distanceso far to the right, that it only detects intersections, which is critical for notmisguiding the robot. Whether the sensor is located on the right or the left sidemakes no difference. The sensor is placed so far in front of the robot, that it canbe used for placing a tomato can precisely on top of an intersection. There’showever a downside by having only one intersection detector - a T-cross can’tbe detected if it appears on the robot’s left side. However it is assumed that

3

there’s a full intersection even though one way will lead to a dead end. Oneway to solve this problem is by having another intersection detector placed atthe other side of the robot.

4 Basic behavioursThis section will present how the robot’s basic behaviors are implemented. Thebasic behaviors includes how it follows a line, how it turns right and left, andfinally how it turns around. To follow a line, it’s necessary to know whetherthe robot is on the right track, or it’s on its way out. Not surprisingly this isdone by having the two line following sensors. The follow line algorithm is quitesimple in the sense that when it detects that the right sensor is turning black, itknows it’s on its way to the left and vise versa. Therefore it has to recalibrate bydriving the opposite way, by slowing down both wheels, but one of them morethan the other. This turn will continue until it detects that both sensors areshowing the ground color. When this happens, the robot will increase its speedto the same speed it had before the recalibration. This results in a smoothlyturn which will get the robot back on track.

Another interesting case is when the robot has to make a turn. Two differ-ent implementations of how the robot turns has been developed, and section 6.2will present an experiment done with these two algorithms, to determine whichone should be used on the competition day. Figure 3 illustrates the visual dif-ference in the two different algorithms. The first turn algorithm (far left) iscalled the Corner rotation algorithm. Let’s examine how the algorithm works

Figure 3: The two examined turns - Corner rotation and Corner cut

by taking an example of a right turn when a crossing line has been detected.The robot starts at the bottom, and drives forward (upwards), when sensor 3detects a line, it drives the distance value forward, which results in the robotstanding exactly on the line, as the figure illustrates. Afterwards it rotates each

4

wheel in opposite directions with equal speed, as the red arrows indicate, tomake a right turn. This procedure is the general idea behind the Corner rota-tion algorithm and also the most intuitive. Sensor 1 and 2 are used to follow aline, and are not relevant for the turning algorithms.The second turn algorithm on figure 3 is called the Corner cut algorithm (farright). The robot starts the exact same way as before, the only difference iswhen and how it begins the rotation. When sensor 3 detects the line, it calcu-lates how the difference in speed between the two wheels should be (red arrowson the wheels) to make a 90 degree arc with the right radius depending on thewheel size and the distance between them. This automatically results in a moresmooth rotation and that the end position of this algorithm is closer to the nextcrossing lines than the Corner rotation algorithm is, due to the earlier rotationprocess itself. As illustrated by the black arrows, the Corner rotation does notdo much else than make the robot rotate in place and the robot is still locatedat the center of the intersection when the rotation is completed, as where theCorner cut algorithm both makes the rotation while driving and finishes closerto the next crossing lines. This obviously requires heavier calculations, but thebenefit in performance might still be greater.

It’s fortunately not just all simple turns and straight ahead driving - there’salso more complicated maneuvers such as placing a can at crossed lines, turningaround and drive away. In this kind of maneuvering there’s not only the robotto take into consideration, there’s another object as well. It’s important whenthe can is placed, that it will not be hit when the robot attempts to turn around.Figure 4 illustrates how this is done in practice. The robot drives forward untill

Figure 4: How the robot turns around after placing a can

sensor 3 detects a line, afterwards it drives a few cm back to get the can out ofthe grip, makes a 180 degrees turn, by driving the wheels in opposite directions,

5

and drives away as figure 4 shows. You might be wondering if the robot knowsif it’s pushing a can or not. It doesn’t know - therefore it assumes that theonly point in time it has to turn around is when it’s pushing a can, or else itwouldn’t make any sense to turn. It’s therefore also assumed that it will notmake the planning during the puzzle solving, but already has the solution tothe puzzle, and is simply just executing the different instructions which makesup the solution. How the robot will solve the puzzle and create the drivinginstructions, will be discussed in section 8.

5 Agent architectureThis section will present the internal behaviors of the robot and which archi-tecture has been chosen to form its behaviors. Obviously the robot can’t solvepuzzles by itself, and can’t drive around if not somehow told to do so. Thereforeit’s necessary to find a suitable way to tell the robot to do the necessary behav-iors. Since this robot isn’t supposed to interact with other robots, it doesn’thave to be very social. Likewise it should behave more like a reactive agent,yet still have some planning behaviors. Brook’s subsumption architecture [1]is believed to be the best behavior based architecture for this assignment withthe reasons just presented. The architecture allows a subsumption hierarchythat can prioritize different behaviors depending on the sensor values. It alsomakes it easy to add new behaviors as the robot gets more complex. Figure 5illustrates how this subsumption hierarchy is applied and designed. As shown

Figure 5: The subsumption architecture of the NXT robot

the Turn off behavior has the highest behavior in the hierarchy, meaning thisbehavior has the highest priority. When the robot’s middle button is pressed,it will turn off no matter what its current behavior is - this will also exit theprogram. The lower a behavior is located in the hierarchy, the more abstractthe behavior usually is. The lowest behavior in the hierarchy is the Follow Linebehavior, which makes sure that the robot follows the line at all times. Thisincludes calibration - getting back on track if it’s on its way out. As long as

6

there is a line to follow, the robot will follow it. Since the environment isn’t alljust straight lines, the robot has to detect when a crossing line is present. Whenthe line crossing is detected, the behavior is then switched to the InstructionExecuter behavior, see figure 5. In this behavior state, the robot will get thenext instruction from the instruction queue and execute that instruction, whichcan vary between turning left, turning right, driving straight ahead and turningaround and will execute these commands as described in section 4. When theinstruction has been executed, the Follow Line behavior gets back the control,and starts moving forward while following the line again. This subsumptionsolution both creates a flexible and scalable solution, since the hierarchy can beextended with behaviors as needed. Furthermore if wanted, switching betweenpriorities of given behaviors can simply be done by moving the behavior up ordown in the hierarchy. Most importantly this solution can also be used in otherapplications, meaning that the solution developed is a rather versatile one.

6 Testing and reliabiltyThis section contains a description of the various chosen test scenarios for therobot. The tests have the purpose of revealing potentially weak design, so thegroup can ensure that the robot is reliable enough to solve the competitionmap on the competition day. The tests does moreover bring useful informationregarding the correct adjustments of the different parameters, like optimal speedand the light sensors’ threshold values.

6.1 Testing the light sensors in different light conditionsTo tell if the light sensors are suitable for the purpose of detecting black lines in asokoban game, the light sensors are tested. The light sensor tests are performedby noting the sensor readings in different light conditions. For the tests, threelight sources have been used:

• Ceiling lamp (CL) with a 11 W fluorescent light bulb

• Reading lamp (RL) with a 12 W fluorescent light bulb

• Work lamp (WL) with a 20W halogen light bulb

6.1.1 The tests

In early developing stages it was found that the robot is easily cheated by thevalues of the front light sensor due to extreme lighting conditions. One way tominimize this problem, is to provide the sensor with a lighting shield to reducethe amount of registered ambient light.

Five different light conditions have been tested, where sensor values for eachof the light sensors; right, left and the front light sensor, have been noted de-tecting both a black line and the white ground. However only the darkest andbrightest environments are relevant, so only these cases will be presented and

7

evaluated. The darkest environment was created by having no light sourcesother than a bit of sun light going through the curtains. The brightest environ-ment was created by having the work lamp positioned parallel to the ground.In table 1 the results of the tests are presented.

White area Black lineLight conditions Left Right Front Left Right FrontDarkest environment 44 49 55 20 21 33Brightest environment 57 61 61 33 35 47

Table 1: Light values from the light sensors in different lighting conditions

The static threshold value should be implemented as the mean of the lightestblack and the darkest white to ensure the greatest possible contrast from thethreshold to either a black line or a white area. The threshold values for thedifferent sensors should be:

• Left light sensor: (33 + 44)/2 = 38.5

• Right light sensor: (35 + 49)/2 = 42

• Front light sensor: (47 + 55)/2 = 51

In these lighting conditions, it is possible to detect the lines, using a staticthreshold value because the values of the lightest black isn’t greater than thedarkest white. More powerful light sources might give other results, and if staticthreshold values is not possible, a dynamic threshold value based on the ambientlighting updated continuously while the robot solves the sokoban game, shouldbe implemented.

6.2 Rotation TestingThe robot’s turn algorithm is very interesting for a productive point of view.When the robot is taking a bearing on it’s next movement, it doesn’t get anycloser to its next goal - therefore it’s preferably to minimize the time spend ontaking a bearing.

In section 4 two different turning methods are descriped. The two turnalgorithms differ in many ways, among these, size in code, amount of mathe-matical calculations, speed of code execution, space complexity etc. This sectionare though solely concentrating on testing which turning algorithm makes therobot go faster.

8

6.2.1 The experiment

Figure 6 illustrates how the experiment is carried out. An important factor whendesigning what scenario to put the robot through, is to make it as realistic aspossible. The robot was put through a scenario it could actually run across

Figure 6: The task the experiment is build upon - the task the robot will try to accomplish

when solving the puzzle. The goal is to move the can (red circle) down via pathA. The robot is only able to push the can in straight lines - and can neithercarry it along while turning nor drag it. The way this should be done is to followroute 1, so the can could be moved from BJ to AJ, then turn around, and followroute 2 so the robot is able to push the can down the desired path. This scenarioincludes moving a can, it includes all the different behaviors that are possible;straight-ahead driving, right turn, left turn and placing a can and turn around.To summarize the scenario, the robot starts in position AK, drives along route1, pushing the can to position AJ, turn around and follow route 2 and will endup in position AK where it started. When this position has been reached andthe final left turn has been made so the robot is heading down against the can,the time will stop. The robot will perform this task 10 times with each of thetwo presented algorithms, to make sure that the experiment presents reliableresults.

6.2.2 Paired T-test

As described the comparison of the two algorithms is based on the completiontimes of a test scenario from the Sokoban game. The samples is listed here,seperated by commas and the time is in milliseconds:

Corner cut :(16227, 15763, 15707, 16360, 15754, 16459, 16159, 16298, 16621, 16222)Corner rotation:(19070, 19272, 18740, 18548, 19126, 18824, 19132, 19240, 18714, 18552)

9

A Paired Sample T-test has been performed, and since it seems like the Cornercut is the faster algorithm, the hypotheses are chosen to be:

H0: Corner rotation is the faster algorithm, or the two algorithms areequally fast

H1: Corner cut is the faster algorithm

The T-test indicates with a probability of 99.5% that the Corner cut algorithmis the faster of the two algorithms for turning the robot.

The mean of the differences, indicates that the Corner cut algorithm forturning could be 2-3 seconds faster than the Corner rotation. Since the testscenario is only a fraction of the complete puzzle, the difference added up, cansignify a difference of minutes when completing the sokoban game.When choosing which algorithm to use in the competition, factors other thanspeed becomes relevant. The robot’s behaviors must be reliable in the sense thatit must not miss a line detection or turn out of the track - with the consequenceof misguiding the robot and therefore not solving the puzzle. The point is torather have a slower robot solving the whole puzzle, than having a robot solvinghalf the puzzle 5 times faster but then fail in the middle of the process. Whilerunning the tests, there was no indication that any of the algorithms used forturning the robot was unreliable. For the competition the Corner cut algorithmis used for turning, the reliability of the robot will be further tested in section6.3 with focus on the robot’s driving speed.

6.3 Speed improvement testingSince the course ends with a competition, where different robots compete tobe the fastest sokoban solving robot, it is a natural wish to improve the speedof the robot’s movements. This can be done in many different ways, one ofthem has already been discussed, and that is to turn in an alternative way, andas section 6.2 concluded, by changing the turn algorithm, the robot will turnfaster and in the long run save significant amount of time. Simple changes canalso improve the robot’s speed, and one of them are just to increase the drivingspeed. It is although not as simple as it sounds, since it should still drive ina stable way and shouldn’t go out of track, since that will not make the robotcomplete the game. So it is not possible just to increase the robot’s speed toits maximum, since that would most likely cause the robot to fail the game inmost cases, which will not be acceptable. It is then the goal to find to mostoptimal speed for the robot to drive with, without compromising the stability.This is done in a rather simple fashion by starting off with a slow speed, thenincreasing the speed until the robot fails the game. The graph for this process isillustrated in figure 7. The experiment has been done on the competition mapitself, and each sample shown, is the mean of five samples with the same speed,to make sure that it was completely stable - to avoid accidental results. As

10

Figure 7: The increasing speed’s impact on game completion time

predicted the completion time decreases as the speed increases. However whatwas not predicted, was the slope’s size, and it turned out that the robot wouldimprove by many seconds simply by increasing the speed - more than antipated.When the speed reached 520 (red line) the robot repeatedly failed, so this wasa clear indication of that the robot has reached its skill limit, and could notby these properties perform any better. Even though the highest stable speedwas 500, the chosen speed was set to 480, just to slightly increase the distanceto the limit. The reason why it fails at these speeds, is that the follow linebehavior was not designed to handle these high speeds in the first place, and byoptimizing how the robot would calibrate itself back on track would then movethe red line further and further to the right, making the robot able to completethe map with higher speeds.

7 Problem representaionThis section will describe how the problem, the sokoban game, is represented.This includes how the map is represented, and how different states are rep-resented and modeled. How the problem is represented is not unimportant,since there’s many factors to consider, such as memory usage and performance,so any representation of the problem wouldn’t necessarily function when it’simplemented.

7.1 Map representationIt is of course wanted to make the different representations as simple as possi-ble, for not making something more complicated than it has to be. Since themap is grid based, one of the simplest solutions would be to use a matrix or a

11

2-dimensional array as the map representaion. Either way it’s a simple repre-sentation much a like, and doesn’t use as much memory as if it was modeledthrough a bunch of objects. Figure 8 illustrates an example of a rather simplemap representation. The map contains different elements as shown - these

Figure 8: An example of how the map is represented

are called identifiers, which, as the name implies, identifies what’s on a specificlocation on the map. So the map is build by having a 2-dimensional identifierarray. Identifiers can have the following different values: WALL, GOAL, GEM,MAN and EMPTY. EMPTY just represents the empty spots on the map, aswhere the man can move to. In fact the map only shows where the walls andthe goals are. It has however knowledge of the initial positions of both the manand each of the gems, but is not modeled directly to the map, because it’s notrelevant for the map to know where the dynamic aspects are placed - it is notits responsibility to know. The map should be considered like a real map. Areal ordinary map, shows where the roads are, and the structure of an environ-ment. It does not show where people are located, where traffic is heavy andso on. It only has information of the static aspects of the environment, and sodoes this map. Even though this is quite simple, it can be used for somethingvery useful too. On ordinary maps, one can easily see whether a road has adead end, or whether is leads to another road. This dead end aspect, can beused in this map representation as well. Independent from anything else, thereare various locations that would be a dead end if a gem is placed on it. Themodeled map, also have information of locations that are dead ends. Figure 9illustrates how this works. If a gem is to be moved either north or south, boththe red locations, in reference to the gem, should be available locations for theman to move to. That is of course only the case if neither a wall or anothergem is located at that location. Since the map doesn’t have information of thegem positions, other than the initial ones, only the wall case is handled by themap. This is obviously the same case if the man was to move the gem to eithereast or west. However if a gem is placed at a location where at least one of the

12

Figure 9: How the dead end model works

red locations and one of the blue locations are blocked by a wall. That wouldmean it would be impossible for the man to get the gem out of the dead end,since he’s only allowed to push the gems. So unless that location is also a goal,that location would be marked as a "dead end location". If one of the gems getsmoved to one of the dead end locations, the map would no longer be solvable.However it has to be mentioned that there are many other cases, where a deadend situation can been reached, but these situations are all dependent on morethan just the walls. One case would be if four gems created a 2x2 square, thiswould also be a dead end situation, no matter where the locations of the gemsmight be, but since it is also dependent on dynamic objects, it is not the map’sresponsibility to have information of those states. Since the dead end locationsare not dependent on either the man or the current position of the gem, thesedead end locations can be found at initial state, before the solving process isstarted.

7.2 States representationTo make a solver that will solve the sokoban game, there has to be a staterepresentation. A state representation represents how the environment looks likeat a given moment. Obviously the first state must be the initial state, where theman’s location as well as the gems’ locations are known. Again it’s importantto make the state representation as simple as possible and not make it use morememory than necessary. So there’s at least two different ways to represent astate, either it could contain every information there is in the environment, soboth walls, gems, goals and the man’s position is represented by the state. Butthe only thing that’s ever going to change is the man’s and the gems’ position,since the man pushes the gems around, so why not just represent a state withthe position of the man and each gem? It’s certainly a solution that requiresless memory, and is also much simpler than the other suggestion. So now it ispossible for the state to use the map, as described earlier, to inspect when gemsare located at positions that are considered dead ends. Furthermore it wouldthen be the state’s responsibility to make sure that it’s own state is valid, in theterms of the locations of each of the gems, since more gems located next to each

13

other in various ways, would create dead end situations, as mentioned earlier.The state therefore has the responsibility to find out whether its own state isa valid one, both with the help of the map and the information the state itselfcontains.Furthermore a state should also be able to compare itself to another state, so itwould be possible to find out whether two states are equal, simply by comparingthe man’s and the gems’ positions. Another problem there is with the states, isthat it should somehow be possible to find out how the state came up, and howfar that state is from the initial state. That can again be done in various ways.One simple way is just by making the state have a path. The path will representwhich actions the man have taken to get to that state. So at the initial state,the state’s path is obviously empty. When the man then moves e.g. north, anew state would emerge with the path north, and the state would then containthe new location of the man and the locations of the gems. This is a nice wayto track back where the states have come from, so when the goal state has beenreached, it’s as simple as looking at the state’s path to see which actions theman has taken to come to that specific state. The path would then representthe instruction set the robot shall execute to solve the game.

8 The planner for solving the sokoban gameThis chapter describes the implemented sokoban planner. When deciding on aplanner to solve the sokoban, several factors of the game needs to be considered.There are elements in the game that needs to be controlled. The gems are ofcourse, the whole basis of the game, since the final goal is to move the gemsfrom their starting positions to the goal fields. When designing the planner, ithas to be done in terms of that it is the robot there is being controlled, and alsoconsidering the fact that the robot should move in an optimal way, meaning itshould solve the game with lowest possible number of movements.

The Breadth-First Search (BFS), is a complete but undirected search. TheBFS is implemented with an open and a closed list, to prevent the planner fromexamining identical states several times. The planning process is written inpseudo code in figure 10. The first state to be examined is the initial state (line1), with start positions of the man and the gems, so that one is added to theopen list (line 2). The search algorithm runs continuosly until a result has beenfound (lines 9-12), or there is no more states in the open list to be examined(line 4). The first item in the open list is examined and it is moved to the closedlist (lines 5-7), when the search algorithm examines a state, it first checks if thestate is the result (line 9), that is if all the gems is in position of a goal. If thestate is not the result, the planner finds all the possible neighbor states, (line14). A possible neighbor state, is a state where the man has been moved onestep either north, south, west or east, and at the same time is a valid state, asdescribed in section 7. Then the possible neighbor states are checked for anyidentical states in the open and closed list (line 15), if the state isn’t in either,

14

1 state = initialState;2 add state to openList;34 while (openList has more states) {5 state = openList.get(0);6 remove state from openList;7 add state to closedList;89 if (state is result) {

10 println("Search finished");11 println(the path to the state);12 stop the search;13 }else{14 for each(Possible Neighbor State) {15 if(neighborState is not in closedList AND neighborState is not in

openList){16 add neighborState to openList;17 }18 }19 }20 }

Figure 10: Pseudo code for the implemented planner with use of breadth-first search and the useof open and closed lists

it is added to the open list (line 16), meaning the neighbor state is a valid moveand hasn’t been evaluated before. At this point one iteration of the planner iscomplete, and continues it’s search for a result by examining the next elementin the open list.

Even though the BFS algorithm always finds a solution if there is one, it isfar from ideal for solving the sokoban puzzle if e.g. the planning time or spacecomplexity is a factor. Since e.g. planning time and memory usage is not asuccess criteria for either the competition nor the course in general, alternativeshave not been developed. If those factors was critical, a better solution wouldbe to plan where the man is going instead of just trying every possible move,such a planner could be using the A* algorithm for pathfinding.

The A* algorithm uses heuristics to determine how many moves the manhas left to complete the sokoban game. An ideal heuristic is the correct numberof moves, which is only possible to set, if the solution is already known. Thisis obviously not the case in this project, since there would not be a reason forcreating a planner in the first place. If the heuristic came up with the correctnumber of moves, the planner would only evaluate the steps to the goal, andmake no wrong moves, which is why the computation of the heuristic is such animportant part of using the A* algorithm. The main difference between the usedplanner and the A* algorithm, is using the heuristic and the traveled distanceas a factor when picking the next state from the open list to evaluate, insteadof just evaluating the first state in the FIFO queue.The group has implemented a planner using the A* algorithm. Even though theheuristic can be optimized, the computation speed of the solver has increased.The heuristics computed is done by adding the following values:

1. The manhatten distance from a each goal to its nearest gem added to-

15

gether. This will produce only four distances in a case with four gems andgoals.

2. The manhatten distance from the man to every gem, that is not alreadylocated at a goal

The planner using the A* algorithm (A* planner) finds the exact same solution(path) to the goal state as the BFS planner does. However, if considering thecompetition map, A* planner was faster in solving the game with 1 to 3 gemscompared to the BFS planner. When considering all 4 gems, it gets a bit morecomplicated, since the gems more often blocks each other’s paths, and therebycreating a more complicated solution than the case with 3 gems. In the situationwith 4 gems the A* planner is not faster in solving the game compared to theBFS planner. This indicates that the heuristic used is not fit for cases wheregems often are blocked by other gems.

9 ConclusionThe group has chosen to write the robot’s behaviors in Java by using LeJOS’framework. This means that the group can use their favorite development envi-ronment for the assignment. The physical structure of the robot was importantto design to optimize the speed and precession it can drive with. The Tribotconstruction was preferred because of the simple construction and scalabilityof sensors. The wheels was chosen carefully to insure that the center of masswas located at a stable position. Two light sensors was chosen to make sure therobot follows the line. An additional light sensor was used to detect intersec-tions, which turned out to be the best choice rather than detect intersectionswith the two inner sensors.The robot’s basic behaviors has been developed, which includes line following,turning right and left, driving straight ahead and turning around after placing acan. These behaviors combined in different sequences will make up the solutionat the very end. The agent architecture used is Brook’s subsumption architec-ture, which allows creation of different behaviors placed at various levels in thesubsumption hierarchy. This architecture makes it simple to add new behaviorsif needed.To solve the sokoban competition map, a planner has been developed. For theplanner, the map has been modelled using a 2-dimensional array and statesonly containing the positions of the dynamic elements; the man and the gem,and the path to the state, to reduce memory usage. The planning has beendeveloped using the principles of BFS while comparing which states has alreadybeen examined to bring down planning time and memory usage. The plannerfinds the best possible solution for solving the sokoban competition map. In thecompetition the planner was used to solve the sokoban competition map andthe robot solved the game by placing all of the cans on a goal.

16

References[1] Michael Wooldridge. An Introduction to Multiagent Systems. John Wiley &

Sons - LTD, 2002.

17

Documents

AI01 - A sokoban solving robot - thederakhshan.comthederakhshan.com/Portfolio_files/AI01 7. semester.pdf1 Introduction This report is based an assignment during the AI01 course at