DOC PREVIEW
UW-Madison ECE 539 - Neural Networking in simple games

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Neural Networking in simple gamesECE 539Theodore HumpalIntroductionAs a note to the reader, this paper is based off of [1], which gives an excellent talk on this subject.The object of this paper is to create a neural network able to play simple games. In this case, tic tac toe was picked. The idea behind this is that there is large consumer demand for fun, interactive gaming, particularly computer gaming. I would predict the market is only going to increase as time goes on, as more children grow up in this environment, the more will be involved as they growolder. In the industry, creating adaptable computerized opponents has yet to be implemented well. Many games claim to have a variety of certain AI features, butoften these AI features are generally quite weak. For example, companies have specifically advertised about the ability of computerized opponents to out-flank player – an essentially simple idea. Turn based games, where the computer's blazing, inhuman speed can't compensate for its lack of a brain tend to have the weakest AI.Because of the increase of networking and computing power, it would be quite possible to set up many neural networks within a game (most probably coupled with AI). Then have people interact with the game online, collect information from the matches, and update the neural network through patches and updates. In this way, the game gains large amounts of value to its purchaser.In order to make games more challenging, most game makers simply stack the odds against the player. Comparatively there is often surprisingly weak AI. There are practical reasons for this, but the idea of cutting AI development and replacingit with an Internet-based, worldwide neural training network, is a powerful idea. Though this project only has a very basic relation to this idea, it is a starting point.Can we create a neural network that, with little to no AI features, adapts to humanplay?This project will concentrate on this in two phases. First, a basic decision making entity is needed to set against human players. Then, training data must be extracted from this and used to create an MLP network. The first phase is donethrough reinforced learning and is done in C++. The second phase is done through MLP back propagation and is done in Matlab.Part IInitially, a lookup table is created. When given a tic tac toe board state, the lookup table provides the desirability of that board state. Therefore, when a moveneeds to be made, the computerized opponent will look at the current board state, lookup the desirability of all possible moves, and pick the most desirable of moves. Every entry in the lookup table is initialized at .5. As the lookup table plays a duplicate of itself, matches are won and lost. On a match result, the desirability of each state is updated. On a win or a tie, the last board state is assigned a 1. On a loss, the last board state is assigned a 0. The board states that led up to this conclusion are then updated as follows: Where:- D_new(n-1) – the new lookup table desirability of the n-1 board state.- D_old(n-1) – the old lookup table desirability of the n-1 board state.- D(n) – the desirability of the state one play closer to the end.D_new(n-1) = D_old(n-1) + .5 * ( D(n) – D_old(n-1) )Therefore, the states are updated from the final play, backward to the first play. The 'strength' of the update weakens as the state moves away from the last play.The computer opponent is setup against a duplicate opponent. The two players will use their own lookup tables to make moves. Random moves will also be injected every once in a while in order for all possibilities to be explored. Following this process, a master tic tac toe player can be constructed. Because tic tac toe is a solvable game, this isn't that exciting of a conclusion, however, in dealing with complex games which aren't, this approach has great potential. For example, [3] explains a related method for computing an expert level (but not master level) checkers player – a game that has only been solved through many years of research.In the simulations performed, one million matches were found to produce acompetitive opponent and ten million matches were found to produce an opponent that couldn't be beat. After 100 million games, and a human inability tobeat the computerized opponent, the lookup table was printed out in an MLP training data format. In particular, the input is the 9 board spaces (0 for empty, 1 for X, and -1 for O) and the player’s identity (1 for X, -1 for O). The output layer has 9 outputs (one for each board space) and produces a 1 for the most desirable move. In this way, the MLP will be constructed as solving a classification problem.Overall, this part of the project was quite solid. The programs ability to adapt to not only another computer opponent, but also a human opponent workedexcellently. The concluding lookup table could also not be beaten, which shows success.Part IIThe first step is to insert the MLP into the tic tac toe program, as the decision maker. The next step is to use the training data from part I to train the MLP network through back propagation. The final step is to run the MLP network to see its performance.That critical middle step was the most difficult part of this project and for the most part what failed in this project. An 'acceptable' MLP was not found. Training the MLP was problematic from the start. The back propagation method was largely based off of the given bp.m. However, there were several constraints on the MLP values, for example, the alpha value needed to be strictly limited, as too large of a value broke the back propagation method (weights all went to 0) while too small of a value would make almost no progress. Despite playing with the parameters, the number of mispredictions in the confusion matrix was always quite high. An even remotely feasible MLP network was not found. The input layer took in the board state and whether the player was X’s or O’s. The output layer gave the next move. If the next move given by the output layer doesn’t make sense a random move is selected instead.I think the solution with the largest potential to solve this MLP training problem is one that would reduce the lookup table substantially. First of all, the tictac toe program was never given a sense of symmetry. Adding symmetry reducesthe lookup table by a large percentage. This would allow for the


View Full Document

UW-Madison ECE 539 - Neural Networking in simple games

Documents in this Course
Load more
Download Neural Networking in simple games
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Neural Networking in simple games and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Neural Networking in simple games 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?