Artificial Intelligence

Stuart Jonathan Russell, Peter Norvig

Mentioned 10

Presents a guide to artificial intelligence, covering such topics as intelligent agents, problem-solving, logical agents, planning, uncertainty, learning, and robotics.

More on

Mentioned in questions and answers.

I'm really interested in Artificial Neural Networks, but I'm looking for a place to start.

What resources are out there and what is a good starting project?

If you don't mind spending money, The Handbook of Brain Theory and Neural Networks is very good. It contains 287 articles covering research in many disciplines. It starts with an introduction and theory and then highlights paths through the articles to best cover your interests.

As for a first project, Kohonen maps are interesting for categorization: find hidden relationships in your music collection, build a smart robot, or solve the Netflix prize.

First of all, give up any notions that artificial neural networks have anything to do with the brain but for a passing similarity to networks of biological neurons. Learning biology won't help you effectively apply neural networks; learning linear algebra, calculus, and probability theory will. You should at the very least make yourself familiar with the idea of basic differentiation of functions, the chain rule, partial derivatives (the gradient, the Jacobian and the Hessian), and understanding matrix multiplication and diagonalization.

Really what you are doing when you train a network is optimizing a large, multidimensional function (minimizing your error measure with respect to each of the weights in the network), and so an investigation of techniques for nonlinear numerical optimization may prove instructive. This is a widely studied problem with a large base of literature outside of neural networks, and there are plenty of lecture notes in numerical optimization available on the web. To start, most people use simple gradient descent, but this can be much slower and less effective than more nuanced methods like

Once you've got the basic ideas down you can start to experiment with different "squashing" functions in your hidden layer, adding various kinds of regularization, and various tweaks to make learning go faster. See this paper for a comprehensive list of "best practices".

One of the best books on the subject is Chris Bishop's Neural Networks for Pattern Recognition. It's fairly old by this stage but is still an excellent resource, and you can often find used copies online for about $30. The neural network chapter in his newer book, Pattern Recognition and Machine Learning, is also quite comprehensive. For a particularly good implementation-centric tutorial, see this one on which implements a clever sort of network called a convolutional network, which constrains connectivity in such a way as to make it very good at learning to classify visual patterns.

Support vector machines and other kernel methods have become quite popular because you can apply them without knowing what the hell you're doing and often get acceptable results. Neural networks, on the other hand, are huge optimization problems which require careful tuning, although they're still preferable for lots of problems, particularly large scale problems in domains like computer vision.

I found Fausett's Fundamentals of Neural Networks a straightforward and easy-to-get-into introductory textbook.

Two books that where used during my study:

Introductional course: An introduction to Neural Computing by Igor Aleksander and Helen Morton.

Advanced course: Neurocomputing by Robert Hecht-Nielsen

Raul Rojas' book is a a very good start (it's also free). Also, Haykin's book 3rd edition, although of large volume, is very well explained.

Neural Networks are kind of declasse these days. Support Vector Machines and kernel methods are better for more classes of problems then back propagation. Neural networks and genetic algorithms capture the imagination of people who don't know much about modern machine learning but they are not state of the art.

If you want to learn more about AI/Machine learning, I recommend buying and reading Peter Norvig's Artificial Intelligence: A Modern Approach. It's a broad survey of AI and lots of modern technology. It goes over the history and older techniques too, and will give you a more complete grounding in the basics of AI/Machine Learning.

Neural networks are pretty easy, though. Especially if you use a genetic algorithm to determine the weights, rather then proper back propagation.

I can recommend where not to start. I bought An Introduction to Neural Networks by Kevin Gurney which has good reviews on Amazon and claims to be a "highly accessible introduction to one of the most important topics in cognitive and computer science". Personally, I would not recommend this book as a start. I can comprehend only about 10% of it, but maybe it's just me (English is not my native language). I'm going to look into other options from this thread.

I've always been a largely independent learner gleaning what I can from Wikipedia and various books. However, I fear that I may have biased my self-education by inadvertent omission of topics and concepts. My goal is to teach myself the equivalent of an undergraduate degree in Computer Science from a top university (doesn't matter which one).

To that end, I've purchased and started reading a few academic textbooks:

As well as a few textbooks I have left over from classes I've taken at a mediocre-at-best state university:

My questions are:

  • What topics aren't covered by this collection?
  • Are there any books that are more rigorous or thorough (or even easier to read) than a book listed here?
  • Are there any books that are a waste of my time?
  • In what order should I read the books?
  • What does an MIT or Stanford (or UCB or CMU ...) undergrad learn that I might miss?

Software engineering books are welcome, but in the context of academic study only please. I'm aware of Code Complete and the Pragmatic Programmer, but I'm looking for a more theoretical approach. Thanks!

I think you can use most of the other books for reference and just absorb Programming Pearls in its entirety. Doing so would make you better than 90% of the programmers I've ever met.

The "Gang of Four" Design Patterns book. The Design Patterns course I took in college was probably the most beneficial class I've ever taken.

First, I wouldn't worry about it. But if you'd like a book to learn some of the abstract CS ideas, I'd recommend The Turing Omnibus or Theoretical Introduction to Programming.

If I were deciding between hiring two programmers and neither had much experience, but one had a CS degree and the other didn't, I'd hire the one with the CS degree. But when you get to comparing two programmers with a dozen years of experience, the degree hardly matters.

Even i'm in the same plane: studying computer science in my free time after work; These are some of the books i have in my shelf right now

  1. Applying UML and patterns - Larman
  2. Introduction to algorithms - Cormen
  3. Discrete mathematics and its applications - Rosen
  4. Software Engineering
  5. Advanced Programming in the UNIX Environment

Will udpate this list further as soon as i finish them... :-)

File Structures: An object oriented approach with C++

A lot of good info about block devices and file structuring which you won't find in any of the books you listed. It got a few critical reviews on Amazon because people didn't like his code examples, but the point of the book is to teach the concepts, not give cut and paste code examples.

Also make sure to get a book on compilers

Biggest two omissions I see:

For operating systems I prefer the Tanenbaum instead of the Silberschatz but both are good:

And about the order, that would depend on your interests. There aren't many prerequisites, automata for compilers is the most obvious one. First read the automata book and then the dragon one.

I don't know all the books you have, but the ones I know are good enough so that may mean the others are decent as well.

You are missing some logic and discrete math books as well.

And let's not forget some database theory books!

I've been asked to help out on an XNA project with the AI. I'm not totally new to the concepts (pathfinding, flocking, etc.) but this would be the first "real" code. I'd be very thankful for any resources (links or books); I want to make sure I do this right.

The standard textbook and a great place to start is Russel and Norvig's Artificial Intelligence: A Modern Approach. You can also get MIT's Intro AI course via OpenCourseWare

I'm looking to learn some fundamentals on cartesian geometry or coordinates based game programming. Platform is irrelevant, although I'm most proficient in JavaScript, C, Objective-C. Ultimately being able to create something such as dots or checkers would be ideal. The idea is for me to learn how sprites work and how pathing works programmatically. My question to you folks is where is the best place to learn the fundamentals? Something that isn't math heavy because to be quite frank, anything further advanced than calculus to me at this point is a grey line and requires refreshing my memory.

If there is a particular book, site, or open source project -- that would probably help me the most.

Thanks for any ideas.

othello and the book is of course the renowned PAIP by Peter Norvig

Sprite animation is going differ significantly based on what platform you choose to do your program on, and any generic reference for animating on that platform will get you through that. If you want to shoot for Java, Yoely's references look pretty good.

For the game AI, though, I recommend you check out Artificial Intelligence: A Modern Approach by Russell and Norvig. It looks intimidating, and understanding much of the book will take a working knowledge of high-level math concepts. However, it is engaging and well-written, and you can probably make it through the first dozen chapters or so without hitting any math landmines. The algorithms and concepts in that book will be more than enough to help you program AI for a simple game, and might even help you decide on one.

I think there's a few more steps to accomplishing your objective, which is understanding the basics of game programming. You mentioned understanding sprites and pathing, which are imperative to game programming, but I think that initially you should spend a little time understanding the programming and methodology behind general graphical user interaction.

Regardless of what language you will eventually program your game in, I think that learning in a modern language like Java or C# will provide you with a vast amount of libraries and will allow you to accomplish tasks like animation and Event Listeners much more simply.

Here is a list of guides and tutorials that I think will be extremely helpful to you just as they were to me and others:

  1. This is an extremely-detailed tutorial for a Java Game Framework that includes full source code and a full walk through (with source code) of writing the infamous "Snake" game in Java, complete with a control panel, score board, and sound effects!
  2. The book "Beginning Java 5 Game Programming" by Jonathan S. Harbour will introduce you to concepts such as 2D vector graphics and bitmap including sprite animation. Plus you can get it used on Amazon Marketplace for $12!
  3. Here is an unbelievable tutorial on Sprite Animation that has more than 5 parts to it! Plus it's written by Richard Baldwin, a Professor of CompSci and an extremely reliable and knowledgeable source. For more tutorials by him, this is his site.

Between these sources you're going to possess the methodology of the parts that go into a game, which are applicable in any language, as well as the knowledge of how those parts can be actually implemented as well.


I have thought of some heuristics for a big (higher dimensions) tic-tac-toe game. How do I check which of them are actually consistent?

What is meant by consistency anyways?

EDITED: This answer confused admissibility and consistency. I have corrected it to refer to admissibility, but the original question was about consistency, and this answer does not fully answer the question.

You could do it analytically, by distinguishing all different cases and thereby proving that your heuristic is indeed admissible.

For informed search, a heuristic is admissible with a search problem (say, the search for the best move in a game) if and only if it underestimates the 'distance' to a suitable state.

EXAMPLE: Search for the shortest route to a target city via a network of highways between cities. Here, one could use the Eucidean distance as a heuristic: the length of a straight line to the goal is always shorter or equally long than the best possible way.

Admissibility is required by algorithms like A*, which then quarantuee you to be optimal (i.e. they will find the best 'route' to a goal state if one exists).

I would recommend to look the topic up in an AI textbook.

I've been trying to learn about Neural Networks for a while now, and I can understand some basic tutorials online. Now i want to develop online handwritten recognition using Neural Network. So i haven't any idea where to start? And i need a very good instruction. In finally i'm java programmer.

What do you suggest I do?

Peter Norvig's Artificial Intelligence: A Modern Approach is a good book on general AI and explains a lot about the basics, and there is a section on Back Propagation neural networks.

To train your neural network you'll need datasets.

There's THE MNIST DATABASE of handwritten digits, or the Pen-Based Recognition of Handwritten Digits Data Set at the UCI Machine Learning Repository

The UCI ML repository has lots of great datasets, many of which would be good to train neural networks. Even if you don't know what they're about you can grab some and see if your ML system can do the classification tasks. Look at Classification tasks with a large number of attributes and instances, although you can try smaller ones too when you start out.

By the way, there are a lot more techniques besides neural networks, including Support Vector Machines, which are popular.

I wanted to ask Stack Overflow users for a nice idea for a project that could entertain a fellow student programmer during a semester. Computer vision might look interesting, although I couldn't say if a project on that field is something that could be achievable in 4 months. What do you think?

Can't tell without knowing more about you, your friend, and the project. My guess is "no".

I'd point you towards two sources. The first is Peter Norvig's "Artificial Intelligence"; the second is "Programming Collective Intelligence". Maybe they'll inspire you.

There is a story that, during the early days of AI research when significant progress was being made on "hard" logic problems via mechanical theorem provers, a professor assigned one of his graduate students the "easy" problem of solving how vision provided meaningful input to the brain. Obviously, things turned out to be far more difficult than the professor anticipated. So, no, not vision in the general sense.

If you are just starting in AI, there are a couple of directions. The classic AI problems - logic puzzles - are solved with a mechanical theorem prover (usually written in Lisp - see here for the classic text on solving logical puzzles). If you don't want to create your own, you can pick up a copy of Prolog (it is essentially the same thing).

You can also go with pattern recognition problems although you'll want to keep the initial problems pretty simple to avoid getting swamped in detail. My dissertation involved the use of stochastic proccesses for letter recognition in free-floating space so I'm kind of partial to this approach (don't start with stochastic processes though, unless you really like math). Right next door is the subfield of neural networks. This is popular because you almost can't learn NN without building some interesting projects. Across this entire domain (pattern processing), the cool thing is that you can solve real problems rather than toy puzzles.

A lot of people like Natural Language Processing as it is easy to get started but almost infinite in complexity. One very definable problem is to build an NLP program for processing language in a specific domain (discussing a chess game, for example). This makes it easy to see progress while still being complex enough to fill a semester.

Hope that gives you some ideas!

Apologies in advance if this is too vague.

My list so far:

  • statistical arbitrage
  • actuarial science
  • manufacturing process control
  • image processing (security, manufacturing, medical imaging)
  • computational biology/drug design
  • sabermetrics
  • yield management
  • operations research/logistics (I'll include business intelligence with this)
  • marketing (preference prediction, survey design/analysis, online ad serving)
  • computational linguistics (Google, information retrieval, ...)
  • educational testing
  • epidemiology
  • criminology (fraud detection, counterterrorism, ...)
  • consumer credit scoring
  • spam detection
  • bug finding, virus detection, computer security

Are there any articles, books or journals that address this question? The only book I've seen is Supercrunchers, which focuses on consumer preferences an not much else.

There are a ton of fields which utilize machine learning:

  • Predictive text input (Support Vector Machines)
  • Computer Vision
  • Game A.I.
  • Robotic perception (classification and detection)
  • Genomics
  • Handwriting recognition (the U.S. Postal service uses neural networks for mail sorting, for instance)
  • Credit card fraud detection
  • Localization (Kalman filters, particle filters)
  • Preference Prediction (Netflix, Amazon)


If you're looking to laundry list all the applications of machine learning, I think you'll find the problem is intractable. Machine learning as a field is largely focused on the task of using data to build a model which can map inputs to a desired set of outputs. The fields which utilize it grows constantly, as folks imagine new applications for machine learning. If it helps, typically machine learning is most powerful when the mapping between inputs and outputs cannot be well described, the mapping space is too highly dimensional to process in a reasonable fashion, and/or needs to be adaptive over time.

If you're simply looking for places to read up on machine learning applications, you can take a look at the following:

Another good bet would be to hit up university websites that have strong A.I., CS, Math, or Robotics programs and see if they have course materials of interest. I know, for instance, that CMU, MIT, and Stanford all typically have lots of course notes online which will often mention applications for various techniques.

I am currently working on an AI for the Card Game Wizard. Wizard is a trick-taking game, in which each player states how many tricks he believes he will take, before the actual game begins.

After reading some papers and some parts of the book Artificial Intelligence: A Modern Approach, I decided to first design my algorithm for the game with open cards, so that each players has complete imformation. So I just started and implemented a Monte Carlo Tree Search algorithm, using UCB Selection Policy. I have implemented everything in java, and it seems to be running pretty well, but my bots are not playing optimal yet. Especially predicting the tricks you get seems to be a hard task, for which I used the same MCTS as for the playing.

So basically my algorithm expands the current state of the game (e.g. 2 players have placed their bid) creating one new node (e.g. 3 players have placed their bid), and then just plays randomly until the game is over. Then the scores are evaluated and backed up trough the nodes.

I think the next step to improve the algorithm would be, to add some heuristic to the tree search, so that branches that will most likely result in a loss will be ignored from the start.

My question is: Do you think that this a good approach? What other approaches would be promising, or do you have any other tipps?

I'm trying to take a group of twenty people (labeled 1 - 20) and spilt them into five subgroups of 4 each based upon expressed preferences of who those people wish to be with.

Each person in the group of 20 could express 0, 1, 2, 3, or 4 preferences. For example, person1 could select 0 (no preference of who they are with), or 14 (in a group with person14) or could express to be in a group with persons 14, 20, 6, and 7. Ideally each person with a preference would be in a group with at least one choice.

Ideas on an algorithm?

The problem you are having is not really related to C#, the algorithm is independent on the language.

A classic implementation for this problems is backtracking.

More info:

Another approach (I would go for this): Genetic Algorithms.