MovingAI pathfinding benchmark parser in Rust

You know I worked a lot with pathfinding. In academia, the MovingAI benchmark created by the MovingAI Lab of the University of Denver is a must for benchmarking pathfinding algorithms. It includes synthetic maps and maps from commercial videogames.

Parsing the benchmark data, the maps, creating the map data structure and more, is one of the most boring thing I needed to do for testing my algorithms. For this reason, I think a common library for working with the maps specifications it is a must.

For this, and because I enjoy a lot coding in Rust, I did a MovingAI map parser for rust.

Repository is here. The library is also on crates.io. It is still unstable because I want to be sure that the public AI is consistent with the requirements. I also not very solid on the needs of Rust APIs. So, I welcome some help here. :)

Example

However, look how it is convenient for writing pathfinding algorithms! All the important stuff (neighbors, map, and so on) are just out of the box. This is an A* algorithm I wrote in literally 5 minutes.

// A* shortest path algorithm.

fn shortest_path(map: &MovingAiMap, start: Coords2D, goal: Coords2D) -> Option<f64> {

    let mut heap = BinaryHeap::new();
    let mut visited = Vec::<Coords2D>::new();

    heap.push(SearchNode { f: 0.0, g:0.0, h: distance(start, goal), current: start });

    while let Some(SearchNode { f: _f, g, h: _h, current }) = heap.pop() {

        if current == goal { return Some(g); }

        if visited.contains(&current) {
            continue;
        }

        visited.push(current);

        for neigh in map.neighbors(current) {
            let new_h = distance(neigh, goal);
            let i = distance(neigh, current);
            let next = SearchNode { f: g+i+new_h, g: g+i, h: new_h, current: neigh };
            heap.push(next);
        }
    }

    // Goal not reachable
    None
}

 

Questions about Deep Learning and the nature of knowledge

Rush of Knowledge

If there is something that can be assumed as a fact in the AI and Machine Learning domain is that the last years had been dominated by Deep Learning and other Neural Network based techniques. When I say dominated, I mean that it looks like the only way to achieve something in Machine Learning and it is absorbing the great part of AI enthusiasts’ energy and attention.

This is indubitably a good thing. Having a strong AI technique that can solve so many hard challenges is a huge step forward for humanity. However, how everything in life, Deep Learning, despite being highly successful in some application, carries with it several limitations to that, in other applications, makes the use of Deep Learning unfeasible or even dangerous.

Continue reading “Questions about Deep Learning and the nature of knowledge”

Not every classification error is the same

In this article, I would like to talk about a common mistake new people approaching Machine Learning and classification algorithm often do. In particular, when we evaluate (and thus train) a classification algorithm, people tend to consider every misclassification equally important and equally bad. We are so deep into our mathematical version of the world that we forget about the consequences of classification errors in the real world.

Continue reading “Not every classification error is the same”

How hidden variables in statistical models affect social inequality

Use of machine learning is becoming ubiquitous and, even with a fancy name, it remains a tool in the statistical modeler belt. Every day, we leak billions of data from ourselves to companies ready to use it for their affair. Modeling through data get more common every day and mathematical model are the rulers of our life: they decide where we can work, if we can get a loan, how many years of jails we deserve, and more.

While this is ethically problematic by itself, a deeper, simple problem is polluting mathematical modeling around the world: hidden variables, variables that are a common root cause of some data we are sampling. To understand why they are such a big problem, let’s start with an example. Suppose we want to write a program to understand if a certain guy is a good worker for our company. We want to use an automated system because we don’t want that human weakness and prejudices to affect our hiring process! Right?

So we start collecting data about the candidates with a questionnaire. We put into it many common sense questions. For simplicity, assume we just have 3 questions: “How good was your school curriculum?”, “Have you ever had problem with the law?” and “Have you ever missed a payment with your creditors?”. They seem good questions. After all, our model is quite clear. Being good at school, being a good citizen and paying debts in time are clearly variables correlated to the variable “It is a good worker”.

This is the causality diagram for the “job candidate” example. The three sampling variables “School”, “Law Problems” and “Debts” are not independent variables. The hidden variable “Race” is a common cause for all of them. This has huge implications on the fairness of the model.

However, after some time we discover that the system is hiring mostly middle class white men. Apparently, being white a Caucasian man is directly correlated  to being a good worker. It makes no sense. That’s because of confounding hidden variables. In fact, even if we have not put race as an explicit variable, it can still affect all the other variables we are sampling. Race affects all the above variables. Race influence on average the wealth of your family. In turn, this will affect the quality of your education, the neighbor you grew up and how much you get targeted by the police.

The overall effect is that you are screwing individuals on the basis of the indirect effect of their race on the average outcome of your sampling variables.

But wait, it gets worse.

The second problem of this kind of model is that they are self-validating. If we use this model to select good job candidates, some people will have less job opportunity and therefore less money, and, in the end, less chance to pay debts on time, pushing them and their families in bad neighbors with more law problems and worse schools. In short, the model will amplify the same issues that the model got wrong in the first place, and doing so, it validates itself. A problem called retrocausality.

Sometime, we in AI community end up too much caught in the models charm, and we forget the effect that such models can have on the people. Machine Learning  is not immune to these problems. Machine Learning can learn the world inequalities and use them to confirm its internal model. And when we will apply those machine learning algorithm, we will contribute to amplify such inequalities.

This and many other problem are discussed in the book “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy” by Cathy O’Neil. While I think the book is overly pessimistic in some parts, it is a good reading to look at statistical modeling from a different angle. It definitely helped me to consider the implications of bad mathematical modeling to people’s life. Sure, the book often is too focused on the risks of models respect to their benefits. But when we are talking about people life, I think even one innocent victim is just too much.

Update 19th September 2017

Some days ago a friend of mine showed me a recent real-life example of what I described here. I said that Machine Learning is not immune from discriminatory biases. As an example, let’s look at this tweet (image copy):

This image is the result of what a machine learning algorithm (word2vec) learns when trained on a Google News corpus. In particular, it shows which adjectives are associated to the word “he” and which adjectives are associated to the word “she”. As you can see, we are in stereotype-land!

The point is that the algorithm is trained on an already biased world, and therefore learn to be biased itself. It is just math and algorithm, but it is sexist. If we are not aware of this possibility, and we apply such ML algorithms, we may end up in amplifying the inequalities we are trying to avoid by using math and algorithm!

Artificial Anxiety and the problem “Mental Issues” in AI

Claptrap Borderlands 2

Anxiety is a human mind bug. This may seem a strange claim, but I cannot find a better explanation for anxiety disorders. In fact, we can see pathological anxiety as the undesired consequence of our ability to think about the future. Being scared about a life-threatening event in the near future is a valuable ability: it helps us to survive, avoid danger and, in short, make our species survive. That is one of the reason our species has been so successful in nature[1].

Human Anxiety

However, sometime our mind goes too much in the future, or it keeps looking too much in the negative outcomes of future events. And that’s where anxiety is deeply rooted. For example, let’s look at my cat. When my cat is scared, it is for something happening in the present moment or, at most, in the imminent future: a noise, a black bag or something only her can perceive and make us laugh at her. Nevertheless, when the treat is gone, the cat quickly came back to its normal state. There is no “anxiety”, at least not in the same form we feel it.

But human can think far in the future. We can imagine future situations, we can plan ahead and do any kind of hypothesis about future events. And this is where anxiety build up: increasing our time horizon we can perceive, we overwhelmingly increase the number of treats we feel imminent and, therefore feeding our “scared condition”.

But this is unavoidable. Planning ahead, thinking about the future, thinking about the consequence of our actions, our fate, our mortality; this is what made us the dominant species of our planet. It is undoubtedly a “feature” we cannot give up. Future sight is both a blessing and a curse.

What about AI then? My idea is that because “anxiety” is design drawback of the “plan in the future” feature, this may hold true for Artificial Intelligence too. We have the goal of designing smarter AIs that can be completely autonomous. And to achieve this we must build AIs that can look and plan around future events like we do. And this force us to face the “anxiety bug” problem. Can AI fall in the same pathological pattern? Or are they immune “by design”?

Artificial Anxiety

We are probably talking about a different kinds of anxiety. We can define “anxiety” as the fear for some future event. In our case “fear” is “fear of death”, because “death” is what can block us from achieving our final goal: reproduction. That’s what thousands of years of natural selection pressure trained us for. I am extremely simplifying the problem, I know, but let’s ignore all the nuances of the problem for now.

In AI, we can set the final goal of the agent, and we may think that we can choose a goal that can avoid “artificial anxiety”. But, whatever goal we set, an agent cannot achieve its goal if it is deactivated/incapacitated. Or the goal is so simple that is practically unavoidable (and so, what is  the point of programming an agent for that?), or there will be some situations that make the goal unreachable. As a consequence, the agent will try its best to avoid such situations, will think about them, will fear them.

This is problem very similar to the Stop Button Problem. Whatever system we include to introduce an “emergency stop button” into a general AI, will make the problem worse or will make the AI useless.

With “artificial anxiety” we may be in a better position. After all, we want our agents to “fear” potential failure scenarios. This is what will drive the agent to the defined goal. What we want, however, is to find a balance between an agent paralysed by anxiety and a reckless one. We know this is possible because we know people who can live without pathological cases of anxiety disorders. Unfortunately, how to do that is far from clear and may need a deeper understanding of human mind and mental disorder.

I know that talking about mental disorders in AIs may seem ridiculous considering the limited capabilities of our current artificial agents. However, I have this feeling that this may become a relevant aspect of AI development in the far (or near?) future. All the techniques we have developed to treat human anxiety may give us hints on how to face similar issues in the artificial domain.

But more importantly, even if not everybody agrees that what I call “fear” in the domain of AI is even comparable to human fear, I feel fascinating that humans and AI can develop common patterns. In some sense, make us feel more like machines and machines like humans.

 


[1]However, it is interesting how this amazing ability to predict the future – so valuable for our individual selves – falls apart when we try to predict our future as a species. We do so many bad choices as a collectivity!