In my current graduate AI Class (Gatech CS 6601) we had to perform two mini research projects.
The actual course structure is very nice and refreshing. You read the Textbook [1] chapter by chapter at
home and do some of the homework from the book. The actual lecture covers state of the art in AI
and you are required to perform three research projects (Proposal + Implementation + Evaluation +
Paper) in a bi weekly cycle.
In the first I evaluated how to find influential users in a Twitter network and in the second we predicted how a group of people likes a dish using a bayes net. I thought both are worth a blog post.
The first I performed as a team with Patrick Mize.
Reach for Friends
Our goal in the project was to identify influential users in a social graph. A social graph represents friendships between users. Each node
in such a graph is an individual and each edge represents a
friendship between two people. In order to generate a social graph, we used the Twitter
api and extracted 1000 nodes and stored them in a database.
The database holds two tables, one for users (with their
Twitter id and name) and one for friendships (mapping a
user to another user). We “crawled” the Twitter
network by recursively searching the friends of a user. In
order to control the growth of the tree we restricted the depth
of the search to 5 and the breadth of the tree to 4. So in the
end we estimated the total number of nodes to be at most 4096 nodes. Because some users had less then 4 friends and because we did not continued searching duplicate users we collected 1600 nodes in the end. Our method for influence
is reach of a node / user. Reach computes as follows: For
a node all the shortest paths that go through
it, the shorter of the halves divided by that
node is inserted in a set. The largest distance
in that set is the reach of that node. (see Figure 1)
In the upper graph the reach for v is on the shortest path between s and g. |
we use a landmark based heuristic to search faster. The landmarks for our heuristics are N distinct nodes from our social graph which show more then 4 connections each. From those N nodes we computed the shortest path to all other nodes in the graph using uniform cost search (Dijkstra’s algorithm). Once we computed the landmarks we can speed up future searches using landmarks as a heuristic [2]. The landmark heuristic is the distance from the current node to the landmark minus the distance from the landmark to the goal node. These distances can be looked up in the table we built while computing the landmarks. Since we want to get as close as possible to the actual distance we take the maximum landmark distance as the value of the heuristic function.
h(x, y) = max_i ( dist(x, landmark_i)−dist(landmark_i, y) )
In the end we calculated the influence of some persons in my net and took the one showing the
highest reach. We ended with 3 of my professors, which makes sens, however over a few friends
we found that I am friend with Obama and he got scored lesser then the professors. The explanation
for this puzzling result is that the network was gathered starting from me. So the network is a local one centered around me.
In the second project Woody Folsom and me explored peoples preferences towards food.
A Baysian Approach to Collaborative Dish Selection
Baysian Catering: A use case. Imagine, you run
a catering service and have to plan an event with a
customer. You can create a variaty of dishes and now
you want to discuss with your clients which one to
serve. In order to get a better idea of which preferences and needs you clients will have, you let them
fill out a survey in advance, where they rate a small
amount of your dishes on a scale from 1 − 10 and
inform you about hard constraints like allergies, religious constraints or vegetarians. You then use those
results in order to predict the ratings for the rest of
your dishes and present the clients the top k results.
If such a system works this will save time and will
lead to a better cutomer satisfaction since you can
present them dishes they will most probably like but
still surprise them (since you have not presented them
what the already rated). After the dinner participants
could rate the dishes served at the party which would
improve your work for that client in the future. We model the diners’
various taste preferences using a Bayes net. The net
consists of three node types. We call them “control nodes”, “taste nodes” and “rating nodes”. A
“taste node” models the probability of a diners’
preference towards an ingredient (P (likes tomato),
P (likes potato)) or a category (P (likes meat)). These
variables are discrete. The ingredients are conditional
independent from each other but conditioned by the food category they belong to (see Figure 2 the two top
layers). A control node can definitely reject a dish, by
evaluating to 0 in certain conditions. For example if
someone is vegetarian and the presented dish contains
meet, the control variable for vegetarian will evaluate
to 0 and so the probability for the whole dish will become 0. So the vegetarian variable is conditioned by
meat. The third type in the net is a preference node,
it is continuos and models the dish rating given a set
of ingredients. The overall net is shown in Figure 2.
In order to estimate the model parameters, the system will be trained with statistics about taste and preferences given a set of dishes with ratings from multiple users. The training set is generated from the questionnaires we distributed. An example for a survey output could look like this (Ingredients, Rating): “Pork, Potatoes, 8”. In order to perform normal Maximum Likelihood Learning [3] we have to have information about all variables (“Pork, Potatoes, Tomatoes, Beef, Meat, Vegetables, Rating”). We perform several steps in order to transform from the survey input to a training instance. First we discretize the values such that all given variables (in our case Pork and Potatoes) are set to “true” if the value is above a certain threshold (in our experiments 5) and “false” otherwise. In that way “liking things rated > 5” appear more often in the training set and will be assigned with a higher probability. We add categories by including the category of each ingredient from the survey. If the ingredient is liked, the category is too and if it is not, the category is not liked too. The last step is to add all values that are not in the recipe as “false” to the training instance.rom a set of those preprocessed assignments, we can directly calculate the probabilities for the ingredients using Maximum Likelihood Learning [4]. Having estimated the probabilities of such a net, we can infer the maximum likelihood rating of a unseen dish while observing only a set of ingredients. Therefore we will iterate over all possible ratings (1 − 10) and compute the probability of this rating. The maximum probability is the maximum likelihood rating for that dish. We use the enumerateAll algorithm [1], for the probability calculations. In a final experiments we showed that we are able to predict the ratings of users with a root mean square error of 1.92.[1] Russel, Norvig: Artificial Intelligence a Modern Approach, 2010, Prentice Hall
[2] C. H. Andrew V. Goldberg. Computing the shortest path: A search meets graph theory. Technical report, Microsoft Research, 2003.
[3] Wang, Knight, and Elder. On computer viral infection and the effect of immunization. In Proceedings of the ACSAC ’00, 2000
[4] Kevin Murphy. A Brief Introduction to Graphical Models and Bayesian Networks.