## Monday, July 13, 2015

### Social Network Sampling

I will start working for a social network in August. My new job is data scientist at the business network Xing. I decided to play with their open API a little. The Xing API gives access to users in the network as well as their contact list. So an external app can analyze the social graph. Well actually there are restrictions on the number of queries so it is possible to
analyze the graph partly. My goal is to extract an unbiased sample from the complete graph by performing a random walk from a start node. That means at every node the algorithm expands all
the node's successors and then picks a random one. The chosen successor is the next node. With
a probability of 0.15 the algorithm returns to the start node. This will ensure that we explore the graph around the start node. The going back is ensuring a trade off between breadth of the exploration and the depth of the exploration. I use the ruby version of the API.
A code snippet for a random walk step in the Xing API
is shown below:

##
# Random walk step:
#   1) With probability 0.15 go back to start
#   2) Randomly choose a node from the successors
def random_walk_step(user_id, start)
reset = false
if Random.rand < 0.15
user_id = start
reset = true
end
num_contacts = 5 # control branching
contacts     = XingApi::Contact.list(user_id, limit: num_contacts)[:contacts][:users]
next_user    = Random.rand(num_contacts)
contacts.map! {|x| x[:id]}
if reset
return random_walk_step(contacts[next_user], start)
end
return [contacts[next_user], contacts, user_id]
end

The random step function takes a user id as the current node and a node we started the random walk at. First we decide if we restart the search at the start node, then it samples a neighbor. Repeating the process and updating the user id with the sampled neighbor results in a graph. The function below, explores the graph for 25 nodes and returns the graph as a dot file:
##
# Print graph as graphviz file
def dot(path)
expended = []
File.open(path, 'w') do |file|
file.write "digraph my_neighbors {\n"
start_id = XingApi::User.me[:users].first[:id]
ids = random_walk_step(start_id, start_id)
45.times do
id = ids[2]