Stay hungry,stay foolish.

Random Walk with Restart

Posted on By Evictor

Introduction

A random walk with restart is a stochastic process to traverse a graph, resulting in a probability distribution over the vertices corresponding to the likelihood those vertices are visited. This probability can be interpreted as the relatedness between nodes in the graph. The random walk starts with an initial distribution over the nodes in the graph, propagating the distribution to adjacent vertices proportionally, until convergence.

Formula

Let T be the transition matrix of the graph, with Machine_learning_image! being the probability of reaching entity Machine_learning_image! from entity Machine_learning_image!,which can be computed as follows::
Machine_learning_image!
in which Machine_learning_image! is the set of entities directly reachable from Machine_learning_image!, and Machine_learning_image! is the weight of the edge between Machine_learning_image! and Machine_learning_image! ,defined as the number of their co-occurrences in the knowledge base. Let Machine_learning_image! be the probability distribution at iteration t, then Machine_learning_image! is computed as follows:
Machine_learning_image!
where λ is restart probability S is the preference vector (in order to avoid the issues caused by sinks and guarantee convergence).
Formally, the random walk model can be modeled as:
Machine_learning_image!

Demo

import random
import numpy as np
def EntityRWS(G, restart_rate, root, max_step): 
'''
Computer Simulate Random Walk with Restart
''' 
    rank = dict()  
    rank = {x:0 for x in G.keys()}  
    rank[root] = 1.0 
    for k in range(max_step):  
        tmp = {x:0 for x in G.keys()}
        total={x:0 for x in G.keys()}
        for i, ri in G.items():
            tmp[root]+=rank[i]*restart_rate
            total[i]=sum(ri.values())
            for j, wij in ri.items():
                  tmp[j] += rank[i] * (wij/total[i])*(1-restart_rate)
        rank = tmp  
  
        print('iter: ' + str(k) + "\t"+str(rank)) 
    return rank  

def Cal_feature(G,restart_rate,root):
'''
Use Formula,however always "Singular Matrix Error"
'''
    probability=np.zeros(shape=(len(G),len(G)))
    total={x:0 for x in G.keys()}
    Name=[x for x in G.keys()]
    for i, ri in G.items():
            total[i]=sum(ri.values())
            for j, wij in ri.items():
                probability[Name.index(i)][Name.index(j)]=wij/total[i]
    c=1-restart_rate
    #for k in range(len(G)):
        #probability[k][k]=0.5;
    print(probability)
    new_probability=1-c*probability
    print(new_probability)
    S=[]
    for x in G.keys():
        if x == root:
            S.append(1)
        else:
            S.append(0)
    new_probability=abs(np.linalg.inv(new_probability))
    print(new_probability)
    print(S)
    feature=restart_rate*np.dot(new_probability,S)
    
    return feature

if __name__ == '__main__' :  
    G={"A":{"a1":1},
       "B":{"b1":1,"b2":1},
       "C":{"c1":1,"c2":1,"c3":1},
       "a1":{"A":1,"d1":1},
       "b1":{"B":1,"d2":1},
       "b2":{"B":1},
       "c1":{"C":1,"d1":1,"d2":1,"d3":1},
       "c2":{"C":1,"d3":1},
       "c3":{"C":1,"d3":1},
       "d1":{"a1":1,"c1":1},
       "d2":{"b1":1,"c1":1},
       "d3":{"c1":1,"c2":1,"c3":1}}
    G1={"A":{"B":1,},
        "B":{"A":1,"C":2},
        "C":{"B":2}}
  
    print(EntityRWS(G1, 0.5, 'A', 100))
    print(Cal_feature(G1,0.5,'A'))