Random Walk with Restart - Evictor's Blog

Introduction
Formula
Demo

Introduction

A random walk with restart is a stochastic process to traverse a graph, resulting in a probability distribution over the vertices corresponding to the likelihood those vertices are visited. This probability can be interpreted as the relatedness between nodes in the graph. The random walk starts with an initial distribution over the nodes in the graph, propagating the distribution to adjacent vertices proportionally, until convergence.

Formula

Let T be the transition matrix of the graph, with Machine_learning_image! being the probability of reaching entity from entity ,which can be computed as follows::

in which is the set of entities directly reachable from , and is the weight of the edge between and ,defined as the number of their co-occurrences in the knowledge base. Let be the probability distribution at iteration t, then Machine_learning_image! is computed as follows:

where λ is restart probability S is the preference vector (in order to avoid the issues caused by sinks and guarantee convergence).
Formally, the random walk model can be modeled as:

Demo

import random
import numpy as np
def EntityRWS(G, restart_rate, root, max_step): 
'''
Computer Simulate Random Walk with Restart
''' 
    rank = dict()  
    rank = {x:0 for x in G.keys()}  
    rank[root] = 1.0 
    for k in range(max_step):  
        tmp = {x:0 for x in G.keys()}
        total={x:0 for x in G.keys()}
        for i, ri in G.items():
            tmp[root]+=rank[i]*restart_rate
            total[i]=sum(ri.values())
            for j, wij in ri.items():
                  tmp[j] += rank[i] * (wij/total[i])*(1-restart_rate)
        rank = tmp  
  
        print('iter: ' + str(k) + "\t"+str(rank)) 
    return rank  

def Cal_feature(G,restart_rate,root):
'''
Use Formula,however always "Singular Matrix Error"
'''
    probability=np.zeros(shape=(len(G),len(G)))
    total={x:0 for x in G.keys()}
    Name=[x for x in G.keys()]
    for i, ri in G.items():
            total[i]=sum(ri.values())
            for j, wij in ri.items():
                probability[Name.index(i)][Name.index(j)]=wij/total[i]
    c=1-restart_rate
    #for k in range(len(G)):
        #probability[k][k]=0.5;
    print(probability)
    new_probability=1-c*probability
    print(new_probability)
    S=[]
    for x in G.keys():
        if x == root:
            S.append(1)
        else:
            S.append(0)
    new_probability=abs(np.linalg.inv(new_probability))
    print(new_probability)
    print(S)
    feature=restart_rate*np.dot(new_probability,S)
    
    return feature

if __name__ == '__main__' :  
    G={"A":{"a1":1},
       "B":{"b1":1,"b2":1},
       "C":{"c1":1,"c2":1,"c3":1},
       "a1":{"A":1,"d1":1},
       "b1":{"B":1,"d2":1},
       "b2":{"B":1},
       "c1":{"C":1,"d1":1,"d2":1,"d3":1},
       "c2":{"C":1,"d3":1},
       "c3":{"C":1,"d3":1},
       "d1":{"a1":1,"c1":1},
       "d2":{"b1":1,"c1":1},
       "d3":{"c1":1,"c2":1,"c3":1}}
    G1={"A":{"B":1,},
        "B":{"A":1,"C":2},
        "C":{"B":2}}
  
    print(EntityRWS(G1, 0.5, 'A', 100))
    print(Cal_feature(G1,0.5,'A'))

Stay hungry,stay foolish.

Index

Introduction

Formula

Demo