On Edge classification in networks with structure and content
Abstract
The problem of node classification has been widely studied in a variety of network-based scenarios. In this paper, we will study the more challenging scenario in which some of the edges in a content-based network are labeled, and it is desirable to use this information in order to determine the labels of other arbitrary edges. Furthermore, each edge is associated with text content, which may correspond to either communication or relationship information between the different nodes. Such a problem often arises in the context of many social or communication networks in which edges are associated with communication between different nodes, and the text is associated with the content of the communication. This situation can also arise in many online social networks such as chat messengers or email networks, where the edges in the network may also correspond to the actual content of the chats or emails. The problem of edge classification is much more challenging from a scalability point of view, because the number of edges is typically significantly larger than the number of nodes in the network. In this paper, we will design a holistic classification approach which can combine content and structure for effective edge classification.