Over the past few years, computer scientists have been very interested in peer-to-peer based information retrieval systems. But while such applications are promising, the underlying technology is challenging because it is difficult to direct users' queries to ideal destinations effectively and efficiently in the absence of complete up-to-date information about other nodes' states in the network. In addition, the presence of concurrent search sessions adds another level of complication: bandwidth and capacity limitations may prevent nodes from promptly forwarding and performing local searches for all queries received. This thesis frames a peer-to-peer information retrieval(P2P IR) problem as a multi-agent framework and attacks it from an organizational perspective by exploring various adaptive, self-organizing topological organizations, designing appropriate coordination strategies, and exploiting learning techniques to create more accurate routing policy for large-scale agent organizations. Specifically, two protocols have been designed to create semantic-based implicitly-clustered agent organizations and explicit multi-level hierarchical agent organizations respectively. Several coordination strategies are also proposed to direct distributed search sessions by taking advantage of agents' degree, similarity information. Furthermore, in order to handle multiple concurrent search sessions in the system, an agent control mechanism is proposed to engineer the query flow in the entire network based only on agents' local observations of network traffic and agent loading so as to improve the mean effective propagation speed of search queries. The elements of such a control mechanism include resource selection, local search scheduling and feedback-based load control. In particular, with the feedback-based load control unit, an agent not only considers the capacity of its own communication channels, but also takes into account its neighboring agents' service rate, which is acquired dynamically from its neighboring agents. Based on this novel agent control mechanism, a balanced distributed search algorithm is designed to reduce the potential hot spots in the network. In addition, a reinforcement-learning based approach is developed in this thesis to take advantage of the run-time characteristics of P2P IR systems, including environmental parameters, bandwidth usage, and historical information about past search sessions. In the learning process, agents refine their content routing policies by constructing relatively accurate routing tables based on a Q-learning algorithm. Experimental results show that this learning algorithm considerably improves the performance of distributed search sessions in P2P IR systems.
Index Terms
- Learning based organizational approaches for peer-to-peer based information retrieval systems
Recommendations
A reinforcement learning based distributed search algorithm for hierarchical peer-to-peer information retrieval systems
AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systemsThe dominant existing routing strategies employed in peer-to-peer(P2P) based information retrieval(IR) systems are similarity-based approaches. In these approaches, agents depend on the content similarity between incoming queries and their direct ...
Information retrieval in a peer-to-peer environment
InfoScale '06: Proceedings of the 1st international conference on Scalable information systemsDue to rapid information growth, peer-to-peer (P2P) systems have become a promising alternative to centralized, client/server-based approaches for large-scale data sharing. By allowing peers to join and leave the system freely, they offer the peers ...
Agent-community based peer-to-peer information retrieval: an evaluation
AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systemsThe Agent-Community-based Peer-to-Peer Information Retrieval (ACP2P) method uses agent communities to manage and look up information of interest to users. An agent works as a delegate of its user and searches for information that the user wants by ...