سال انتشار: ۱۳۸۴

محل انتشار: یازدهمین کنفرانس سالانه انجمن کامپیوتر ایران

تعداد صفحات: ۵

نویسنده(ها):

Alireza Mahdian – Computer Engineering Department Sharif University of Technology,
Hamid Khalili –
Mohammad Ghodsi –

چکیده:

We introduce a new algorithm for compressing the link structure of the web graph by means of re-indexing the nodes in some web communities in order to decrease the differences between the numerical values of indices of these nodes. This is especially done for the nodes that participate in the adjacency lists of many nodes. Our algorithm will then partition these nodes into groups, each represented by a group-indicator node. We then remove the edges directed to nodes in one group and replace them with an edge to the group indicator. This algorithm preserves the overall characteristics of the graph and also increases similarity between link adjacency lists of nodes. So it can be used as a preprocessing algorithm to compress the web graph prior to compression by Huffman or other algorithms, as a result the final compression ratio is considerably improved. We will show in the paper that the time complexity of our compression algorithm is O(n2 log n) for each community.