Project 3 Influence Maximization Problem(IMP) Solution

Starting from:

~~$30~~

$24

Home

Overview

In this project, you have two computational tasks for Influence Maximization Problems (IMPs) in social networks. The first is to implement an estimation algorithm for the influence spread and the second is to design and implement a search algorithm for IMPs. An introduction to IMP (including problem formulations and influence spread definition) can be found in the package we provide (IMP.pdf). The IMP is NP-hard and the influence spread computation is #P-hard under the definitions shown in the introduction. Overall, your estimation algorithm needs to give a good estimation of the influence spread and your search algorithm needs to find a high-quality solution to IMPs as fast as possible within a limited time. The scores you get in this project will be given according to the performance of your algorithms in our test.

General rules

After your submission, we will test your influence spread estimator and IMP solver on different IMP problem instances. To make this process as smooth as possible, the package you submit must satisfy the following requirements.

Report

The report should be submitted in pdf format, in which you should describe the core idea of your design and give the pseudo code. Please see the report template and evaluation standard (Released on sakai).

Programming aspects

To get rid of the operating system related issues and the execution efficiency issues of different programming languages, your algorithm must be implemented using Python 3.6 and the only allowed library is numpy.

The executable estimator must be named by ISE.py and the executable solver must be named by IMP.py

Task 1: influence spread computation

Input: the format of the estimator call should be as follows:

python ISE.py –i <social network -s <seed set -m <diffusion

model -t <time budget

<social network is the absolute path of the social network file <seed set is the absolute path of the seed set file <diffusion model can only be IC or LT

<time budget is a positive number which indicates how many seconds (in Wall clock time, range: [60s, 1200s]) your algorithm can spend on this instance.

Output: the value of the estimated influence spread.

Task 2: influence maximization

Input: the format of the solver call should be as follows:

python IMP.py –i <social network -k <predefined size of the seed set -m <diffusion model -t <time budget

<social network is the absolute path of the social network file <predefined size of the seed set is a positive integer <diffusion model can only be IC or LT

<time budget is a positive number which indicates how many seconds (in Wall clock time, range: [60s, 1200s]) your algorithm can spend on this instance.

Output: the seed set found by your algorithm.

The format of the seed set output should be as follows: each line contains a node index. An example is also included in the package.

Supplementary instruction

The social networks are given in a uniform format and an example is provided in the package. The first line contains the number of nodes n and number of edges m, each of the next m lines describes an edge following the format: <src <dest <weight. Node index starts from 1.

The seed set are given in a uniform format and an example is provided in the package. Each line contains a node index. Your algorithm should give the output in this format as well.

Evaluation

We will evaluate the performance of your estimator and your solver, respectively.

For each task, the test is divided into three parts.

Usability test (40%)

We will use some IMP problem instances and seed sets to check whether your solver or estimator are usable.

Here the usability means that your estimator or solver can satisfy the input and the output requirements shown in Section 2.

Once your estimator and solver have passed a test, you will get all the corresponding scores.

Efficacy test (60%)

We will focus on how good your estimators and solvers are. In this test, we will select different IMP problem instances and set a common cut-off time for all participants.

Influence spread computation (20%): We will select different seed sets with different sizes and test the estimation error of your estimator on the instances with these seed set.

Influence maximization (40%): we will specify different sizes of the seed set and test your solver on the instances. The solution quality of all participant solvers is compared in terms of the influence spread and consumed time.

The scores you get in this test depend on the rankings your estimator or solver obtain.

Scalability test (Bonus:0~20%)

We will focus on how well your estimators and solvers can scale up to large-scale IMP problems. In this test, we will select some social networks with different scales, either in the number of nodes or the number of edges, and set a common cut-off time for all participants and compare their solution quality.

Like the efficacy test, we will test your estimator and solver on the instances and compare their solution quality.

The scores you get in this test depend on the rankings your estimator or solver obtain.

Test Environment

Operation System: Debian 10

Server CPU：2.2GHz*2, 8-core total

Python version: 3.6.6