Challenge Overview
Prize
1st - $3000
2nd - $1500
3rd - $1000
4th - $500
Challenge Overview
We are looking for predictive models to better evaluate the impact on Social Systems due to various events, especially like Election.
Task Detail
We have some initial datasets collected as follows. RunDataTable describes demographic information and policy-related vote information of each individual, and RelationshipDataTable describes the social relations between these individuals.
RunDataTable
-
income : The value of Income.
-
location: An integer from 0 to 3 denoting the Location.
-
age: An integer describing the Age
-
avg_confusion: A fraction number denoting the Average Confusion value. There are 10 different possible values, i.e., "[1-10]/10".
-
Avg_excitement: A fraction number denoting the Average Excitement value. There are 10 different possible values, i.e., "[1-10]/10".
-
avg_group_work_percentage: A value range of the Avg Group Work Percentage.
-
avg_happiness: A fraction number denoting the Average Happiness value. There are 5 different possible values, i.e., "[1-5]/5"
-
confusion: A fraction number denoting the Confusion value. There are 10 different possible values, i.e., "[1-10]/10".
-
excitement: A fraction number denoting the Excitement value. There are 10 different possible values, i.e., "[1-10]/10".
-
expense_range: A value range of the Expense Range
-
factor_1: A fraction number denoting the answer to the question "I like to take charge of things.". There are 10 different possible values, i.e., "[1-10]/10".
-
factor_2: A fraction number denoting the answer to the question "I am sociable.". There are 10 different possible values, i.e., "[1-10]/10".
-
factor_3: A fraction number denoting the answer to a series of logical reasoning tasks.. There are 10 different possible values, i.e., "[1-10]/10".
-
gender: Male/Female
-
groups: Comma separated string of groups.
-
Happiness: A fraction number denoting the Happiness value. There are 5 different possible values, i.e., "[1-5]/5"
-
income_range: The value range of Income.
-
num_interactions: An integer number
-
Policy1Vote: Yes/No
-
Policy2Vote: Yes/No
-
Policy3Vote: Yes/No
-
Policy4Vote: Yes/No
-
recently_joined_group: Yes/No
-
recently_left_group: Yes/No
-
voted: Yes/No
-
group_contest: Comma separated string of "WinningGroup,LosingGroup,ContestLocation".
-
group_created : Line in data table represents group being created at that time.
-
group_disbanded: Line in data table represents group disbanding at that time.
-
group_membership: Comma separated string of agents in group.
-
group_work_productivity: A non-negative real number
-
percent_population_voted: a real number between 0 and 1.
-
policy_activated: An integer from 1 to 4 representing different policy. Line in data table represents policy activated at that time. A policy of "none" indicates that all policies were deactivated for that location at that time.
-
job_income: A real number about the income from the job.
-
fixed_income: A real number about the fixed income.
-
fixed_expense: A real number about the fixed expense.
-
distribution_income: A real number about the distribution income.
RelationshipDataTable
-
avg_group_identification: A fractional number denoting the Avg Group Identification value. There are only 5 different values, i.e., "[1-5]/5"
-
individual_interaction: A string describing the individual.
-
social_network_edge: True/False
-
percent_yes_votes: A real number between 0 and 1 denoting the percentage of the “Yes” votes.
Final Predictive Goals: (1) How are groups impacting the social world? (2) How can we track agents participating in events? And (3) what is influencing them to participate in that (maybe groups or policies. One agent can be part of 1 event at any point in time)?
Goal of This Challenge: You are asked to first analyze the data and formulate the problem -- how can we evaluate the impact in some quantitative way? You are asked to build models to answer the questions in the “Final Predictive Goals” section. Your solution will be judged based on the novelty as well as the quality of the generated conclusion.
Models using/related to dependency graphs are strongly recommended. Submitting a causal graph will be a plus too.
The full dataset can be downloaded here.
Final Submission Guidelines
Submission
The final submission must include the following items.
-
A write-up of your proposed model, i.e., how do you utilize the input to make the quantitative evaluation for the impact.
-
Data analysis, for example, using Jupyter-Notebook.
-
PoC solution is required.
Judging Criteria
Winners will be determined based on the following aspects:
-
Model Effectiveness (60%)
-
Did you formulate the problem in an intuitive and scientific way?
-
Did you use data analysis to justify your choices?
-
What’s the quality of the generated conclusion?
-
-
Model Novelty (20%)
-
Are you using any novel model?
-
-
Model Feasibility (10%)
-
How easy to deploy your model?
-
Is your model’s training time-consuming?
-
-
Clarity of the Report (10%)
-
Do you explain your proposed method clearly?
-