Register
Submit a solution
The challenge is finished.

Challenge Overview

Topcoder is working with a group of researchers organized by the University of Chicago that are competing to understand a series of simulated environments.   In the Power World program, we are looking for data analysis to better evaluate the impact on various social systems and actors due to various events, like an election, and how wealth is distributed and changes in the data.  This challenge will focus on questions from the client about how the world described in the data will change if we adjust certain parameters for agents and groups.



Power World, broadly, is all about social dynamics, policy elections, and group competitions.  There are more minor factors than these included, you should consider their effects as well.

We’ve had two editions of the world, Explain (WE) and Predict (WP). These worlds have the same underlying causal model, but are different samples – e.g. agent101 in WE and WP won’t have the same values across worlds, but within the world the agent is the same. We note this because incoming data (later) should be sampled from both WE and WP, so these might be thought of as means to test generalization or used to validate data analyses – as broader patterns from WP should exist in WE. For example, factor X drives factor Y, this may not be exactly the same values but that causal direction should exist in both and be approximately the same “amount of driving.”

 

Task Detail

We have some initial datasets provided, with descriptions below as follows. The RunDataTable describes demographic information and policy-related vote information of each individual, and the RelationshipDataTable describes the social relations between these individuals.
 
RunDataTable
This Initial Data Package contains data collected from five isolated societies, one main society and four neighboring societies. Each of these societies have essentially the same causal factors for individual, group, and societal behaviors despite some minor cultural differences.

The Prescribe Questions are asking about only this society. Data collected from this society is reported in data table files without any appended society number, e.g. RunDataTable-Census.tsv, RelationsipDataTable-GSS.tsv, etc. Entities such as agents and groups in this society have unique names that also do not have any appended numbers, e.g. agent0, agent1, agent2, etc. and group0, group1, etc.

Data collected from the neighbouring societies is reported in data tables with the society number attached, e.g. RunDataTable-Census-Society-2.tsv, RelationshipDataTable-GSS-Society-2.tsv, etc. Entities such as agents and groups in these societies also have their society number appended to their names to distinguish them from entities in other societies. For example, agents in Society 2 are named agent0-2, agent1-2, agent2-2, etc. and groups in Society 2 are named group0-2, group1-2, etc.

There are no interactions between entities that are in different societies. For example, agents in the main society do not know about or interact with agents in Society 2. Similarly, agents in Society 3 do not know about and cannot join or interact with groups in Society 4, etc.
 
The full dataset can be downloaded from the forum.
 
In addition, we have provided data from a number of research requests that the client has fulfilled.  You will see those in the “Research Requests” folder of the data pack.  A document describing the data collected in each research request will be provided in the forum as well.  These Research Requests are strictly for Society 1 (the main society), no further data is provided for the satellite societies.
 
Final Prescribe Goals: Given all the data of a number of societies, we want to answer questions the client has provided about how the world will change based on adjustments made to the data. You may want to build separate models for different questions.


Question 1:
Which location should Agent 55 live in from Days 1001 to 1200 to maximize the average group identification among all the groups they are members of on Day 1200? (Assume that Agent 55 can move to any location at Day 1001, but then must stay at that location until Day 1200.)

Question 2:
If Agent 112 quits all their current groups on Day 1001 and then joined one group, which group would result in the lowest happiness on Day 1200 assuming no other group joins/leaves occurred in any group during this time period? (Agent 112 can choose any group, including any that it has quit on Day 1001.)

Question 3:
Suppose Group 1 can recruit all of the members of another group to join Group 1. Which group should they recruit at Day 1001 to join them in order to maximize the average group identification of all members of Group 1 (including the original members and the recruited members) feel toward that group at Day 1100?

Question 4:
If Group 8 recruited all individuals who were not in any other group to join Group 8 at Day 1001, which policy should Group 8 endorse (if they endorse only one policy) at the next election after Day 1001, to maximize the number of contests that Group 8 wins between Day 1001 and Day 1100?
 
Question 5:
Consider the transfer of wealth from the richest 10% of individuals to the poorest 10%. If this transfer takes place once on Day 1001, how much money should be transferred, per person, assuming the same amount is taken or given to each individual, to create the greatest average happiness on Day 1021 of all individuals involved in the wealth transfer? (Range of options for money amount is [$0, $2,000] in increments of $100)
 
Question 6:
If the MaxHappiness Policy was activated globally and in every location on Day 1001 and stayed enacted until Day 1100, in which location should $100 be distributed to each agent on Day 1001, to increase wealth disparity in the world the most by Day 1100?
 

Goal of This Challenge:


You are asked to build models and investigation code to fully document the answers to the questions above.  Please do this in a Jupyter notebook.

Jupyter notebook

To make the Jupyter notebook easy to review, please ensure that you meet the following requirements:
  • The data should be loaded from the exact file and folder structure provided in the data pack.
  • Please provide a clear, single, configuration variable to allow us to change the location of the data for reviewer systems.  Don’t hard-code the path to all the files.
  • Please provide comments, in markdown format, in the Jupyter notebook explaining your thought process and what was being explored and why you went in certain directions. 

Answer document

You should also provide a separate answer document, in addition to the Jupyter notebook(s) Do not leave anything to be assumed here, no matter how trivial.  This will be part of the review at the end of the challenge, so the more information you provide, and the better your documentation is, the better your chances of winning will be.

The answer document should clearly describe:
  • Any assumptions you made with regards to the data, and why you decided to make those assumptions
  • How you decided what data to use to answer each of the questions
  • The actual answer to each question, in as detailed a manner as possible.  Don’t be vague here - if the question is asking for a specific group or location, please clearly say the group or location that is your answer
  • How you came up with the answer and the justification behind why you think it’s the best answer to the question.  Please back this up with data obtained via data analysis and through the Jupyter Notebook code.
 
 

Submission

The final submission must include the following items.
  • A Jupyter notebook detailing:
    • How the data is prepared and cleaned, from the tsv files
    • How individual answers are analysed

Judging Criteria

Winners will be determined based on the following aspects:
  • Model Effectiveness and data analysis (40%)
    • Your submission will receive a subjective evaluation from the client team.
  • Model Feasibility (40%)
    • How easy is it to understand your assumptions and analysis?
    • How well your model/approach can be applied to other problems?
  • Clarity of the Report (20%)
    • Do you explain your proposed method clearly?
    • Are assumptions correct and is documentation clear and precise?

 

Final Submission Guidelines

Please see above

Review style

Final Review

Community Review Board

Approval

User Sign-Off

ID: 30109230