Challenge Overview
Previously in Topcoder - Create CronJob For Populating Changed Challenges To Elasticsearch, We have created the cronjob for populating changed challenges.
For this challenge, we'd like to follow the same approach to populate the srms and mmatches indexes. Please use cronjob_for_syncing_challenges_index branch.
1. the job for marathon matches should try to find the rounds with registration phase started with latest 60 days
registration_segment.start_time > (sysdate - 180 units day)
2. the job for single round matches (srms) should try to find the rounds with registration phase started with latest 60 days
registration_segment.start_time > (sysdate - 60 units day)
3. The job should be up intervally, the interval should be configurable in YAML file and overridable by environment variables.
The following already implemented in the existing cronjob, you can learn from that.
4. The service will be possiblly deployed in several machines and load balanced, so there will be several jobs running simutanously, we should use a distributed lock to make sure only one cronjob is running in the same time. The job are same, so no need to run in the same time.
you can use redisson to achieve this, see https://github.com/redisson/redisson/wiki/8.-Distributed-locks-and-synchronizers
5. It is possible for query in item 1, there will be a big list of challenge ids (like the intial load), the job should be able to smartly do batch update, like retrieved a configurable size (for example, 100) each time to update into elasticsearch.
and for listing the challenge ids, be sure to use desc order, so the newer data will be updated first, which is more important for the platform.
- Verification Steps
For this challenge, we'd like to follow the same approach to populate the srms and mmatches indexes. Please use cronjob_for_syncing_challenges_index branch.
1. the job for marathon matches should try to find the rounds with registration phase started with latest 60 days
registration_segment.start_time > (sysdate - 180 units day)
2. the job for single round matches (srms) should try to find the rounds with registration phase started with latest 60 days
registration_segment.start_time > (sysdate - 60 units day)
3. The job should be up intervally, the interval should be configurable in YAML file and overridable by environment variables.
The following already implemented in the existing cronjob, you can learn from that.
4. The service will be possiblly deployed in several machines and load balanced, so there will be several jobs running simutanously, we should use a distributed lock to make sure only one cronjob is running in the same time. The job are same, so no need to run in the same time.
you can use redisson to achieve this, see https://github.com/redisson/redisson/wiki/8.-Distributed-locks-and-synchronizers
5. It is possible for query in item 1, there will be a big list of challenge ids (like the intial load), the job should be able to smartly do batch update, like retrieved a configurable size (for example, 100) each time to update into elasticsearch.
and for listing the challenge ids, be sure to use desc order, so the newer data will be updated first, which is more important for the platform.
Final Submission Guidelines
- Code Changes- Verification Steps