Key Information

Register
Submit
The challenge is finished.

Challenge Overview

Challenge Overview

Welcome to the Pioneer Event Streaming Ideation Challenge

Create an overview document of (i) applicable architectural patterns and (ii) open source frameworks/libraries that help architect and implement our event streaming platform.

 

Background

We are building a scalable event streaming platform that will be used to provide customers with real time notifications on events that occurred in the system. Specific use case that we’re targeting is in the financial sector, but the platform will be designed as a generic event streaming solution. Scalability is a major concern as the solution would be used to process millions of events daily.

Event streaming platform will consist of two parts:

  1. Producer - that aggregates source data streams and produces the events and

  2. Consumer - that will be consuming the generated events and providing integration points for downstream applications

 

Most of the source data is generated in real time (ex Bob sent $5 to Alice) and some data is generated during nightwork process (ex balance for Bob’s account is $10). Regardless of how the data is generated, it is available in Kafka topics and will be used by our Producer to send event notifications. 

 

Consumer is a tool that would be installed in a customer environment and would pick up events from the producer and store them on the customer infrastructure (ex in flat files).

 

Customers will be able to subscribe to event types they are interested in and receive only those events with their consumer instance. Also, source data is coming from a multi-tenant platform and the producer will need to filter the data for a particular customer.

 

Event data should be delivered to the consumer in relatively short time, a few seconds, (in the common case with no network issues or other edge cases), and should be transmitted only once, to avoid flooding the network traffic, and finally all event notifications need to be streamed in order they happened (out of order delivery is not acceptable).

 

Both producer and consumer will be able to tolerate platform downtimes up to 72 hours (ie no events lost in that time), and will need to handle disaster recovery scenario (ex retransmit all the events in the last 24hrs)

 

Producer will be able to handle the load of ~100M events per day total - and would need to scale appropriately with increase in the number of events (note that not all events would need to be streamed - that depends on the event types that each customer is subscribed to)

 

Consumer will need to use minimal infrastructure, as it will be installed in a client environment where we will have minimal access and different clients will have very different infrastructure in their data centers. It will define a generic interface so that customers can implement their own event handlers (ex send data to APIs, or message queues), and we will implement a default handler that simply stores the events in flat files.

 

Data seeding will be supported by both producer and consumer - ex producer exporting events for a time interval (ex 3 months) into a flat file and consumer loading those events from a flat file, to avoid transmitting such large amounts of data over the network.


Task Details

Your task is to research and document the architectural patterns and available frameworks/technologies that we can use to implement the event notification platform that supports the above requirements. Your primary focus should be on producer/consumer architecture and their interaction - for example what does the producer do with the events (does it store them and how) and how does it scale (does it have to be centralized or distributed), how are the events sent to the consumer (API, existing messaging technology, custom protocol,..) and how the subscriptions are managed. We will focus on the exact architecture for other requirements in future challenges.

 

Note that using third party hosted/cloud/SaaS solutions is NOT  allowed (primarily due to data confidentiality requirements) - all the tools will need to be supported on local infrastructure. This means using e.g. Firebase isn’t an option, but using RabbitMQ, Kafka, etc are ok.

 

Your submission should be a document with following sections:

  • General info - your view of the requirements and how they map to existing solutions/frameworks

  • Details for appropriate frameworks/technologies and patterns and their comparisons (using charts/diagrams to showcase the data flow is highly encouraged)

  • Other technologies or frameworks that you considered but don’t satisfy some of the requirements

  • Summary and a recommendation for the platform architecture



Final Submission Guidelines

See above

ELIGIBLE EVENTS:

2021 Topcoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30142913