Challenge Overview

In this challenge series we are looking to develop a file service - a reusable service where a Source system can upload a file using http(s) and in return gets a technical handle (URL). The Technical handle is transmitted via Integration to the Destination which can use the handle to download the file. Service will have a security model that will allow uploader to set access permissions for the uploaded file. Destination services will have the option to register for webhook events for new files.
 

In this challenge we want you to come up with a proposal for the service architecture (frameworks, databases, file storage, event notifications, etc), finalize the service API design (swagger), and database design. No coding is required. You don't need to design the API from scratch - we have sample API endpoints defined (see below) and you should suggest how to support the missing features in the samples and create the final swagger specification. Service will be implemented using Spring Boot and Apache Camel. Usage of Apache Camel is optional, but will make our future combined maintenance easier - you can include it in the service architecture where appropriate, or suggest alternatives.

Here is the expected system diagram:


It is up to you to propose where to store the actual files (filesystem, s3, mongoDB or something else), and the database that will keep files metadata, access logs, file access settings and event notifications data, but make sure those choices don't break any of these non functional requirements:
  • Filesize shall only be limited by underlying file-storage system
  • Md5checksums shall be calculated at file upload 
  • Md5Checksums shall be compared if provided at upload 
  • File retention (days it should be kept) can be set by source system at upload, but can never exceed x days 
  • Filenames stored on underlying MUST be decoupled from source filenames (i.e. technically generated) 
  • If file-system is used for file-storage, the folders MUST not contain more than 1000 files per folder ( Linux folder operations becomes slower and slower if NFS shared folders contain more than 1000 files )
  • Easy reconciliation between database and files stored on file-system MUST be possible to perform housekeeping if auto-deletion fails
  • Files that have expired retention period SHALL be removed automatically
Service will support two modes of file sharing:
  • Integration platform - In this scenario the source/sender system uploads a file to the fileservice and transmits the download information to the destination/receiver system using integration platform (independent of the file service)
  • Events - In this scenario the source/sender system uploads a file to the fileservice and the destination/receiver system is informed using a webhook-call after successful upload has occurred. Stable event delivery should be supported (retry on failure). You should suggest how the event delivery should be implemented (custom jobs, JMS, or something else). Make sure to take this into account when designing the database model - event delivery should be tracked through the database
In both scenarios, file sharing can be one to one (optionally a “delete after successful download”- flag could be set at upload by source/sender system), or one to many (this will work for the recipient systems within the file-retention period).

Security model supported by the service will be defined as follows:
  • IntegrationId - an identifier which allows source uploaders to share files with a group of destination downloaders (it will be used to delegate management to teams owning and operating their file integrations)
  • SystemID - Identity of a system (uploader, downloader or eventlistener)
  • SecurityToken (password in Basic Authentication model) - Token that is used to identify the systemid within the IntegrationId, issued at system creation. System ID and security token will be used for authentication
  • Permission - The permission(s) associated with the SystemIdentity (Upload xor (Download and/or EventListener) )
  • Download is by default tied by IntegrationId and all downloaders have permissions to download any file associated with a IntegrationId, but this can be overridden by source uploader granting only specific SystemIDs within IntegrationId realm to download the file

High level of endpoints supported by the service (/v1/fileservice/ path prefix)
  • /mgmnt – Manage integrations and systems within each integration
  • /auth – Authentication – gives JWT bearer tokens
  • /upload – file uploads
  • /download - file downloads
  • /event – Event management api (registering for webhook callbacks)
  • /audit – audit logs

Management api
/v1/fileservice/mgmnt
  • GET – a list of JSON {{“integration-id”:”<integrationid”};….} 
  • PUT – JSON {“integration-id”:””;”business-contact”:””;”technical-contact”:””}
    − Creates a new integration realm and the critical contacts − Response: 200 ok if all checks out
  • POST – N/A
  • DELETE – N/A
/v1/fileservice/mgmnt/
  • GET – Response: 200 ok+ JSON {“integration-id”:””;”business-contact”:””;”technical-contact”:””} 
  • PUT: n/a
  • POST: JSON {”business-contact”:””;”technical-contact”:””} − Update existing integration realm 
  • DELETE: − Remove existing integration realm and all associated system ids and their associated files and eventlisteners (not auditdata)
/v1/fileservice/mgmnt//clients
  • GET – N/A or a list of JSON {{“client-id”:””};….} 
  • PUT – JSON {“client-id”:”clientid”;”permission”:””;”business-contact”:””;”technical-contact”:””}
    − Creates a new integration realm and the critical contacts 
    − Response: 200 ok + JSON {“security-token”:””} 
    − Alternative handling is for security token to be automatically mailed to technical-contact 
  • POST – N/A
  • DELETE – N/A
/v1/fileservice/mgmnt//clients/
  • GET – − Response: 200 ok+ JSON {“client-id”:””; ”permission”:”permission” ; ”business-contact”:””;”technicalcontact”:””}
  • PUT: n/a
  • POST: JSON {”business-contact”:””;”technical-contact”:””} 
    − Update existing integration realm 
    − Note permissions should not be able to be updated with this endpoint
  • DELETE: − Remove existing client-id and associated uploaded files (if upload permission),eventlisteners, but not auditlogs
Authentication api
/v1/fileservice/auth 
  • Basic authentication : Username:<IntegrationID>, Password:<SecurityToken>
  • Response: 
    − 200 ok + http-header WWW-Authenticate: Bearer 
    − 301 ok + redirect back to referrer + http-header WWW-Authenicate: Bearer 
    − JWT Should be encoded with payload { “integration-id”:””;”client-id”:””;”permission”:””}
File upload api
/v1/fileservice/upload/
  • GET N/A
  • POST: 
    Mandatory fields
    − Authentication: Bearer 
    − Content-Type: 
    − Content-Length: − 
    Response 
    - http-header: Location: <external url for download>
    - JSON {“download-url-internal”:”<internal download url>”; “download-url-external”:”<external download url>”;”file-expirytimestamp”:”
    <YYYYMMdd:HHmmss+TZ>”;”size”:”<size in bytes>”;”md5checksum”:”md5checksum”;”technical-fileidentifier”:”<technical
    file identifier>}
  • PUT n/a 
  • DELETE: N/A
  • NOTE: API needs to be updated with good way of optionally specifying allowed download-systemids and file-retention
File download api
/v1/fileservice/download/
  • GET (normal http file download) 
    − http-header: Authenticate: Bearer 
    − Response 200 OK + file 
  • PUT N/A
  • POST N/A
  • DELETE N/A
  • NOTE: API needs to be updated with good way of specifying Md5checksum, original filename
Event management api
The bellow endpoints should support the following event types:
  • FileUploadStarted
  • FileUploadCompleted
  • FileUploadInterupted
  • FileDownloadStarted
  • FileDownloadCompleted
  • FileDownloadInterupted
Event payload will have the following content:
  • OriginalFilename
  • FileLength - The file-length or -1 if not known (FileUploadStarted and FileUploadInterupted)
  • EventTimestamp
  • IntegrationId
  • DeleteAfterDownload - Flag stating if the file will be deleted after a successful download
  • FileExperiyTimestamp - The timestamp when the file is no longer available for download 
  • SystemId - The SystemId of the system the triggered the event
  • IPAdress - The IP-address of the system the triggered the event
/v1/fileservice/event
  • GET − get list of registered event webhooks
    http-header: Authenticate: Bearer 
    − Response: 200 ok + JSON list {“event-listener-id”:””;…} list is to be filtered only for systemid 
    403 unauthorized if systemid is not permitted for event
  • PUT: register a new event webhook
    − http-header: Autenticate: Bearer 
    − JSON {“event-call-back-url”:””;”business-contact”:””;””} 
    − Response 200 ok + JSON {“event-listener-id”:””}
  • POST: N/A
  • DELETE: N/A
/v1/fileservice/event/<eventListenerId>
  • GET − get details of the event webhook
    http-header: Authenticate: Bearer 
    − Response − 200 ok + JSON {“event-call-back-url”:””;”business-contact”:””;””} 
    − 403 unauthorized if systemid is not permitted for events or eventlistenerid is not registered by systemid 
  • PUT: N/A
  • POST: N/A
  • DELETE - remove this webhook
    − http-header: Authenticate: Bearer 
    − Response − 200 ok 
    − 403 unauthorized if systemid is not permitted for events or eventlistenerid is not registered by systemid

NOTE: Along with swagger api specification, please provide brief implementation notes where endpoint implementation is not obvious - just explain how the data flows through the system and which db tables should be updated
 

Final Submission Guidelines

Submit the service architecture document
Submit the service swagger specification

ELIGIBLE EVENTS:

2018 Topcoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30065443