Challenge Overview
Challenge Objectives
-
Target environment: Bitbucket Git Repo
-
Basic Requirements: Given a list of strings and repo names, generate a report using a script of any repos that contain those strings.
Project Background
Our clients work on the Bitbucket Git Repositories, they want to check if some people check credentials, names & passwords into repos and need to prevent that from happening.
Technology Stack
Git
Bitbucket API
Scripting languages
-
Python
-
NodeJS (ES6 or Typescript)
-
Or any other language you like
Individual requirements
Given a list of strings and repo names, write a script to scan all given repos, generate a report to illustrate any repos that contain those strings.
1. Prepare demo repos and a mock file that contains strings.
Clients won’t provide demo repos for development, so we have to prepare demo repos ourselves.
Please prepare at least 5 repos, each repo should contain
-
at least 10 commits
-
at least 4 pull requests
-
at least 2 branches
I suggest you can transfer some popular open source project from GitHub to Bitbucket, then the repo will be more closed to the production environment.
Here is a tutorial about how to transfer project from GitHub to Bitbucket https://befused.com/git/github-bitbucket-move
You should prepare a mock file that contains
-
at least 5 strings
-
at least one string contains space
-
at least one string contains special characters like !@#$%^&
2. Write a script to scan all repos and generate the report
You should write a script to scan all demo repos that you prepared, check if any of them contains any string that listed in the mock file, and generates a good-looking HTML report file.
2.1 Script
The check rules are
-
All the given repos are in scope.
-
All the branches of a repo are in scope.
-
All pull requests are in scope.
-
The check supports partial match. For example, given a string “abcde”, if a repo contains a string “xxxabcdefg”, that means the repo contains the string “abcde”.
The script will run periodically or run after receiving any new commits or pull requests. The script should not scan all files every time. Instead, we want
-
it scans all files at the first run
-
then it only scans files that are contained in the incremental changes (new commits or new pull requests).
At the first run, the script will clone the repo first, then scan all files.
At subsequent runs, the script will only check the incremental changes (new commits or new pull requests).
Note you need to persist the state of the script properly.
2.2 Report
The report file should be in HTML format, it should be good looking and we can just double-click to open it without running any local HTTP server to host it.
I suggest you can use Bootstrap to make the UI look good, but you are free to choose any other HTML UI library.
The report should include the following items:
-
The repo that contains the string listed in the given file, specifically, specifically, it includes
-
the repo name
-
the remote link
-
the branch name
-
-
The file name of the repo that contains the string, specifically, it includes
-
the line number
-
the content of the line
-
which string that is contained in the line
-
the author of the line
-
the related commit hash (should link to the Bitbucket)
-
the related pull requests (if exists)
-
All the items should be organized in a table-like view.
Important notes
-
Please make sure all your demo repos are accessible by the reviewers, so they can verify your script easily. You can make them publicly accessible or grant access to them in the review phase.
-
Reviewers are expected to add new commits, strings, and repos & modify existing strings and repos to test the robustness of the script.
-
Please keep in mind the script should all the edge cases in
-
In this PoC, the script doesn’t need to listen to new commit & pull request, the script can be run multiple times manually.
-
Please document how to reset the script to the initial state.
-
The script can just run as a CLI tool.
Final Submission Guidelines
Please submit all the following items in a zip archive.
- The source code of the script
- A file contains all the demo repositories that you prepared.
- A mock file contains strings
- A detailed README.md to show how to deploy and run your script.
- A sample HTML report