I've decided to take on Amazon S3 and have figured out how to build the ultimate object storage service.
Conditions:
- Several servers for storing the chunks and one server to rule them all.
- REST API for uploading and downloading files.
- The file should be split into chunks and stored on different servers.
- Storage servers can be added at any time, but no, you can't remove it.
- The storage load is evenly distributed among the servers.
Refer to the document Architecture to understand the main line of thinking and the reasons behind the architectural decisions made.
This command build and launch a cluster consisting of one frontend server and ten chunk servers:
docker-compose up
Upload file:
curl -X PUT -F [email protected] 'http://localhost:13090/put'
Download file:
curl -X GET 'http://localhost:13090/get?uuid=69d973de-c7ba-4856-9e54-773bb0e58546' > example_result.pdf
Frontend server:
- A set of endpoints for collecting statistics from the frontend server: for example, data distribution across nodes.
Chunk server:
- Checking for available space before uploading a chunk.
- If an error occurs when writing a chunk to the chunk server, we can try writing this chunk to another chunk server.
Chunk server and frontend server:
- Health and live checks.
- Store metadata for file.
- Checksums.
- Replication and data recovery in case of Chunk-Server node failure.