Skip to content

Commit

Permalink
Allow mongofs to be mounted as any other file system
Browse files Browse the repository at this point in the history
  • Loading branch information
Vincent Raman authored and gilles-degols committed Apr 9, 2019
1 parent 365878d commit 78fa872
Show file tree
Hide file tree
Showing 3 changed files with 72 additions and 20 deletions.
57 changes: 46 additions & 11 deletions readme.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,27 @@
### Requirements:
Tested Operating Systems: Centos 7

Tested Operating Systems: Centos 7

Tested MongoDB: 3.6. Should be able to run easily since 3.0. Works with sharding and replica set.

Tested Python version: 3.4/3.6.

### General information

Mount a Mongo database as a FUSE file system. Purpose of this implementation is the following:

1. Scaling only limited to your MongoDB installation.
2. Avoid limitations of basic file systems by allowing the following:
2. Avoid limitations of basic file systems by allowing the following:
- You can put millions of files in the same directory
- Infinite hierarchy
- Automatic redundancy
- Automatic redundancy
- Automatic compression
- Easier setup than HDFS (and more appropriate for small files)
- Faster creation/deletion of millions of files
- Faster creation/deletion of millions of files
- No "advanced" problems that you could find with inodes, ... once you start to have millions of files.

Features development:

- [x] Directory: creation, deletion, listing of files
- [x] File: creation, writing, reading, deletion
- [x] Symbolic link: creation, deletion
Expand All @@ -37,30 +41,32 @@ What is not possible or recommended with MongoFS:

1. Any limitation related to the underlying FUSE file system, and the fusepy library on top of it, which includes:

1.1. Hard links: https://github.com/libfuse/libfuse/issues/79
1.1. Hard links: https://github.com/libfuse/libfuse/issues/79

2. Expecting 100MB/s as writing speed. There is an overhead to decode, then store the data in MongoDB. But the slowest part is fusepy so we cannot improve much more the current code. Check benchmarks below if you want some numbers.

### Installation guide

1. Install the different packages

```
yum -y install https://github.com/gilles-degols/mongofs/releases/download/v1.2.2/mongofs-1.2.2-0.noarch.rpm
```

2. Mount the file system with the default parameters in /etc/mongofs/mongofs.json

For more information about the configuration parameters, check the appropriate section below.

```
sudo mongofs-mount /mnt/data
```


### Developer's guide

1. First installation

We assume that you already have a MongoDB installation, otherwise, follow the procedure described here: https://docs.mongodb.com/manual/installation/

```
git clone [email protected]:gilles-degols/mongofs.git
yum -y install python36 fuse fuse-libs
Expand All @@ -87,17 +93,19 @@ python3 -m unittest -v test.core.test_GenericFile.TestGenericFile.test_basic_sav

By default the configuration file is in /etc/mongofs/mongofs.json. You can give an alternative path in the command line
directly, as second argument.

```
mkdir -p /mnt/data
python3 -m src.main /mnt/data
# With a specific configuration filepath (absolute or relative)
python3 -m src.main /mnt/data conf/mongofs.json
python3 -m src.main conf/mongofs.json /mnt/data
```

4. Troubleshooting

If there was a problem during the mount, the mounting directory might have some problems (impossible to delete it, re-use it, ...):

```
rmdir /mnt/data
# rmdir: failed to remove ‘/mnt/att’: Device or resource busy
Expand All @@ -109,31 +117,60 @@ rmdir /mnt/data
5. Create a new release

Add a tag for your version, and push it to the remote Git repository.

```
git tag -a v1.2.2 -m "Version 1.2.2"
git push origin v1.2.2
```

Set up your environment:

```
yum install -y rpm-build
mkdir -p ~/rpmbuild/{BUILD,BUILDROOT,RPMS,SOURCES,SPECS,SRPMS}
```

Generate a new archive containing all the sources in Github, in the appropriate directory. Then, generate the rpm.

```
cp -r mongofs mongofs-1.2.2
tar -zcvf ~/rpmbuild/SOURCES/mongofs-1.2.2.tar.gz mongofs-1.2.2
rpmbuild -ba mongofs-1.2.2/spec/mongofs.spec
```

6. Benchmarks
6. Mount mongofs with package installed

Once the package is installed, mongofs is recognized as different type of mount. You can just run

```
mount -t mongofs <config-file> <mount-point> -o <mount-options>
```

You could also create an entry in systemd. Don't forget to name your unit file after your mountin-point (eg `/usr/lib/systemd/system/mnt-mongofs.mount`)

```
[Unit]
Description=M³ mongofs mounting point
Requires=mongod.service
After=mongod.service
[Mount]
What=/etc/mongofs/mongofs.json
Where=<mounting-point>
Type=mongofs
[Install]
WantedBy=multi-user.target
```

7. Benchmarks

If you want to test the performance of MongoFS versus your file system, you can easily test the handling of big files. The numbers given below were generated on a VM with 3GB of RAM & 4vCPU hosted on a Desktop computer with 16GB of RAM & SSD. So the reading speed is not strictly related to the disk as everything might be in cache on the OS level.
Benchmarking is a difficult subject, so be careful when you compare numbers.
Be aware that FUSE + fusepy (library used in mongofs) is in fact the slowest part of writing, and unfortunately we cannot improve the performance a lot more until it is improved on their side.

```
python3 -m src.main /mnt/data conf/mongofs.json
python3 -m src.main conf/mongofs.json /mnt/data
# Test local file system
yes "a" | dd of=output.dat bs=4k count=2500000 iflag=fullblock && time cat output.dat > /dev/null
Expand Down Expand Up @@ -166,5 +203,3 @@ Default configuration parameters can be seen in conf/mongofs.json, every one of
11. host: Current hostname of the machine itself (so, it should be unique), to manage file locks.
12. lock.access_attempt_s: Number of seconds we try to access to a locked file before giving up and returning an error to the client. Put 0 for infinity.
13. lock.timeout_s: Maximum number of seconds we consider a lock valid. To avoid a deadlock if a server is down, we delete the lock after that amount of time. Put 0 for infinity.


8 changes: 5 additions & 3 deletions spec/mongofs.spec
Original file line number Diff line number Diff line change
Expand Up @@ -43,15 +43,16 @@ getent passwd mongofs >/dev/null || useradd -r -g mongofs -d / -s /sbin/nologin
rm -rf ${RPM_BUILD_ROOT}
install -d -m 0755 ${RPM_BUILD_ROOT}/usr/lib/mongofs
install -d -m 0755 ${RPM_BUILD_ROOT}/usr/bin
install -d -m 0755 ${RPM_BUILD_ROOT}/usr/sbin
cp -r src/ ${RPM_BUILD_ROOT}/usr/lib/mongofs

# Security to avoid creating an rpm with invalid end-of-line
find ${RPM_BUILD_ROOT}/usr/lib/mongofs/src -type f -print0 | xargs -0 dos2unix

install -D -m 0644 conf/mongofs.json ${RPM_BUILD_ROOT}/etc/mongofs/mongofs.json
install -D -m 0644 run ${RPM_BUILD_ROOT}/usr/lib/mongofs/run
/usr/bin/ln -s /usr/lib/mongofs/run ${RPM_BUILD_ROOT}/usr/bin/mongofs-mount
chmod +x ${RPM_BUILD_ROOT}/usr/lib/mongofs/run
install -D -m 0755 run ${RPM_BUILD_ROOT}/usr/lib/mongofs/run
/usr/bin/ln -s /usr/lib/mongofs/run ${RPM_BUILD_ROOT}/usr/bin/mongofs-Mount
/usr/bin/ln -s /usr/lib/mongofs/run ${RPM_BUILD_ROOT}/usr/sbin/mount.mongofs

%define VPATH ${RPM_BUILD_ROOT}/usr/lib/mongofs/environment
%define REQUIREMENTS_PATH requirements.txt
Expand Down Expand Up @@ -85,6 +86,7 @@ rm -rf ${RPM_BUILD_ROOT}
%defattr(-,root,root)
/usr/lib/mongofs
/usr/bin/mongofs-mount
/usr/sbin/mount.mongofs
%config /etc/mongofs/mongofs.json

%changelog
Expand Down
27 changes: 21 additions & 6 deletions src/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,14 +275,28 @@ def flush(self, path, fh):

if __name__ == '__main__':
if len(argv) < 2:
print('usage: %s <mountpoint> (<configuration_filepath>)' % argv[0])
print('usage: %s (<configuration_filepath>) <mountpoint> (-o <fuse_mount_options>)' % argv[0])
exit(1)

if len(argv) == 3:
configuration_filepath = argv[2]
Configuration.FILEPATH = configuration_filepath
fuse_options = {}

mounting_point = str(argv[1])
mongofs_argv_size = len(argv)

if argv.count('-o') > 0:
# let's construct a dictionnary for fuse options
mongofs_argv_size = argv.index('-o')
for fuse_opt_arg in argv[mongofs_argv_size + 1].split(','):
fuse_opt_arg_parts = fuse_opt_arg.split('=')
if (len(fuse_opt_arg_parts) == 1):
fuse_opt_arg_parts.append(True)
fuse_options[fuse_opt_arg_parts[0]] = fuse_opt_arg_parts[1]

if mongofs_argv_size >= 3:
configuration_filepath = argv[1]
Configuration.FILEPATH = configuration_filepath
mounting_point = str(argv[2])

if not mounting_point.startswith('/'):
mounting_point = os.getcwd() + '/' + mounting_point

Expand All @@ -303,7 +317,8 @@ def flush(self, path, fh):
configuration = Configuration()
if configuration.is_development():
logging.basicConfig(level=logging.DEBUG)
fuse = FUSE(MongoFS(), mounting_point, foreground=True, nothreads=True, allow_other=allow_other)
fuse = FUSE(MongoFS(), mounting_point, foreground=True, nothreads=True, allow_other=allow_other, **fuse_options)
else:
logging.basicConfig(level=logging.ERROR)
fuse = FUSE(MongoFS(), mounting_point, foreground=False, nothreads=False, allow_other=allow_other)
fuse = FUSE(MongoFS(), mounting_point, foreground=False,
nothreads=False, allow_other=allow_other, **fuse_options)

0 comments on commit 78fa872

Please sign in to comment.