Merge pull request openvinotoolkit#1670 from eaidova/ea/nuy_fcrn

added nyu fcrn depth prediction model
DariaMityagina · Oct 20, 2020 · 350702f · 350702f
2 parents d20e1b4 + ab0712f
commit 350702f
Show file tree

Hide file tree

Showing 8 changed files with 182 additions and 1 deletion.
diff --git a/demos/python_demos/monodepth_demo/models.lst b/demos/python_demos/monodepth_demo/models.lst
@@ -1,2 +1,3 @@
 # This file can be used with the --list option of the model downloader.
+fcrn-dp-nyu-depth-v2-tf
 midasnet
diff --git a/demos/python_demos/monodepth_demo/monodepth_demo.py b/demos/python_demos/monodepth_demo/monodepth_demo.py
@@ -73,7 +73,7 @@ def main():
 
     # processing output blob
     log.info("processing output blob")
-    disp = res[out_blob][0]
+    disp = np.squeeze(res[out_blob][0])
 
     # resize disp to input resolution
     disp = cv2.resize(disp, (input_width, input_height), cv2.INTER_CUBIC)

diff --git a/models/public/fcrn-dp-nyu-depth-v2-tf/accuracy-check.yml b/models/public/fcrn-dp-nyu-depth-v2-tf/accuracy-check.yml
@@ -0,0 +1,20 @@
+models:
+  - name: fcrn-dp-nyu-depth-v2-tf
+    launchers:
+      - framework: dlsdk
+        adapter: mono_depth
+    datasets:
+      - name: NYU_Depth_V2
+        preprocessing:
+          - type: resize
+            use_pillow: true
+            dst_height: 228
+            dst_width: 304
+        postprocessing:
+          - type: resize_prediction_depth_map
+        metrics:
+          - type: rmse
+          - type: log10_error
+            name: log10
+          - type: mape
+            name: rel
diff --git a/models/public/fcrn-dp-nyu-depth-v2-tf/fcrn-dp-nyu-depth-v2-tf.md b/models/public/fcrn-dp-nyu-depth-v2-tf/fcrn-dp-nyu-depth-v2-tf.md
@@ -0,0 +1,111 @@
+# fcrn-dp-nyu-depth-v2-tf
+
+## Use Case and High-Level Description
+
+This is a model for monocular depth estimation trained on the NYU Depth V2 dataset,
+  as described in the paper [Deeper Depth Prediction with Fully Convolutional Residual Networks](https://arxiv.org/abs/1606.00373), where it is referred to as ResNet-UpProj. 
+  The model input is a single color image.
+  The model output is an inverse depth map that is defined up to an unknown scale factor. More details can be found in the [following repository](https://github.com/iro-cp/FCRN-DepthPrediction).
+
+
+## Specification
+
+| Metric            | Value         |
+|-------------------|---------------|
+| Type              | Monodepth     |
+| GFLOPs            | 63.5421       |
+| MParams           | 34.5255       |
+| Source framework  | TensorFlow\*  |
+
+## Accuracy
+
+| Metric | Value |
+| ------ | ----- |
+| [RMSE](https://en.wikipedia.org/wiki/Root-mean-square_deviation)   | 0.573 |
+| log10  | 0.055 |
+| rel    | 0.127 |
+
+Accuracy numbers obtained on NUY Depth V2 dataset. 
+The `log10` metric is logarithmic absolute error, defined as `abs(log10(gt) - log10(pred))`, 
+where `gt` - ground truth depth map, `pred` - predicted depth map.
+The `rel` metric is relative absolute error defined as absolute error normalized on ground truth depth map values 
+(`abs(gt - pred) / gt`, where `gt` - ground truth depth map, `pred` - predicted depth map).
+
+
+## Input
+
+### Original Model
+
+Image, name - `Placeholder`, shape - `1,228,304,3`, format is `B,H,W,C` where:
+
+- `B` - batch size
+- `C` - channel
+- `H` - height
+- `W` - width
+
+Channel order is `RGB`.
+
+### Converted Model
+
+Image, name - `Placeholder`, shape - `1,3,228,304`, format is `B,C,H,W` where:
+
+- `B` - batch size
+- `C` - channel
+- `H` - height
+- `W` - width
+
+Channel order is `BGR`.
+
+## Output
+
+### Original Model
+
+Inverse depth map, name - `ConvPred/ConvPred`, shape - `1,128,160`, format is `B,H,W` where:
+
+- `B` - batch size
+- `H` - height
+- `W` - width
+
+Inverse depth map is defined up to an unknown scale factor.
+
+### Converted Model
+
+Inverse depth map, name - `ConvPred/ConvPred`, shape - `1,128,160`, format is `B,H,W` where:
+
+- `B` - batch size
+- `H` - height
+- `W` - width
+
+Inverse depth map is defined up to an unknown scale factor.
+
+## Legal Information
+
+The original model is released under the following [license](https://raw.githubusercontent.com/iro-cp/FCRN-DepthPrediction/master/LICENSE):
+
+```
+Copyright (c) 2016, Iro Laina
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright notice, this
+  list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright notice,
+  this list of conditions and the following disclaimer in the documentation
+  and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+```
+
+[*] Other names and brands may be claimed as the property of others.
diff --git a/models/public/fcrn-dp-nyu-depth-v2-tf/model.yml b/models/public/fcrn-dp-nyu-depth-v2-tf/model.yml
@@ -0,0 +1,38 @@
+# Copyright (c) 2020 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+description: >-
+  FCRN ResNet50 UpProj is a model for monocular depth estimation trained by NYU Depth V2 dataset;
+  as described in the paper "Deeper Depth Prediction with Fully Convolutional Residual Networks"
+  <https://arxiv.org/abs/1606.00373>.
+  The model input is a blob that consists of a single image of "1x228x304x3" in RGB order.
+  The model output is an inverse depth map that is defined up to an unknown scale factor.
+task_type: monocular_depth_estimation
+framework: tf
+files:
+  - name: NYU_FCRN-checkpoint.zip
+    size: 472588519
+    sha256: 9d97ed165c4a5b3f085eb83b8814de1e883c6348da60da4b2568ddd64bb2d5c4
+    source: http://campar.in.tum.de/files/rupprecht/depthpred/NYU_FCRN-checkpoint.zip
+postprocessing:
+  - $type: unpack_archive
+    format: zip
+    file: NYU_FCRN-checkpoint.zip
+model_optimizer_args:
+  - --input=Placeholder
+  - --reverse_input_channels
+  - --input_shape=[1,228,304,3]
+  - --output=ConvPred/ConvPred
+  - --input_meta=$dl_dir/NYU_FCRN.ckpt.meta
+license: https://raw.githubusercontent.com/iro-cp/FCRN-DepthPrediction/master/LICENSE
diff --git a/models/public/index.md b/models/public/index.md
@@ -170,6 +170,7 @@ Since this task contains - in the general setting - some ambiguity, the resultin
 | Model Name                  | Implementation | OMZ Model Name                | Accuracy | GFlops    | mParams |
 | --------------------------- | -------------- | ----------------------------- | -------- | --------- | ------- |
 | midasnet                    | PyTorch\*      | [midasnet](./midasnet/midasnet.md)| 7.5878| 207.4915  |    104.0814     |
+| FCRN ResNet50-Upproj          | TensorFlow\*   | [fcrn-dp-nyu-depth-v2-tf](./fcrn-dp-nyu-depth-v2-tf/fcrn-dp-nyu-depth-v2-tf.md)| 0.573 | 63.5421 | 34.5255 |
 
 ## Image Inpainting
 

diff --git a/tools/accuracy_checker/configs/fcrn-dp-nyu-depth-v2-tf.yml b/tools/accuracy_checker/configs/fcrn-dp-nyu-depth-v2-tf.yml
@@ -0,0 +1 @@
+../../../models/public/fcrn-dp-nyu-depth-v2-tf/accuracy-check.yml
diff --git a/tools/accuracy_checker/dataset_definitions.yml b/tools/accuracy_checker/dataset_definitions.yml
@@ -957,3 +957,12 @@ datasets:
       - type: pad_with_eos
         eos_index: 1
         sequence_len: 192
+
+  - name: NYU_Depth_V2
+    data_source: nyudepthv2/val/converted/images
+    additional_data_source: nyudepthv2/val/converted/depth
+    annotation_conversion:
+      converter: nyu_depth_v2
+      images_dir: nyudepthv2/val/converted/images
+      depth_map_dir: nyudepthv2/val/converted/depth
+      data_dir: nyudepthv2/val/official
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		../../../models/public/fcrn-dp-nyu-depth-v2-tf/accuracy-check.yml