Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference uses too much GPU memory #14

Open
ivannson opened this issue Feb 7, 2020 · 3 comments
Open

Inference uses too much GPU memory #14

ivannson opened this issue Feb 7, 2020 · 3 comments

Comments

@ivannson
Copy link

ivannson commented Feb 7, 2020

Hi,

I have trained a model on my own data, and now trying to run inference. When running infer_img.py on one image (640x480) I see my GPU (GeForce GTX 1080) usage jump up to 7.5gb, which seems excessive for one image only. Is this expected behaviour?

Do you have any suggestions on how to decrease GPU usage during inference, as I only have 8gb of GPU memory, and need it to run a simulation as well as a couple of other inference scripts at the same time, as the data is coming in.

I would also just like to say thanks for open sourcing your work. From what I've seen, this is one of the best detection/segmentation projects out there in terms of code quality and readability, as well as good explanations how to get everything working.

@tano297
Copy link
Contributor

tano297 commented Feb 10, 2020

Hi,

Thanks for the props :)
Is it peak gpu consumption at the beginning or all the time during inference? Which model are you using?

@ivannson
Copy link
Author

Hi,

I was using my own dataset to train a segmentation model based on the coco config file (mobilenetsV2). I then ran infer_img.py script to get predictions for one image (for some reason it wasn't working when I pointed it to a folder with images. The GPU usage increased very quickly to 7gb, and went down to 0 since the inference finished by then.

I have also trained another model on a similar dataset but using cityscapes config file (ERFNet). Running inference with this model only used around 2.7gb of GPU memory.

I was working on adapting the inference script to be able to infer multiple images, and when I run that, the GPU usage is a lot lower, I think it was around 1.5gb. Not sure why that happens.

@tano297
Copy link
Contributor

tano297 commented Feb 17, 2020

Hi,

It may be that at the beginning of the inference cudnn is trying lots of different strategies and some of them use a lot of memory.

Another possibility (which I have fixed in our internal version), is this line.

You may want to change it to:

with torch.no_grad():
  _, skips = self.backbone(stub)

This line is there to profile the backbone and see the size of the skip connections. It allows me to adapt the decoder to any type of encoder, making backbone and decoder design significantly easier. However, without the no_grad guard, the run of that call also saves every activation as if you needed it for training, which is not desired. Give this a shot and let me know if it still spikes, this is likely the culprit.

BTW, the infer_img.py script already can take multiple images.

For example:

./infer_image.py -i /path/to/all/images/*.png

would infer all images in that directory.

As a final comment, I suggest you give the tensorRT runtime possibility a try. It will make your model WAY faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants