Skip to content
This repository has been archived by the owner on May 1, 2020. It is now read-only.

Wrong code! #1

Open
Guocode opened this issue Apr 30, 2019 · 6 comments
Open

Wrong code! #1

Guocode opened this issue Apr 30, 2019 · 6 comments

Comments

@Guocode
Copy link

Guocode commented Apr 30, 2019

num = torch.sum(torch.mul(predict, target)) + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p)) + self.smooth

The add of smooth is done on every index! It should be like this

self.smooth = 1
...
num = torch.sum(torch.mul(predict, target)) + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p)) + self.smooth
...
return 1 - num/den

@hubutui
Copy link
Owner

hubutui commented Apr 30, 2019

Emm, a typo? I don't see any differences.

@Guocode
Copy link
Author

Guocode commented Apr 30, 2019

@hubutui
sorry, it's your code

        num = torch.sum(torch.mul(predict, target), dim=1) + self.smooth
        den = torch.sum(predict.pow(self.p) + target.pow(self.p), dim=1) + self.smooth

"+" in pytorch of a tensor and a scalar will be done on every index, so if you sum up the num, molecule is the sum of num and smooth multiply with the shape of the sum(not 1 but maybe millions).

@Guocode
Copy link
Author

Guocode commented May 1, 2019

and miss multiply 2,

self.smooth = 1
...
num = torch.sum(torch.mul(predict, target))*2 + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p)) + self.smooth
...
return 1 - num/den

@asciidiego
Copy link

@Guocode Can you just do the following?

  1. Reference the lines in the code of this repo where it is potentially badly calculated.
  2. Explain what should happen (mathematically) and what his code is doing erroneously.
  3. Propose a new code.

All in the same comment? It is hard for me to follow and understand what you are trying to tell me.

@hubutui Can you review this issue? I was planning to use your loss on a segmentation deep learning pipeline.

@Guocode
Copy link
Author

Guocode commented Aug 13, 2019

@diegovincent
I have pointed out the difference, and I have released my code on my page, you can have a look at that.

@wassname
Copy link

wassname commented Dec 1, 2019

Let me try to work this out, this page:

num = torch.sum(torch.mul(predict, target), dim=1) + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p), dim=1) + self.smooth

diegovincent's page

num = torch.sum(torch.mul(predict, target))*2 + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p)) + self.smooth

So adds a *2 and removes dim=2.

I think diegovincent is saying that if you sum on dimension two only, but leave the others dimension, giving you a tensor of shape, say, [16, 4]. Now if you add smooth, pytorch will expand it's shape to [16, 4] and add it to each element. The results is you've added it many times.

The alternative is to sum over all elements first, and then add smooth. But this changes the equation, and you're not actually calculating the dice loss for each element, but for the sum of all elements in all batches. There may be disagvantages to backprop if you sum too early as the gradient isn't traces through the dice loss of each element. The only disadvantage I see if adding smooth to each element is that it smooths more. But you can [actually measure] the effect smoothing has on the output, and it gives an outputs similar to MSELoss.

Also I implemented dice loss in the same way in this gist and quite a few people reviewed it.

I think it's fine as it is.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants