Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose and dev fixes #18245

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jenshannoschwalm
Copy link
Collaborator

Only expose if module has gui_post_expose

Avoid calling _module_gui_post_expose() if the module in focus doesn't have that. Improve readability of -d expose logs.


Improved handling of dt_dev_pixelpipe_change()

dt_dev_pixelpipe_change() calls the costly dt_dev_pixelpipe_get_dimensions() in all cases in addition to possibly syncing pipe nodes.

In dt_dev_process_image_job() we check for situations where we must call dt_dev_pixelpipe_change() as we don't have the pipe dimensions available in addition to pipe->changed != DT_DEV_PIPE_UNCHANGED leading to a significant performance gain due to less "pixelpipe-runs" just done to get the dimension.

In dt_dev_pixelpipe_change() we reduce the amount of syncing if possible as there is another active dt_dev_pixelpipe_change_t flag that includes work to be done. Locking history is also done only if required as nodes/parameters have been changed.

Some log improvements reducing misleading noise.

@jenshannoschwalm jenshannoschwalm added bugfix pull request fixing a bug scope: performance doing everything the same but faster labels Jan 19, 2025
@jenshannoschwalm
Copy link
Collaborator Author

@dterrahe @ralfbrown you both have been working a lot on dev stuff. Would you have a look?

@dterrahe
Copy link
Member

I really only understand it a little for 5 mins after a deep dive that took a few days. Not something you can dip in and out of.

@wpferguson
Copy link
Member

@jenshannoschwalm I'm running a clean environment with this PR and it still gets stuck in a loop.

What flags would you like me to run with to try and isolate this?

@jenshannoschwalm
Copy link
Collaborator Author

Likely -d pipe -d expose -d lua - i guess you are using lua :-)

raw and xmp might be helping too.

Do you observe some "jitter" while being in that loop? main canvas slightly moving "around" ?

Could you possibly do a video?

@wpferguson
Copy link
Member

Do you observe some "jitter" while being in that loop? main canvas slightly moving "around" ?

No, no movement at all. That's usually the first clue unless it's the working message flickering.

Could you possibly do a video?

Maybe, I'd have to figure it out.

@wpferguson
Copy link
Member

Looking back over the conversation from the last time I tried to figure this out I came across an entry I made saying that it didn't occur with opencl disabled.

I turned off opencl and processed 18 images without a hiccup. With opencl I can almost always reproduce in less than 10 images. So, I'll add opencl to the flags. I also added more logging to the script to try and tell you what it's doing.

@wpferguson
Copy link
Member

Enabled opencl. Reproduced the process image, move to next sequence that results in the processed image being reloaded with bad data. Got the insane data message for colorbalancergb (I think, it was fast and I wasn't looking for it). I stopped darktable after that occurred, so it's the end of the log. I'll attach the XMP and the log.

image_reset_on_change_image.txt

Here's the XMP from the reset image,
129EOSR7_2R4A0451.CR3.xmp.txt

If you need the raw too, let me know and I'll load it somewhere and post a link.

@wpferguson
Copy link
Member

Got a loop. Approximately line 3170 in the log.

loop.txt

The last entry before the loop had to do with channelmixerrgb. The one above could have been channelmixerrgb too. Possiblly something with the channelmixerrgb opencl code?

@wpferguson
Copy link
Member

Another loop. Last before was channelmixerrgb. Log at 72.8997 seconds.

loop2.txt

The channelmixerrgb preset I'm using is attached.

channelmixerrgb_monarch fieldhouse.dtpreset.txt

I'll stop now until you've had a chance to look at these.

@ralfbrown
Copy link
Collaborator

I don't yet have a good enough grasp of the pixelpipe processing to give much feedback, but your reasoning seems sound and I did a similar skip of work if no gui_post_expose in 1a8450a for #18132.

@jenshannoschwalm
Copy link
Collaborator Author

Don't know yet what's happening but for me it doesn't look like something related to OpenCL - that just processes the pipe in a slightly different way. What we have from your logs are loops that look slightly different - some examples

    73.3586 pipe starting             CL0 [full]                                       (   0/   0) 2755x1838 scale=0.3945 --> (   0/   0) 2755x1838 scale=0.3945  ID=33, nvidiacudanvidiageforcertx3070tilaptopgpu
    73.3693 dt_view_paint_surface         [full]                                        viewport zoom_scale  2.075 backbuf_scale  0.394 (x=3636.00 y=2402.00) -> (x=+0.000 y=+0.000)
    73.3745   painting                    [full]                                        size 2779x1899 processed 6984x4660 buf 2755x1838 scale=0.394 zoom (x=3636.00 y=2401.69) -> offset (x=+0.000 y=-0.000)
    73.3764 pipe state changing           [full]                                        dimension
    73.3764 get dimensions                [full]                                       (   0/   0) 7128x4732 scale=1.0000 ID=33
    73.3764 modify roi OUT                [full]           rawprepare              100 (   0/   0) 7128x4732 scale=1.0000 --> (   0/   0) 6984x4660 scale=1.0000  
    73.3765 pipe cache check              [full]                                        64 lines (important=20, used=0). Freed 0MB. Using using 0MB, limit=498MB
    73.3765 pipe starting             CL0 [full]                                       (   0/   0) 2755x1838 scale=0.3945 --> (   0/   0) 2755x1838 scale=0.3945  ID=33, nvidiacudanvidiageforcertx3070tilaptopgpu
   143.0358 pipe starting             CL0 [full]                                       (   0/   0) 1251x1875 scale=0.2685 --> (   0/   0) 1251x1875 scale=0.2685  ID=30, nvidiacudanvidiageforcertx3070tilaptopgpu
   143.0459 dt_view_paint_surface         [full]                                        viewport zoom_scale  1.412 backbuf_scale  0.268 (x=3636.00 y=2402.00) -> (x=+0.000 y=-0.000)
   143.0507   painting                    [full]                                        size 2779x1899 processed 4660x6984 buf 1251x1875 scale=0.268 zoom (x=3636.00 y=2401.86) -> offset (x=-0.000 y=-0.000)
   143.0520 pipe state changing           [full]                                        dimension
   143.0520 get dimensions                [full]                                       (   0/   0) 7128x4732 scale=1.0000 ID=30
   143.0520 modify roi OUT                [full]           rawprepare              100 (   0/   0) 7128x4732 scale=1.0000 --> (   0/   0) 6984x4660 scale=1.0000  
   143.0520 modify roi OUT                [full]           flip                   1700 (   0/   0) 6984x4660 scale=1.0000 --> (   0/   0) 4660x6984 scale=1.0000  
   143.0521 pipe cache check              [full]                                        64 lines (important=19, used=0). Freed 0MB. Using using 0MB, limit=498MB
   143.0521 pipe starting             CL0 [full]                                       (   0/   0) 1251x1875 scale=0.2685 --> (   0/   0) 1251x1875 scale=0.2685  ID=30, nvidiacudanvidiageforcertx3070tilaptopgpu
    17.9956 dt_view_paint_surface         [full/fast]                                   viewport zoom_scale  1.438 backbuf_scale  0.273 (x=3635.39 y=2401.13) -> (x=+0.000 y=+0.000)
    18.0008   painting                    [full/fast]                                   size 2779x1899 processed 4473x6863 buf 1222x1875 scale=0.273 zoom (x=3635.39 y=2401.06) -> offset (x=-0.000 y=+0.000)
    18.0020 expose module                 [full/fast]      crop                   3300  2779x1899, px=2744 py=1123
    18.0143 dt_view_paint_surface         [full/fast]                                   viewport zoom_scale  1.438 backbuf_scale  0.273 (x=3635.39 y=2401.13) -> (x=+0.000 y=+0.000)
    18.0215   painting                    [full/fast]                                   size 2779x1899 processed 4473x6863 buf 1222x1875 scale=0.273 zoom (x=3635.39 y=2401.06) -> offset (x=-0.000 y=+0.000)
    18.0229 expose module                 [full/fast]      crop                   3300  2779x1899, px=2718 py=1159
    18.0351 dt_view_paint_surface         [full/fast]                                   viewport zoom_scale  1.438 backbuf_scale  0.273 (x=3635.39 y=2401.13) -> (x=+0.000 y=+0.000)
    18.0401   painting                    [full/fast]                                   size 2779x1899 processed 4473x6863 buf 1222x1875 scale=0.273 zoom (x=3635.39 y=2401.06) -> offset (x=-0.000 y=+0.000)
    18.0412 expose module                 [full/fast]      crop                   3300  2779x1899, px=2640 py=1235

I don't know yet if the changed px/py in example 3 should give us a clue

     9.4276 dt_view_paint_surface         [full]                                        viewport zoom_scale  1.412 backbuf_scale  0.268 (x=3636.00 y=2402.00) -> (x=+0.000 y=-0.000)
     9.4327   painting                    [full]                                        size 2779x1899 processed 4660x6984 buf 1251x1875 scale=0.268 zoom (x=3635.98 y=2401.85) -> offset (x=-0.000 y=+0.000)
     9.4446 dt_view_paint_surface         [full]                                        viewport zoom_scale  1.412 backbuf_scale  0.268 (x=3636.00 y=2402.00) -> (x=+0.000 y=-0.000)
     9.4494   painting                    [full]                                        size 2779x1899 processed 4660x6984 buf 1251x1875 scale=0.268 zoom (x=3635.98 y=2401.85) -> offset (x=-0.000 y=+0.000)

Does this mean that dt_view_paint_surface somehow is the bad boy? painting changing something in port/dev data that triggers a re- dt_view_paint_surface ? That could also explain the other example loops - there we just recalc dimensions - no pixelpipe work - for re-painting ...

Bill, if you run into such a loop,

  1. What ways to get out of that loop did you have?
  2. What happens if you just change the window size?
  3. If you can't provide a video, maybe a screenshot?
  4. Any sliders for main darkroom canvas?
  5. A screenshot of darkroom preferences?

@jenshannoschwalm
Copy link
Collaborator Author

@TurboGit ae42f03 would be a cherry-pick candidate for 5.0.1

@jenshannoschwalm
Copy link
Collaborator Author

I don't yet have a good enough grasp of the pixelpipe processing to give much feedback, but your reasoning seems sound and I did a similar skip of work if no gui_post_expose in 1a8450a for #18132.

Aah, i just had a short glance on that. Seems to be "pretty related" to this pr :-) One section in darkroom is pretty identical, i think you missed the logs to be included in the check though :-)

@TurboGit TurboGit added this to the 5.2 milestone Jan 20, 2025
@TurboGit
Copy link
Member

TurboGit commented Jan 20, 2025

@jenshannoschwalm

Avoid calling _module_gui_post_expose() if the module in focus doesn't have that.

But calling it has a no effect as the first line on this routine is:

if(!module || !module->gui_post_expose || width < 1.0f || height < 1.0f) return;

Also, there is other call to _module_gui_post_expose() just above the fixed one here, why isn't it necessary to add the same check? Sorry for the possible stupid question but I'm a bit lost reading the code here and trying to understand the intent.

@jenshannoschwalm
Copy link
Collaborator Author

But calling it has a no effect as the first line on this routine is:

Right. But it avoids the wrong log message :-)

Also, there is other call to _module_gui_post_expose() just above the fixed one here

The cropper always has that function implemented.

@wpferguson
Copy link
Member

What ways to get out of that loop did you have?

Return to lighttable breaks the loop.

Haven't tried it after it occurs. Have tried panning and that doesn't work. Just recorded a video in a smaller screen size and could still reproduce.

Just did a video that shows the current edit being overwritten when I attempt to change images. I pop back out to lighttable and hover over the image that should have been developed but has a blue cast and the history stack has been reset.

Here's a link with the log file, https://drive.google.com/drive/folders/1FpOiePYU9f_MpLNVTIjGLmoNupEtN_Fd?usp=sharing

@wpferguson
Copy link
Member

I think Boris has nothing to fear from my screencasts 🤣.

I meant to have the log scrolling underneath the darktable window, but I was trying to manage too many new things at once.

Any recommendations about a better way to create videos will be happily accepted.

@TurboGit
Copy link
Member

TurboGit commented Jan 20, 2025

Right. But it avoids the wrong log message :-)

Ah ok, just that, so commit is good for 5.0.x branch.

@wpferguson
Copy link
Member

run8 is another image reset, but this time it had the insane values message.

Also I've noticed lately that when I crop I'm set to original image cropping uncropped images, and the crop window doesn't cover the whole image. Another new "feature"?

@wpferguson
Copy link
Member

wpferguson commented Jan 20, 2025

run9 is the loop. I had to clean up the mess from run8, then started processing again. Notice the flickering "working" message on the screen and lack of response to the applied processing. Exited to lighttable and then back to darkroom to finish processing.

EDIT: I went back to full screen because I wasn't getting the loop. In full screen after 2 or 3 images, I reproduced. A resource issue?

EDIT 2:Changed resources from default to small and still got the loop.

@TurboGit
Copy link
Member

Commit ae42f03 merged in branch 5.0.x.

@wpferguson
Copy link
Member

but for me it doesn't look like something related to OpenCL

Except that I can't reproduce if I turn off opencl

@jenshannoschwalm
Copy link
Collaborator Author

but for me it doesn't look like something related to OpenCL

Except that I can't reproduce if I turn off opencl

I understood that. I just meant, it doesn't seem to be an OpenCL code issue at all, from my understanding of the logs it could be related to "no-opencl-just-one-device" so less chances for a pipe race problem.

In fact in thos loops there is no processing at all, it could be a dev gtk redrawing issue.

Avoid calling _module_gui_post_expose() if the module in focus doesn't have that.
Improve readability of '-d expose' logs.
dt_dev_pixelpipe_change() calls the possibly costly dt_dev_pixelpipe_get_dimensions()
in all cases in addition to possibly syncing pipe nodes.

In dt_dev_process_image_job() we check for situations where we *must* call dt_dev_pixelpipe_change()
as we don't have the pipe dimensions available in addition to pipe->changed != DT_DEV_PIPE_UNCHANGED.

This leads to a significant performance gain due to less "pixelpipe-runs" just done to get the dimension.

In dt_dev_pixelpipe_change() we reduce the amount of syncing if possible as there is another active
dt_dev_pixelpipe_change_t flag that includes work to be done. Locking history is also done only if
required as nodes/parameters have been changed.

Some log improvements reducing misleading noise.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix pull request fixing a bug scope: performance doing everything the same but faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants