Project

General

Profile

Actions

Bug #22754

closed

3.1.0 workbench workflow page extreme slowness

Added by Ivan Diaz Alvarez 12 months ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Workbench2
Target version:
Story points:
-
Release relationship:
Auto

Description

Since ver 3.1.0, workflow pages tab in chrome exhibit hangs and get stuck:

  • This is mainly CPU based, checked the memory consumption, but it was normal (in this case, for this ticket)
    Doesn't seem related to cluster size or wokflog / log size
  • It tends to happen when the logs / outputs are empty, specially when seting up basic compute on the upgrade so that workflows failed and having no logs was triggered.
  • closing the tab and opening other with the same url solves the issue momentarily, but it crops again after some minutes.

Attached image with elevated cpu consumption in chrome


Files

Screenshot 2025-04-01 090118.png (9.09 KB) Screenshot 2025-04-01 090118.png Ivan Diaz Alvarez, 04/04/2025 12:55 PM
clipboard-202504041512-xx9gh.png (11 KB) clipboard-202504041512-xx9gh.png Ivan Diaz Alvarez, 04/04/2025 01:12 PM
process-panel-stuck.png (324 KB) process-panel-stuck.png Peter Amstutz, 04/27/2025 01:07 AM
profiler-process-page-stuck.png (440 KB) profiler-process-page-stuck.png Peter Amstutz, 04/28/2025 07:30 PM
queued-state-jank.png (348 KB) queued-state-jank.png Peter Amstutz, 05/07/2025 09:10 PM

Subtasks 1 (0 open1 closed)

Task #22804: ReviewResolvedPeter Amstutz05/09/2025Actions
Actions #1

Updated by Ivan Diaz Alvarez 12 months ago

Forgot to add the page gets unresponsive in this condition

Actions #2

Updated by Peter Amstutz 12 months ago

  • Assigned To set to Stephen Smith
  • Category set to Workbench2
Actions #3

Updated by Peter Amstutz 12 months ago

  • Target version set to Development 2025-04-16
Actions #4

Updated by Brett Smith 12 months ago

I am also seeing this in Firefox 137. The most reliable way to see it seems to be to start a non-trivial workflow and then leave a browser tab open on the workflow process page. You can tab away but for me when I hear my laptop fan kick in that tells me the tab has wedged itself and needs to be stopped.

Actions #5

Updated by Brett Smith 11 months ago

Ivan confirmed in a meeting that this is seen on queued/running workflows, not completed-and-failed ones. In his particular case, the user was monitoring a workflow that was impossible to dispatch, so it was stuck in the queued state.

Actions #6

Updated by Peter Amstutz 11 months ago

  • Target version changed from Development 2025-04-16 to Development 2025-04-30
Actions #7

Updated by Peter Amstutz 11 months ago

In the "queued" state, it polls for container status. This may have some unreasonable resource leak.

In the "running" state, it collects runtime logs. This also may have some unreasonable resource leak, but the polling mechanism for logs is different from the polling mechanism for container status used in the "queued" state.

I suggest the following tests to reproduce/diagnose the problem:

  • Leave the browser on a page in "Queued" state for several minutes with the javascript profiler active.
  • Leave the browser on a page in "Running" state for several minutes with the javascript profiler active.
Actions #8

Updated by Peter Amstutz 11 months ago

  • Subtask #22804 added
Actions #9

Updated by Peter Amstutz 11 months ago

I had this happen and I had the wherewithal to get a look at it in the Firefox debugger:

The stack trace on the right continues to scroll up for hundreds of method calls, but basically it seem that doing deep equality on MPVPanelContentProps is either extremely expensive or actually stuck in an infinite loop (it is hard to say which) -- it appears, among other things, that it is probably doing a line-by-line comparison of the contents of the log viewer.

Actions #10

Updated by Peter Amstutz 11 months ago

Here's the profiler:

Actions #11

Updated by Peter Amstutz 11 months ago

  • Release set to 78
Actions #12

Updated by Peter Amstutz 11 months ago

  • Target version changed from Development 2025-04-30 to Development 2025-05-14
Actions #13

Updated by Stephen Smith 11 months ago

  • Status changed from New to In Progress

Changes pushed to 22754-process-panel-slowness - removed memoization from MPVPanelContent.

I realized that it wouldn't be feasible to keep partial memoization based on the other parameters since it would memoize entirely based on the MPV props, so I removed it.

Actions #14

Updated by Peter Amstutz 11 months ago

Stephen Smith wrote in #note-13:

Changes pushed to 22754-process-panel-slowness - removed memoization from MPVPanelContent.

I realized that it wouldn't be feasible to keep partial memoization based on the other parameters since it would memoize entirely based on the MPV props, so I removed it.

So, I ran a workflow and I didn't have any issues. However based on the previous evidence I think we should merge the change which removes the memoization and keep an eye on it to see if there are any other issues.

I did notice some stutter while in the "Queued" phase. I think this is un-related slowness that was there before.

It is hard to tell from the stack trace what is happening but my best guess is that it is doing a pretty expensive re-render (700ms, twice, so that's like 1.5s), that probably isn't really necessary (because everything looks fine before it happens) but I would have to match this with a redux event and I don't have time to do that right now. My test process is to just start a process run, start the profiler, and scroll up and down with the mouse wheel while it is in the "queued" phase and watch for significant delays in scrolling that make the app feel slow.

Actions #15

Updated by Stephen Smith 11 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF