Project

General

Profile

Actions

Bug #23450

closed

Error in cluster-activity.cwl latest version (commit bcf417eb0f3c1cb6904f7753f582adcb958231b1)

Added by Lucas Di Pentima about 1 month ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Deployment
Target version:
Story points:
-
Release relationship:
Auto

Description

Tried to run the latest cluster-activity.cwl workflow and got the following error:

2026-02-11T20:33:40.842911064Z INFO:root:Got workflow steps 64000 - 65000
2026-02-11T20:33:46.260308687Z INFO:root:Got workflow steps 65000 - 66000
2026-02-11T20:33:50.892294548Z INFO:root:Got workflow steps 66000 - 67000
2026-02-11T20:34:00.437488273Z INFO:root:Got workflow steps 67000 - 68000
2026-02-11T20:34:01.026716350Z INFO:root:Got workflow steps 68000 - 68108
2026-02-11T20:34:03.007592256Z INFO:root:Exporting workflow runs 18000 - 18400
2026-02-11T20:34:04.374271657Z INFO:root:Getting workflow steps
2026-02-11T20:34:22.415643865Z INFO:root:Got workflow steps 0 - 1000
2026-02-11T20:34:24.213218796Z INFO:root:Got workflow steps 1000 - 1398
2026-02-11T20:34:24.297887513Z INFO:root:Getting container hours time series
2026-02-11T20:34:24.776052483Z Traceback (most recent call last):
2026-02-11T20:34:24.776063703Z File "/opt/arvados-py/bin/arv-cluster-activity", line 8, in <module>
2026-02-11T20:34:24.776068723Z sys.exit(main())
2026-02-11T20:34:24.776069963Z ^^^^^^
2026-02-11T20:34:24.776071063Z File "/opt/arvados-py/lib/python3.11/site-packages/arvados_cluster_activity/main.py", line 192, in main
2026-02-11T20:34:24.776874574Z f.write(reporter.html_report(since, to, args.exclude, args.include_workflow_steps))
2026-02-11T20:34:24.776956036Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.776958786Z File "/opt/arvados-py/lib/python3.11/site-packages/arvados_cluster_activity/report.py", line 163, in html_report
2026-02-11T20:34:24.777713915Z self.graphs[containers_graph] = self.collect_graph(since, to,
2026-02-11T20:34:24.777717055Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.777718365Z File "/opt/arvados-py/lib/python3.11/site-packages/arvados_cluster_activity/report.py", line 124, in collect_graph
2026-02-11T20:34:24.777829867Z for series in get_metric_usage(self.prom_client, since, to, metric % self.cluster, resampleTo=resample_to):
2026-02-11T20:34:24.777832718Z File "/opt/arvados-py/lib/python3.11/site-packages/arvados_cluster_activity/prometheus.py", line 40, in get_metric_usage
2026-02-11T20:34:24.778401502Z rs = series.resample(resampleTo).max(1).ffill()
2026-02-11T20:34:24.778404592Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.778509645Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/resample.py", line 1322, in max
2026-02-11T20:34:24.779871169Z return self._downsample("max", numeric_only=numeric_only, min_count=min_count)
2026-02-11T20:34:24.779943181Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.780030963Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/resample.py", line 2102, in _downsample
2026-02-11T20:34:24.780274099Z result = obj.groupby(self._grouper).aggregate(how, **kwargs)
2026-02-11T20:34:24.780277609Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.780343781Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/groupby/generic.py", line 2291, in aggregate
2026-02-11T20:34:24.781987122Z result = op.agg()
2026-02-11T20:34:24.781990932Z ^^^^^^^^
2026-02-11T20:34:24.781992552Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/apply.py", line 291, in agg
2026-02-11T20:34:24.782193547Z return self.apply_str()
2026-02-11T20:34:24.782327270Z ^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.782330060Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/apply.py", line 701, in apply_str
2026-02-11T20:34:24.782436673Z return self._apply_str(obj, func, *self.args, **self.kwargs)
2026-02-11T20:34:24.782439503Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.782608427Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/apply.py", line 792, in _apply_str
2026-02-11T20:34:24.782686459Z return f(*args, **kwargs)
2026-02-11T20:34:24.782689040Z ^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.782690229Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/groupby/groupby.py", line 3381, in max
2026-02-11T20:34:24.784464304Z return self._agg_general(
2026-02-11T20:34:24.784467394Z ^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.784468564Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/groupby/groupby.py", line 1708, in _agg_general
2026-02-11T20:34:24.784707900Z result = self._cython_agg_general(
2026-02-11T20:34:24.784710910Z ^^^^^^^^^^^^^^^^^^^^^^^^^
2026-02-11T20:34:24.784712060Z File "/opt/arvados-py/lib/python3.11/site-packages/pandas/core/groupby/groupby.py", line 1777, in _cython_agg_general
2026-02-11T20:34:24.784934046Z raise ValueError("numeric_only accepts only Boolean values")
2026-02-11T20:34:24.784936946Z ValueError: numeric_only accepts only Boolean values
2026-02-11T20:34:25.036438858Z Container exited with status code 1
2026-02-11T20:34:25.072656507Z Total CPU usage was 81.194144 user and 4.043214 sys on 1.00 CPUs
2026-02-11T20:34:25.072737979Z Total disk I/O on 259:5 was 0 bytes written and 4755456 bytes read
2026-02-11T20:34:25.072765410Z Total disk I/O on 252:0 was 339603456 bytes written and 4755456 bytes read
2026-02-11T20:34:25.072878863Z Maximum disk usage was 0%, 1451016192/210237366272 bytes
2026-02-11T20:34:25.072915383Z Maximum container memory pgmajfault usage was 18 faults
2026-02-11T20:34:25.072942054Z Maximum container memory rss usage was 70%, 565473280/805306368 bytes
2026-02-11T20:34:25.072969105Z Total network I/O on eth0 was 235588462 bytes written and 1108675529 bytes read
2026-02-11T20:34:25.631374677Z copying "cost.csv" (339648947 bytes)
2026-02-11T20:34:27.440935988Z copying "report.html" (0 bytes)
2026-02-11T20:34:29.025763250Z Maximum arv-mount memory rss usage was 553496576 bytes
2026-02-11T20:34:29.025800731Z Maximum crunch-run memory rss usage was 463638528 bytes
2026-02-11T20:34:29.025815651Z Maximum keepstore memory rss usage was 308731904 bytes
2026-02-11T20:34:29.025830202Z Complete

Subtasks 1 (0 open1 closed)

Task #23456: Review 23450-pandas-compatResolvedLucas Di Pentima02/19/2026Actions
Actions #1

Updated by Brett Smith about 1 month ago

  • Release set to 84
  • Target version changed from Development 2026-02-18 to Development 2026-03-04
  • Assigned To set to Brett Smith
  • Category set to Deployment

This is not the same error I saw before, this'll require a little more investigation.

Actions #2

Updated by Brett Smith about 1 month ago

prometheus-api-client only depends on pandas >= 1.4.0. Pandas 3.0.0 came out January 21 2026. So there has been another library change since the last time I worked on this.

Actions #3

Updated by Brett Smith about 1 month ago

  • Subtask #23456 added
Actions #4

Updated by Brett Smith about 1 month ago

  • Status changed from New to In Progress
Actions #5

Updated by Brett Smith about 1 month ago

23450-pandas-compat @ eb5099a3712686aad93ee105cdf5dc24a69c58ed - developer-run-tests: #5031

  • All agreed upon points are implemented / addressed. Describe changes from pre-implementation design.
    • Pins our pandas version for the reason given in the new comment.
  • Anything not implemented (discovered or discussed during work) has a follow-up story.
    • N/A
  • Code is tested and passing, both automated and manual, what manual testing was done is described.
    • See above
    • Generated tordo-4zz18-opftpxe93pstl4s by running:
      PROMETHEUS_HOST=https://prometheus.curii.com/ PROMETHEUS_APIKEY=… arv-cluster-activity --cluster=tordo --days=100 --cost-report tordo.csv --html-report tordo.html
  • Tested code incorporates recent main branch changes.
    • Yes
  • New or changed UI/UX and has gotten feedback from stakeholders.
    • N/A
  • Documentation has been updated.
    • N/A
  • Behaves appropriately at the intended scale (describe intended scale).
    • N/A
  • Considered backwards and forwards compatibility issues between client and server.
    • N/A
  • Follows our coding standards and GUI style guidelines.
    • N/A (pure metadata change)
Actions #6

Updated by Lucas Di Pentima about 1 month ago

This LGTM, thanks!

Actions #7

Updated by Brett Smith about 1 month ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF