Story #11139
Updated by Peter Amstutz almost 8 years ago
There's a discrepancy between the RAM of a VM used to choose what size node to boot for a job, and the actual amount of memory available to the job. If a job falls in the "donut hole", the job will be unable to run because the request is larger than the actual memory available, but node manager won't boot up a properly sized node because it will believe that the job is satisfied. <pre> tetron@compute3.c97qk:/usr/local/share/arvados-compute-ping-controller.d$ awk '($1 == "MemTotal:"){print ($2 / 1024)}' </proc/meminfo 3440.54 </pre> <pre> df -m /tmp | perl -e ' > my $index = index(<>, " 1M-blocks "); > substr(<>, 0, $index + 10) =~ / (\d+)$/; > print "$1\n"; > ' 51170 </pre> <pre> tetron@compute3.c97qk:/usr/local/share/arvados-compute-ping-controller.d$ sinfo -n compute3 --format "%c %m %d" CPUS MEMORY TMP_DISK 1 3440 51169 </pre> <pre> >>> szd["Standard_D1_v2"] <NodeSize: id=Standard_D1_v2, name=Standard_D1_v2, ram=3584 disk=50 bandwidth=0 price=0 driver=Azure Virtual machines ...> >>> </pre> For Standard_D1_v2 there is a ~144 MiB discrepancy between the advertised RAM size and the amount of RAM considered available by Linux. <pre> CPUS MEMORY TMP_DISK 2 6968 102344 </pre> <pre> <NodeSize: id=Standard_D2_v2, name=Standard_D2_v2, ram=7168 disk=100 bandwidth=0 price=0 driver=Azure Virtual machines ...> </pre> For Standard_D1_v2 it is 200 MiB. Based on discussion: node manager should reduce the RAM size for node by 5% from the "sticker value" in the ServerCalculator (jobqueue.py) The scale factor should be settable in the configuration file.