Bug #13649
closedc-d-s doesn't request a preemptible instance when it should
100%
Description
When a CR having the preemptible=true
scheduling parameter set gets picked up by crunch-dispatch-slurm
, it seems that the preemptible
parameter gets ignored because c-d-s runs sbatch with the non-preemptible version of the instance type.
Updated by Lucas Di Pentima over 6 years ago
- Related to Story #7478: [Node Manager] Creates compute nodes using AWS spot instances added
Updated by Lucas Di Pentima over 6 years ago
- Status changed from New to In Progress
Updated by Lucas Di Pentima over 6 years ago
Updates at b3f6c1442 - branch 13649-cds-preemptible-sbatch
This issue seems to not be related to a crunch-dispatch-slurm
bug, after some more investigation I saw that although the container request included the preemptible
scheduling parameter, its container didn't get it, so it seems that this is the reason for c-d-s
to not be able to see it.
The issue is related to the other bugfix at branch 7478-auto-preemptible-cr-fix
, having to do with the order that some callbacks are called: set_container
is set to be run before the callback that set the default scheduling parameters value, so the newly created container copies an empty scheduling_parameters
field from its request.
On this branch I first merged 7478-auto-preemptible-cr-fix
because it solves a similar issue, and then wrote a test that exposes the bug. I'm not sure which is the proper way to solve this, I tried reordering the callbacks, but lots of tests fail.
Updated by Lucas Di Pentima over 6 years ago
Updates at 0e94528878341faa941247eb3d990230c941868c - branch 13649-preemptible-committed-child-cr
(rebased, renamed branch)
Test run: https://ci.curoverse.com/job/developer-run-tests/773/
Changed the entire approach, reverted the set_requesting_container_uuid
callback to be called on before_create
, but added a private method that returns the requesting container even though it isn't assigned to the CR (yet) so it can be used from another callback before it's assigned.
Moved set_default_preemptible_scheduling_parameter
to be run after set_container
so that the resolved container gets the correct scheduling parameters.
Updated by Peter Amstutz over 6 years ago
Lucas Di Pentima wrote:
Updates at 0e94528878341faa941247eb3d990230c941868c - branch
13649-preemptible-committed-child-cr
(rebased, renamed branch)
Test run: https://ci.curoverse.com/job/developer-run-tests/773/Changed the entire approach, reverted the
set_requesting_container_uuid
callback to be called onbefore_create
, but added a private method that returns the requesting container even though it isn't assigned to the CR (yet) so it can be used from another callback before it's assigned.
Movedset_default_preemptible_scheduling_parameter
to be run afterset_container
so that the resolved container gets the correct scheduling parameters.
Couple of small testing suggestions:
assert_not_nil cr.requesting_container_uuid
Should probably be
assert_equal 'zzzzz-dz642-runningcontainr', cr.requesting_container_uuid
You added two tests, but they appear to be nearly identical. Could you merge them into one test?
Rest LGTM.
Updated by Lucas Di Pentima over 6 years ago
Branch 13649-spot-instances-doc
- e9e440847c574d1152b128cfd508cf473f49121b
Documentation page on Admin section that describes the required configuration changes to be able to use AWS Spot instances.
Updated by Lucas Di Pentima over 6 years ago
For cost tracking, AWS offers a spot instance data feed that creates hourly logs on an S3 bucket: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-data-feeds.html
I've done some tests with t2.micro instances (price being $0.0116 for on-demand instances), asked t2.micro spot instances with max_price=0.005
(price history on us-west regions show 0.0035 to be the lowest price) and got this tab separated values file on the S3 bucket:
#Version: 1.0 #Fields: Timestamp UsageType Operation InstanceID MyBidID MyMaxPrice MarketPrice Charge Version 2018-06-25 19:45:04 UTC USW2-SpotUsage:t2.micro RunInstances:SV001 i-0fc4474c7952e98d4 sir-qqci9gxn 0.005 USD 0.004 USD 0.004 USD 1
Updated by Tom Morris over 6 years ago
I thought we said that we were going to bid the default (ie on-demand) price rather than trying to figure out any exotic bidding strategy. We get the spot market price even if we bid "too high."
Updated by Lucas Di Pentima over 6 years ago
Tom Morris wrote:
I thought we said that we were going to bid the default (ie on-demand) price rather than trying to figure out any exotic bidding strategy. We get the spot market price even if we bid "too high."
Yes, that's what we're doing. The price data feed is to know the actual price that Amazon's charges for every instance after every running hour, so that we can do cost analysis on our tests.
Updated by Nico César over 6 years ago
review at e9e440847c574d1152b128cfd508cf473f49121b
the phrase
"If attempting to use this feature with an already working AWS account and getting an error message on nodemanager's log like this one:"
sounds long and confusing, maybe "If this is the case check nodemanager's log for:" is better?
Changing the "you" reference: "...linked role created. For this, you can log into your AWS account, go to " by "This can be done by logging int the AWS account..."
besides that small change I feel that the documentation should be ready to merge
Updated by Lucas Di Pentima over 6 years ago
- Status changed from In Progress to Resolved