Hey, @ksonney@redwombat.social @snipe@hackers.town and anyone else who's done... - Random

daemionfox, 1 year ago

Hey, @ksonney @snipe and anyone else who's done any kind of mucking w/ Docker. Got a weirdness. We have a setup where we're using docker via Jenkins for load testing, running selenium hub and pytest. The weirdness is this. AS SOON AS one of the tests complete, it shuts all of the containers off. It's very confusing to me.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Image

Image alternative text

c0dec0dec0de, 1 year ago

@daemionfox @ksonney @snipe what in the hell? What kind of pipeline gives you that behavior?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de @ksonney @snipe So apparently it's a thing w/ Jenkins. As soon as it registers a test complete (process end) from one of the testing containers, it starts cleanup and spindown steps. This of course kills all the other containers running. We've solved the problem by adding a never-ending tail to dev/null but that has it's own complications. Also, we can run this w/ Docker w/o Jenkins no problem, which takes away our one-button pushstart but, compromises.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

c0dec0dec0de, 1 year ago

@daemionfox @ksonney @snipe weird, I use multi-stage, multi-container Jenkins jobs routinely, and I’m not sure why that would happen. Are you doing these things in parallel?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de @ksonney @snipe yeah, parallel. We've been told we have to attempt to replicate our high-end load (about 500 users/second) over a very complex set of selenium paths. So lots of randomly set delays in the test script, all trying to replicate opening day

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

c0dec0dec0de, 1 year ago

@daemionfox @ksonney @snipe does that process ends with a nonzero status? I can see that triggering clean-up depending on some other factors.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de First process to finish is 0, everything else ends up with 137 when it gets killed. We're launching 41 containers in parallel (1 Selenium hub, 10 Selenium chrome, 30 users), first user to complete wins, anyone else has to complete before they shutdown in order to get results.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

c0dec0dec0de, 1 year ago

@daemionfox guessing that the hub should be outside the rest of your parallel steps with the rest nested within its withRun() block to guarantee that it lives long enough for all the other things to finish. I don’t know enough about selenium to say whether the chrome containers need to outlive the clients (probably?).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de Okay, so I'm thinking I'm not doing this the way I should be doing this. We're doing one step. Which is just a docker-compose up. Unfortunately this is one of those things that's outside my wheelhouse that I got pulled onto because I'm the team lead. Our docker/jenkins expert left 6 months back for better money, and I haven't been able to get the PTB to put out for new hires.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

c0dec0dec0de, 1 year ago

@daemionfox I haven’t used Compose in Jenkins. Are you just running a shell step then that invokes docker-compose up? And running the same command on a developer machine goes to completion?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de Slightly more than that, but yeah, basically that's it. We've parameterized the shell step so we can build a custom docker-compose.yaml based on our needs (# browsers, # users, etc) and a couple of other minor setup steps, but then just starting the docker.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

c0dec0dec0de, 1 year ago

@daemionfox well, that’s super weird then because I would think that Jenkins would wait for your whole command to exit (which might be never with docker-compose up).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de As I said, this is way outside my wheelhouse, so we're kind of muddling through. My current thought, honestly, is to track the complete status of each user-container and have a shutdown script watching to see when everyone reports as complete before letting us hit the end of the process.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

c0dec0dec0de, 1 year ago

@daemionfox maybe this is something like what you need?

https://stackoverflow.com/questions/57241307/how-to-make-docker-compose-stop-after-x-containers-have-stopped

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de I dunno. Healthchecks might be the way to go. We're not explicitly doing the docker compose down command, leaving that to Jenkins. (We don't even have a post-test step to spin down, it just does)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

c0dec0dec0de, 1 year ago

@daemionfox if you’re detaching (using the -d flag for compose), then Jenkins doesn’t care about the daemon process at all and will just keep running your pipeline. If you’re in a spun up VM or using Docker-in-Docker, I guess that’ll clean up for you, but it might do it really early. Waiting explicitly for your client containers to exit would let Jenkins know when you’re actually ready to have the test cleaned up.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de Good to know. I'll take a look, should be relatively straightfoward to add the docker-wait for the users. We're not detaching as far as I know. But mostly we're copypasta-ing the settings our ex-docker guy left us.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

c0dec0dec0de, 1 year ago

@daemionfox good luck

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

daemionfox, 1 year ago

@c0dec0dec0de After digging in further, I have found the problem.

The docker-compose up has an --abort-on-container-exit flag. Of course it does.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment