Message boards : Theory Application : Docker on Windows
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Send message Joined: 24 Oct 19 Posts: 208 Credit: 581,115 RAC: 828 ![]() ![]() |
In reply to Crystal Pellet's message of 26 Mar 2025: I tried a Theory docker task on Windows 10 where I didn't have WSL installed. (so from scratch). Maybe it's a good idea to write a little "guide" (maybe a thread in the forum) to sum up all the info to start from scratch? |
![]() Send message Joined: 28 Jul 16 Posts: 519 Credit: 400,710 RAC: 11 ![]() ![]() |
+1 The interesting part of the log is this: Mounted CVMFS in the container. job: htmld=/var/www/lighttpd job: unpack exitcode=0 job: run exitcode=1 job: diskusage=6828 job: logsize=16 k job: times= 0m0.010s 0m0.000s 0m19.678s 0m7.623s job: cpuusage=27 ===> [runRivet] Wed Mar 26 13:03:36 UTC 2025 [boinc pp z1j 8000 180 - pythia8 8.313 tune-monash13 100000 92] Job Finished It shows the log output of the scientific app. The task was very short because the scientific app failed with "job: run exitcode=1". ATM there are lots of those around but it is not related to docker. Next docker version will tail runRivet.log to the BOINC slot (already available on Linux) for easier monitoring. |
![]() Send message Joined: 28 Jul 16 Posts: 519 Credit: 400,710 RAC: 11 ![]() ![]() |
In reply to boboviz's message of 26 Mar 2025: In reply to Crystal Pellet's message of 26 Mar 2025: You find most of it in a few threads here. It's currently a moving target, hence makes no sense to write a complete documentation now. What works on platform A today may not work on platform B, so modifications will have to be tested first. |
![]() ![]() Send message Joined: 12 Sep 14 Posts: 1129 Credit: 339,230 RAC: 3 ![]() |
In reply to Crystal Pellet's message of 26 Mar 2025: I tried a Theory docker task on Windows 10 where I didn't have WSL installed. (so from scratch). This is great! Hopefully most of the setup will be done by the Windows installer for BOINC so we have to wait for the upstream release of the client. |
Send message Joined: 24 Oct 19 Posts: 208 Credit: 581,115 RAC: 828 ![]() ![]() |
In reply to computezrmle's message of 26 Mar 2025: You find most of it in a few threads here. Ok. Thank you! |
Send message Joined: 22 Apr 16 Posts: 731 Credit: 2,205,280 RAC: 2,384 ![]() ![]() ![]() |
|
Send message Joined: 22 Apr 16 Posts: 731 Credit: 2,205,280 RAC: 2,384 ![]() ![]() ![]() |
Where is in Github the link for Windows to install Boinc 8.1.0 using Docker. atm have 8.0.4 |
Send message Joined: 13 Feb 15 Posts: 1223 Credit: 933,122 RAC: 1,135 ![]() ![]() ![]() |
Some remarks to the docker version: 1. Suspend of a task works with or without ''Leave non-GPU tasks in memory while suspended" ticked. 2. After a BOINC restart the task survives, but starts from scratch. 3. High CPU-usage during event processing part of the main process vmmem: --- On a quad-core with VBox for 1 task 25%, with Docker jumping between 27% and 42%. |
Send message Joined: 13 Feb 15 Posts: 1223 Credit: 933,122 RAC: 1,135 ![]() ![]() ![]() |
2 error-tasks: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3391508 https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3391569 Reason: Errors during downloading metadata for repository 'epel' Status code: 503 for https://mirrors.fedoraproject.org/metalink?repo=epel-9&arch=x86_64&infra=container&content=$contentdir (IP: 18.159.254.57) |
![]() Send message Joined: 28 Jul 16 Posts: 519 Credit: 400,710 RAC: 11 ![]() ![]() |
Some remarks to the docker version: Thanks. Good to know. As for (2.), that's at least not worse than before using native. As for (3.) Does it mean you ran a scenario 'A' with 1 vbox task and no docker tasks and later you ran 'B1' with no vbox tasks and n (how many?) docker tasks? Or did you run vbox beside docker in scenario 'B2'? If you monitor the docker containers - e.g. running 'docker stats' or 'podman stats' - you may notice that some containers use far more than 100% CPU. This is because each task runs 2 processes, the mc-generator and rivetvm. Errors during downloading metadata for repository 'epel' A temporary glitch affecting the CDN where fedora hosts the mirror list. As a result the image build can't complete. This is not under CERN's control. Should work again after a few minutes when the DNS records time out and the next request gets the IP of a 'good' server. |
Send message Joined: 24 Oct 19 Posts: 208 Credit: 581,115 RAC: 828 ![]() ![]() |
In reply to Crystal Pellet's message of 27 Mar 2025: 2. After a BOINC restart the task survives, but starts from scratch. +1 But the wus are not so long, so the checkpoint it's a relative problem 3. High CPU-usage during event processing part of the main process vmmem: Strange. On my pc, running 2 wus uses 20% of cpu... |
Send message Joined: 13 Feb 15 Posts: 1223 Credit: 933,122 RAC: 1,135 ![]() ![]() ![]() |
In reply to computezrmle's message of 27 Mar 2025: Does it mean you ran a scenario 'A' with 1 vbox task and no docker tasks and later you ran 'B1' with no vbox tasks and n (how many?) docker tasks?The comparison is with standalone tasks. - you may notice that some containers use far more than 100% CPU. This is because each task runs 2 processes, the mc-generator and rivetvm. That explains the higher cpu usage. I have seen the same when using 2 cpu's for a Theory VM. It depends on the used generator (Pythia, Herwig, Sherpa etc) and how fast the events are processed. In between rivetvm has to do some processing. In the past we also had the plotter.exe running every now and than. Example of a 2-core VBox-task: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3389481 : Run time 1 hours 8 min 35 sec <==> CPU time 1 hours 32 min 38 sec For those users that want a responsive system, they could made use of <app_version> <app_name>Theory</app_name> <plan_class>docker</plan_class> <avg_ncpus>2</avg_ncpus> </app_version> in app_config.xml |
Send message Joined: 13 Feb 15 Posts: 1223 Credit: 933,122 RAC: 1,135 ![]() ![]() ![]() |
In reply to boboviz's message of 27 Mar 2025: 3. High CPU-usage during event processing part of the main process vmmem: Strange. On my pc, running 2 wus uses 20% of cpu... The "during event processing" part of the sentence is important. There are a lot of short tasks ending before the event processing starts. |
![]() Send message Joined: 28 Jul 16 Posts: 519 Credit: 400,710 RAC: 11 ![]() ![]() |
In reply to boboviz's message of 27 Mar 2025: In reply to Crystal Pellet's message of 27 Mar 2025: Depends on mcplots. ATM the queue sends lots of tasks that fail early. In the future there will be short tasks as well as tasks running for days, as usual because the scientific payload is more or less the same. 3. High CPU-usage during event processing part of the main process vmmem: 25% of 4 core => 1 core 20% of 12 core => 2.4 cores => 2 tasks with 120% each 20% of 16 core => 3.2 cores => 2 tasks with 160% each At least for the 12 core roughly within the normal range. |
Send message Joined: 24 Oct 19 Posts: 208 Credit: 581,115 RAC: 828 ![]() ![]() |
In reply to Crystal Pellet's message of 27 Mar 2025: The "during event processing" part of the sentence is important. Yeaph. I see now some wus very short. But these are, anyway, validated. |
![]() Send message Joined: 28 Jul 16 Posts: 519 Credit: 400,710 RAC: 11 ![]() ![]() |
In reply to boboviz's message of 27 Mar 2025: But these are, anyway, validated. On dev the #invalids/errors is limited to 32 in a row (IIRC per core per computer). If a computer exceeds this limit it will not get further work for 24h (or maybe until midnight). Since those errors are usually not what we test here the tasks report a success back to BOINC. This may change once the app_version moves to prod (or maybe if people misuse dev as a prod like project). |
Send message Joined: 22 Apr 16 Posts: 731 Credit: 2,205,280 RAC: 2,384 ![]() ![]() ![]() |
no new Tasks atm. |
Send message Joined: 24 Oct 19 Posts: 208 Credit: 581,115 RAC: 828 ![]() ![]() |
After a lot of correct wus, now some errors (after 10 minutes of run) <message> |
Send message Joined: 13 Feb 15 Posts: 1223 Credit: 933,122 RAC: 1,135 ![]() ![]() ![]() |
A valid task from BOINC's point of view, but run exit code = 1 87800 events processed 87900 events processed ./rungen.sh: line 2669: 2499 Segmentation fault (core dumped) /scratch/pythia8/pythia8.exe /scratch/tmp/tmp.4llDBUdfz2/generator.params /scratch/tmp/tmp.4llDBUdfz2/generator.hepmc ERROR: failed to run pythia8 8.313 terminate called after throwing an instance of 'HepMC::IO_Exception' what(): input stream encountered invalid data, stream is now corrupt [1]- 1883 Exit 1 ( env $origEnv $generatorExecString; exit $? ) [2]+ 1884 Running ( $rivetExecString; exit $? ) & (wd: /scratch/tmp/tmp.4llDBUdfz2) ERROR: fail to run pythia8 8.313 or Rivet (error exit code)[/url] |
Send message Joined: 22 Apr 16 Posts: 731 Credit: 2,205,280 RAC: 2,384 ![]() ![]() ![]() |
Computer 4639 is in MC Production, but 5337 not. UserId 378. |
©2025 CERN