Message boards : Theory Application : New cranky version explained
Message board moderation

To post messages, you must log in.

AuthorMessage
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 478
Credit: 394,720
RAC: 473
Message 8171 - Posted: 7 Sep 2023, 7:41:07 UTC
Last modified: 8 Sep 2023, 20:41:34 UTC

New cranky version explained


Legacy version

Cranky is an interface script between BOINC and a software container app (runc) running complex scientific processes (Theory Simulation) in a native Linux environment (that means: without virtualbox).

It's last recent major change introduced suspend/resume which is expected by BOINC users but not supported by the scientific processes.
That suspend/resume method is based on a freeze/thaw request to runc which forwards it to systemd.
The method works fine with cgroups v1 but needs distinct cgroup directories being prepared in advance by the computer's admin (root).



New version

Since all major Linux distributions nowadays use cgroups v2 or a (not recommended, even deprecated) hybrid mode v1/v2 there's a need to make cranky ready for cgroups v2.
The most natural point for the split is to keep the scientific container under control of runc and move the cgroups interface to systemd (including suspend/resume).
This results in a single command line like this (within cranky):
sudo [sudo options] systemd-run [systemd options] runc [runc options] [container]

Similar to the legacy version the new cranky requires permission to access cgroup's freezer (now v2).
In addition it requires permission to create a temporary systemd scope per task via systemd-run.
On Linux a standard method to grant permission is to use sudo which checks certain configuration files (sudoers file) that include permission definitions.

The new cranky version comes with a setup script (must be run once) which creates a well formed sudoers file, saves it to the right place and sets the correct access rights. Instead of creating the sudoers file manually it is highly recommended to use that script to avoid a typo or wrong permission settings cause sudo to reject the commands from within cranky.

Since the sudoers file makes use of regular expressions sudo version must at least be 1.9.10. Older sudo versions do not support regular expressions.

This oneliner gets the setup script from CERN and executes it.
In case of any errors, post them here.
sudo /bin/bash -c "export script=\"prepare_theory_native_environment\" && wget https://lhcathomedev.cern.ch/lhcathome-dev/download/$script -O /tmp/$script && chmod u+x /tmp/$script && /tmp/$script && rm /tmp/$script"

<edit> corrected script with '$' being escaped.
sudo /bin/bash -c "export script=\"prepare_theory_native_environment\" && wget https://lhcathomedev.cern.ch/lhcathome-dev/download/\$script -O /tmp/\$script && chmod u+x /tmp/\$script && /tmp/\$script && rm /tmp/\$script"
</edit>


What happens on systems that do not meet the requirements?

New cranky will try to run the task in legacy mode.
Be aware that there is no further development to improve that mode since cgroups v1 will disappear sometime.


Minor changes

New cranky prints more information to the logfile (stderr.txt).
This allows users to see whether basic requirements are missing or which options are recommended, e.g. for the local CVMFS client.
It also prints a hint how to get information about the running task via systemctl - just copy/paste the command.


Microsoft Windows WSL2

According to Microsoft Linux guests under WSL2 can be configured to use systemd as init process.
It may be worth a test if Theory native can be run within such an environment.
ID: 8171 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
m
Volunteer tester

Send message
Joined: 20 Mar 15
Posts: 243
Credit: 886,442
RAC: 181
Message 8172 - Posted: 7 Sep 2023, 10:27:45 UTC - in response to Message 8171.  
Last modified: 7 Sep 2023, 10:29:40 UTC

New cranky version explained

Legacy version
It's last recent major change introduced suspend/resume which is expected by BOINC users but not supported by the scientific processes.
That suspend/resume method is based on a freeze/thaw request to runc which forwards it to systemd.
The method works fine with cgroups v1 but needs distinct cgroup directories being prepared in advance by the computer's admin (root).

I, and I'm sure, others have put a fair bit of effort into arranging things around the lack of suspend/resume for certain applications (Theory, Atlas and others) and am happy with that. So have not needed cgroups etc. The system has worked well for a long time. This latest change therefore involves a major update/reconfiguration for no useful purpose. (even assuming suspend/resume will now work to disk, allowing the host to be shut down). So, unless there is some sort of fall back option to a "basic, non suspendable" version (it looks as though any of the previous versions would do), then, sorry to say it's a show stopper for me, at least in the short to medium term.

What happens on systems that do not meet the requirements?
New cranky will try to run the task in legacy mode.
Be aware that there is no further development to improve that mode since cgroups v1 will disappear sometime.

Didn't work for me.
ID: 8172 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Theory Application : New cranky version explained


©2024 CERN