1) Message boards : ATLAS Application : ATLAS vbox v.1.27 (Message 7900)
Posted 30 Nov 2022 by David Cameron
Post:
v1.27 contains a very minor change to pass information from the bootstrap script to the wrapper script.
2) Message boards : ATLAS Application : ATLAS vbox v.1.26 (Message 7890)
Posted 24 Nov 2022 by David Cameron
Post:
ATLAS vbox 1.26 was just released which contains some small improvements in handling error conditions in the bootstrap script. v1.20 - 25 were already taken by native versions so that's why there is a jump from 1.19 :)
3) Message boards : ATLAS Application : ATLAS vbox v.1.19 (Message 7889)
Posted 22 Nov 2022 by David Cameron
Post:
v1.19 was just released which contains some improvements to the bootstrap scripts and CVMFS configuration.
4) Message boards : ATLAS Application : ATLAS vbox v.1.18 (Message 7882)
Posted 16 Nov 2022 by David Cameron
Post:
ATLAS v1.18 is now released. This version uses the new vboxwrapper version 26206 and also contains various CVMFS configuration improvements made by computezrmle.
5) Message boards : ATLAS Application : ATLAS vbox v.1.17 (Message 7826)
Posted 19 Oct 2022 by David Cameron
Post:
Hi all,

v1.17 of vbox contains some fixes for CVMFS configuration provided by computezrmle which should address some of the problems people see with stuck or not working CVMFS at the start of tasks. We are testing it here on dev just to make sure it works ok before releasing on the prod server.
6) Message boards : ATLAS Application : ATLAS native 1.25 (Message 7780)
Posted 5 Sep 2022 by David Cameron
Post:
This seemed to solve the problems with tmp dirs so this version is now deployed on the production server.
7) Message boards : ATLAS Application : ATLAS native 1.23 (Message 7770)
Posted 31 Aug 2022 by David Cameron
Post:
I wonder if this could be a side effect of hardening options set in BOINC's systemd service unit.

Did not yet test it but it should be ensured that the tmp dir forwarded to apptainer is not the system wide tmp.
Instead the tmp below the slot should be used.


Thanks for this tip, it looks like this is indeed the problem. The unit file has
ProtectSystem=strict
ReadWritePaths=-/var/lib/boinc -/etc/boinc-client


which makes /tmp and /var/tmp read-only.

In v1.25 I set APPTAINERENV_TMPDIR to a dir inside the slots and this seems to fix the problem.
8) Message boards : ATLAS Application : ATLAS native 1.25 (Message 7769)
Posted 31 Aug 2022 by David Cameron
Post:
This version sets TMPDIR to a directory inside the slots dir instead of /tmp
9) Message boards : ATLAS Application : ATLAS native 1.24 (Message 7764)
Posted 23 Aug 2022 by David Cameron
Post:
This version adds some debugging statements to try to figure out the problems with read-only tmp dirs.
10) Message boards : ATLAS Application : 8 core atlas native uses only 1 core. (Message 7763)
Posted 23 Aug 2022 by David Cameron
Post:
Also note that the tasks running here are very short and only process 2 events compared to 200 per task in production. This is to test things with a quick turnaround and not waste people's resources producing data that is not useful for science. Since the events are split between cores this means these short tasks will never use more than 2 CPUs.
11) Message boards : ATLAS Application : ATLAS native 1.23 (Message 7754)
Posted 18 Aug 2022 by David Cameron
Post:
This version explicitly mounts /tmp and /var/tmp into the container, to see if this fixes the errors seen in production.
12) Message boards : ATLAS Application : ATLAS native 1.22 (Message 7752)
Posted 18 Aug 2022 by David Cameron
Post:
It looks like there are a lot of failures with this version that were not picked up in testing so I reverted it in production and will try to debug here.

On one of my own hosts I have a mix of success (https://lhcathome.cern.ch/lhcathome/result.php?resultid=363399068) and failed (https://lhcathome.cern.ch/lhcathome/result.php?resultid=363399242) tasks.

The change in bind mounts seems to make some tmp directories read-only giving errors like:

Failed to execute payload:mktemp: failed to create file via template '/tmp/asetup_XXXXXX.sh': Read-only file system
13) Message boards : ATLAS Application : ATLAS native 1.22 (Message 7749)
Posted 18 Aug 2022 by David Cameron
Post:
Thanks for all the testing and feedback here, I just released this version as v2.90 on the production server
14) Message boards : ATLAS Application : ATLAS native 1.22 (Message 7748)
Posted 18 Aug 2022 by David Cameron
Post:
Thanks, so this confirms that the changes in 1.22 fix the errors seen in production. I will try to deploy this to production today.
15) Message boards : ATLAS Application : ATLAS native 1.22 (Message 7741)
Posted 17 Aug 2022 by David Cameron
Post:
I have already Atlas-Native running and would be happy to help testing here with apptainer.

At the moment, there is singulary installed on my Ubuntu 20.04.4 LTS

Can you tell me please the exact instructions how I can install apptainer to the boxes?

Is Apptainer the same as CentOS..... ?

yeti


Hi Yeti,

Apptainer from CVMFS works on Ubuntu, at least on one of my machines with Ubuntu 21.10. So you should not have to install anything locally, just let the tasks use the version from CVMFS. The fallback to local singularity is only in case apptainer from CVMFS does not work.
16) Message boards : ATLAS Application : ATLAS native 1.22 (Message 7734)
Posted 16 Aug 2022 by David Cameron
Post:
I have also tested with Centos Stream 9 and it works fine with CVMFS and boinc installed from standard packages and apptainer from CVMFS: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3108671
17) Message boards : ATLAS Application : ATLAS native 1.22 (Message 7733)
Posted 15 Aug 2022 by David Cameron
Post:
This fixes the problem for one of my computers.
18) Message boards : ATLAS Application : ATLAS native 1.22 (Message 7732)
Posted 15 Aug 2022 by David Cameron
Post:
Version 1.22 attempts to fix the errors like "failed to create /var/lib/condor directory: mkdir /var/lib/condor: permission denied" which are seen in some situations with certain apptainer versions.

The change is to mount only the current working directory (eg /var/lib/boinc/slots/0) into the container rather than the top level directory (eg /var).
19) Message boards : ATLAS Application : ATLAS native 1.20 (Message 7726)
Posted 12 Aug 2022 by David Cameron
Post:
I made a silly mistake when releasing this version by forgetting to make the wrapper script executable so all tasks were failing. Version 1.21 fixes this.
20) Message boards : ATLAS Application : ATLAS native 1.20 (Message 7725)
Posted 11 Aug 2022 by David Cameron
Post:
ATLAS native 1.20 was just released which uses apptainer instead of singularity. At the moment apptainer functionality is identical to singularity. If apptainer does not work there is still a fallback to using singularity so tasks should work as normal for those who have a locally installed singularity. But this will be removed at some point in the future so we recommend that people who cannot use singularity/apptainer from CVMFS install a local version of apptainer instead of singularity. apptainer provides a backwards compatible "singularity" command so installing it will not break production tasks still relying on singularity.


Next 20


©2023 CERN