Message boards :
ATLAS Application :
ATLAS native 1.22
Message board moderation
Author | Message |
---|---|
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 |
Version 1.22 attempts to fix the errors like "failed to create /var/lib/condor directory: mkdir /var/lib/condor: permission denied" which are seen in some situations with certain apptainer versions. The change is to mount only the current working directory (eg /var/lib/boinc/slots/0) into the container rather than the top level directory (eg /var). |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 |
This fixes the problem for one of my computers. |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 |
I have also tested with Centos Stream 9 and it works fine with CVMFS and boinc installed from standard packages and apptainer from CVMFS: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3108671 |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
This sounds good. Last two days got no tasks for CentOS8-VM. My fault, had native not avalaibled. Edit: CentOS9-VM epel-release-9-3.el9 sudo yum install -y cvmfs - no success. also sudo yum install boinc-client boinc-manager no success CentOS8-VM singularity is local installed https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4354 |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
Edit: CentOS9-VM epel-release-9-3.el9 download must be https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest.noarch.rpm my second fault for today, sorry. cvmfs works now on CentOS9-VM! Tomorrow a deeper look for installing boinc (PRE-Release 7.20.2 in CentOS9-VM) |
Send message Joined: 29 May 15 Posts: 147 Credit: 2,842,484 RAC: 0 |
I have already Atlas-Native running and would be happy to help testing here with apptainer. At the moment, there is singulary installed on my Ubuntu 20.04.4 LTS Can you tell me please the exact instructions how I can install apptainer to the boxes? Is Apptainer the same as CentOS..... ? yeti |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
aptainer is the new app instead of singularity. David posted some instructions here and in production today. |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
no success for boinc 7.20.2 (pre-release) Link from RedHat Customer Portal CentOS9 stream epel repo: sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm Boinc pre-release is now installed on one Threadripper 3995wx, but project /lhc@home or lhc@home-dev are not reached. |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
cvmfs on both Threadripper 3995wx now active in CentOS9-VM. First -native from Production is now running: https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10806627 Now -dev integrated: https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=4689 But only one Cpu for this CentOs9-VM. Have to wait for the ending -native from production. |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 |
I have already Atlas-Native running and would be happy to help testing here with apptainer. Hi Yeti, Apptainer from CVMFS works on Ubuntu, at least on one of my machines with Ubuntu 21.10. So you should not have to install anything locally, just let the tasks use the version from CVMFS. The fallback to local singularity is only in case apptainer from CVMFS does not work. |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
Have identical epel-release downloads. One CentOS9-VM have Boinc pre-release 7.20.2 installed, the other CentOS9-VM doesn't install it? |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=4690 This is now the second CentOS9-VM. Had Boinc installed from link of dl...fedoraproject....next.... |
Send message Joined: 29 May 15 Posts: 147 Credit: 2,842,484 RAC: 0 |
So far, I don't get any Atlas-WU. BOINC-Client is 7.16.6, is this modern enough or do I need 7.20.x ? |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
Boinc is ok, but there are only small numbers of -native-Tasks. Seeing also no new tasks. |
Send message Joined: 29 May 15 Posts: 147 Credit: 2,842,484 RAC: 0 |
Meanwhile I have got 18 WUs, for me it looks as if they all have run fine so far: https://lhcathomedev.cern.ch/lhcathome-dev/results.php?userid=250 |
Send message Joined: 22 Apr 16 Posts: 677 Credit: 2,002,766 RAC: 3 |
Have made a test with CentOS9-VM and NO Singularity installed in production. Got this Error: [2022-08-18 09:01:27] CVMFS is ok [2022-08-18 09:01:27] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 [2022-08-18 09:01:27] Checking for singularity binary... [2022-08-18 09:01:27] which: no singularity in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin) [2022-08-18 09:01:27] Singularity is not installed, using version from CVMFS [2022-08-18 09:01:27] Checking singularity works with /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname [2022-08-18 09:01:39] TRCOS9 [2022-08-18 09:01:39] Singularity works [2022-08-18 09:01:42] Starting ATLAS job with PandaID=5565589513 [2022-08-18 09:01:42] Running command: /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec --pwd /var/lib/boinc/slots/0 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh [2022-08-18 09:01:42] Job failed [2022-08-18 09:01:42] FATAL: container creation failed: hook function for tag prelayer returns error: failed to create /var/lib/condor directory: mkdir /var/lib/condor: read-only file system [2022-08-18 09:01:42] ./runtime_log [2022-08-18 09:01:42] ./runtime_log.err 09:11:42 (665950): run_atlas exited; CPU time 0.198227 09:11:42 (665950): app exit status: 0x1 09:11:42 (665950): called boinc_finish(195) After installing singularity it works now. Ok, this is only a test with the production. This new Version in -dev running well, with or without installing of singularity in the CentOS9-VM. |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 |
Thanks, so this confirms that the changes in 1.22 fix the errors seen in production. I will try to deploy this to production today. |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 |
Thanks for all the testing and feedback here, I just released this version as v2.90 on the production server |
Send message Joined: 20 Apr 16 Posts: 180 Credit: 1,355,327 RAC: 0 |
It looks like there are a lot of failures with this version that were not picked up in testing so I reverted it in production and will try to debug here. On one of my own hosts I have a mix of success (https://lhcathome.cern.ch/lhcathome/result.php?resultid=363399068) and failed (https://lhcathome.cern.ch/lhcathome/result.php?resultid=363399242) tasks. The change in bind mounts seems to make some tmp directories read-only giving errors like: Failed to execute payload:mktemp: failed to create file via template '/tmp/asetup_XXXXXX.sh': Read-only file system |
©2024 CERN