Message boards : ATLAS Application : Native app using Singularity from CVMFS
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6520 - Posted: 8 Aug 2019, 12:21:20 UTC
Last modified: 12 Aug 2019, 14:19:09 UTC

(EDIT: instructions updated following comments below. Thanks to everyone who tested this out!)

ATLAS now maintains its own deployment of Singularity in the atlas.cern.ch CVMFS repository. This means that in theory volunteers running the native app no longer have to install Singularity themselves and the only requirement is CVMFS.

To see if this works I've created native version 0.70. If Singularity is not detected it will fallback to using the version on CVMFS. I have tested this version myself and it works ok but it would be good if some of you could try running native tasks on machines without Singularity and share your experiences.

The version in CVMFS should work on most operating systems but one important point to note is that it requires user namespaces to be enabled and this is not done by default on some platforms (e.g. CentOS 7). You can easily test this by trying this:

# /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
ERROR  : Failed to create user namespace


If you see this error then you should run as root:

On RedHat-like machines:

echo "user.max_user_namespaces = 15000" > /etc/sysctl.d/90-max_user_namespaces.conf
sysctl -p /etc/sysctl.d/90-max_user_namespaces.conf


On Debian-like machines:

sed -i '$ a\kernel.unprivileged_userns_clone = 1' /etc/sysctl.conf
sysctl -p


On some other systems a change in kernel arguments is required followed by a reboot:

grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"


The native wrapper also checks that the container is useable before starting the tasks and prints information to the stderr log. All feedback is welcome!
ID: 6520 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6521 - Posted: 8 Aug 2019, 18:18:34 UTC - in response to Message 6520.  
Last modified: 8 Aug 2019, 18:31:03 UTC

It is looking good!

Tested it with a system with no singularity installed. The logs show:
...
Checking for CVMFS
CVMFS is installed
OS:cat: /etc/redhat-release: Datei oder Verzeichnis nicht gefunden

This is not SLC6, need to run with Singularity....
Checking Singularity...
Singularity is not installed, using version from CVMFS
Testing the function of Singularity...
Checking singularity with cmd:/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
Singularity Works...
...

The task is currently running, and there should be no reason why it should fail.

N.B.: The command
 /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity version
shows 3.2.1-1. The most current version of singularity is 3.3.0.
ID: 6521 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 6522 - Posted: 8 Aug 2019, 18:21:06 UTC
Last modified: 8 Aug 2019, 18:55:10 UTC

Have a CentOS7 running for Atlas-native in Production.
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10594509

The echo 640 > /proc/sys/user/max_user_namespaces is set because of Theory-native. Also openhtc.io...

The Task say no CVMFS is installed?
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2797688
Edit: This Computer have both projects Production and -dev in Boinc.
ID: 6522 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6523 - Posted: 9 Aug 2019, 4:58:43 UTC - in response to Message 6521.  

The task is currently running, and there should be no reason why it should fail.
It finished successfully:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2797696
ID: 6523 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6524 - Posted: 9 Aug 2019, 13:03:51 UTC - in response to Message 6523.  

The task is currently running, and there should be no reason why it should fail.
It finished successfully:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2797696


Great! Thanks a lot for testing it. I'll wait for more feedback before deploying in production. One nice thing about this is that the requirements for ATLAS native and Theory native will be the same.

I just released version 0.71 with some cosmetic changes and the fix for the problem discussed here
ID: 6524 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 6525 - Posted: 9 Aug 2019, 17:47:19 UTC

What if singularity is already installed. One task tested and that failed: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2798232
ID: 6525 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 6527 - Posted: 9 Aug 2019, 23:26:22 UTC - in response to Message 6524.  
Last modified: 9 Aug 2019, 23:31:53 UTC

. One nice thing about this is that the requirements for ATLAS native and Theory native will be the same.

The best ever.. but Laurence have Boinc-VM App testing here in -dev ;-))
ID: 6527 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6528 - Posted: 10 Aug 2019, 8:09:07 UTC - in response to Message 6525.  

What if singularity is already installed. One task tested and that failed: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2798232
Are you sure that your local singularity version is working? BTW, your singularity version is quite old.

Tested the new version with singularity installed and everything is working fine. The log shows:

Checking for CVMFS
CVMFS is installed
OS:cat: /etc/redhat-release: Datei oder Verzeichnis nicht gefunden

This is not SLC6, need to run with Singularity....
Checking Singularity...
Singularity is installed, version singularity version 3.3.0-614.gf0cd4b488
Testing the function of Singularity...
Checking singularity with cmd:singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
Singularity Works...

copy /home/boinc/boinc1/slots/0/shared/ATLAS.root_0
copy /home/boinc/boinc1/slots/0/shared/RTE.tar.gz
copy /home/boinc/boinc1/slots/0/shared/input.tar.gz
copy /home/boinc/boinc1/slots/0/shared/start_atlas.sh
export ATHENA_PROC_NUMBER=2;start atlas job with grep: pandaJobData.out: Datei oder Verzeichnis nicht gefunden
cmd = singularity exec --pwd /home/boinc/boinc1/slots/0 -B /cvmfs,/home /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 sh start_atlas.sh > runtime_log 2> runtime_log.err
ID: 6528 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 6529 - Posted: 10 Aug 2019, 10:41:28 UTC - in response to Message 6528.  

What if singularity is already installed. One task tested and that failed: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2798232
Are you sure that your local singularity version is working?
Theory native from LHC-dev and LHC-production site were working well.
ID: 6529 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 6530 - Posted: 10 Aug 2019, 20:59:59 UTC
Last modified: 10 Aug 2019, 21:23:10 UTC

After a downloadspeed of 95 kbps for 260 MByte,
stderr.txt from SL76 show:
22:49:01 (29663): wrapper (7.7.26015): starting
22:49:01 (29663): wrapper: running run_atlas (--nthreads 1)
singularity image is /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6
sys.argv = ['run_atlas', '--nthreads', '1']
THREADS=1
Checking for CVMFS
check cvmfs return values are 0, 256
CVMFS not found, aborting the job

running start_atlas return value is 1

cvmfs_config probe is ok and cvmfs_config showconfig -s is also ok?
Probing /cvmfs/atlas.cern.ch... OK
Probing /cvmfs/grid.cern.ch... OK
Probing /cvmfs/cernvm-prod.cern.ch... OK
Probing /cvmfs/sft.cern.ch... OK
Probing /cvmfs/alice.cern.ch... OK

cvmfs_config stat atlas.cern.ch
VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
2.6.0.0 2689 4 28976 52686 3 1 633009 4194304 0 65024 0 0 n/a 20106 4908 http://s1cern-cvmfs.openhtc.io/cvmfs/atlas.cern.ch DIRECT 1
cvmfs_config stat atlas-condb.cern.ch
atlas-condb.cern.ch not mounted

Tomorrow will proof default.local.
ID: 6530 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 6531 - Posted: 11 Aug 2019, 6:52:31 UTC - in response to Message 6530.  
Last modified: 11 Aug 2019, 6:53:37 UTC

atlas-condb.cern.ch not mounted was the reason for this task.
Now is this Server mounted and this new task
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2798852
08:11:46 (3553): wrapper (7.7.26015): starting
08:11:46 (3553): wrapper: running run_atlas (--nthreads 1)
singularity image is /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6
sys.argv = ['run_atlas', '--nthreads', '1']
THREADS=1
Checking for CVMFS
CVMFS is installed
OS:Scientific Linux release 7.6 (Nitrogen)

This is not SLC6, need to run with Singularity....
Checking Singularity...
Singularity is installed, version 2.2.99
Testing the function of Singularity...
Checking singularity with cmd:singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
Singularity Works...

copy /root/Downloads/BOINC/slots/0/shared/ATLAS.root_0
copy /root/Downloads/BOINC/slots/0/shared/input.tar.gz
copy /root/Downloads/BOINC/slots/0/shared/RTE.tar.gz
copy /root/Downloads/BOINC/slots/0/shared/start_atlas.sh
start atlas job with grep: pandaJobData.out: Datei oder Verzeichnis nicht gefunden
cmd = singularity exec --pwd /root/Downloads/BOINC/slots/0 -B /cvmfs,/root /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 sh start_atlas.sh > runtime_log 2> runtime_log.err
running cmd return value is 32512
have a Panda-Error shown. VM is SL76.
ID: 6531 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 734
Credit: 11,558,055
RAC: 2,030
Message 6532 - Posted: 11 Aug 2019, 8:05:22 UTC

*File or directory not found*
ID: 6532 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6535 - Posted: 11 Aug 2019, 19:57:53 UTC

I have set up a new VM with debian 10, installed boinc and compiled cvmfs. cvmfs_config probe shows all 'ok'.
For enabling user namespace the following commands have been used:
sudo sed -i '$ a\kernel.unprivileged_userns_clone = 1' /etc/sysctl.conf
sudo sysctl -p

Tested a 0.70 nativ task, but it failed due to singularity not working:
THREADS=1
Checking for CVMFS
CVMFS is installed
OS:cat: /etc/redhat-release: No such file or directory

This is not SLC6, need to run with Singularity....
Checking Singularity...
Singularity is not installed, using version from CVMFS
Testing the function of Singularity...
Checking singularity with cmd:/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname

Singularity isnt working: 

running start_atlas return value is 3
Any ideas why it is failing?
ID: 6535 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6538 - Posted: 12 Aug 2019, 8:42:25 UTC - in response to Message 6535.  

I have set up a new VM with debian 10, installed boinc and compiled cvmfs. cvmfs_config probe shows all 'ok'.
For enabling user namespace the following commands have been used:
sudo sed -i '$ a\kernel.unprivileged_userns_clone = 1' /etc/sysctl.conf
sudo sysctl -p

Tested a 0.70 nativ task, but it failed due to singularity not working:
THREADS=1
Checking for CVMFS
CVMFS is installed
OS:cat: /etc/redhat-release: No such file or directory

This is not SLC6, need to run with Singularity....
Checking Singularity...
Singularity is not installed, using version from CVMFS
Testing the function of Singularity...
Checking singularity with cmd:/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname

Singularity isnt working: 

running start_atlas return value is 3
Any ideas why it is failing?


Can you run the command manually and post the output?

/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
ID: 6538 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6539 - Posted: 12 Aug 2019, 8:47:33 UTC - in response to Message 6532.  

*File or directory not found*


This should have been fixed in 0.71. However I see that I forgot to deprecate 0.70 so everyone is still getting that version. I've fixed that now. The error itself has no effect on the running task, it only affects the log message to stderr.
ID: 6539 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 6540 - Posted: 12 Aug 2019, 13:20:58 UTC
Last modified: 12 Aug 2019, 13:26:41 UTC

CentOS7 have namespace not active by default. To use it, have found this command:
grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
After this a reboot is needed.

/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
Error: Failed to create mount namespace: Operation not permitted
ID: 6540 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 20 Apr 16
Posts: 180
Credit: 1,355,327
RAC: 0
Message 6541 - Posted: 12 Aug 2019, 14:13:34 UTC - in response to Message 6540.  

Thanks for the info, I will add this information to the original post. But which version of CentOS 7 do you have? As far as I know CentOS 7.6 doesn't require updating the kernel arguments or reboot. Setting a number for max user namespaces with sysctl as I showed in the original post should be enough.
ID: 6541 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 6542 - Posted: 12 Aug 2019, 14:33:20 UTC - in response to Message 6541.  

Yes David,
CentOS 76. But have a problem to let -native (Atlas and Theory running).
No Problem with SL76.
ID: 6542 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 6543 - Posted: 12 Aug 2019, 16:08:39 UTC - in response to Message 6538.  
Last modified: 12 Aug 2019, 16:09:43 UTC

Can you run the command manually and post the output?

/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
This command gives no output.

My first thought was that user namespaces are not working properly. However, in the kernel config file it says "CONFIG_USER_NS=y", within the /etc/sysctl.conf file "kernel.unprivileged_userns_clone=1" is written, and the file /proc/sys/user/max_user_namespaces shows the value "19655".

Are there other parameters that have to be adapted in order for namespaces to work?

I am not sure, but I think that someone here or on the production site wrote that he had to compile a custom kernel for debian testing (which is now stable 10), but I can't find the post. But since the CONFIG_USER_NS is set to yes within the kernel config file there should be no need to compile a custom kernel, or are there other config options needed to be set to yes?
ID: 6543 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 659
Credit: 1,719,912
RAC: 3,195
Message 6550 - Posted: 14 Aug 2019, 17:49:06 UTC

Have made a new test with SL76.
Two tasks of Theory-native from Production running parallel with this Atlas-native 0.71 task:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2800367
This is not SLC6, need to run with Singularity....
Checking Singularity...
Singularity is installed, version 2.2.99
Testing the function of Singularity...
Checking singularity with cmd:singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
Singularity Works...

copy /root/Downloads/BOINC/slots/2/shared/ATLAS.root_0
copy /root/Downloads/BOINC/slots/2/shared/input.tar.gz
copy /root/Downloads/BOINC/slots/2/shared/RTE.tar.gz
copy /root/Downloads/BOINC/slots/2/shared/start_atlas.sh
start atlas job with PandaID=4447864536
cmd = singularity exec --pwd /root/Downloads/BOINC/slots/2 -B /cvmfs,/root /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 sh start_atlas.sh > runtime_log 2> runtime_log.err
running cmd return value is 32512

runtime_log.err show this entries:
WARNING: OverlayFS not supported by host build
WARNING: Non existant 'bind point' directory in container: '/boot'
WARNING: Not mounting home directory: bind point does not exist in container: /root
WARNING: Not mounting requested bind point (already mounted in container): /cvmfs
WARNING: Skipping user bind, non existant bind point (directory) in container: '/root'
WARNING: Could not chdir to home directory: /root
sh: start_atlas.sh: Datei oder Verzeichnis nicht gefunden
ID: 6550 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : ATLAS Application : Native app using Singularity from CVMFS


©2024 CERN