Message boards : Theory Application : Docker on Linux
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 520
Credit: 400,710
RAC: 0
Message 8614 - Posted: 25 Mar 2025, 12:53:28 UTC

djustments that need to be confirmed

I'm using podman instead of docker, so some options may be different.

The systemd service file for boinc needs this line:
[pre]RuntimeDirectory=user/%n[/pre]
It avoids errors like:
[pre]running docker command: ps --all --filter "name=boinc__lhcathomedev.cern.ch_lhcathome-dev__theory_2848-4462726-32_0"
time="2025-03-25T06:57:08+01:00" level=warning msg="RunRoot is pointing to a path (/run/user/1001/containers) which is not writable. Most likely podman will fail."[/pre]


in job*.toml I added/modified
[pre]build_args = "--layers --squash-all"
create_args = "-v /cvmfs:/cvmfs:shared"[/pre]

The build_args avoid the local image being build from scratch every time after a short break between tasks.
Not yet tested if there are unwanted side effects.

create_args runs without "--cap-add=SYS_ADMIN" "--device /dev/fuse" if "chmod go+w /cvmfs" is set on the host.
This works, if CVMFS from the host is used.
Not yet tested, if it also works with CVMFS in the container being used.
Not tested on Windows.
ID: 8614 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 19 Aug 15
Posts: 72
Credit: 3,643,465
RAC: 33
Message 8617 - Posted: 26 Mar 2025, 8:49:32 UTC
Last modified: 26 Mar 2025, 8:52:20 UTC

ID: 8617 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 520
Credit: 400,710
RAC: 0
Message 8618 - Posted: 26 Mar 2025, 9:17:28 UTC - in response to Message 8617.  

In reply to Toby Broom's message of 26 Mar 2025:
got 2 that worked:

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3391288
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3391371

didn use proxy though

+1

Your log tells you this:
"Using CVMFS on the host."

Hence, you need to configure the CVMFS on the host to use your local Squid.
Check if this is set in /etc/cvmfs/default.local:
CVMFS_HTTP_PROXY="http://your_proxy_name_or_IP:port;DIRECT"
Then (while no container is running) run on the host "sudo cvmfs_config reload".

To forward the proxy to your containers, set the container environment as described here:
https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=682&postid=8607
ID: 8618 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 520
Credit: 400,710
RAC: 0
Message 8619 - Posted: 26 Mar 2025, 9:25:11 UTC - in response to Message 8614.  

ly to computezrmle's message of 25 Mar 2025:
create_args runs without "--cap-add=SYS_ADMIN" "--device /dev/fuse" if "chmod go+w /cvmfs" is set on the host.
This works, if CVMFS from the host is used.
Not yet tested, if it also works with CVMFS in the container being used.

Was testing this back and forth.
Unfortunately we can't avoid "--cap-add=SYS_ADMIN" and "--device /dev/fuse" when CVMFS inside the container should be used.
Hence, to simplify deployment both option should remain in the *.toml file.
[pre]create_args = "--cap-add=SYS_ADMIN --device /dev/fuse -v /cvmfs:/cvmfs:shared"[/pre]
ID: 8619 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 19 Aug 15
Posts: 72
Credit: 3,643,465
RAC: 33
Message 8627 - Posted: 26 Mar 2025, 18:21:08 UTC - in response to Message 8618.  
Last modified: 26 Mar 2025, 18:22:23 UTC

I guess auto does not work then.

I did the second part:

env = [
    "http_proxy=192.168.1.179:3128",
    "https_proxy=192.168.1.179:3128",
     "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
]
ID: 8627 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 520
Credit: 400,710
RAC: 0
Message 8628 - Posted: 26 Mar 2025, 18:39:52 UTC - in response to Message 8627.  
Last modified: 26 Mar 2025, 19:12:50 UTC

In reply to Toby Broom's message of 26 Mar 2025:
I guess auto does not work then.

I did the second part:

env = [
    "http_proxy=192.168.1.179:3128",
    "https_proxy=192.168.1.179:3128",
     "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
]

Right, "auto" is not supported. At least not yet.

ATM only the classical method is supported:
1. Define an environment variable, here: http_proxy=http://proxy:port
2. Export that variable, here via containers.conf
3. Create a script inside the container that reads the variable and does the necessary steps
here: add CVMFS_HTTP_PROXY="http://proxy:port;DIRECT" to /etc/cvmfs/default.local

The script is already available in the Linux app_version.
The Windows app_version just needs an update.

Edit:
@Toby Broom

You set "http_proxy=192.168.1.179:3128" instead of "http_proxy=http://192.168.1.179:3128".
for https it must be "https_proxy=http://192.168.1.179:3128" (sic!)
Due to the missing protocol CVMFS can't use the proxy and falls back to DIRECT.
ID: 8628 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 19 Aug 15
Posts: 72
Credit: 3,643,465
RAC: 33
Message 8647 - Posted: 27 Mar 2025, 18:18:24 UTC

I changed the format and it detects now.

There is sort of a weired mix of results, they all seem like they worked but some have errors.

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3391532
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3391544
ID: 8647 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 19 Aug 15
Posts: 72
Credit: 3,643,465
RAC: 33
Message 8769 - Posted: 23 Apr 2025, 19:47:44 UTC

this didn't work for some reason

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3398968
ID: 8769 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 19 Aug 15
Posts: 72
Credit: 3,643,465
RAC: 33
Message 8837 - Posted: 18 Jun 2025, 22:27:41 UTC

ID: 8837 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 19 Aug 15
Posts: 72
Credit: 3,643,465
RAC: 33
Message 8843 - Posted: 20 Jun 2025, 17:42:57 UTC

Any ideas?

running docker command: logs boinc__lhcathomedev.cern.ch_lhcathome-dev__theory_2922-4812957-20_0
command output:
Local webserver did not start.
boinc_shutdown called with exit code 206

Using custom CVMFS.
Proxy configuration failed.
boinc_shutdown called with exit code 206

stderr from container:
Local webserver did not start.
boinc_shutdown called with exit code 206

Using custom CVMFS.
Proxy configuration failed.
boinc_shutdown called with exit code 206

stderr end
ID: 8843 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 759
Credit: 3,313,543
RAC: 22,332
Message 8852 - Posted: 25 Jun 2025, 7:16:53 UTC - in response to Message 8843.  
Last modified: 25 Jun 2025, 7:20:36 UTC

In reply to Toby Broom's message of 20 Jun 2025:
Any ideas?

running docker command: logs boinc__lhcathomedev.cern.ch_lhcathome-dev__theory_2922-4812957-20_0
command output:
Local webserver did not start.
boinc_shutdown called with exit code 206


Win11pro Hardware-Acceleration off works with Docker-Tasks.
https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=4861
https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=4639
ID: 8852 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toby Broom

Send message
Joined: 19 Aug 15
Posts: 72
Credit: 3,643,465
RAC: 33
Message 8853 - Posted: 25 Jun 2025, 16:42:35 UTC - in response to Message 8852.  

I think you probally want it on though for best performance?
ID: 8853 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 25 Jul 22
Posts: 8
Credit: 2,608,806
RAC: 4,380
Message 8898 - Posted: 11 Jul 2025, 14:21:57 UTC
Last modified: 11 Jul 2025, 14:23:11 UTC

I’ve been running theory docker tasks using a squid proxy server for the last few days. This morning none of the completed tasks were uploaded. I can’t connect to any of the CERN websites from my home network. I’m trying to figure out if my ISP blocked access or if CERN blocked my IP address. Any help would be appreciated. I’m posting this with my cell phone.
ID: 8898 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 25 Jul 22
Posts: 8
Credit: 2,608,806
RAC: 4,380
Message 8900 - Posted: 11 Jul 2025, 18:12:38 UTC - in response to Message 8898.  

In reply to biodoc's message of 11 Jul 2025:
I’ve been running theory docker tasks using a squid proxy server for the last few days. This morning none of the completed tasks were uploaded. I can’t connect to any of the CERN websites from my home network. I’m trying to figure out if my ISP blocked access or if CERN blocked my IP address. Any help would be appreciated. I’m posting this with my cell phone.


Problem solved: There's an issue with my ISP DNS servers. I switched to google public DNS servers on my router and everything is back to normal.
ID: 8900 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
biodoc

Send message
Joined: 25 Jul 22
Posts: 8
Credit: 2,608,806
RAC: 4,380
Message 8927 - Posted: 19 Jul 2025, 0:44:37 UTC

I have been running the theory docker app for linux and have made the following observations:

-The error rate is about 7-10% and all of the errors are "1 (0x00000001) Unknown error code"
- I can run up to 10 tasks simultaneously on a computer with 64 GB RAM. If I go up to 16 tasks, I get waiting for memory notifications from boinc manager.
- Run times are accurate but cpu times are highly inflated.
- I checked other users computers and found that those running the windows docker app have very low error rates.
- I thought there may be a hardware issue on my computer. SMART shows my drive is healthy. I have non-registered, unbuffered ECC RAM in the computer and edac-util shows no corrected or uncorrected memory errors.

I decided to switch to the theory vbox app (linux) and found the following:
- So far only 2 tasks had computation errors out of 420+ tasks completed so far.
- Run times and cpu times are accurate.
- I can run 16 tasks simultaneously with plenty of RAM still available.

Clearly the docker app for linux still needs work. Should I continue to run the vbox app or should I switch back to the docker app?
Computer details: [url] https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=5025[/url]
podman version 4.9.3
Squid Cache: Version 6.13
ID: 8927 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 759
Credit: 3,313,543
RAC: 22,332
Message 8928 - Posted: 19 Jul 2025, 1:28:40 UTC - in response to Message 8927.  
Last modified: 19 Jul 2025, 1:29:00 UTC

It's realy good to have both options. vbox and docker.
Laurence from Cern-IT and his team give us this option with docker.
We have to wait for the answer from him, which way they want to go.
Docker is a licence problem he wrote, instead of podman.
It's your way to test one or both in -dev.
ID: 8928 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 20 Jun 17
Posts: 31
Credit: 5,684,604
RAC: 13,031
Message 8929 - Posted: 19 Jul 2025, 10:24:12 UTC - in response to Message 8927.  

In reply to biodoc's message of 19 Jul 2025:
I have been running the theory docker app for linux and have made the following observations:

-The error rate is about 7-10% and all of the errors are "1 (0x00000001) Unknown error code"
- I can run up to 10 tasks simultaneously on a computer with 64 GB RAM. If I go up to 16 tasks, I get waiting for memory notifications from boinc manager.
- Run times are accurate but cpu times are highly inflated.
- I checked other users computers and found that those running the windows docker app have very low error rates.
- I thought there may be a hardware issue on my computer. SMART shows my drive is healthy. I have non-registered, unbuffered ECC RAM in the computer and edac-util shows no corrected or uncorrected memory errors.

I decided to switch to the theory vbox app (linux) and found the following:
- So far only 2 tasks had computation errors out of 420+ tasks completed so far.
- Run times and cpu times are accurate.
- I can run 16 tasks simultaneously with plenty of RAM still available.

Clearly the docker app for linux still needs work. Should I continue to run the vbox app or should I switch back to the docker app?
Computer details: [url] https://lhcathomedev.cern.ch/lhcathome-dev/show_host_detail.php?hostid=5025[/url]
podman version 4.9.3
Squid Cache: Version 6.13


Is the docker memory usage for Theory app correct as I've seen it as accurate as it's CPU usage. BOINC could see xxxxxxxx memory used but in reality it's only x. I see the wrong memory reported to BOINC in the BOINC Central docker app.
ID: 8929 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TeeVeeEss

Send message
Joined: 2 Jul 25
Posts: 3
Credit: 105,123
RAC: 1,175
Message 8950 - Posted: 27 Jul 2025, 11:53:03 UTC
Last modified: 27 Jul 2025, 12:00:09 UTC

From my last 8 T4T tasks the short ones ended in error, e.g https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3472514. I checked the task result and found this line:
head: cannot open '/scratch/runRivet.log' for reading: No such file or directory
After some debugging I decided to create a scratch directory in the project folder (/var/lib/boinc/projects/lhcathomedev.cern.ch_lhcathome-dev/)
sudo mkdir scratch
sudo chown boinc:boinc scratch
The last running task on my host completed as valid: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3472519
Not sure if this missing folder is only a problem on my host, that's why I am posting here.
Running Boinc 8.2.4, Squid 6.13, podman 4.9.3, compose: Docker Compose Docker Inc.), Version: v2.36.2
ID: 8950 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Theory Application : Docker on Linux


©2025 CERN