1) Message boards : Sixtrack Application : Xtrack beam simulation (Message 8273)
Posted 12 Jan 2024 by mmonnin
Post:
I received 1 on a PC hat has been running for 17 hours.
Another PC received several that error'd immediately with segmentation violation.

All while someones been running 3 tasks since Nov per WUProp hour data.

Edit:
Now the PC with 17 hour task received some more but all failed:

<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)</message>
<stderr_txt>
SIGSEGV: segmentation violation
Stack trace (8 frames):
[0x42f3c0]
[0x44b5b0]
[0x408b46]
[0x40900e]
[0x429157]
[0x429e7f]
[0x4cd58e]
[0x402265]

Exiting...

</stderr_txt>
]]>
2) Message boards : Sixtrack Application : Xtrack beam simulation v0.01 windows_intelx86 (Message 8228)
Posted 24 Nov 2023 by mmonnin
Post:
All the linux tasks were canceled by server, even the ones in progress.
The windows tasks are still running beyond 11.5 hours atm. 12 of 12 cores still using CPU. The remaining tasks haven't been canceled.
Updated the project. All canceled.
3) Message boards : Sixtrack Application : Xtrack beam simulation v0.01 windows_intelx86 (Message 8223)
Posted 24 Nov 2023 by mmonnin
Post:
The tasks I received earlier were aborted by the server.
The ones I received today run to 100% in around 5 min and continue running. So far over 2:35 hr:min and using CPU. Windows and Linux.
4) Message boards : CMS Application : CMS network test are getting more strict (Message 7226)
Posted 7 Jul 2021 by mmonnin
Post:
Yes you know how that goes here for some reason so I always have to check by looking here
https://lhcathomedev.cern.ch/lhcathome-dev/top_hosts.php
And that of course only tells what computer and not who it is and several never seem to check it if they get the credit for those failed tasks and never seem to look here either.

We shouldn't have to do that here since this is for testing and hiding a computer is ..........

for example:
Volunteer: mmonnin (451)


Most of those are not mine so go bark at someone else.

Why doesn't LHC just remove the stats export option required by GDPR as well then?
5) Message boards : CMS Application : New Version 60.30 (Message 7216)
Posted 3 Jul 2021 by mmonnin
Post:
I set up 1 task with 2 cores. No app_config in use.
The received task created a dual core VM with 2792 MB RAM. So far so good.
However, there is 'only' 1 cmsRun running and it uses 1 core.


I saw the same so went back to 1 core via account preferences at the website.
6) Message boards : CMS Application : New Version 60.20 (Message 7211)
Posted 2 Jul 2021 by mmonnin
Post:
Another new version. 60.30 Just happened to catch the vdi download.
7) Message boards : CMS Application : New Version 60.10 (Message 7207)
Posted 24 Jun 2021 by mmonnin
Post:
Efficiency of using 2 cores sucks though. Your links there show less than 1 thread used but would be occupying two threads. Mine so far are at best using 1.1 threads but only 6% completed. I'll be going back to single thread tasks.
8) Message boards : CMS Application : New Version 60.10 (Message 7205)
Posted 24 Jun 2021 by mmonnin
Post:
Cool I've been running 1 thread set by site preferences. I'm trying 2 threads now.
9) Message boards : ATLAS Application : ATLAS very long simulation v1.01 download errors (Message 7136)
Posted 21 Mar 2021 by mmonnin
Post:
I am getting the same thing.
10) Message boards : ATLAS Application : ATLAS long simulation 1.01 (Message 7124)
Posted 20 Mar 2021 by mmonnin
Post:
Unknown image format/type: /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img

try a version of singularity from the server

/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname


maybe a hint will appear, e.g. "unsquashfs not found" or "mkdir /home/boinc: permission denied"

PS if it works - just delete the installed singularity


That command returns my host name. No errors.

Delete it? Like
sudo apt-get remove --auto-remove singularity 


Edit: 2 PCs are getting this. A 3rd where I just went through the setup thread on LHC now works, after installing gawk, at lease its starting to use some memory/CPU. The two PCs may have singularity installed via a repository some time ago instead of via cmake etc.
11) Message boards : ATLAS Application : WU's Error out @ 10:02 Min's (Message 7114)
Posted 20 Mar 2021 by mmonnin
Post:
You must complete the setup that typically comes with vbox vdi file.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4840

You also linked where the rest of us have no access.
12) Message boards : ATLAS Application : ATLAS long simulation 1.01 (Message 7113)
Posted 20 Mar 2021 by mmonnin
Post:
<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
23:47:51 (28346): wrapper (7.7.26015): starting
23:47:51 (28346): wrapper: running run_atlas (--nthreads 4)
[2021-03-19 23:47:51] Arguments: --nthreads 4
[2021-03-19 23:47:51] Threads: 4
[2021-03-19 23:47:51] Checking for CVMFS
[2021-03-19 23:47:52] Probing /cvmfs/atlas.cern.ch... OK
[2021-03-19 23:47:53] Probing /cvmfs/atlas-condb.cern.ch... OK
[2021-03-19 23:47:54] Probing /cvmfs/grid.cern.ch... OK
[2021-03-19 23:47:56] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2021-03-19 23:47:56] 2.5.2.0 28470 0 23296 81113 3 1 2621483 4194304 0 65024 0 0 n/a 0 0 http://cvmfs-s1bnl.opensciencegrid.org/cvmfs/atlas.cern.ch DIRECT 1
[2021-03-19 23:47:56] CVMFS is ok
[2021-03-19 23:47:56] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img
[2021-03-19 23:47:56] Checking for singularity binary...
[2021-03-19 23:47:56] Using singularity found in PATH at /usr/bin/singularity
[2021-03-19 23:47:56] Running /usr/bin/singularity --version
[2021-03-19 23:47:56] 2.4.2-dist
[2021-03-19 23:47:56] Checking singularity works with /usr/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname
[2021-03-19 23:47:56] Singularity isnt working: ERROR : Unknown image format/type: /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img
[2021-03-19 23:47:56] ABORT : Retval = 255
[2021-03-19 23:47:56] 
23:57:56 (28346): run_atlas exited; CPU time 0.301484
23:57:56 (28346): app exit status: 0x1
23:57:56 (28346): called boinc_finish(195)

</stderr_txt>
]]>

singularity --version
This returns a version as the check listed on main LHC forums. I've ran the native app before but it's been awhile.
13) Message boards : Theory Application : New version 5.00 (Message 6674)
Posted 23 Sep 2019 by mmonnin
Post:
With v5.02 the jobs starts to run.


Wait, so v5.00 and v5.01 the tasks do nothing? It looks like none of mine have returned yet.
14) Message boards : Theory Application : New version 5.00 (Message 6669)
Posted 23 Sep 2019 by mmonnin
Post:
Wow 24.75GB of memory usage on 32t system. Necessary?

I have some v5.00 and v5.01 tasks. Are the 5.00 ones still needed?
15) Message boards : Sixtrack Application : The Sixtrack Application (Message 5625)
Posted 10 Nov 2018 by mmonnin
Post:
Looks like admins are still giving us the middle finger :(
16) Message boards : Sixtrack Application : The Sixtrack Application (Message 5613)
Posted 7 Nov 2018 by mmonnin
Post:
Ya mean like how I mentioned it before in this post? Not a word was said about it then.
https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=415

Now nearly everything is being canceled by the server. This was working fine before the server went down.
17) Message boards : LHCb Application : No jobs? (Message 5543)
Posted 26 Sep 2018 by mmonnin
Post:
Stop sending tasks without jobs!! FFS here too.
18) Message boards : CMS Application : Here we go again..... (Message 5541)
Posted 23 Sep 2018 by mmonnin
Post:
... they don't use much of the CPU ...

Well, during job processing LHCb should use 100 % of 1 core (per internal slot).
Usually for around 80 min on my hosts.
Each job should produce an intermediate result of about 60-80 MB that should be uploaded immediately to lbboinc01.cern.ch, TCP port 9148.
During this upload the CPU usage drops to nearly idle.

If your hosts are idle and you don't notice an upload the jobs got stuck either during job setup or during stageout.
I suspect it's not an error on the volunteer's side but a wrong/missing connection to a backend system at CERN.


It's using about half of a CPU core on my 3570k and run for like 12-13 hours. Much longer than 80min.
19) Message boards : CMS Application : Here we go again..... (Message 5537)
Posted 23 Sep 2018 by mmonnin
Post:
Server status said 1 user for CMS and now its 2 with me so not much to compare to.

I thought CMS ended in like 10min if there was no condor connection. And if there wasn't there would be no CPU time during those 10min. These actually had more run CPU time than run time and welllll past 10min.

I'm running LHCb here. They just are using a hefty 6GB of RAM but are completing.
20) Message boards : CMS Application : Here we go again..... (Message 5535)
Posted 22 Sep 2018 by mmonnin
Post:
2018-09-22 08:42:29 (5764): Guest Log: [ERROR] Condor exited after 30947s without running a job.
2018-09-22 08:42:29 (5764): Guest Log: [INFO] Shutting Down.
2018-09-22 08:42:29 (5764): VM Completion File Detected.
2018-09-22 08:42:29 (5764): VM Completion Message: Condor exited after 30947s without running a job.

How does it go so long and use more than 1 CPU core w/o a job? Why are tasks available when there are no jobs? I fail to see after years and years why this is still an issue.


Next 20


©2024 CERN