Message boards : CMS Application : New Version 60.63
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1051
Credit: 294,070
RAC: 5
Message 7630 - Posted: 26 Jul 2022, 8:14:45 UTC
Last modified: 26 Jul 2022, 8:15:07 UTC

This new version provides a new image along with an updated version of the vboxwrapper (v26205). The cvmfs reload and proxy setting functions have been temporarily disabled. This will be revised in a future update.
ID: 7630 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 653
Credit: 10,839,884
RAC: 3,513
Message 7631 - Posted: 26 Jul 2022, 9:13:54 UTC - in response to Message 7630.  

1.57GB
I guess I will d/l one here since this is my isp high-speed at 2am and I rather not use it all up doing this so I will do the rest during the day where I already used up all my high-speed in 4 days
ID: 7631 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1147
Credit: 753,200
RAC: 923
Message 7632 - Posted: 26 Jul 2022, 9:18:52 UTC

In this version ALT-F# key strokes, except ALT-F3 (top), don't display what it was in the previous version.
ALT-F2: Running job output should appear here
ALT-F4: Output of the job wrapper may appear here.
ALT-F5: Error messages may appear here.

BOINC's Show graphics displays a webpage with: Scientific Linux Test Page ..... and not the log files.
ID: 7632 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1147
Credit: 753,200
RAC: 923
Message 7633 - Posted: 26 Jul 2022, 9:22:30 UTC
Last modified: 26 Jul 2022, 9:30:58 UTC

My first task crashed after 20 minutes because of: VM Heartbeat file specified, but missing

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3102023


No heartbeat file in the shared folder
ID: 7633 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 653
Credit: 10,839,884
RAC: 3,513
Message 7634 - Posted: 26 Jul 2022, 9:36:29 UTC - in response to Message 7633.  
Last modified: 26 Jul 2022, 9:54:04 UTC

VM Heartbeat file specified, but missing.
VM Heartbeat file specified, but missing file system status. (errno = '2')

{This machine does not have any snapshots}, preserve=false aResultDetail=0
(well glad I only did the d/l once)......goodnight
ID: 7634 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 601
Credit: 1,383,910
RAC: 2,199
Message 7635 - Posted: 26 Jul 2022, 9:53:17 UTC - in response to Message 7632.  

BOINC's Show graphics displays a webpage with: Scientific Linux Test Page ..... and not the log files.

+1
ID: 7635 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 653
Credit: 10,839,884
RAC: 3,513
Message 7636 - Posted: 26 Jul 2022, 9:54:47 UTC - in response to Message 7635.  

BOINC's Show graphics displays a webpage with: Scientific Linux Test Page ..... and not the log files.

+1


Same here
ID: 7636 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1051
Credit: 294,070
RAC: 5
Message 7637 - Posted: 26 Jul 2022, 10:19:06 UTC - in response to Message 7636.  

Thanks for testing. The fix for the heartbeat issue should be live in a few minutes. There are a number of things that need to be fixed before this version is ready. The main thing we are testing is an updated connection method to the CMS job pool and the vboxwrapper.
ID: 7637 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1147
Credit: 753,200
RAC: 923
Message 7638 - Posted: 26 Jul 2022, 11:32:23 UTC - in response to Message 7637.  
Last modified: 26 Jul 2022, 11:59:46 UTC

Thanks for testing. The fix for the heartbeat issue should be live in a few minutes. There are a number of things that need to be fixed before this version is ready. The main thing we are testing is an updated connection method to the CMS job pool and the vboxwrapper.
Confirmed. Heartbeat once a minute. Cool!

Process 'cmsRun' running fine.

Differencing disk 'only' 241 MB so far after 44 minutes runtime.
In the past the whole 4GB vdi was copied into the slot-folder for every CMS-task.
ID: 7638 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 1,122
Message 7640 - Posted: 26 Jul 2022, 15:23:08 UTC
Last modified: 26 Jul 2022, 21:31:10 UTC

Task seems to be running fine on my Windows 10 box. I also see the Apache home page rather than logs in "Show graphics". In the console window, Ctrl-Alt-F1 brings up the console output, Ctrl-Alt-F3 brings up the "top" output and Ctrl-Alt-F6 shows the console login page. With F2, F4 and F5 I just get the dummy messages that job output/job wrapper/error messages may appear, but they don't.
ID: 7640 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 402
Credit: 374,791
RAC: 32
Message 7641 - Posted: 27 Jul 2022, 4:54:23 UTC

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3102051
This task does not shut down correctly.

During normal operation "cmsRun" was the leading process in the top output and in BOINC CPU-time was always a bit ahead of runtime (since it is a 2-core VM).

Now "cmsRun" is not shown any more and the VM is idle (don't know since when).
Top shows load averages close to 0.
Runtime is continuously increasing (currently 16:43:10) but CPU-time is sitting at 13:57:12.

Will let it run until the hard runtime limit to see whether this will end the task.
ID: 7641 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 601
Credit: 1,383,910
RAC: 2,199
Message 7642 - Posted: 27 Jul 2022, 5:27:16 UTC

Seeing the same.
6 hours idle and 17:30 hours runtime.
ID: 7642 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1147
Credit: 753,200
RAC: 923
Message 7643 - Posted: 27 Jul 2022, 5:31:17 UTC

The same here. VM-lifetime > 13 hours.
I suppose it's the same as the missing heartbeat file.

computezrmle will wait until the shutdown is done by vboxwrapper. I'm sure it will.

I'll help the task a bit.
ID: 7643 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 402
Credit: 374,791
RAC: 32
Message 7644 - Posted: 27 Jul 2022, 6:50:53 UTC - in response to Message 7643.  

As foreseen by CP the vboxwrapper watchdog correctly shut down the task.
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3102051
ID: 7644 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1051
Credit: 294,070
RAC: 5
Message 7646 - Posted: 27 Jul 2022, 8:21:42 UTC - in response to Message 7644.  

Thanks for the information. I think I know what the problems are. Will try to fix them.
ID: 7646 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1051
Credit: 294,070
RAC: 5
Message 7647 - Posted: 27 Jul 2022, 8:58:21 UTC - in response to Message 7646.  

I made a couple of changes. Let's see if that improves things.
ID: 7647 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 653
Credit: 10,839,884
RAC: 3,513
Message 7648 - Posted: 27 Jul 2022, 9:31:15 UTC - in response to Message 7647.  

This the first one I ran yesterday https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3102096
Run time 18 hours 1 min 6 sec
CPU time 4 hours 18 min 13 sec

I just started a new one on that host and today got the vdi and wrapper on another so just started that one too.
I will add the other host tomorrow and then the 4th on the next day since I rather not use up all of my 2am fast speed
You are all lucky you are not stuck using Hughes satellite isp because loading the new vdi and wrapper I started at 5:30pm and ran until 2am and only had 51% of the d/l but at 2am when I get full speed again it finished the rest in 5 minutes.
So tomorrow I will start the next d/l at around noon but I imagine my late night "bonus" high speed I get will soon be gone which means it will be hard to even get 4 hosts running a single task when I run out of what ever I have left and right now the Hughes website page will not even load so I can see what I have left and my new month doesn't start until Aug. 13th at midnight
ID: 7648 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 402
Credit: 374,791
RAC: 32
Message 7649 - Posted: 27 Jul 2022, 10:07:44 UTC - in response to Message 7648.  

A (correctly configured) local proxy ensures you would have to download the vdi just once independent of the #computers in your LAN using it.
ID: 7649 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1126
Credit: 7,409,829
RAC: 1,122
Message 7650 - Posted: 27 Jul 2022, 15:28:14 UTC
Last modified: 27 Jul 2022, 15:40:52 UTC

Has anyone else run into a problem with the vboxwrapper under Linux? I get the message:
../../projects/lhcathomedev.cern.ch_lhcathome-dev/vboxwrapper_26205_x86_64-pc-linux-gnu: /lib64/libm.so.6: version `GLIBC_2.29' not found (required by ../../projects/lhcathomedev.cern.ch_lhcathome-dev/vboxwrapper_26205_x86_64-pc-linux-gnu)
It seems vboxwrapper_26205_x86_64-pc-linux-gnu was built on a system with glibc 2.29 but I'm using Rocky Linux 8.6 (derived from RHEL 8) which still uses 2.28.
It's not immediately obvious to me how to build my own version of vboxwrapper (it seems you have to build the whole BOINC tree, but that always trips up over the wx-widgets version unless you are lucky). I tried copying the V26204 wrapper to my directory tree and renaming that to V26205, but have yet to find out if that works as BOINC won't serve me new tasks since I had so many failures last night before I realised something was wrong.
ID: 7650 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 402
Credit: 374,791
RAC: 32
Message 7651 - Posted: 27 Jul 2022, 16:33:03 UTC - in response to Message 7650.  

Vboxwrapper 26205 has been compiled on the official Github platform including the lib versions they use there.
Best would be to upgrade your system libs.


If you compile it yourself you may switch off generating vboxmanager since only this component uses wx-widgets.
It's also the component that requires most of the compilation time.


Suggested steps to configure/compile the BOINC client (including required helpers) and vboxwrapper:

1. cd to the base directory of your sourcecode
2. run "make distclean"
3. run ./configure with the options "--disable-server --disable-manager --enable-apps-vbox --enable-optimize"
4. run make
5. cd to <base directory>/samples/vboxwrapper
6. run make
7. run "strip vboxwrapper"



This is just a workaround.
Be aware that under certain circumstances the BOINC client requests a fresh copy from the project server.
ID: 7651 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : CMS Application : New Version 60.63


©2022 CERN