Message boards : CMS Application : New Version 60.60
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 278
Message 7328 - Posted: 14 Jun 2022, 12:12:21 UTC

This update provides a new version of the VboxWrapper which supports the muliattachmode. Please let me know if there are any issues.
ID: 7328 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1129
Credit: 7,870,629
RAC: 576
Message 7333 - Posted: 14 Jun 2022, 21:03:50 UTC - in response to Message 7328.  
Last modified: 14 Jun 2022, 21:38:23 UTC

This update provides a new version of the VboxWrapper which supports the muliattachmode. Please let me know if there are any issues.

For what it's worth, this version dies immediately on both my Windows 10 machine and a Rocky Linux 8.6 box[1]. I didn't have much time to investigate this afternoon (too many meetings...). Please let us know whether or not your new tasks are running normally.
[1] Tasks link if anyone has the time and inclination to take a look overnight UK time.
ID: 7333 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7334 - Posted: 15 Jun 2022, 4:26:42 UTC - in response to Message 7333.  

I suspect it is caused by a vdi registration error (on Windows a well as on Linux):
VBoxManage.exe: error: Cannot register the hard disk 'C:\ProgramData\BOINC\projects\lhcathomedev.cern.ch_lhcathome-dev\CMS_2021_07_07.vdi' {f888c51e-0503-4495-8794-fd67809dc4e8} because a hard disk 'C:\ProgramData\BOINC\projects\lhcathome.cern.ch_lhcathome\CMS_2021_07_07.vdi' with UUID {f888c51e-0503-4495-8794-fd67809dc4e8} already exists

The old app version as well as the new app version both refer to the same vdi file "CMS_2021_07_07.vdi".
To allow both app versions to coexist the new vdi file must have a different name and a different UUID.


@Laurence
Please clone the old vdi file using a command line like this:
vboxmanage clonemedium CMS_2021_07_07.vdi CMS_2022_06_15.vdi

Then create the new app version with CMS_2022_06_15.vdi instead of CMS_2021_07_07.vdi.



BTW
BOINC first published an unstripped Linux version of vboxwrapper.
They recently updated the executable with a stripped version which is slightly smaller:
https://boinc.berkeley.edu/dl/vboxwrapper_26204_x86_64-pc-linux-gnu
ID: 7334 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 7335 - Posted: 15 Jun 2022, 6:05:25 UTC

My CMS-test with the new vboxwrapper is running fine. Meanwhile busy with its 4th cmsRun and will be ready in about 3 hours.

MasterLog	2022-06-15 07:44	245K	 
StartdLog	2022-06-15 07:41	286K	 
StarterLog	2022-06-14 21:00	215	 
finished_0.log	2022-06-14 21:03	39	 
finished_1.log	2022-06-15 00:43	1.4M	 
finished_2.log	2022-06-15 03:50	1.4M	 
finished_3.log	2022-06-15 07:04	1.4M	 
running.log	2022-06-15 07:46	239K	 
stderr.log	2022-06-15 07:42	23K	 
stdout.log	2022-06-15 07:01	24K	 
wmagentJob.log	2022-06-15 07:42	6.7K	 
wmagentJob_1.log	2022-06-15 00:43	23K	 
wmagentJob_2.log	2022-06-15 03:50	23K	 
wmagentJob_3.log	2022-06-15 07:04	23K

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3092830

snapshot difference file atm 1460MB

stderr.txt so far:

2022-06-14 20:52:55 (13408): Detected: vboxwrapper 26204
2022-06-14 20:52:55 (13408): Detected: BOINC client v7.19.0
2022-06-14 20:52:55 (13408): Detected: VirtualBox VboxManage Interface (Version: 6.1.34)
2022-06-14 20:52:56 (13408): Detected: Heartbeat check (file: 'heartbeat' every 1200.000000 seconds)
2022-06-14 20:52:56 (13408): Successfully copied 'init_data.xml' to the shared directory.
2022-06-14 20:52:56 (13408): Create VM. (boinc_0514318c2c914930, slot#0)
2022-06-14 20:52:57 (13408): Setting Memory Size for VM. (2048MB)
2022-06-14 20:52:57 (13408): Setting CPU Count for VM. (1)
2022-06-14 20:52:57 (13408): Setting Chipset Options for VM.
2022-06-14 20:52:58 (13408): Setting Graphics Controller Options for VM.
2022-06-14 20:52:58 (13408): Setting Boot Options for VM.
2022-06-14 20:52:58 (13408): Setting Network Configuration for NAT.
2022-06-14 20:52:58 (13408): Enabling VM Network Access.
2022-06-14 20:52:59 (13408): Disabling USB Support for VM.
2022-06-14 20:52:59 (13408): Disabling COM Port Support for VM.
2022-06-14 20:52:59 (13408): Disabling LPT Port Support for VM.
2022-06-14 20:53:00 (13408): Disabling Audio Support for VM.
2022-06-14 20:53:00 (13408): Disabling Clipboard Support for VM.
2022-06-14 20:53:00 (13408): Disabling Drag and Drop Support for VM.
2022-06-14 20:53:00 (13408): Adding storage controller(s) to VM.
2022-06-14 20:53:01 (13408): Adding virtual disk drive to VM. (CMS_2021_07_07.vdi)
2022-06-14 20:53:02 (13408): Adding VirtualBox Guest Additions to VM.
2022-06-14 20:53:02 (13408): Adding network bandwidth throttle group to VM. (Defaulting to 1024GB)
2022-06-14 20:53:02 (13408): forwarding host port 53846 to guest port 80
2022-06-14 20:53:03 (13408): Enabling remote desktop for VM.
2022-06-14 20:53:03 (13408): Enabling shared directory for VM.
2022-06-14 20:53:04 (13408): Starting VM using VBoxManage interface. (boinc_0514318c2c914930, slot#0)
2022-06-14 20:53:11 (13408): Successfully started VM. (PID = '9800')
2022-06-14 20:53:11 (13408): Reporting VM Process ID to BOINC.
2022-06-14 20:53:11 (13408): Guest Log: BIOS: VirtualBox 6.1.34
2022-06-14 20:53:11 (13408): Guest Log: CPUID EDX: 0x178bfbff
2022-06-14 20:53:11 (13408): Guest Log: BIOS: No PCI IDE controller, not probing IDE
2022-06-14 20:53:11 (13408): Guest Log: BIOS: AHCI 0-P#0: PCHS=16383/16/63 LCHS=1024/255/63 0x0000000002800000 sectors
2022-06-14 20:53:11 (13408): VM state change detected. (old = 'poweredoff', new = 'running')
2022-06-14 20:53:11 (13408): Detected: Web Application Enabled (http://localhost:53846)
2022-06-14 20:53:11 (13408): Detected: Remote Desktop Enabled (localhost:53847)
2022-06-14 20:53:11 (13408): Preference change detected
2022-06-14 20:53:11 (13408): Setting CPU throttle for VM. (100%)
2022-06-14 20:53:12 (13408): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 150 seconds) or (Vbox_job.xml: 600 seconds))
2022-06-14 20:53:13 (13408): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032
2022-06-14 20:53:13 (13408): Guest Log: BIOS: Booting from Hard Disk...
2022-06-14 20:53:16 (13408): Guest Log: BIOS: KBD: unsupported int 16h function 03
2022-06-14 20:53:16 (13408): Guest Log: BIOS: AX=0305 BX=0000 CX=0000 DX=0000
2022-06-14 20:53:44 (13408): Guest Log: vgdrvHeartbeatInit: Setting up heartbeat to trigger every 2000 milliseconds
2022-06-14 20:53:44 (13408): Guest Log: vboxguest: misc device minor 56, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000)
2022-06-14 20:53:46 (13408): Guest Log: VBoxService 5.2.6 r120293 (verbosity: 0) linux.amd64 (Jan 15 2018 14:51:00) release log
2022-06-14 20:53:46 (13408): Guest Log: 00:00:00.000193 main Log opened 2022-06-14T18:53:45.742080000Z
2022-06-14 20:53:46 (13408): Guest Log: 00:00:00.000423 main OS Product: Linux
2022-06-14 20:53:46 (13408): Guest Log: 00:00:00.000545 main OS Release: 4.14.232-19.cernvm.x86_64
2022-06-14 20:53:46 (13408): Guest Log: 00:00:00.000587 main OS Version: #1 SMP Fri Apr 30 17:12:25 CEST 2021
2022-06-14 20:53:46 (13408): Guest Log: 00:00:00.000639 main Executable: /usr/sbin/VBoxService
2022-06-14 20:53:46 (13408): Guest Log: 00:00:00.000642 main Process ID: 2169
2022-06-14 20:53:46 (13408): Guest Log: 00:00:00.000644 main Package type: LINUX_64BITS_GENERIC
2022-06-14 20:53:46 (13408): Guest Log: 00:00:00.005730 main 5.2.6 r120293 started. Verbose level = 0
2022-06-14 20:54:05 (13408): Guest Log: [INFO] Mounting the shared directory
2022-06-14 20:54:05 (13408): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor
2022-06-14 20:54:05 (13408): Guest Log: [INFO] Sourcing essential functions from /cvmfs/grid.cern.ch
2022-06-14 20:54:05 (13408): Guest Log: [INFO] Testing connection to cern.ch
2022-06-14 20:54:05 (13408): Guest Log: [INFO] Testing connection to VCCS
2022-06-14 20:54:06 (13408): Guest Log: [INFO] Testing connection to HTCondor
2022-06-14 20:54:06 (13408): Guest Log: [INFO] Testing connection to WMAgent
2022-06-14 20:54:06 (13408): Guest Log: [INFO] Testing connection to EOSCMS
2022-06-14 20:54:06 (13408): Guest Log: [INFO] Testing connection to CMS-Frontier
2022-06-14 20:54:06 (13408): Guest Log: [INFO] Testing connection to Frontier
2022-06-14 20:54:07 (13408): Guest Log: [INFO] Could not find a local HTTP proxy
2022-06-14 20:54:07 (13408): Guest Log: [INFO] CVMFS and Frontier will have to use DIRECT connections
2022-06-14 20:54:07 (13408): Guest Log: [INFO] This makes the application less efficient
2022-06-14 20:54:07 (13408): Guest Log: [INFO] It also puts higher load on the project servers
2022-06-14 20:54:07 (13408): Guest Log: [INFO] Setting up a local HTTP proxy is highly recommended
2022-06-14 20:54:07 (13408): Guest Log: [INFO] Advice can be found in the project forum
2022-06-14 20:54:07 (13408): Guest Log: [INFO] Reloading and probing the CVMFS configuration
2022-06-14 20:54:21 (13408): Guest Log: [INFO] Probing /cvmfs/cvmfs-config.cern.ch... OK
2022-06-14 20:54:21 (13408): Guest Log: [INFO] Probing /cvmfs/grid.cern.ch... OK
2022-06-14 20:54:30 (13408): Guest Log: [INFO] Probing /cvmfs/oasis.opensciencegrid.org... OK
2022-06-14 20:54:30 (13408): Guest Log: [INFO] Probing /cvmfs/singularity.opensciencegrid.org... OK
2022-06-14 20:54:31 (13408): Guest Log: [INFO] Probing /cvmfs/cms.cern.ch... OK
2022-06-14 20:54:31 (13408): Guest Log: [INFO] Probing /cvmfs/cms-ib.cern.ch... OK
2022-06-14 20:54:32 (13408): Guest Log: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY
2022-06-14 20:54:32 (13408): Guest Log: [INFO] 2.7.2.0 http://s1fnal-cvmfs.openhtc.io:8080 DIRECT
2022-06-14 20:54:32 (13408): Guest Log: [INFO] Reading volunteer information
2022-06-14 20:54:46 (13408): Guest Log: [INFO] Requesting an X509 credential from LHC@home
2022-06-14 20:54:47 (13408): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev
2022-06-14 20:54:49 (13408): Guest Log: [INFO] CMS application starting. Check log files.
2022-06-14 21:40:17 (13408): Preference change detected
2022-06-14 21:40:17 (13408): Setting CPU throttle for VM. (100%)
2022-06-14 21:40:17 (13408): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 150 seconds) or (Vbox_job.xml: 600 seconds))
2022-06-14 21:42:36 (13408): Preference change detected
2022-06-14 21:42:36 (13408): Setting CPU throttle for VM. (100%)
2022-06-14 21:42:37 (13408): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 150 seconds) or (Vbox_job.xml: 600 seconds))
2022-06-14 22:04:54 (13408): Stopping VM.
2022-06-14 22:05:10 (13408): Successfully stopped VM.
2022-06-14 22:07:21 (2904): Detected: vboxwrapper 26204
2022-06-14 22:07:21 (2904): Detected: BOINC client v7.19.0
2022-06-14 22:07:21 (2904): Detected: VirtualBox VboxManage Interface (Version: 6.1.34)
2022-06-14 22:07:21 (2904): Detected: Heartbeat check (file: 'heartbeat' every 1200.000000 seconds)
2022-06-14 22:07:22 (2904): Guest Log: BIOS: VirtualBox 6.1.34
2022-06-14 22:07:22 (2904): Guest Log: CPUID EDX: 0x178bfbff
2022-06-14 22:07:22 (2904): Guest Log: BIOS: No PCI IDE controller, not probing IDE
2022-06-14 22:07:22 (2904): Guest Log: BIOS: AHCI 0-P#0: PCHS=16383/16/63 LCHS=1024/255/63 0x0000000002800000 sectors
2022-06-14 22:07:22 (2904): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032
2022-06-14 22:07:22 (2904): Guest Log: BIOS: Booting from Hard Disk...
2022-06-14 22:07:22 (2904): Guest Log: BIOS: KBD: unsupported int 16h function 03
2022-06-14 22:07:22 (2904): Guest Log: BIOS: AX=0305 BX=0000 CX=0000 DX=0000
2022-06-14 22:07:22 (2904): Guest Log: vgdrvHeartbeatInit: Setting up heartbeat to trigger every 2000 milliseconds
2022-06-14 22:07:22 (2904): Guest Log: vboxguest: misc device minor 56, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000)
2022-06-14 22:07:22 (2904): Guest Log: VBoxService 5.2.6 r120293 (verbosity: 0) linux.amd64 (Jan 15 2018 14:51:00) release log
2022-06-14 22:07:22 (2904): Guest Log: 00:00:00.000193 main Log opened 2022-06-14T18:53:45.742080000Z
2022-06-14 22:07:22 (2904): Guest Log: 00:00:00.000423 main OS Product: Linux
2022-06-14 22:07:22 (2904): Guest Log: 00:00:00.000545 main OS Release: 4.14.232-19.cernvm.x86_64
2022-06-14 22:07:22 (2904): Guest Log: 00:00:00.000587 main OS Version: #1 SMP Fri Apr 30 17:12:25 CEST 2021
2022-06-14 22:07:22 (2904): Guest Log: 00:00:00.000639 main Executable: /usr/sbin/VBoxService
2022-06-14 22:07:22 (2904): Guest Log: 00:00:00.000642 main Process ID: 2169
2022-06-14 22:07:22 (2904): Guest Log: 00:00:00.000644 main Package type: LINUX_64BITS_GENERIC
2022-06-14 22:07:22 (2904): Guest Log: 00:00:00.005730 main 5.2.6 r120293 started. Verbose level = 0
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Mounting the shared directory
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Sourcing essential functions from /cvmfs/grid.cern.ch
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Testing connection to cern.ch
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Testing connection to VCCS
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Testing connection to HTCondor
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Testing connection to WMAgent
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Testing connection to EOSCMS
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Testing connection to CMS-Frontier
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Testing connection to Frontier
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Could not find a local HTTP proxy
2022-06-14 22:07:22 (2904): Guest Log: [INFO] CVMFS and Frontier will have to use DIRECT connections
2022-06-14 22:07:22 (2904): Guest Log: [INFO] This makes the application less efficient
2022-06-14 22:07:22 (2904): Guest Log: [INFO] It also puts higher load on the project servers
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Setting up a local HTTP proxy is highly recommended
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Advice can be found in the project forum
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Reloading and probing the CVMFS configuration
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Probing /cvmfs/cvmfs-config.cern.ch... OK
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Probing /cvmfs/grid.cern.ch... OK
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Probing /cvmfs/oasis.opensciencegrid.org... OK
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Probing /cvmfs/singularity.opensciencegrid.org... OK
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Probing /cvmfs/cms.cern.ch... OK
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Probing /cvmfs/cms-ib.cern.ch... OK
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY
2022-06-14 22:07:22 (2904): Guest Log: [INFO] 2.7.2.0 http://s1fnal-cvmfs.openhtc.io:8080 DIRECT
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Reading volunteer information
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Requesting an X509 credential from LHC@home
2022-06-14 22:07:22 (2904): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev
2022-06-14 22:07:22 (2904): Guest Log: [INFO] CMS application starting. Check log files.
2022-06-14 22:07:22 (2904): Starting VM using VBoxManage interface. (boinc_0514318c2c914930, slot#0)
2022-06-14 22:07:38 (2904): Successfully started VM. (PID = '2500')
2022-06-14 22:07:38 (2904): Reporting VM Process ID to BOINC.
2022-06-14 22:07:38 (2904): VM state change detected. (old = 'poweredoff', new = 'running')
2022-06-14 22:07:38 (2904): Detected: Web Application Enabled (http://localhost:53846)
2022-06-14 22:07:38 (2904): Detected: Remote Desktop Enabled (localhost:53847)
2022-06-14 22:07:38 (2904): Preference change detected
2022-06-14 22:07:38 (2904): Setting CPU throttle for VM. (100%)
2022-06-14 22:07:39 (2904): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 150 seconds) or (Vbox_job.xml: 600 seconds))
2022-06-14 22:37:53 (2904): Status Report: Job Duration: '64800.000000'
2022-06-14 22:37:53 (2904): Status Report: Elapsed Time: '6000.649441'
2022-06-14 22:37:53 (2904): Status Report: CPU Time: '5724.281250'
2022-06-15 00:18:12 (2904): Status Report: Job Duration: '64800.000000'
2022-06-15 00:18:12 (2904): Status Report: Elapsed Time: '12000.649441'
2022-06-15 00:18:12 (2904): Status Report: CPU Time: '11701.828125'
2022-06-15 01:58:15 (2904): Status Report: Job Duration: '64800.000000'
2022-06-15 01:58:15 (2904): Status Report: Elapsed Time: '18000.649441'
2022-06-15 01:58:15 (2904): Status Report: CPU Time: '17559.312500'
2022-06-15 03:38:32 (2904): Status Report: Job Duration: '64800.000000'
2022-06-15 03:38:32 (2904): Status Report: Elapsed Time: '24000.649441'
2022-06-15 03:38:32 (2904): Status Report: CPU Time: '23545.921875'
2022-06-15 05:18:36 (2904): Status Report: Job Duration: '64800.000000'
2022-06-15 05:18:36 (2904): Status Report: Elapsed Time: '30000.649441'
2022-06-15 05:18:36 (2904): Status Report: CPU Time: '29417.093750'
2022-06-15 06:59:36 (2904): Status Report: Job Duration: '64800.000000'
2022-06-15 06:59:36 (2904): Status Report: Elapsed Time: '36000.765282'
2022-06-15 06:59:36 (2904): Status Report: CPU Time: '35387.703125'
ID: 7335 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1129
Credit: 7,870,629
RAC: 576
Message 7338 - Posted: 15 Jun 2022, 8:54:49 UTC - in response to Message 7334.  

I suspect it is caused by a vdi registration error (on Windows a well as on Linux):
VBoxManage.exe: error: Cannot register the hard disk 'C:\ProgramData\BOINC\projects\lhcathomedev.cern.ch_lhcathome-dev\CMS_2021_07_07.vdi' {f888c51e-0503-4495-8794-fd67809dc4e8} because a hard disk 'C:\ProgramData\BOINC\projects\lhcathome.cern.ch_lhcathome\CMS_2021_07_07.vdi' with UUID {f888c51e-0503-4495-8794-fd67809dc4e8} already exists

The old app version as well as the new app version both refer to the same vdi file "CMS_2021_07_07.vdi".
To allow both app versions to coexist the new vdi file must have a different name and a different UUID.

Ah, thanks, I'd missed that difference because of the long line and my eyesight problems -- I thought it was the same file.
ID: 7338 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 278
Message 7341 - Posted: 15 Jun 2022, 11:41:48 UTC - in response to Message 7338.  

It is working fine for me with the Theory app.

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3092467

Maybe it is an issue with the app version upgrade. does a project reset fix it?
ID: 7341 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7342 - Posted: 15 Jun 2022, 12:26:42 UTC - in response to Message 7341.  

It's not caused directly by BOINC.
The issue appears as long as the VirtualBox media manager tries to register a vdi with a UUID that is already in the list.
If that vdi is attached to a VM it can't be switched to multiattach mode.

Once switched to multiattach mode a vdi can be used by many VMs.


The old method copies the original vdi to a slot and sets a new random UUID.
That's why all of them can be named "vm_image.vdi".

See the comments from the BOINC sorcecode:
https://github.com/BOINC/boinc/blob/master/samples/vboxwrapper/vbox_vboxmanage.cpp#L519-L593
ID: 7342 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1129
Credit: 7,870,629
RAC: 576
Message 7346 - Posted: 15 Jun 2022, 15:55:51 UTC - in response to Message 7342.  

Hmm, yes, I got the new wrapper running on both my machines by making sure CMS@Home wasn't running (I had to manually remove the BOINC VMs on Windows as they hung around in VirtualBox after I did a pause and abort on the tasks).
ID: 7346 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 738
Credit: 11,558,539
RAC: 1,940
Message 7348 - Posted: 15 Jun 2022, 17:33:41 UTC

No problems here ( Windows 10 )
ID: 7348 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 660
Credit: 1,720,123
RAC: 3,061
Message 7350 - Posted: 15 Jun 2022, 19:03:47 UTC - in response to Message 7348.  

https://lhcathomedev.cern.ch/lhcathome-dev/workunit.php?wuid=2189965
Win10pro downloading CMS_2021_07.07.vdi (3.7 GByte)
ID: 7350 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 278
Message 7355 - Posted: 16 Jun 2022, 12:50:16 UTC - in response to Message 7350.  

What is the consensus on this? It is ready for the prod server next week?
ID: 7355 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7357 - Posted: 16 Jun 2022, 13:36:23 UTC - in response to Message 7355.  

I checked some logs from my computers as well as from other volunteers (Windows and Linux).
Looks like the new vboxwrapper works as expected.

Not yet tested is a heavy load scenario on computers with lots of cores but I'm sure the new vboxwrapper isn't less robust than the old one.
I tested it with a self compiled version but not with the version provided by BOINC.

Please use a cloned CMS vdi file (new name + new UUID) when you prepare the app version for the production server.
It's a few seconds of work but ensures that both app versions can coexist on the clients until older tasks are finished.
ID: 7357 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 738
Credit: 11,558,539
RAC: 1,940
Message 7359 - Posted: 16 Jun 2022, 20:02:41 UTC - in response to Message 7355.  

What is the consensus on this? It is ready for the prod server next week?


I say YES Ivan

( at least as far as Windows 10 ) and (Version: 6.1.32)
ID: 7359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richie_unstable

Send message
Joined: 31 Aug 21
Posts: 13
Credit: 1,118,469
RAC: 0
Message 7366 - Posted: 17 Jun 2022, 11:02:51 UTC

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093277
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093276
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093290
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093339

My host had some sort of problem with all these CMS tasks. But it is running Boinc 7.20.0 (Development version) + Windows 11 + Virtualbox 6.1.34
... so maybe that has something to do with it. Or is it clearly something else ?


ATLAS tasks had "Outcome : Success"...
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093565

... but
Run time 24 min 48 sec
CPU time 2 min 35 sec
... and "No HITS file was produced" for all three of them and these lines in Stderr output :

2022-06-17 02:07:07 (2204): Guest Log: *** Job finished ***
2022-06-17 02:07:07 (2204): Guest Log: *** The last 20 lines of the pilot log: ***
2022-06-17 02:07:07 (2204): Guest Log: *** Error codes and diagnostics ***
2022-06-17 02:07:07 (2204): Guest Log: "exeErrorCode": 65,
2022-06-17 02:07:07 (2204): Guest Log: "exeErrorDiag": "Non-zero return code from EVNTtoHITS (33); Logfile error in log.EVNTtoHITS: \"IOVDbSvc FATAL Conditions database connection COOLOFL_TRT/OFLP200 cannot be opened - STOP\"",
2022-06-17 02:07:07 (2204): Guest Log: "pilotErrorCode": 1165,
2022-06-17 02:07:07 (2204): Guest Log: "pilotErrorDiag": "Local output file is missing"


Theory task run without problems.
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3092782
ID: 7366 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7368 - Posted: 17 Jun 2022, 11:36:31 UTC - in response to Message 7366.  

The good news:
Your CMS tasks correctly configure the differencing image and boot the VM.

The bad news:
There are lots of network errors when the bootstrap script from inside the VM sends some network tests.
Might be a firewall issue.

Example https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093277:
2022-06-17 10:05:30 (1696): Guest Log: [INFO] Testing connection to HTCondor
2022-06-17 10:05:45 (1696): Guest Log: [DEBUG] Status run 1 of up to 3: 1
2022-06-17 10:06:06 (1696): Guest Log: [DEBUG] Status run 2 of up to 3: 1
2022-06-17 10:06:36 (1696): Guest Log: [DEBUG] Status run 3 of up to 3: 1
2022-06-17 10:06:36 (1696): Guest Log: [DEBUG] run 1
.
.
.
<and many lines below>



Didn't look into the ATLAS example yet.
ID: 7368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7373 - Posted: 17 Jun 2022, 12:34:55 UTC - in response to Message 7366.  

Your ATLAS VM boots fine and uses a differencing image.
See:
2022-06-17 01:42:33 (2204): Adding virtual disk drive to VM. (ATLAS_vbox_0.84_image.vdi)



The error happens much deeper inside the running VM in one of the ATLAS scripts:
2022-06-17 02:07:07 (2204): Guest Log:     "exeErrorDiag": "Non-zero return code from EVNTtoHITS (33); Logfile error in log.EVNTtoHITS: \"IOVDbSvc            FATAL Conditions database connection COOLOFL_TRT/OFLP200 cannot be opened - STOP\"",
ID: 7373 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richie_unstable

Send message
Joined: 31 Aug 21
Posts: 13
Credit: 1,118,469
RAC: 0
Message 7382 - Posted: 17 Jun 2022, 17:33:03 UTC - in response to Message 7368.  

The bad news:
There are lots of network errors when the bootstrap script from inside the VM sends some network tests.
Might be a firewall issue.


Okay, I believe you are right.

I fired up another host that run Windows 10 + Boinc 7.20.0 + VirtualBox 6.1.34. This host too produced the same errors:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093178

Then I downgraded Boinc from 7.20.0 to 7.16.20. Same errors again.
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093397

Then I downgraded VirtualBox from 6.1.34 to 6.1.32 . Same errors again.
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3093379

2022-06-17 20:05:13 (4272): VM Completion Message: Could not connect to all required network services

I wish I knew what to change and where. But I think I'll just pause trying these CMS tasks for now so that I won't flood this board with my messages. This network thing seems to be a problem on my hosts only.
ID: 7382 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7383 - Posted: 17 Jun 2022, 18:17:28 UTC - in response to Message 7382.  

It may help to know which script causes the errors.
It can be found here:
https://gitlab.cern.ch/vc/vm/-/blob/master/bin/basic_network_tests

The test command is just 1 line in that script:
https://gitlab.cern.ch/vc/vm/-/blob/master/bin/basic_network_tests#L20



The script is called a couple of times from inside the VM when bootstrap-cms is executed.
https://gitlab.cern.ch/vc/vm/blob/master/sbin/bootstrap-cms#L52-L76
Hence, it's CMS only.
ID: 7383 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 7395 - Posted: 19 Jun 2022, 7:25:02 UTC

Just checked the UUIDs of the CMS vdis from the dev server and the prod server.
They are the same since the vdis are identical.

This causes problems when a volunteer runs tasks from dev and prod concurrently.
See:
https://lhcathomedev.cern.ch/lhcathome-dev/forum_thread.php?id=563&postid=7390


@Laurence
Please ensure each vdi that is sent out has a unique UUID.
ID: 7395 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 278
Message 7405 - Posted: 20 Jun 2022, 9:56:24 UTC - in response to Message 7395.  


@Laurence
Please ensure each vdi that is sent out has a unique UUID.


I will release a new version later with a name change. Hopefully that will be enough.
ID: 7405 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : CMS Application : New Version 60.60


©2024 CERN