Message boards : LHCb Application : New version v1.02
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 5274 - Posted: 14 Dec 2017, 10:46:52 UTC

A new version has been requested by LHCb.
ID: 5274 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Dingo
Avatar

Send message
Joined: 10 May 15
Posts: 4
Credit: 39,333
RAC: 0
Message 5291 - Posted: 22 Dec 2017, 3:43:18 UTC
Last modified: 22 Dec 2017, 4:15:12 UTC

I have run threee of theese and they all end in error code 5. This is an example LHCb Simulation v1.02 (vbox64_mt_mcore)
x86_64-pc-linux-gnu

Name	LHCb_25826_1513818422.243445_0
Workunit	360294
Created	21 Dec 2017, 1:07:05 UTC
Sent	21 Dec 2017, 12:33:30 UTC
Report deadline	28 Dec 2017, 12:33:30 UTC
Received	21 Dec 2017, 12:57:06 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	5 (0x00000005) Unknown error code
Computer ID	2332
Run time	30 sec
CPU time	
Validate state	Invalid
Credit	0.00
Device peak FLOPS	34.79 GFLOPS
Application version	LHCb Simulation v1.02 (vbox64_mt_mcore)
x86_64-pc-linux-gnu
Peak working set size	16.36 MB
Peak swap size	317.56 MB
Peak disk usage	2,382.03 MB
Stderr output
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
process exited with code 5 (0x5, -251)
</message>
<stderr_txt>
2017-12-21 07:36:44 (11935): vboxwrapper (7.7.26196): starting
2017-12-21 07:36:44 (11935): Feature: Checkpoint interval offset (308 seconds)
2017-12-21 07:36:44 (11935): Detected: VirtualBox VboxManage Interface (Version: 5.2.4)
2017-12-21 07:36:44 (11935): Detected: Minimum checkpoint interval (600.000000 seconds)
2017-12-21 07:36:44 (11935): Detected: Heartbeat check (file: 'heartbeat' every 1200.000000 seconds)
2017-12-21 07:36:44 (11935): Successfully copied 'init_data.xml' to the shared directory.
2017-12-21 07:36:44 (11935): Create VM. (boinc_d9679f03a02df903, slot#35)
2017-12-21 07:37:00 (11935): Setting Memory Size for VM. (1830MB)
2017-12-21 07:37:00 (11935): Setting CPU Count for VM. (12)
2017-12-21 07:37:00 (11935): Setting Chipset Options for VM.
2017-12-21 07:37:00 (11935): Setting Boot Options for VM.
2017-12-21 07:37:00 (11935): Setting Network Configuration for NAT.
2017-12-21 07:37:00 (11935): Enabling VM Network Access.
2017-12-21 07:37:13 (11935): Disabling USB Support for VM.
2017-12-21 07:37:13 (11935): Disabling COM Port Support for VM.
2017-12-21 07:37:13 (11935): Disabling LPT Port Support for VM.
2017-12-21 07:37:13 (11935): Disabling Audio Support for VM.
2017-12-21 07:37:13 (11935): Disabling Clipboard Support for VM.
2017-12-21 07:37:14 (11935): Disabling Drag and Drop Support for VM.
2017-12-21 07:37:14 (11935): Adding storage controller(s) to VM.
2017-12-21 07:37:14 (11935): Adding virtual disk drive to VM. (vm_image.vdi)
2017-12-21 07:37:19 (11935): Error in storage attach (fixed disk) for VM: -2135228411
Command:
VBoxManage -q storageattach "boinc_d9679f03a02df903" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "/usr/bin/slots/35/vm_image.vdi" 
Output:
VBoxManage: error: Could not get the storage format of the medium '/usr/bin/slots/35/vm_image.vdi' (VERR_NOT_SUPPORTED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MediumWrap, interface IMedium, callee nsISupports
VBoxManage: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 179 of file VBoxManageDisk.cpp
VBoxManage: error: Invalid UUID or filename "/usr/bin/slots/35/vm_image.vdi"

2017-12-21 07:37:19 (11935): Powering off VM.
2017-12-21 07:37:19 (11935): Deregistering VM. (boinc_d9679f03a02df903, slot#35)
2017-12-21 07:37:19 (11935): Removing network bandwidth throttle group from VM.
2017-12-21 07:37:19 (11935): Removing storage controller(s) from VM.
2017-12-21 07:37:19 (11935): Removing VM from VirtualBox.
2017-12-21 07:37:20 (11935): Removing virtual disk drive from VirtualBox.

    Hypervisor System Log:


    VM Execution Log:


    VM Startup Log:


    VM Trace Log:

UUID: 24a70507-9078-46e2-abe8-80c3a9795403
Settings file: '/usr/bin/slots/35/boinc_d9679f03a02df903/boinc_d9679f03a02df903.vbox'

2017-12-21 07:37:00 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --description "LHCb_25826_1513818422.243445_0" 
Exit Code: 0
Output:

2017-12-21 07:37:00 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --memory 1830 
Exit Code: 0
Output:

2017-12-21 07:37:00 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --cpus 12 
Exit Code: 0
Output:

2017-12-21 07:37:00 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --acpi on --ioapic on 
Exit Code: 0
Output:

2017-12-21 07:37:00 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --boot1 disk --boot2 dvd --boot3 none --boot4 none 
Exit Code: 0
Output:

2017-12-21 07:37:00 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --nic1 nat --natdnsproxy1 on --cableconnected1 off 
Exit Code: 0
Output:

2017-12-21 07:37:13 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --cableconnected1 on 
Exit Code: 0
Output:

2017-12-21 07:37:13 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --usb off 
Exit Code: 0
Output:

2017-12-21 07:37:13 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --uart1 off --uart2 off 
Exit Code: 0
Output:

2017-12-21 07:37:13 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --lpt1 off --lpt2 off 
Exit Code: 0
Output:

2017-12-21 07:37:13 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --audio none 
Exit Code: 0
Output:

2017-12-21 07:37:14 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --clipboard disabled 
Exit Code: 0
Output:

2017-12-21 07:37:14 (11935): 
Command: VBoxManage -q modifyvm "boinc_d9679f03a02df903" --draganddrop disabled 
Exit Code: 0
Output:

2017-12-21 07:37:14 (11935): 
Command: VBoxManage -q storagectl "boinc_d9679f03a02df903" --name "Hard Disk Controller" --add "ide" --controller "PIIX4" 
Exit Code: 0
Output:

2017-12-21 07:37:14 (11935): 
Command: VBoxManage -q storageattach "boinc_d9679f03a02df903" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "/usr/bin/slots/35/vm_image.vdi" 
Exit Code: -2135228411
Output:
VBoxManage: error: Could not get the storage format of the medium '/usr/bin/slots/35/vm_image.vdi' (VERR_NOT_SUPPORTED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MediumWrap, interface IMedium, callee nsISupports
VBoxManage: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 179 of file VBoxManageDisk.cpp
VBoxManage: error: Invalid UUID or filename "/usr/bin/slots/35/vm_image.vdi"

2017-12-21 07:37:15 (11935): 
Command: VBoxManage -q storageattach "boinc_d9679f03a02df903" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "/usr/bin/slots/35/vm_image.vdi" 
Exit Code: -2135228411
Output:
VBoxManage: error: Could not get the storage format of the medium '/usr/bin/slots/35/vm_image.vdi' (VERR_NOT_SUPPORTED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MediumWrap, interface IMedium, callee nsISupports
VBoxManage: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 179 of file VBoxManageDisk.cpp
VBoxManage: error: Invalid UUID or filename "/usr/bin/slots/35/vm_image.vdi"

2017-12-21 07:37:16 (11935): 
Command: VBoxManage -q storageattach "boinc_d9679f03a02df903" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "/usr/bin/slots/35/vm_image.vdi" 
Exit Code: -2135228411
Output:
VBoxManage: error: Could not get the storage format of the medium '/usr/bin/slots/35/vm_image.vdi' (VERR_NOT_SUPPORTED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MediumWrap, interface IMedium, callee nsISupports
VBoxManage: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 179 of file VBoxManageDisk.cpp
VBoxManage: error: Invalid UUID or filename "/usr/bin/slots/35/vm_image.vdi"

2017-12-21 07:37:17 (11935): 
Command: VBoxManage -q storageattach "boinc_d9679f03a02df903" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "/usr/bin/slots/35/vm_image.vdi" 
Exit Code: -2135228411
Output:
VBoxManage: error: Could not get the storage format of the medium '/usr/bin/slots/35/vm_image.vdi' (VERR_NOT_SUPPORTED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MediumWrap, interface IMedium, callee nsISupports
VBoxManage: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 179 of file VBoxManageDisk.cpp
VBoxManage: error: Invalid UUID or filename "/usr/bin/slots/35/vm_image.vdi"

2017-12-21 07:37:18 (11935): 
Command: VBoxManage -q storageattach "boinc_d9679f03a02df903" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "/usr/bin/slots/35/vm_image.vdi" 
Exit Code: -2135228411
Output:
VBoxManage: error: Could not get the storage format of the medium '/usr/bin/slots/35/vm_image.vdi' (VERR_NOT_SUPPORTED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MediumWrap, interface IMedium, callee nsISupports
VBoxManage: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 179 of file VBoxManageDisk.cpp
VBoxManage: error: Invalid UUID or filename "/usr/bin/slots/35/vm_image.vdi"

2017-12-21 07:37:19 (11935): 
Command: VBoxManage -q storageattach "boinc_d9679f03a02df903" --storagectl "Hard Disk Controller" --port 0 --device 0 --type hdd --setuuid "" --medium "/usr/bin/slots/35/vm_image.vdi" 
Exit Code: -2135228411
Output:
VBoxManage: error: Could not get the storage format of the medium '/usr/bin/slots/35/vm_image.vdi' (VERR_NOT_SUPPORTED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MediumWrap, interface IMedium, callee nsISupports
VBoxManage: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 179 of file VBoxManageDisk.cpp
VBoxManage: error: Invalid UUID or filename "/usr/bin/slots/35/vm_image.vdi"

2017-12-21 07:37:19 (11935): 
Command: VBoxManage -q snapshot "boinc_d9679f03a02df903" list 
Exit Code: 0
Output:
This machine does not have any snapshots

2017-12-21 07:37:19 (11935): 
Command: VBoxManage -q bandwidthctl "boinc_d9679f03a02df903" remove "boinc_d9679f03a02df903_net" 
Exit Code: -2135228415
Output:
VBoxManage: error: Could not find a bandwidth group named 'boinc_d9679f03a02df903_net'
VBoxManage: error: Details: code VBOX_E_OBJECT_NOT_FOUND (0x80bb0001), component BandwidthControlWrap, interface IBandwidthControl, callee nsISupports
VBoxManage: error: Context: "DeleteBandwidthGroup(name.raw())" at line 259 of file VBoxManageBandwidthControl.cpp

2017-12-21 07:37:19 (11935): 
Command: VBoxManage -q storagectl "boinc_d9679f03a02df903" --name "Hard Disk Controller" --remove 
Exit Code: 0
Output:

2017-12-21 07:37:20 (11935): 
Command: VBoxManage -q unregistervm "boinc_d9679f03a02df903" --delete 
Exit Code: 0
Output:
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%

2017-12-21 07:37:20 (11935): 
Command: VBoxManage -q closemedium disk "/usr/bin/slots/35/vm_image.vdi" --delete 
Exit Code: -2135228411
Output:
VBoxManage: error: Could not get the storage format of the medium '/usr/bin/slots/35/vm_image.vdi' (VERR_NOT_SUPPORTED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MediumWrap, interface IMedium, callee nsISupports
VBoxManage: error: Context: "OpenMedium(Bstr(pszFilenameOrUuid).raw(), enmDevType, enmAccessMode, fForceNewUuidOnOpen, pMedium.asOutParam())" at line 179 of file VBoxManageDisk.cpp

07:37:25 (11935): called boinc_finish(-2135228411)

</stderr_txt>
]]>


Proud Founder and member of



Have a look at my WebCam
ID: 5291 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gyllic

Send message
Joined: 10 Mar 17
Posts: 40
Credit: 108,345
RAC: 0
Message 5417 - Posted: 29 Apr 2018, 8:09:57 UTC - in response to Message 5291.  

ID: 5417 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 484
Credit: 394,839
RAC: 1
Message 5425 - Posted: 9 May 2018, 9:48:31 UTC

LHCb_2017_05_05.xml configures a RAM size of 2048 MB via "<memory_size_mb>2048</memory_size_mb>".
The VM (2-core) effectively runs with 830 MB, which is sent via sched_reply_lhcathomedev.cern.ch_lhcathome-dev.xml.

830 MB is much less than the 2048 MB of the normal 1-core VMs over at -prod AND it is exactly the same value that is used for Theory.

Is it a server side configuration error?
ID: 5425 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,325,404
RAC: 1,361
Message 5461 - Posted: 7 Jul 2018, 22:12:08 UTC

I decided to give these another test and they seem to be running Valids again but I only have them running on one old 3-core host so it only runs one 2-core task at a time but the last 2 are Valids so I will let that host continue running these multi-core tasks (and it looks like the single core version over at LHC are working so I loaded a few on one of my 8-core hosts for a few days)

I can only tell if any of these are even running here by looking at the server status page but that doesn't tell us how they are doing and our stats pages haven't worked/been updated for a LONG time so I can only go by what my hosts are doing here.
Mad Scientist For Life
ID: 5461 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,325,404
RAC: 1,361
Message 5462 - Posted: 10 Jul 2018, 18:50:26 UTC
Last modified: 10 Jul 2018, 18:55:58 UTC

Well so far I have all 7 Valid tasks but when I look at the Valid 2-core and 3-core tasks several of those the CPU time is less than the Run time.

I also checked a Linux running these and some are running 10 hour tasks but most of the tasks are only running around 15 minutes.

https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=2206&offset=0&show_names=0&state=4&appid=3
ID: 5462 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 677
Credit: 2,002,766
RAC: 2
Message 5463 - Posted: 11 Jul 2018, 13:31:00 UTC

Have one Task running with two Cpus:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2128880
six Jobs are finished so far.
ID: 5463 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,325,404
RAC: 1,361
Message 5464 - Posted: 12 Jul 2018, 3:25:10 UTC - in response to Message 5463.  

I just looked at yours and it is a good Valid Axel

Run time 13 hours 14 min 31 sec
CPU time 16 hours 7 min 14 sec
ID: 5464 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 484
Credit: 394,839
RAC: 1
Message 5465 - Posted: 12 Jul 2018, 6:12:27 UTC - in response to Message 5463.  

Have one Task running with two Cpus:
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2128880
six Jobs are finished so far.


Did you limit the RAM setting somewhere or is it the project's default value?
It seems to be much too low for LHCb, especially in a 2-core setup.

2018-07-11 14:05:29 (2580): Setting Memory Size for VM. (830MB)
2018-07-11 14:05:29 (2580): Setting CPU Count for VM. (2)


It may be a result of that low RAM setting that your CPU efficiency is only 61 % and your log shows lines like this:
2018-07-12 03:19:44 (2580): VM did not power off when requested.
2018-07-12 03:19:44 (2580): VM was successfully terminated.
ID: 5465 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 677
Credit: 2,002,766
RAC: 2
Message 5466 - Posted: 12 Jul 2018, 7:18:23 UTC

Have no own settings, only default:
It is running in combination with Atlas from Production on this Ryzen.

In the log of LHCb is a line:
CPUID EDX: 0x178bfbff
google say: Hardware is not correct detected.

Virtualbox 5.2.14 including extension pack.
ID: 5466 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 484
Credit: 394,839
RAC: 1
Message 5467 - Posted: 12 Jul 2018, 8:26:03 UTC - in response to Message 5466.  

Just a guess:
A corrupt vdi?

You may consider to check/clean your vbox environment and/or reset the project to get a fresh vdi.
ID: 5467 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 677
Credit: 2,002,766
RAC: 2
Message 5468 - Posted: 12 Jul 2018, 8:43:39 UTC - in response to Message 5467.  

Just a guess:
A corrupt vdi?

You may consider to check/clean your vbox environment and/or reset the project to get a fresh vdi.

LHCb is running the first time on this PC. (Four weeks old). So, the .vdi is downloaded yesterday.
The actuell task is running job 29 and 30 with two CPU's at the moment.
ID: 5468 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : LHCb Application : New version v1.02


©2024 CERN