Message boards :
Theory Application :
Windows Version
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
https://lhcathomedev.cern.ch/lhcathome-dev/results.php?userid=192 Your link only works when you have a MAGIC login. Link to the hostid in question. Probably the quota started low (2 maybe) when version 4.18 was launched. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
OK or not OK. The tail message during each startup is exactly the same. It seems to be a sherpa job. It's shown with each job, although the job description is different. This are my 6 running jobs now. (6 seems the limit now, have max 8 defined) echo "runspec=boinc pp ue 900 - - pythia8 8.235 tune-CUETP8S1 100000 21" echo "runspec=boinc pp w1j 7000 150 - pythia6 6.428 373 100000 21" echo "runspec=boinc pp jets 7000 500 - herwig7 7.1.4 softTune 100000 21" echo "runspec=boinc ppbar mb-inelastic 546 - - pythia6 6.428 360 100000 21" echo "runspec=boinc pp jets 7000 25,-,300 - pythia8 8.235 default-noCR 100000 21" echo "runspec=boinc pp mb-inelastic 900 - - pythia6 6.428 377 100000 21" I think in the vdi coming with v4.18 this sherpa log is already present and blocking the creation of a new tail log. |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,349,881 RAC: 3,224 |
Well CP I think after all these years that just means you are welcome to check all of my tasks or any hosts I have running here since I don't have to *hide* them like some members do. (but then that was for Laurence to check if he wanted to since I don't need help from members) As I said I ran 12 of the version 4.18 in that first day and then it stopped sending me any.........until just now when I got 4 new ones and I have 3 running right now since they don't take very long Valid or Invalid. Once they start working semi-dependable I will run some on other computers. |
Send message Joined: 12 Sep 14 Posts: 1069 Credit: 334,882 RAC: 0 |
Question: Is the science outcome valid? Was the job accepted by MCPlots? |
Send message Joined: 12 Sep 14 Posts: 1069 Credit: 334,882 RAC: 0 |
I think in the vdi coming with v4.18 this sherpa log is already present and blocking the creation of a new tail log. I think you might be right and there is an old log file present. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
Question: Is the science outcome valid? Hard to check, but all 4 jobs from my Linux-VM are there. From 66 valid Windows-tasks, only 1 is in MCPlots. |
Send message Joined: 12 Sep 14 Posts: 1069 Credit: 334,882 RAC: 0 |
I think that there is a problem and as an old result exist in the image, the task is considered successful but mcplots rejects it as a duplicate. Will put the Windows version on hold until I can fix it. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
Will put the Windows version on hold until I can fix it. Until the fix, you could replace it by v4.16 (Theory_2019_02_20.vdi.gz) |
Send message Joined: 12 Sep 14 Posts: 1069 Credit: 334,882 RAC: 0 |
Done |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
For my Windows machine I don't get new tasks. Server Status: 66 tasks Unsent. lhcathome-dev 25 Feb 08:50:08 No tasks are available for Theory Simulation On Linux I get work. Is there something wrong or is this on purpose? |
Send message Joined: 13 Apr 15 Posts: 138 Credit: 2,969,210 RAC: 0 |
Picked up 7 tasks over 3 hosts yesterday. All except one completed, validated and credited with runtimes of 1 - 18 hrs. There was no Graphics output through Boinc from any of them and only Alt-F1 initial setup screen (no F2 events or Top), therefore no way to check on how any jobs were progressing other than the cpu usage. No McPlots for any of them 8~( No debris left in any slots or in VBox 8~) |
Send message Joined: 12 Sep 14 Posts: 1069 Credit: 334,882 RAC: 0 |
For my Windows machine I don't get new tasks. Server Status: 66 tasks Unsent. The server needed a restart for it to pick up the old version. |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,349,881 RAC: 3,224 |
Running and d/ling version 4.16 (vbox64_mt_mcore) now |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
After 36 minutes Running Container https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2755279 [ERROR] Container 'runc' failed. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
Another inexplicable error after 1hr12m run time: [ERROR] Container 'runc' failed. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2755360 |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
Next is an error with Unknown error code 2019-02-26 18:46:22 (7064): Guest Log: 17:46:21 2019-02-26: cranky: [INFO] Running Container 'runc'. . . 2019-02-26 19:26:44 (7064): Guest Log: 18:26:07 2019-02-26: cranky: [ERROR] Container 'runc' failed. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2755414 |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,349,881 RAC: 3,224 |
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2755042 2019-02-25 17:33:03 (1844): Guest Log: 01:33:02 2019-02-26: cranky: [INFO] Running Container 'runc'. 2019-02-25 19:10:58 (1844): Status Report: Job Duration: '64800.000000' 2019-02-25 19:10:58 (1844): Status Report: Elapsed Time: '6000.000000' 2019-02-25 19:10:58 (1844): Status Report: CPU Time: '13076.484375' 2019-02-25 19:34:53 (1844): Guest Log: 03:34:53 2019-02-26: cranky: [ERROR] Container 'runc' failed. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2755451 2019-02-26 19:30:39 (5404): Guest Log: 18:30:38 2019-02-26: cranky: [INFO] Running Container 'runc'. 2019-02-26 20:32:04 (5404): Guest Log: 19:31:09 2019-02-26: cranky: [ERROR] Container 'runc' failed. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
From this error task https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2755480, I saw that the Console display was frozen at but the CPU was used 115% of 1 core. Jobname = [boinc pp softqcdall 7000 - - pythia6 6.426 360 100000 22] 2019-02-27 07:57:32 (3592): Status Report: CPU Time: '7143.660192' 2019-02-27 08:42:12 (3592): Guest Log: 07:41:20 2019-02-27: cranky: [ERROR] Container 'runc' failed. 2019-02-27 08:42:12 (3592): VM Completion File Detected. |
Send message Joined: 20 Mar 15 Posts: 243 Credit: 886,442 RAC: 0 |
I've been unable to get any of these tasks to start.on a couple of Win7 hosts https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=246 and https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=245 The failures have been slightly different as Laurence has made changes (4.16 to 4.18 etc) at the server end and I've made changes (memory, task priorities, Vbox 5.1.x to 5.2.x) to the hosts. At present (v4.16) the VM setup fails at the message "Started update UTMP about system Runlevel changes" At this point the RDP connection to the VM is lost. The wrapper runs on for a minute or so before failing. The vbox log shows various file read failures from the slot with "can't find path" errors. The slot contents seem OK although this is after the failure. The VMs are left as orphans afterwards. Testing is very difficult since both hosts are now down to 1 task per day., although they normally get two at once.. I'm sure I'm missing something obvious but have run out of ideas. I've a host shut down at the moment with two failures unreported so logs should still be there. |
©2024 CERN