Message boards :
Theory Application :
The Theory Application
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
The test job may not be a good one. Tomorrow hopefully more jobs will be submitted. |
Send message Joined: 20 May 15 Posts: 217 Credit: 6,133,398 RAC: 1,091 |
The machine that couldn't get an LHCb earlier (when Rasputin broke it) has now just received an LHCb task. |
Send message Joined: 13 Apr 15 Posts: 138 Credit: 2,969,210 RAC: 1 |
Three of the new Theory0.01s so far, 2 erroring out after c.7mins, the other waiting for some Sixtrack wus to clear. VM starts up but doesn't get all the way in. I took a screengrab of the console window at the point of error; Hope it's helpful. I'll set No New Tasks overnight rather than continually trashing tasks. |
Send message Joined: 20 May 15 Posts: 217 Credit: 6,133,398 RAC: 1,091 |
Having looked at mine for longer they are looping rather than crashing out like Ray's. Terminated it via a shutdown file after 37 minutes. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 56 |
The same error as Ray. The VM is halted by itself, not by the shutdown file and so the task ends up into computation error from vboxwrapper point of view. Refreshing human memory: The same happened on vLHCathome only with Windows-boxes. Wait! Just found a Linux box with the same premature VM-shutdown -> http://lhcathomedev.cern.ch/vLHCathome-dev/results.php?hostid=1002 |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
We are out of jobs but as you pointed out the shutdown method is not correct. |
Send message Joined: 12 Feb 15 Posts: 2 Credit: 78,775 RAC: 0 |
Don't know if this is related to what you guys are talking about but I just encountered several tasks which errored out on my machine, logs all point to "(unknown error) - exit code 194 (0xc2)", example-wu - also mostly done after around 7 minutes. Unfortunately it produced too much error wus and now the daily quota went down to 2, hmpf. :-\ Life is Science, and Science rules. To the universe and beyond Proud member of BOINC@Heidelberg My BOINC-Stats |
Send message Joined: 12 Sep 14 Posts: 65 Credit: 544 RAC: 0 |
To Theory-dev testers: This is very early days so please bear with us. Here is some information: 1. The VM is 64 bits (yes!) 2. The job feeding system is now Condor based (like CMS) instead of CoPilot. (This means that job initiation, suspend/resume timeouts, and other things may be less robust for now than you are used to with CoPilot). 3. The VM screens and Web logs are being worked on so don't complain about them yet. The idea is to standardise the whole series of CERN VM based apps as much as possible… Ben, Laurence, Leonardo and team |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Would you please inform us, when an updated version is fed into the system? This way, we would not unneccessarily waste time with tasks, that will not work. |
Send message Joined: 12 Sep 14 Posts: 65 Credit: 544 RAC: 0 |
Would you please inform us, when an updated version is fed into the system? OK, will do! By the way, the current test jobs are identical and fail after 1-2 minutes. This is intentional to test the system setup and recovery features. Thanks for all your help! |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
This is intentional to test the system setup and recovery features. That is the kind of things, we need to know. Thanks,Ben. |
Send message Joined: 20 May 15 Posts: 217 Credit: 6,133,398 RAC: 1,091 |
OK, will do! By the way, the current test jobs are identical and fail after 1-2 minutes. This is intentional to test the system setup and recovery features. It takes 4 minutes to loop round on my i3 laptop. Should I leave it running if it doing something useful for you your end ? Or just terminate and wait for a new version ? |
Send message Joined: 11 Mar 16 Posts: 23 Credit: 68,680 RAC: 0 |
Last WU work 1h with CPU time 45m Endless loop of Sherpa initialisation with exit code 2 http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=137729 |
Send message Joined: 20 May 15 Posts: 217 Credit: 6,133,398 RAC: 1,091 |
OK, will do! By the way, the current test jobs are identical and fail after 1-2 minutes. This is intentional to test the system setup and recovery features. It terminated itself after an hour because... 2016-04-12 10:14:14 (15928): Guest Log: [INFO] Theory application starting. Check log files. 2016-04-12 11:16:48 (15928): Guest Log: [ERROR] App is not supported. Shutting down! |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Tasks are actually validating. http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=137802 |
Send message Joined: 20 May 15 Posts: 217 Credit: 6,133,398 RAC: 1,091 |
Tasks are actually validating. All of my Theory tasks have validated :-) |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
My very first ones errored out. |
Send message Joined: 11 Mar 16 Posts: 23 Credit: 68,680 RAC: 0 |
Tasks are actually validating. my too. but only first valid task doing the real work (cpu time = 1h). other tasks worked only 7 minutes it may be worth adding to the main menu item "Theory Jobs" |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
Please consider this application broken until stated otherwise. It is being worked on internally and should improve over the next few days. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 56 |
The VM shutdowning itself seems be solved. Very well! The Sherpa's doesn't run well, but my first Pythia6 and Pythia8 do. ===> [runRivet] Tue Apr 12 16:31:15 CEST 2016 [boinc ppbar uemb-hard 1800 - - pythia6 6.428 391 100000 188] ===> [runRivet] Tue Apr 12 16:52:55 CEST 2016 [boinc ppbar uemb-hard 1800 15 - pythia8 8.186 tune-4c 100000 188] |
©2024 CERN