Message boards : Theory Application : The Theory Application
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1043
Credit: 283,440
RAC: 0
Message 2663 - Posted: 11 Apr 2016, 21:31:47 UTC - in response to Message 2660.  

The test job may not be a good one. Tomorrow hopefully more jobs will be submitted.
ID: 2663 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 20 May 15
Posts: 217
Credit: 2,228,948
RAC: 0
Message 2664 - Posted: 11 Apr 2016, 21:35:27 UTC - in response to Message 2663.  

The machine that couldn't get an LHCb earlier (when Rasputin broke it) has now just received an LHCb task.
ID: 2664 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 131
Credit: 2,325,273
RAC: 905
Message 2665 - Posted: 11 Apr 2016, 21:39:26 UTC
Last modified: 11 Apr 2016, 21:42:26 UTC

Three of the new Theory0.01s so far, 2 erroring out after c.7mins, the other waiting for some Sixtrack wus to clear. VM starts up but doesn't get all the way in. I took a screengrab of the console window at the point of error;



Hope it's helpful. I'll set No New Tasks overnight rather than continually trashing tasks.
ID: 2665 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 20 May 15
Posts: 217
Credit: 2,228,948
RAC: 0
Message 2666 - Posted: 11 Apr 2016, 21:44:08 UTC - in response to Message 2665.  

Having looked at mine for longer they are looping rather than crashing out like Ray's. Terminated it via a shutdown file after 37 minutes.
ID: 2666 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1136
Credit: 715,455
RAC: 152
Message 2668 - Posted: 12 Apr 2016, 7:04:06 UTC
Last modified: 12 Apr 2016, 7:13:30 UTC

The same error as Ray.
The VM is halted by itself, not by the shutdown file and so the task ends up into computation error from vboxwrapper point of view.
Refreshing human memory: The same happened on vLHCathome only with Windows-boxes.

Wait! Just found a Linux box with the same premature VM-shutdown -> http://lhcathomedev.cern.ch/vLHCathome-dev/results.php?hostid=1002

ID: 2668 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1043
Credit: 283,440
RAC: 0
Message 2669 - Posted: 12 Apr 2016, 7:59:34 UTC - in response to Message 2668.  

We are out of jobs but as you pointed out the shutdown method is not correct.
ID: 2669 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile DoctorNow
Avatar

Send message
Joined: 12 Feb 15
Posts: 2
Credit: 78,669
RAC: 0
Message 2671 - Posted: 12 Apr 2016, 8:21:23 UTC

Don't know if this is related to what you guys are talking about but I just encountered several tasks which errored out on my machine, logs all point to "(unknown error) - exit code 194 (0xc2)", example-wu - also mostly done after around 7 minutes.
Unfortunately it produced too much error wus and now the daily quota went down to 2, hmpf. :-\
Life is Science, and Science rules. To the universe and beyond
Proud member of BOINC@Heidelberg
My BOINC-Stats
ID: 2671 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ben Segal
Volunteer moderator
Volunteer developer
Volunteer tester

Send message
Joined: 12 Sep 14
Posts: 65
Credit: 544
RAC: 0
Message 2672 - Posted: 12 Apr 2016, 8:25:33 UTC
Last modified: 12 Apr 2016, 8:27:18 UTC

To Theory-dev testers:

This is very early days so please bear with us. Here is some information:

1. The VM is 64 bits (yes!)

2. The job feeding system is now Condor based (like CMS) instead of CoPilot.
(This means that job initiation, suspend/resume timeouts, and other things may be less robust for now than you are used to with CoPilot).

3. The VM screens and Web logs are being worked on so don't complain about them yet.

The idea is to standardise the whole series of CERN VM based apps as much as possible…

Ben, Laurence, Leonardo and team
ID: 2672 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 2673 - Posted: 12 Apr 2016, 8:41:33 UTC - in response to Message 2672.  
Last modified: 12 Apr 2016, 8:42:25 UTC

Would you please inform us, when an updated version is fed into the system?
This way, we would not unneccessarily waste time with tasks, that will not work.
ID: 2673 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ben Segal
Volunteer moderator
Volunteer developer
Volunteer tester

Send message
Joined: 12 Sep 14
Posts: 65
Credit: 544
RAC: 0
Message 2674 - Posted: 12 Apr 2016, 8:47:46 UTC - in response to Message 2673.  
Last modified: 12 Apr 2016, 9:19:56 UTC

Would you please inform us, when an updated version is fed into the system?
This way, we would not unneccessarily waste time with tasks, that will not work.

OK, will do! By the way, the current test jobs are identical and fail after 1-2 minutes. This is intentional to test the system setup and recovery features.

Thanks for all your help!
ID: 2674 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 2675 - Posted: 12 Apr 2016, 9:33:47 UTC - in response to Message 2674.  

This is intentional to test the system setup and recovery features.


That is the kind of things, we need to know.
Thanks,Ben.
ID: 2675 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 20 May 15
Posts: 217
Credit: 2,228,948
RAC: 0
Message 2676 - Posted: 12 Apr 2016, 9:35:27 UTC - in response to Message 2674.  

OK, will do! By the way, the current test jobs are identical and fail after 1-2 minutes. This is intentional to test the system setup and recovery features.

Thanks for all your help!

It takes 4 minutes to loop round on my i3 laptop.
Should I leave it running if it doing something useful for you your end ?
Or just terminate and wait for a new version ?
ID: 2676 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sergey Kovalchuk

Send message
Joined: 11 Mar 16
Posts: 23
Credit: 68,680
RAC: 0
Message 2677 - Posted: 12 Apr 2016, 9:35:29 UTC - in response to Message 2674.  
Last modified: 12 Apr 2016, 9:40:52 UTC

Last WU work 1h with CPU time 45m
Endless loop of Sherpa initialisation with exit code 2

http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=137729
ID: 2677 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 20 May 15
Posts: 217
Credit: 2,228,948
RAC: 0
Message 2678 - Posted: 12 Apr 2016, 10:20:30 UTC - in response to Message 2676.  

OK, will do! By the way, the current test jobs are identical and fail after 1-2 minutes. This is intentional to test the system setup and recovery features.

Thanks for all your help!

It takes 4 minutes to loop round on my i3 laptop.
Should I leave it running if it doing something useful for you your end ?
Or just terminate and wait for a new version ?

It terminated itself after an hour because...

2016-04-12 10:14:14 (15928): Guest Log: [INFO] Theory application starting. Check log files.
2016-04-12 11:16:48 (15928): Guest Log: [ERROR] App is not supported. Shutting down!
ID: 2678 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 2679 - Posted: 12 Apr 2016, 11:40:57 UTC

ID: 2679 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 20 May 15
Posts: 217
Credit: 2,228,948
RAC: 0
Message 2680 - Posted: 12 Apr 2016, 11:42:43 UTC - in response to Message 2679.  

Tasks are actually validating.
http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=137802

All of my Theory tasks have validated :-)
ID: 2680 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 2681 - Posted: 12 Apr 2016, 13:29:37 UTC - in response to Message 2680.  

My very first ones errored out.
ID: 2681 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sergey Kovalchuk

Send message
Joined: 11 Mar 16
Posts: 23
Credit: 68,680
RAC: 0
Message 2682 - Posted: 12 Apr 2016, 13:36:45 UTC - in response to Message 2680.  

Tasks are actually validating.
http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=137802

All of my Theory tasks have validated :-)


my too. but only first valid task doing the real work (cpu time = 1h). other tasks worked only 7 minutes


it may be worth adding to the main menu item "Theory Jobs"
ID: 2682 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1043
Credit: 283,440
RAC: 0
Message 2683 - Posted: 12 Apr 2016, 13:50:57 UTC - in response to Message 2682.  

Please consider this application broken until stated otherwise. It is being worked on internally and should improve over the next few days.
ID: 2683 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1136
Credit: 715,455
RAC: 152
Message 2686 - Posted: 12 Apr 2016, 14:55:54 UTC

The VM shutdowning itself seems be solved. Very well!

The Sherpa's doesn't run well, but my first Pythia6 and Pythia8 do.

===> [runRivet] Tue Apr 12 16:31:15 CEST 2016 [boinc ppbar uemb-hard 1800 - - pythia6 6.428 391 100000 188]

===> [runRivet] Tue Apr 12 16:52:55 CEST 2016 [boinc ppbar uemb-hard 1800 15 - pythia8 8.186 tune-4c 100000 188]
ID: 2686 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : Theory Application : The Theory Application


©2022 CERN