Message boards : Theory Application : New Version (v3.14)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 6410 - Posted: 2 Jul 2019, 13:04:05 UTC

This new version contains a CVMFS configuration fix.
ID: 6410 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,969,210
RAC: 0
Message 6414 - Posted: 2 Jul 2019, 19:14:46 UTC - in response to Message 6410.  

I let a 3.14 run long enough to start a job then did the checkpoint edit to see how it would react. Same -240 file xfer error as before so I don't know how these will do if they are left to run to term.
Oldest one is 4hrs in, running its 3rd job. Others are around the 1hr mark, running their 1st or 2nd jobs so start-up OK and intermediate, Job-completion uploads OK but still possible doubt as to TASK completion.
ID: 6414 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,324,905
RAC: 1,506
Message 6415 - Posted: 2 Jul 2019, 19:41:42 UTC - in response to Message 6414.  

Well it will be a while before I get 3.14 to finish this d/l and run the first one.

In that 3.13 version the first 2 failed as you know but the next 3 are still running after 9 hours so I might as well let them continue to run since I can't run this new one for many hours (15.50% after 2 hours so far)
ID: 6415 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,324,905
RAC: 1,506
Message 6416 - Posted: 3 Jul 2019, 10:01:31 UTC

I only have one of this version running since the server is telling me there is no more Theory tasks (but then the server status says there are 136 and I know there isn't anyone else running these)

The only finished ones I see are done by Ray and as before they all finished and ended with upload failure:

So I will just run this one since I am sure the same thing will happen

(I sure hope this doesn't happen with the new ones at LHC since I have 8 computers d/ling that update)
ID: 6416 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 2
Message 6417 - Posted: 3 Jul 2019, 11:50:05 UTC - in response to Message 6416.  

The only finished ones I see are done by Ray and as before they all finished and ended with upload failure:
. . . .
(I sure hope this doesn't happen with the new ones at LHC since I have 8 computers d/ling that update)

With manual intervention I managed to finish 2 tasks with success here:
3.13 -> https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789555
3.14 -> https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789413

and forced 1 early for LHC@home to see that a task is validated OK: https://lhcathome.cern.ch/lhcathome/result.php?resultid=236588542
ID: 6417 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,969,210
RAC: 0
Message 6418 - Posted: 3 Jul 2019, 12:19:50 UTC - in response to Message 6417.  
Last modified: 3 Jul 2019, 12:26:34 UTC

Hi CP
Could you describe your "manual intervention". I have tried the edit of elapsed time in checkpoint but that method doesn't work with these. I see "completion file detected" in the log of those tasks. What should I create and where should I put it?
ID: 6418 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 2
Message 6419 - Posted: 3 Jul 2019, 14:07:02 UTC - in response to Message 6418.  
Last modified: 3 Jul 2019, 14:09:57 UTC

Hi CP
Could you describe your "manual intervention". I have tried the edit of elapsed time in checkpoint but that method doesn't work with these. I see "completion file detected" in the log of those tasks. What should I create and where should I put it?

You should create a result file in the project folder. The content is not important (could even be empty), but the name is.
The file name needed you will find in client_state.xml. Something like Theory_2279-779662-75_1_r765374446_result
Do this before the task ends (by machine or gracefully).
ID: 6419 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,969,210
RAC: 0
Message 6420 - Posted: 3 Jul 2019, 18:33:00 UTC - in response to Message 6419.  

I thought I'd followed your instructions but either I didn't create the file correctly or I didn't put it in the right place as I still got the same error.
Can't get any more of those tasks to experiment on so hopefully that means the adjustments tested have fixed the credentials problem and my 6 days of uncredited CPU time were useful.
In fact, I can't get ANY Theory tasks here just now.
ID: 6420 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 2
Message 6421 - Posted: 3 Jul 2019, 19:34:41 UTC - in response to Message 6420.  

I thought I'd followed your instructions but either I didn't create the file correctly or I didn't put it in the right place as I still got the same error.

The files needed on your 2 machines for those last 2 tasks should have been

Theory_2279-794574-75_1_r450481893_result and Theory_2279-797643-75_1_r530068321_result without extension and placed,

when you have the default BOINC installation, in the folder C:\ProgramData\BOINC\projects\lhcathomedev.cern.ch_lhcathome-dev
ID: 6421 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 17 Mar 15
Posts: 51
Credit: 602,329
RAC: 0
Message 6422 - Posted: 3 Jul 2019, 21:29:36 UTC
Last modified: 3 Jul 2019, 21:30:10 UTC

Hi

I got 2 theory 3.14 that failed after 16/17 hours of calculation on my iMac :(

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789488
https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789371

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>Theory_2279-783192-75_0_r1937531163_result</file_name>
<error_code>-161 (not found)</error_code>
</file_xfer_error>
</message>
]]>
ID: 6422 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 781
Credit: 12,324,905
RAC: 1,506
Message 6423 - Posted: 4 Jul 2019, 2:23:59 UTC

Just got home and this is always the first pc I check and I saw that one 3.14 I had running was done and turned in so I went to see if it would be as expected......sure enough.....Run time - 14 hours 57 min _ CPU time - 1 days 4 hours 4 min 19 sec

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>Theory_2279-804477-75_1_r757929331_result</file_name>
<error_code>-240 (stat() failed)</error_code>
</file_xfer_error>
</message>
]]>


Of course we can't have them doing that over at LHC since the public isn't going to want to watch them and turn them in like back in the T4T days 7 years ago

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789571

I do see I have two of the single-core version of Theory Simulation v263.95 finished and Valid on my old 3-core Phenom with Win 10 OS but the other pc's have not finished any yet (Win 7's and Win 10's running 2-core and 4-core version of v263.95 )
ID: 6423 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Theory Application : New Version (v3.14)


©2024 CERN