Message boards :
Theory Application :
New Version (v3.14)
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Sep 14 Posts: 1069 Credit: 334,882 RAC: 0 |
This new version contains a CVMFS configuration fix. |
Send message Joined: 13 Apr 15 Posts: 138 Credit: 2,969,210 RAC: 0 |
I let a 3.14 run long enough to start a job then did the checkpoint edit to see how it would react. Same -240 file xfer error as before so I don't know how these will do if they are left to run to term. Oldest one is 4hrs in, running its 3rd job. Others are around the 1hr mark, running their 1st or 2nd jobs so start-up OK and intermediate, Job-completion uploads OK but still possible doubt as to TASK completion. |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,406,321 RAC: 6,369 |
Well it will be a while before I get 3.14 to finish this d/l and run the first one. In that 3.13 version the first 2 failed as you know but the next 3 are still running after 9 hours so I might as well let them continue to run since I can't run this new one for many hours (15.50% after 2 hours so far) |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,406,321 RAC: 6,369 |
I only have one of this version running since the server is telling me there is no more Theory tasks (but then the server status says there are 136 and I know there isn't anyone else running these) The only finished ones I see are done by Ray and as before they all finished and ended with upload failure: So I will just run this one since I am sure the same thing will happen (I sure hope this doesn't happen with the new ones at LHC since I have 8 computers d/ling that update) |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 862,257 RAC: 70 |
The only finished ones I see are done by Ray and as before they all finished and ended with upload failure: With manual intervention I managed to finish 2 tasks with success here: 3.13 -> https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789555 3.14 -> https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789413 and forced 1 early for LHC@home to see that a task is validated OK: https://lhcathome.cern.ch/lhcathome/result.php?resultid=236588542 |
Send message Joined: 13 Apr 15 Posts: 138 Credit: 2,969,210 RAC: 0 |
Hi CP Could you describe your "manual intervention". I have tried the edit of elapsed time in checkpoint but that method doesn't work with these. I see "completion file detected" in the log of those tasks. What should I create and where should I put it? |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 862,257 RAC: 70 |
Hi CP You should create a result file in the project folder. The content is not important (could even be empty), but the name is. The file name needed you will find in client_state.xml. Something like Theory_2279-779662-75_1_r765374446_result Do this before the task ends (by machine or gracefully). |
Send message Joined: 13 Apr 15 Posts: 138 Credit: 2,969,210 RAC: 0 |
I thought I'd followed your instructions but either I didn't create the file correctly or I didn't put it in the right place as I still got the same error. Can't get any more of those tasks to experiment on so hopefully that means the adjustments tested have fixed the credentials problem and my 6 days of uncredited CPU time were useful. In fact, I can't get ANY Theory tasks here just now. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 862,257 RAC: 70 |
I thought I'd followed your instructions but either I didn't create the file correctly or I didn't put it in the right place as I still got the same error. The files needed on your 2 machines for those last 2 tasks should have been Theory_2279-794574-75_1_r450481893_result and Theory_2279-797643-75_1_r530068321_result without extension and placed, when you have the default BOINC installation, in the folder C:\ProgramData\BOINC\projects\lhcathomedev.cern.ch_lhcathome-dev |
Send message Joined: 17 Mar 15 Posts: 51 Credit: 602,329 RAC: 0 |
Hi I got 2 theory 3.14 that failed after 16/17 hours of calculation on my iMac :( https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789488 https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789371 </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>Theory_2279-783192-75_0_r1937531163_result</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]> |
Send message Joined: 8 Apr 15 Posts: 781 Credit: 12,406,321 RAC: 6,369 |
Just got home and this is always the first pc I check and I saw that one 3.14 I had running was done and turned in so I went to see if it would be as expected......sure enough.....Run time - 14 hours 57 min _ CPU time - 1 days 4 hours 4 min 19 sec </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>Theory_2279-804477-75_1_r757929331_result</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message> ]]> Of course we can't have them doing that over at LHC since the public isn't going to want to watch them and turn them in like back in the T4T days 7 years ago https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2789571 I do see I have two of the single-core version of Theory Simulation v263.95 finished and Valid on my old 3-core Phenom with Win 10 OS but the other pc's have not finished any yet (Win 7's and Win 10's running 2-core and 4-core version of v263.95 ) |
©2024 CERN