Message boards :
ATLAS Application :
New Experimental ATLAS Application
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
Console 4 (stdout) should now report when a job starts and stops. There should be two gfal-copy calls per job; one input and one output. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
I tested it for the third time now. Console F4 shows "Bandwidth 117683"during the upload process. Upload for the vboxheadless process: 128MB. Upload was running at full speed for 18min (checked with task monitor) IT IS UPLOADING 120-150MB! You can say whatever you want, but it is happening. |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
If we are observing something different, we should dig a little deeper. The stdout and stderr Web logs (show graphics) are now appending for each job so should contain all information. If you search for gfal-copy in stderr you should see the file is being copied. Go to the URL below and search for that file. http://data-bridge-test.cern.ch/myfed/atlas-boinc/output/ The size of that file should be displayed and you can download it to verify. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 36 |
From stdout.log: Copying 19992536 bytes file:////var/lib/condor/execute/dir_3618/result.tar.gz => https://data-bridge-test.cern.ch/myfed/atlas-boinc/output/3727707_ATLAS_result Bandwidth: 1134948 |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Thanks, i will. One indication may be, that the upload was continuing for long AFTER the "Bandwidth xxxxxx" display showed on Console F4. Once the upload completed it showed" Starting new task" |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 859,751 RAC: 36 |
Maybe you saw the transfer upload and directly thereafter the download for the new job. The downloads are as far I have noticed 53.3MB for 1 job. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Thanks for the tip. It was JUST the upload. Task manager showed max upload on the graph (1Mbit/s) for 18min. |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
CP is right. Search for the curl command to see the input file and look for it here. http://data-bridge-test.cern.ch/myfed/atlas-boinc/input/ So with just the input and output file there is ~75MB per job. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Task ending after 7 min. Out of jobs-again? |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
More jobs submitted. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Thanks, Laurence. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Now i can see, if a job passed or failed. EDIT:Sorry, wrong thread. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
I did some more testing. The job finished (after about 4h50min)it started uploading. The first 25 or so MB were transmitted to cephrgw10.cern.ch. Then it continued transmitting to alicondorce01.cern.ch for another 110MB. The main IP addresses were:188.184.129.127:9618 and 188.184.187.167:9618. Why and what is it transmitting? This is way to specific to be an accident. EDIT: the job terminated shortly after the end of the upload. http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=159615 |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
That's not good. Shouldn't be doing that. It looks like it is transferring the whole scratch directory back. Will investigate tomorrow. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
That's not good. Shouldn't be doing that. It looks like it is transferring the whole scratch directory back. Will investigate tomorrow. Any progess? |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
Sorry, got a little sidetracked with the day job as Ivan would say :) There was only one running job left in the queue so submitted some more. These should not transfer the output back to alicondorxxx. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
These should not transfer the output back to alicondorxxx. Not working. Upload size still >100MB and to the above. |
Send message Joined: 12 Sep 14 Posts: 1067 Credit: 334,882 RAC: 0 |
Try again. I hope you don't have a data cap from your ISP! |
Send message Joined: 6 Mar 15 Posts: 19 Credit: 142,109 RAC: 0 |
Estimated duration ~75 hours Actual duration ~ 3 minutes Tasks say Running High Priority but no elapsed time and stuck at 0% for a few minutes as a minimum. |
Send message Joined: 11 Mar 16 Posts: 23 Credit: 68,680 RAC: 0 |
Should we expect the resumption of the ATLAS application, or the application will be tested on your own test server? |
©2024 CERN