Message boards : News : Progress!
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1069
Credit: 334,882
RAC: 0
Message 893 - Posted: 28 Aug 2015, 10:58:33 UTC
Last modified: 28 Aug 2015, 10:58:48 UTC

We are making great progress and are just chasing up the last remaining issues. One of the recent improvements was to create the link to the CMS monitoring infrastructure. This means that we can generate nice plots similar to what ATLAS@home have for their project.

ID: 893 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Hendrik
Project developer
Project tester
Avatar

Send message
Joined: 1 Aug 14
Posts: 14
Credit: 884
RAC: 0
Message 894 - Posted: 28 Aug 2015, 11:38:57 UTC

That's great!

I can't wait to see those plots :)
Especially those that compare CMS@Home to the "traditional" CMS resources.
ID: 894 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 895 - Posted: 28 Aug 2015, 12:03:16 UTC - in response to Message 893.  

Looks great.
What is the unit on the Y-axis?
ID: 895 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 3
Message 896 - Posted: 28 Aug 2015, 12:14:31 UTC - in response to Message 895.  

Looks great.
What is the unit on the Y-axis?

Looks like received jobs/hour.
ID: 896 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 897 - Posted: 28 Aug 2015, 12:23:45 UTC - in response to Message 896.  

Thanks.
Are these the "jobs"?

Begin processing the 1st record. Run 1, Event 1266, LumiSection 254 at 28-Aug-2015 04:00:09.508 CEST
Begin processing the 2nd record. Run 1, Event 1267, LumiSection 254 at 28-Aug-2015 04:06:00.353 CEST
Begin processing the 3rd record. Run 1, Event 1268, LumiSection 254 at 28-Aug-2015 04:11:30.804 CEST
Begin processing the 4th record. Run 1, Event 1269, LumiSection 254 at 28-Aug-2015 04:17:37.433 CEST
Begin processing the 5th record. Run 1, Event 1270, LumiSection 254 at 28-Aug-2015 04:21:17.386 CEST
ID: 897 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 861,475
RAC: 3
Message 898 - Posted: 28 Aug 2015, 12:29:37 UTC - in response to Message 897.  
Last modified: 28 Aug 2015, 12:42:21 UTC

Thanks.
Are these the "jobs"?

Begin processing the 1st record. Run 1, Event 1266, LumiSection 254 at 28-Aug-2015 04:00:09.508 CEST
Begin processing the 2nd record. Run 1, Event 1267, LumiSection 254 at 28-Aug-2015 04:06:00.353 CEST
Begin processing the 3rd record. Run 1, Event 1268, LumiSection 254 at 28-Aug-2015 04:11:30.804 CEST
Begin processing the 4th record. Run 1, Event 1269, LumiSection 254 at 28-Aug-2015 04:17:37.433 CEST
Begin processing the 5th record. Run 1, Event 1270, LumiSection 254 at 28-Aug-2015 04:21:17.386 CEST

No.

A job is 1 run of the cmsRun process.
1 job/cmsRun is stored in your http://localhost:?????/logs/run-?/glide_??????/dir_????/
Fill in the right figures for the question marks.
Ivan can configure how many records (in CERN they're speaking about events) he wants to fit in one cmsRun.
Last week we had jobs with 200 events (records).
ID: 898 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 1,082
Message 899 - Posted: 28 Aug 2015, 12:46:09 UTC - in response to Message 898.  

At the moment I'm using 5 events/job to get quick turnaround for Laurence and Hassen to debug. Expect that to increase tonight. :-) Basically' I'm looking for jobs that run in 30-60 minutes and create 20-30 MB output each.
ID: 899 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 900 - Posted: 28 Aug 2015, 12:47:14 UTC - in response to Message 898.  

Thank you.
So, the number of "records" per run need to be the same, otherwise the "jobs" are not comparable in the statistics.
ID: 900 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 1,082
Message 902 - Posted: 28 Aug 2015, 15:43:55 UTC - in response to Message 900.  

Thank you.
So, the number of "records" per run need to be the same, otherwise the "jobs" are not comparable in the statistics.

Yes. There other metrics, like the wall-clock time, etc.
ID: 902 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 1,082
Message 903 - Posted: 28 Aug 2015, 15:45:57 UTC

OK, the last test was a great success, so I'll dump a load of larger jobs in a little while, I'm just doing a pilot run on the GRID proper before starting the test. Thanks for your help, in the past and in the future.

ivan
ID: 903 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 1,082
Message 904 - Posted: 28 Aug 2015, 18:14:49 UTC - in response to Message 903.  

Unfortunately the next lot of jobs will be a bit later than planned -- the CRAB server was upgraded this afternoon, which broke "our" patch. I'm told it will be fixed in about two hours, so I'll try again then.
ID: 904 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 1,082
Message 905 - Posted: 28 Aug 2015, 20:16:17 UTC - in response to Message 904.  

Unfortunately the next lot of jobs will be a bit later than planned -- the CRAB server was upgraded this afternoon, which broke "our" patch. I'm told it will be fixed in about two hours, so I'll try again then.

Not yet...
ID: 905 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 906 - Posted: 28 Aug 2015, 22:22:00 UTC

Is there going to be any work over the weekend?
If not, i am going to shut it down till monday.
ID: 906 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 1,082
Message 907 - Posted: 28 Aug 2015, 22:24:32 UTC - in response to Message 905.  

Looks like it's not going to be fixed tonight. Sorry...
Meanwhile, 1969 of the supposed parallel jobs submitted to the GRID have finished, with 30 errors and one still to finish... :-(
ID: 907 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 1,082
Message 908 - Posted: 28 Aug 2015, 22:27:49 UTC - in response to Message 906.  

Is there going to be any work over the weekend?
If not, i am going to shut it down till monday.

It's not looking good, I'm afraid. Feel free to switch off.
ID: 908 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 909 - Posted: 28 Aug 2015, 22:30:28 UTC

Thanks for letting me know.
ID: 909 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
m
Volunteer tester

Send message
Joined: 20 Mar 15
Posts: 243
Credit: 886,442
RAC: 0
Message 910 - Posted: 28 Aug 2015, 23:32:32 UTC - in response to Message 904.  
Last modified: 29 Aug 2015, 0:06:11 UTC

Unfortunately the next lot of jobs will be a bit later than planned -- the CRAB server was upgraded this afternoon, which broke "our" patch. I'm told it will be fixed in about two hours,...


Oh dear, oh dear, oh dear. Bad move, that.

Thanks for the info.
ID: 910 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ivan
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Send message
Joined: 20 Jan 15
Posts: 1139
Credit: 8,310,612
RAC: 1,082
Message 911 - Posted: 30 Aug 2015, 13:40:48 UTC

Sorry to say that this isn't resolved yet. In fact, the problem has raised some "differences of perspective" amongst the leading players. I'm staying out of that as far as possible, I'm as much a volunteer as you lot. But there's hope we can get it sorted soon. It's a long weekend here, so I'll be buying lots of popcorn to watch tomorrow's round of emails...
ID: 911 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Yeti
Avatar

Send message
Joined: 29 May 15
Posts: 147
Credit: 2,842,484
RAC: 0
Message 912 - Posted: 30 Aug 2015, 14:56:39 UTC - in response to Message 911.  

This remembers me back to the early years of LHC@Home and sixtrack; it must have been similar ;-)
ID: 912 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : News : Progress!


©2024 CERN