Thread 'Migrating to vLHC@home'

Author	Message
Laurence CERN Project administrator Project developer Project tester Send message Joined: 12 Sep 14 Posts: 1161 Credit: 342,328 RAC: 0	Message 1746 - Posted: 30 Jan 2016, 20:44:56 UTC - in response to Message 1734. CMS-Dev Users 189 Users with a post 57 (30%) vLHC@Home Users 14,458 Users with a post 1144 (8%) ID: 1746 · Rating: 0 · rate: / Reply Quote

Rasputin42 Volunteer tester Send message Joined: 16 Aug 15 Posts: 967 Credit: 1,216,795 RAC: 0	Message 1748 - Posted: 30 Jan 2016, 21:14:13 UTC - in response to Message 1746. You have to take into account, that you need an invitation code. That means, there is a pre-selection of volunteers, before they join. vLHC does not have that. ID: 1748 · Rating: 0 · rate: / Reply Quote

ivan Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 20 Jan 15 Posts: 1156 Credit: 8,453,729 RAC: 25	Message 1753 - Posted: 31 Jan 2016, 11:23:46 UTC - in response to Message 1748. You have to take into account, that you need an invitation code. That was out of necessity; some trollers/scammers/ne'er-do-wells had discovered the project before it was announced and signed up a couple of hundred accounts before Laurence noticed and put the block in. That means, there is a pre-selection of volunteers, before they join. vLHC does not have that. That is, of course, still a valid point. ID: 1753 · Rating: 0 · rate: / Reply Quote

Yeti Send message Joined: 29 May 15 Posts: 163 Credit: 3,574,902 RAC: 8,696	Message 1775 - Posted: 1 Feb 2016, 12:53:52 UTC Okay guys, here is one more Point for your ToDo-List on the way to vLHC. In vLHC I see very often my CMS-Task sitting with doing nothing because "BOINC_User-ID is not an integer" (it is blank or NULL) I didn't wathc this here in CMS-Dev ID: 1775 · Rating: 0 · rate: / Reply Quote

Steve Hawker* Send message Joined: 6 Mar 15 Posts: 19 Credit: 142,109 RAC: 0	Message 1793 - Posted: 2 Feb 2016, 2:41:00 UTC - in response to Message 1702. Last modified: 2 Feb 2016, 2:41:55 UTC Hi Steve, I think that we found the reason why you got two tasks for the beta and changed the configuration so hopefully you will only get one in the future. Thanks for acting on this, I will check The direction that we would like to go is to have one LHC@home BOINC project which has different applications for the different experiments. With this in mind we have a number of options for the testing and development environment. beta apps in the LHC@home project a test/devel LHC@home project separate projects per application Which option is preferred? As I mentioned in the "poll" thread, I would like to see sub-projects. If you have sub-projects, then two distinct and separate projects for prod and dev would be perfect. I agree that this app is not quite ready for primetime but I hope that by starting a process of continuous improvement we can get there soon. Unfortunately we need the virtualized approach and VBox is what the work has been based on up to now. We can try to address the issues but if this becomes a blocking issue we can review our options. Well, I dont know why but I'm not gonna argue as its your show to run. I've had a bad time at ATLAS and most of that is due to VBox. CMS has been a smooth run so I'll stick with it. In a different thread you wonder about community representation. You have it right here on the boards. The crunchers who care read the forums and the crunchers who care enough will post. The crunchers who would volunteer would be the ones who also care enough to post. Your stats will tell you what % that is. Just make a separate forum folder for such discussions. Cheers! S. ID: 1793 · Rating: 0 · rate: / Reply Quote

Tern Send message Joined: 21 Sep 15 Posts: 89 Credit: 383,017 RAC: 0	Message 1854 - Posted: 4 Feb 2016, 17:26:12 UTC Regarding vLHC message boards (the need to clean up, before dumping more applications on vLHC): http://lhcathome2.cern.ch/vLHCathome/forum_thread.php?id=1703 title "Virtualbox not installed" Posted: 15 Aug 2015, 19:30:45 UTC yet still on Page 1 NEVER ANSWERED. At all. By anyone. Rang a bell with me because I had the same exact problem here. (See my very first posting, http://boincai05.cern.ch/CMS-dev/forum_thread.php?id=89 - which, by the way, was ALSO never answered, other than by volunteer "m", who at least TRIED to help... I still have no answer for the problem, it just "started working" at some point.) Mine is still the one-and-only post over in Q&A:Mac, so it's not like it's gotten buried... AFAIK, nobody at either project has ever even looked at the problem. No telling how many people DID give up, as I was going to if it hadn't fixed itself just before I did. More problematic: Very first stickied thread in Number Crunching is "How to end a run on task gracefully". It started in 2012. They still have the problem today (as we do in CMS-Dev) However, the "How to" (post #1) is no longer correct, the "new correct" (but questionable) fix is way down in page 4. Stickied threads 1500+ days old, if still relevant, indicate a major project problem. If not still relevant, they indicate a major project communication (boards) problem! Also noted from random checking, most of the problems currently being reported over there, especially by new users, seem VERY familiar. With lots of people saying "THIS NEEDS TO BE COMMUNICATED AT SIGNUP, not buried somewhere in the message boards!" Gee, where have I heard that before? CMS-Dev message board checking problems exist also, BTW, so it's "CERN-wide", not just vLHC: Poor MarkRBright has been waiting since 16 May 2015 for an answer to his question in Q&A:Windows, and "Agus" has been waiting 7 days now. Understaffing everywhere? Prioritization problems? Oh yeah, the "CMS Simulation Beta Tasks" thread at vLHC... Yep, CMS was ready to move over there all right! Smooth move, CERN! Looks like they "shut down" (but not really) CMS-Dev, fired up at vLHC, and then immediately MADE MAJOR CHANGES TO THE APP!!!!! WTF? "Lets spend months in development, alpha and beta testing, get a version 0.4 that almost works (other than a few dozen known bugs we can't seem to fix), decide overnight to renumber it 1.0 and release it. Then the next day, take what was going to be version 0.5, with a ton of totally untested changes, and without any communication whatsoever, call it 1.1 and send it out to thousands of users." 8-( ID: 1854 · Rating: 0 · rate: / Reply Quote

Rasputin42 Volunteer tester Send message Joined: 16 Aug 15 Posts: 967 Credit: 1,216,795 RAC: 0	Message 1855 - Posted: 4 Feb 2016, 18:18:32 UTC Last modified: 4 Feb 2016, 18:27:57 UTC I think, nobody is or feels like being in charge of the message boards or otherwise. That is the problem. EDIT: There is a new admin person at vLHC message boards, so maybe they are starting to make some improvements. ID: 1855 · Rating: 0 · rate: / Reply Quote

Laurence CERN Project administrator Project developer Project tester Send message Joined: 12 Sep 14 Posts: 1161 Credit: 342,328 RAC: 0	Message 1862 - Posted: 4 Feb 2016, 23:09:43 UTC - in response to Message 1854. Last modified: 4 Feb 2016, 23:13:22 UTC Hi Bill, I would suggest that issues relating to the vLHC forum be directed there. Communication and support via the message boards are indeed important and something that is often overlooked. As a general observation, people tend to make logical decisions. If a decision is perceived as illogical then someone is missing some information. The purpose of discourse should be to identify that information and share it. I would like to clarify a few details about CERN and is purpose as it is often miss-understood. CERN is an organization that has 21 member states and is also a physical laboratory. Its main purpose is to provide infrastructure such as the particle accelerator that is needed for high-energy physics research. As a result it is a focal point for numerous experiments that are the result of international collaborations. As an example you can take a look a paper behind MCPLOTS and vLHC@home. http://arxiv.org/pdf/1306.3436v2.pdf Whilst the people listed as authors may have been working at CERN, they are actually employees of their respective institutions. So even though the hostname is mcplots.cern.ch, the responsibility falls on an international collaboration of people. If you find any of this confusing, imagine what it is like to work in such an environment :) ID: 1862 · Rating: 0 · rate: / Reply Quote

Tern Send message Joined: 21 Sep 15 Posts: 89 Credit: 383,017 RAC: 0	Message 1907 - Posted: 7 Feb 2016, 13:43:01 UTC - in response to Message 1862. Hi Bill, I would suggest that issues relating to the vLHC forum be directed there. Communication and support via the message boards are indeed important and something that is often overlooked. My point was missed. I don't care about the vLHC forum, as I don't use it, except to find VBox answers that might be there and not here. And therefore, not being attached to vLHC, can't post there. The point was that doing away with the idea of a "dev" project and doing a beta app THERE, would mean dealing with the issues there. Plenty of issues here, why add more? I was giving ammunition in support of having two different projects. <snipped lots of interesting info about CERN> My second point was also missed. While I think the communication HERE is FAR better than the communication at vLHC, there is also a shortage of "customer service" going on here. The fact that NO ONE is looking at and responding to the "Q&A" section, was my example. Lots of Q's, no A's. Even after my posting, nobody has gone to look there. My third point was that CERN (whichever sub-part is irrelevant) just created a massive mess at two projects, vLHC and CMS-Dev, with the ill-timed whatever-it-was. I started to say "release of CMS at vLHC", but no, CMS has been there for a while. Then I started to say "shutdown of CMS-Dev", but that didn't fit either. I'm not sure what was done by who or why, I am just seeing sudden numerous bizarre problems in multiple places. (Which have even multiplied since my original posting.) SOMEBODY (one or more), employer irrelevant, made a really, really poor decision, or series of decisions, or untested code changes, somewhere, this last week or so. We, the volunteers, at both projects, are having to deal with the fallout. Having been through situations like this before, when chaos starts to take over at a project, MY reaction is generally just to back off and wait for the project to get their act together. If I had the time to chase one of the problems so I could contribute, fine, but I have no "spare" time at the moment to be able to help, so better just to stay out of the way. My machines are running unattended, if they are producing data that can be used by someone, great. ID: 1907 · Rating: 0 · rate: / Reply Quote

Laurence CERN Project administrator Project developer Project tester Send message Joined: 12 Sep 14 Posts: 1161 Credit: 342,328 RAC: 0	Message 1909 - Posted: 7 Feb 2016, 22:11:38 UTC - in response to Message 1907. To address your second point first, the shortage of customer service has been recognised and an area where significant improvements can be made. This brings us to your third point which is linked to the first. It was thought that the best customer service could be provided by having a single project. This direction was known from day one of the project and the beta was enabled in vLHC@home last year. However when the announcement to move was made feedback was received (including from yourself) which was considered. Thus the move is on hold until a conclusion is reached on what to do. The outcome of the poll suggests that having a production and development project is preferred and that the model of PrimeGrid is maybe what we should be following. The positive that I would take away from the situation is that this community's opinion is valued highly and has the potential to influence the direction. ID: 1909 · Rating: 0 · rate: / Reply Quote

Yeti Send message Joined: 29 May 15 Posts: 163 Credit: 3,574,902 RAC: 8,696	Message 1989 - Posted: 13 Feb 2016, 16:04:09 UTC Okay, Project guys, please, bring some light into my darkness. I understood your postings, that you wanted us to Switch to vLHC and there to run CMS-Tasks. But I see that you are still running and serving CMS-Tasks here while vLHC does only sent out few workunits. So, what do you prefer ? Running CMS here or at vLHC ? ID: 1989 · Rating: 0 · rate: / Reply Quote

ivan Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 20 Jan 15 Posts: 1156 Credit: 8,453,729 RAC: 25	Message 1990 - Posted: 13 Feb 2016, 16:19:40 UTC - in response to Message 1989. Okay, Project guys, please, bring some light into my darkness. I understood your postings, that you wanted us to Switch to vLHC and there to run CMS-Tasks. But I see that you are still running and serving CMS-Tasks here while vLHC does only sent out few workunits. So, what do you prefer ? Running CMS here or at vLHC ? Speaking only for myself -- you realise I'm really only a small cog in a larger machine -- I'd be happy for you to maintain a small presence here. The move to vLHC may have been premature, especially WMAgent jobs seem to have their difficulties there. I haven't checked especially, but from the number of jobs running we must have picked up some vLHC users; however from my (and I emphasise _my_) perspective, perhaps vLHC is a bit more ~~experimental~~ untried than our efforts here. ID: 1990 · Rating: 0 · rate: / Reply Quote

Laurence CERN Project administrator Project developer Project tester Send message Joined: 12 Sep 14 Posts: 1161 Credit: 342,328 RAC: 0	Message 1996 - Posted: 13 Feb 2016, 21:53:41 UTC - in response to Message 1989. The current situation is that there seems to be consensus on having both a production and dev project. Having credit per app also seems to be preferable. This project will now most probably stay around as the dev project. At the moment the CMS app is the same in both projects so it depends on where you want to accumulate credit and which boards you prefer. Personally I would see vLHC@home as the crunching project and this one for those who want to contribute more than just cycles. A failed task with a good bug report here is probably worth 1000 successful tasks. So if you have many machines I would use vLHC@home but if you only have one or two machines and intend to remain active on the boards, stay here. btw can explain to me the difference between having credit per app and a sub-project? ID: 1996 · Rating: 0 · rate: / Reply Quote

Development for LHC@home