InfoMessage
1) Message boards : CMS Application : No Tasks
Message 5479
Posted 11 Aug 2018 by Ben Segal
So, is the CMS-project now completly dead?(dev and production)

No, the CMS people are still trying to get their production job submission working. Don't know how hard it is or how hard they are working - vacation time...
2) Message boards : Number crunching : New Applications
Message 5470
Posted 18 Jul 2018 by Ben Segal
These test apps are for now just placeholders for a summer student project which is looking at some problems involving Machine Learning.

Will post if/when we have some real work to crunch. For now please don't waste time wondering, OK?

Ben and Laurence
3) Message boards : CMS Application : Dip?
Message 4315
Posted 14 Nov 2016 by Ben Segal
BOINC's feeder is not running: https://lhcathome.cern.ch/vLHCathome-dev/server_status.php

At least when using the new secure project URL.

Thanks for the heads up!
It's been restarted.

Ben and Nils
4) Message boards : Theory Application : Authentication errors - error 206.
Message 4249
Posted 28 Oct 2016 by Ben Segal
Yes, we had a firewall issue with the Condor job feeder. Now being fixed so things should recover slowly.
5) Message boards : News : CMS Servers up again
Message 4238
Posted 26 Oct 2016 by Ben Segal
My hosts got a couple of WUs from the non-dev project although it was clear they would run into an error.
Why can´t you stop sending out WUs until the patches are installed?

The VMs are also linux machines.
If they use a COW filesystem they are also affected by that bug.
I´m sure I am the very first thinking about that :-))

The patch for this bug was issued yesterday and will be applied automagically when your current task expires and your CernVM reboots.

As I stated a couple of times in this message board my hosts still do not get the most recent application versions (CMS, CMS-dev).
The older apps download/boot older VM images, e.g. CMS_2016_08_08.vdi in case of CMS-dev, which are not patched.

Resetting the projects or rebooting the hosts do not solve the problem.

Well actually you do get the security patches whatever .vdi version gets loaded. CernVM is connected to its file system CVMFS and it is this which does automagical kernel and library updates right after booting the vdi image.
6) Message boards : CMS Application : New Version v47.60
Message 4237
Posted 26 Oct 2016 by Ben Segal
Do you mean vLHC is hosted by an "external" laboratory ? I would have thought this was "inside" LHC infrastructure, somehow...

...

The vLHC BOINC servers are at CERN but the Condor servers which supply jobs can be elsewhere. In this case, CMS jobs are being sent from RAL where Ivan is partly based.
7) Message boards : News : CMS Servers up again
Message 4232
Posted 25 Oct 2016 by Ben Segal
My hosts got a couple of WUs from the non-dev project although it was clear they would run into an error.
Why can´t you stop sending out WUs until the patches are installed?

The VMs are also linux machines.
If they use a COW filesystem they are also affected by that bug.
I´m sure I am the very first thinking about that :-))

The patch for this bug was issued yesterday and will be applied automagically when your current task expires and your CernVM reboots.
8) Message boards : CMS Application : New Version v47.60
Message 4188
Posted 17 Oct 2016 by Ben Segal
This new version enables Web proxy auto discovery (wpad) which means the CVMFS traffic should be directed to either CERN for FNAL depending on where you are.

By the way, FNAL means Fermi National Accelerator Lab which is near Chicago USA.
9) Message boards : Theory Application : Multicore settings for 20 cor machine
Message 3892
Posted 30 Jul 2016 by Ben Segal
At least without app_config and no tasks, an empty cache and asking 0 days of work, I got 8 tasks for my 8 threaded (4 core HT) machine.
1 task started, created a VM with 8 processors, running 8 jobs with far too low memory. At least the swap is now used ;)

Yes, looks like Work In Progress (:-))
10) Message boards : Theory Application : Multicore settings for 20 cor machine
Message 3890
Posted 30 Jul 2016 by Ben Segal
- the new project preference options set to 1 task max, 3 CPUs max

Is this setting only for (ex-)CERN-employees??

I only see a new "Limit the number of tasks per host?" with a tick-box

I got at least 3 tasks with this option set to yes.

Aaaargh!!! Looks like Laurence is doing development under our feet. Those preferences were set by me yesterday, before today's testing. The page with them displayed was still on my screen but I'd not refreshed it today. Now, as you say, they aren't shown any more. But hopefully they are still active somewhere..

Laurence, what goes ???
11) Message boards : Number crunching : Performance Tuning
Message 3887
Posted 30 Jul 2016 by Ben Segal
Is there ANY way, i could run 3 jobs with 4 cores?

The load average running 4 jobs is about 7.
That means, that there is much more work per job, than a single core can process.
Before the multi-core version, i could do it, now i can only use 1 core per job, which is substantially overloaded and therefore slow.

If the aim of game is, to run as fast as possible,this should be implemented.
Alternatively, there could be 2 versions.
One as is and the other as it was before (through plan classes).

EDIT: Theory jobs

I am currently doing this on my i7 Macbook Pro (4CPUs and 8 threads), just setting the new project preference options to 1 task max and 3 CPUs max. Try it!

See my post http://lhcathomedev.cern.ch/vLHCathome-dev/forum_thread.php?id=280&postid=3885#3885
12) Message boards : Theory Application : Multicore settings for 20 cor machine
Message 3885
Posted 30 Jul 2016 by Ben Segal
Just FYI, with my MacBook Pro (i7, 4 CPUs, 8 threads) and:

- no app_config.xml file
- the new project preference options set to 1 task max, 3 CPUs max
- only the Theory app checked

and with the latest updated server code, I get (YES!):

1 task (Theory) and a 3 CPU VM

and it works correctly so far… I will check for progress today, including some suspends/resumes.

Ben

EDIT: So far all goes well:
4 jobs run to completion, all 3 Condor job threads active. Suspend for now (15:50 CET)
13) Message boards : Number crunching : Respect My Limits!
Message 3865
Posted 29 Jul 2016 by Ben Segal
Hi Bryan, it's only very recently that Theory has been able to use multiple cores sensibly and we are slowly gaining experience with that. The code itself is not multi-threaded to any serious extent so one is forced to use either multiple BOINC tasks with a VM in each (ugly), or multiple VMs per task, or multiple jobs per VM (which may be tricky) or a mixture of all that. If the code were designed for multicore and/or multiple threads life would be simpler (like Atlas I believe). In any case going over about 8 threads at a time is application dependent and in general high energy physics code doesn't lend itself to that sort of thing. In the old days we tried and failed to use vector hardware on super computers like the Cray and so on. Sorry not to be able to ace out your super installation!

Ben
14) Message boards : Theory Application : Task startup issue
Message 3850
Posted 29 Jul 2016 by Ben Segal
Thanks, Rom.
How is determined now, how many cores to be used?
By the project preference file? What if it is undefined?

...

Anyway be careful as Laurence hasn't yet fully implemented the new project preferences for number of cores and tasks… You may not get what you ask for until he's finished coding it.
15) Message boards : Theory Application : Task startup issue
Message 3829
Posted 27 Jul 2016 by Ben Segal
I have a case, where i am running a 4 core tasks. For whatever reason, it decided after 3h or so, to discontinue 3 of them and runs only on one.
( as if it had hit the 12h mark, which it has not.)

I had the same thing yesterday with a 2 core task when one of the two job streams went quiet while the other continued to run jobs.
16) Message boards : Theory Application : Theory Application job errors
Message 3779
Posted 22 Jul 2016 by Ben Segal
The Theory Application looks like it is getting a batch of bad input jobs from the MCPlots server. This is causing many task errors.
We are investigating… Please be patient.
17) Message boards : Theory Application : New Version v47.22
Message 3773
Posted 22 Jul 2016 by Ben Segal
Since about 5.44UTC there seem to be no jobs available.

BTW. Boinc-tasks should not error out, if no work is available.

EDIT:And this message shows at thr top of stderr.

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<message>
Der Ring 2-Stapel wird bereits verwendet.
(0xcf) - exit code 207 (0xcf)
</message>

Yes, there was a problem over the last few hours with both job supply and result processing on the Theory Applicationt. This has now been fixed so things should clear up steadily.

@CP, this also explains your lack of jobs after resuming...
18) Message boards : Theory Application : Task shutting down prematurly
Message 3604
Posted 23 Jun 2016 by Ben Segal
As before, now task a running , maybe 1 job and starting a new task.

ARE WE OUT OF JOBS?

Thanks for the detailed reply to my first question.


Hi Rasputin, Laurence is busy for about 10 days so may not reply to you for a while…

Thanks for helping us test this!
19) Message boards : News : Scheduler and vbox update to detect 64-bit enabled computers
Message 3603
Posted 23 Jun 2016 by Ben Segal
Hi maeax, Laurence is off for about 10 days so may not reply to you for a while…

Thanks for helping us test this!
20) Message boards : Number crunching : VBox issues
Message 3548
Posted 8 Jun 2016 by Ben Segal
Rom just released vboxwrapper 26186 at the weekend to support VirtualBox 5.1 on Windows. The Theory app was updated so that the release on Monday would go with that version but the CMS app wasn't updated. It has been done now so please try again.


Does this apply as well to the production work at vLHC?

Yes it does.
Next 20


©2025 CERN