Message boards : General Discussion : Scheduler Change
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 249
Message 6845 - Posted: 26 Nov 2019, 14:51:42 UTC

As far as I understand the scheduler code, the issue is that the project preferences setting is limiting the ncpus and that this value is used to set the number of threads. We probably don't want to touch ncpus and just set the number of threads.
ID: 6845 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 249
Message 6846 - Posted: 26 Nov 2019, 15:04:28 UTC - in response to Message 6845.  
Last modified: 26 Nov 2019, 15:04:42 UTC

I have disabled the Max # CPUs so setting it should have no affect.
ID: 6846 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 467
Credit: 389,411
RAC: 503
Message 6847 - Posted: 26 Nov 2019, 15:54:39 UTC

I see what sched_send.cpp does:

base CPU count is set to the processor capabilities
n = g_reply->host.p_ncpus;



reduced if a user has set a CPU percentage limit
if (g_request->global_prefs.max_ncpus_pct && g_request->global_prefs.max_ncpus_pct < 100) {
    n = (int)((n*g_request->global_prefs.max_ncpus_pct)/100.);
}



reduced based on <ncpus> from cc_config.xml (?, guess yes) (if < than the %limit)
if (n > config.max_ncpus) n = config.max_ncpus;



sanity checks (not sure where MAX_CPUS is set; looks like a global setting)
if (n < 1) n = 1;
if (n > MAX_CPUS) n = MAX_CPUS;



So far it makes sense as the settings above limit the global number of cores this BOINC client is allowed to use and the server can now estimate how much work can be send.




if (project_prefs.max_cpus) {
    if (n > project_prefs.max_cpus) {
        n = project_prefs.max_cpus;
    }

This code is here, without any doubt, but why here?
Guess a user wants to run 2-core ATLAS tasks and sets the web preferences to "2 CPUs".
This code will make every brand new 256-core-amdintel_ripper appear as a Fred Flintstone's 2 core machine.
ID: 6847 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 6848 - Posted: 26 Nov 2019, 16:14:08 UTC - in response to Message 6846.  

I have disabled the Max # CPUs so setting it should have no affect.
Max # jobs 2
Max # CPUs 1

I got 2 tasks, just what a 'normal' user would expect.
ID: 6848 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 6849 - Posted: 26 Nov 2019, 18:16:24 UTC

Setting
Max # jobs 3
Max # CPUs 2

no app_config in use - I got a 3rd task and avg_ncpus stays 1, creating a single core VM.
Requesting new tasks, I get: lhcathome-dev 26 Nov 19:06:18 This computer has reached a limit on tasks in progress
When the max cpus was misused as task-limit you got 'No tasks available'
ID: 6849 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 249
Message 6850 - Posted: 26 Nov 2019, 21:32:31 UTC - in response to Message 6849.  

Setting
Max # jobs 3
Max # CPUs 2

no app_config in use - I got a 3rd task and avg_ncpus stays 1, creating a single core VM.
Requesting new tasks, I get: lhcathome-dev 26 Nov 19:06:18 This computer has reached a limit on tasks in progress
When the max cpus was misused as task-limit you got 'No tasks available'

This is what I would expect for Theory as it is now a single core app. Try CMS which is multi-core.
ID: 6850 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 249
Message 6851 - Posted: 26 Nov 2019, 21:36:16 UTC - in response to Message 6847.  
Last modified: 26 Nov 2019, 21:36:35 UTC

(not sure where MAX_CPUS is set; looks like a global setting)

In the header file.

if (project_prefs.max_cpus) {
    if (n > project_prefs.max_cpus) {
        n = project_prefs.max_cpus;
    }

This code is here, without any doubt, but why here?
Guess a user wants to run 2-core ATLAS tasks and sets the web preferences to "2 CPUs".
This code will make every brand new 256-core-amdintel_ripper appear as a Fred Flintstone's 2 core machine.


These lines have now been deleted. Will look where they should be put.
ID: 6851 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 738
Credit: 11,558,798
RAC: 1,847
Message 6852 - Posted: 27 Nov 2019, 2:26:02 UTC

Well I see this didn't work for me.

I just got a 4 core task on a quad-core that is set to only run 2-core tasks like all the other host locations.

I could run the 4-core on my 8-core pc's but I haven't done an update yet since I am waiting for CMS to actually have Jobs again but I sure hope it doesn't send me 8-core CMS tasks.

This 4 core task will have to be aborted.
ID: 6852 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 6854 - Posted: 27 Nov 2019, 10:20:25 UTC - in response to Message 6850.  

This is what I would expect for Theory as it is now a single core app. Try CMS which is multi-core.
I did with settings:
Max # jobs 2
Max # CPUs 2

and I got a 3-core task:
    <avg_ncpus>3.000000</avg_ncpus>
    <flops>43178750546.003098</flops>
    <plan_class>vbox64_mt_mcore_cms</plan_class>
    <api_version>7.7.0</api_version>
    <cmdline>--memory_size_mb 3688 --nthreads 3</cmdline>
I changed to
Max # jobs 2
Max # CPUs 4
and in the above code nothing changed ??
I could reset the -dev-project and try again, but have to finish first a Theory-task running a bit longer.
ID: 6854 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 249
Message 6855 - Posted: 27 Nov 2019, 13:15:36 UTC - in response to Message 6854.  

I think I have found a solution. Please can you see if the settings work as expected. You should get the following:

Max 2 CPU, Max 1 Job => 2 threaded job 
Max 1 CPU, Max 2 Job => 2 single threaded jobs
ID: 6855 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 249
Message 6856 - Posted: 27 Nov 2019, 14:13:56 UTC - in response to Message 6855.  

There is still a small issue left to be solved. If I have 3 CPUs and have set Max 2 CPUs. The request will come back with 2 2 CPU jobs whereas I would like 1 2 CPU job and 1 1C PU job. If these are separate requests everything is fine. Anyway, I have opened an issue on the topic to hopefully get input from others.
ID: 6856 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 6857 - Posted: 27 Nov 2019, 14:17:33 UTC - in response to Message 6854.  

I could reset the -dev-project and try again, but have to finish first a Theory-task running a bit longer.
Finally that Theory job was ready and I have reset the project.

First try:
Max # jobs 1
Max # CPUs 2

    <avg_ncpus>2.000000</avg_ncpus>
    <plan_class>vbox64_mt_mcore_cms</plan_class>
    <cmdline>--memory_size_mb 2792 --nthreads 2</cmdline>


Second try:
Max # jobs 1
Max # CPUs 4
    <avg_ncpus>1.000000</avg_ncpus>
    <plan_class>vbox64_mt_mcore_cms</plan_class>
    <cmdline>--memory_size_mb 1896 --nthreads 1</cmdline>
ID: 6857 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 6858 - Posted: 27 Nov 2019, 14:45:03 UTC - in response to Message 6857.  
Last modified: 27 Nov 2019, 14:53:04 UTC

Second try:
Max # jobs 1
Max # CPUs 4
    <avg_ncpus>1.000000</avg_ncpus>
    <plan_class>vbox64_mt_mcore_cms</plan_class>
    <cmdline>--memory_size_mb 1896 --nthreads 1</cmdline>
I probably know why!

My second request was when I had set "Use at most 12.5% of the CPUs" locally of 8 threads. Sigh. Retry with 100% CPUs:
    <avg_ncpus>4.000000</avg_ncpus>
    <plan_class>vbox64_mt_mcore_cms</plan_class>
    <cmdline>--memory_size_mb 4584 --nthreads 4</cmdline>
That is as expected, but I don't like it, that reducing my local cpus has impact on the # of cores of my project preferences.
ID: 6858 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1178
Credit: 810,985
RAC: 2,009
Message 6859 - Posted: 27 Nov 2019, 14:51:32 UTC - in response to Message 6855.  

I think I have found a solution. Please can you see if the settings work as expected. You should get the following:

Max 2 CPU, Max 1 Job => 2 threaded job 
Max 1 CPU, Max 2 Job => 2 single threaded jobs
Both settings work as expected.
ID: 6859 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1064
Credit: 325,950
RAC: 249
Message 6860 - Posted: 27 Nov 2019, 15:46:05 UTC - in response to Message 6859.  

Thanks for testing. I am going to update the production sched as I think it is an improvement over what we have now.
ID: 6860 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : General Discussion : Scheduler Change


©2024 CERN