Message boards : CMS Application : New Version v47.40
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 4007 - Posted: 8 Aug 2016, 18:34:32 UTC

Upgrading vboxwrapper to 26197 on windows which enables support for Virtual Box 5.1.2.
ID: 4007 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4008 - Posted: 8 Aug 2016, 18:47:18 UTC - in response to Message 4007.  

Multi-core?
ID: 4008 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 540
Credit: 7,616,583
RAC: 1,500
Message 4010 - Posted: 8 Aug 2016, 19:03:11 UTC - in response to Message 4008.  

Multi-core?


It appears to be multi

I have it d/ling right now

http://lhcathomedev.cern.ch/vLHCathome-dev/results.php?hostid=612
ID: 4010 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4012 - Posted: 8 Aug 2016, 19:08:32 UTC

Thanks!

I will give it a go in a bit.
ID: 4012 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 4015 - Posted: 8 Aug 2016, 19:41:03 UTC - in response to Message 4008.  

Yes but with the new project preferences you can set the number of CPUs = 1 to make it single core.
ID: 4015 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4019 - Posted: 8 Aug 2016, 21:37:10 UTC
Last modified: 8 Aug 2016, 21:42:20 UTC

Tried to run a 4 core task.

Failed twice--->VM Completion Message: No jobs were available to run.


EDIT: tried 2core task-->seems to work.
ID: 4019 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,653
RAC: 2
Message 4022 - Posted: 9 Aug 2016, 6:44:27 UTC - in response to Message 4019.  

Tried to run a 4 core task.

Failed twice--->VM Completion Message: No jobs were available to run.


EDIT: tried 2core task-->seems to work.

See my message -> http://lhcathomedev.cern.ch/vLHCathome-dev/forum_thread.php?id=291&postid=4000#4000

So maybe no coincidence.
ID: 4022 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4023 - Posted: 9 Aug 2016, 6:53:30 UTC - in response to Message 4022.  

Maybe not. I assigned 5GB of memory--so alack of memory is no the cause.
ID: 4023 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4027 - Posted: 9 Aug 2016, 9:20:36 UTC
Last modified: 9 Aug 2016, 9:26:19 UTC

A 2 core task runs fine.

A 3 core task only runs 2 jobs in slot 1 and slot 3.

EDIT: One has to keep in mind, that when multiple jobs are uploading at the same time large amounts, it is going to be a problem, as we had in the past.
ID: 4027 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,653
RAC: 2
Message 4029 - Posted: 9 Aug 2016, 10:09:36 UTC

I tested twice 2 CMS-tasks with 2 processors in each VM.
All 4 tasks errors after about 9 minutes with EXIT_NO_SUB_TASKS (CMS-jobs available)
Starting 2 single core CMS-tasks works fine.
ID: 4029 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4030 - Posted: 9 Aug 2016, 10:30:19 UTC - in response to Message 4029.  

You are using vbox 5.0.26?
I am using 5.1.2.
ID: 4030 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4031 - Posted: 9 Aug 2016, 10:35:47 UTC

A 3 core task only runs 2 jobs in slot 1 and slot 3.


It picked up a 3rd job 29min later than the other two.
ID: 4031 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,653
RAC: 2
Message 4032 - Posted: 9 Aug 2016, 11:34:51 UTC - in response to Message 4030.  

You are using vbox 5.0.26?
I am using 5.1.2.

Yeah, I'll upgrade to 5.1.2 to test it with vboxwrapper version 26197 (only available for Windows yet).

After the upgrade I'll start 1 dual core CMS without memory extention (no app_config), so 2048MB default RAM.
ID: 4032 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,653
RAC: 2
Message 4033 - Posted: 9 Aug 2016, 12:15:56 UTC - in response to Message 4032.  

Yeah, I'll upgrade to 5.1.2 to test it with vboxwrapper version 26197 (only available for Windows yet).

After the upgrade I'll start 1 dual core CMS without memory extention (no app_config), so 2048MB default RAM.

Done.
1 dual core CMS with default RAM doesn't get jobs either. I'll use a app_config now with RAM set to 3072 MB.

http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=235756

Condor StartLog:

08/09/16 13:54:48 ******************************************************
08/09/16 13:54:48 ** condor_startd (CONDOR_STARTD) STARTING UP
08/09/16 13:54:48 ** /usr/sbin/condor_startd
08/09/16 13:54:48 ** SubsystemInfo: name=STARTD type=STARTD(7) class=DAEMON(1)
08/09/16 13:54:48 ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON
08/09/16 13:54:48 ** $CondorVersion: 8.4.8 Jun 30 2016 BuildID: 373513 $
08/09/16 13:54:48 ** $CondorPlatform: x86_64_RedHat6 $
08/09/16 13:54:48 ** PID = 4156
08/09/16 13:54:48 ** Log last touched time unavailable (No such file or directory)
08/09/16 13:54:48 ******************************************************
08/09/16 13:54:48 Using config source: /etc/condor/condor_config
08/09/16 13:54:48 Using local config sources:
08/09/16 13:54:48 /etc/condor/config.d/10_security.config
08/09/16 13:54:48 /etc/condor/config.d/14_network.config
08/09/16 13:54:48 /etc/condor/config.d/20_workernode.config
08/09/16 13:54:48 /etc/condor/config.d/30_lease.config
08/09/16 13:54:48 /etc/condor/config.d/35_cms.config
08/09/16 13:54:48 /etc/condor/config.d/40_ccb.config
08/09/16 13:54:48 /etc/condor/condor_config.local
08/09/16 13:54:48 config Macros = 153, Sorted = 153, StringBytes = 5980, TablesBytes = 5604
08/09/16 13:54:48 CLASSAD_CACHING is ENABLED
08/09/16 13:54:48 Daemon Log is logging: D_ALWAYS D_ERROR
08/09/16 13:54:48 Daemoncore: Listening at <10.0.2.15:29199> on TCP (ReliSock).
08/09/16 13:54:48 DaemonCore: command socket at <10.0.2.15:29199?addrs=10.0.2.15-29199&noUDP>
08/09/16 13:54:48 DaemonCore: private command socket at <10.0.2.15:29199?addrs=10.0.2.15-29199>
08/09/16 13:55:09 CCBListener: registered with CCB server lcggwms02.gridpp.rl.ac.uk:9623 as ccbid 130.246.180.120:9623#521719
08/09/16 13:55:10 HibernationSupportedStates invalid '' in ad from hibernation plugin /usr/libexec/condor/condor_power_state
08/09/16 13:55:10 VM-gahp server reported an internal error
08/09/16 13:55:10 VM universe will be tested to check if it is available
08/09/16 13:55:10 History file rotation is enabled.
08/09/16 13:55:10 Maximum history file size is: 20971520 bytes
08/09/16 13:55:10 Number of rotated history files is: 2
08/09/16 13:55:10 Allocating auto shares for slot type 0: Cpus: auto, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1.000000, Memory: 1500, Swap: 50.00%, Disk: 50.00%
slot type 0: Cpus: 1.000000, Memory: 1500, Swap: 50.00%, Disk: 50.00%
08/09/16 13:55:10 slot1: New machine resource allocated
08/09/16 13:55:10 Setting up slot pairings
08/09/16 13:55:10 slot2: New machine resource allocated
08/09/16 13:55:10 Setting up slot pairings
08/09/16 13:55:10 CronJobList: Adding job 'mips'
08/09/16 13:55:10 CronJobList: Adding job 'kflops'
08/09/16 13:55:10 CronJob: Initializing job 'mips' (/usr/libexec/condor/condor_mips)
08/09/16 13:55:10 CronJob: Initializing job 'kflops' (/usr/libexec/condor/condor_kflops)
08/09/16 13:55:10 slot1: State change: IS_OWNER is false
08/09/16 13:55:10 slot1: Changing state: Owner -> Unclaimed
08/09/16 13:55:10 State change: RunBenchmarks is TRUE
08/09/16 13:55:10 slot1: Changing activity: Idle -> Benchmarking
08/09/16 13:55:10 BenchMgr:StartBenchmarks()
08/09/16 13:55:10 slot2: State change: IS_OWNER is false
08/09/16 13:55:10 slot2: Changing state: Owner -> Unclaimed
08/09/16 13:55:10 State change: RunBenchmarks is TRUE
08/09/16 13:55:10 slot2: Changing activity: Idle -> Benchmarking
08/09/16 13:55:10 slot2: Changing activity: Benchmarking -> Idle
08/09/16 13:55:28 State change: benchmarks completed
08/09/16 13:55:28 slot1: Changing activity: Benchmarking -> Idle
ID: 4033 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,653
RAC: 2
Message 4034 - Posted: 9 Aug 2016, 12:23:24 UTC - in response to Message 4033.  

I'll use a app_config now with RAM set to 3072 MB.

Immediately starting with 2 cmsRun's.
ID: 4034 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 4035 - Posted: 9 Aug 2016, 14:19:40 UTC - in response to Message 4034.  

The memory is divided by the number of slots and there is probably a memory requirement in the job description use by the matchmaking. Memory scaling is required but I am still waiting to hear back on how to do this for a project with multiple applications.
ID: 4035 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,653
RAC: 2
Message 4036 - Posted: 9 Aug 2016, 16:22:06 UTC - in response to Message 4035.  

The memory is divided by the number of slots and there is probably a memory requirement in the job description...

2 cmsRun's in 1 VM running synchronous use almost 3GB of memory, but no Swap is used at all.
ID: 4036 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4045 - Posted: 10 Aug 2016, 10:56:42 UTC

I got a 4 core VM started with 5632MB of memory.

Fist it started 2 jobs and 20min later another two.

Is that delay deliberate?
ID: 4045 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 965
Credit: 1,201,500
RAC: 5
Message 4064 - Posted: 12 Aug 2016, 9:06:52 UTC
Last modified: 12 Aug 2016, 9:14:35 UTC

I have a 4 core task running, where slot 2 does not start a new job,even though the 12h mark has not been reached.

EDIT:Has the cutoff time been changed? I have now only 1 job running!
ID: 4064 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : CMS Application : New Version v47.40


©2020 CERN