Message boards : ALICE Application : The ALICE Application
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1067
Credit: 329,449
RAC: 199
Message 4353 - Posted: 23 Nov 2016, 18:18:23 UTC - in response to Message 4352.  

11/23/16 15:39:11 (pid:4115) DockerProc::Detect()
11/23/16 15:39:11 (pid:4115) DOCKER is undefined.
11/23/16 15:39:11 (pid:4115) DockerAPI::detect() failed to detect the Docker version; assuming absent.


Starter.log..


Task hangs at stderr.txt --> " HTCondor ping --->0"

Tested several times---ideas?


There are no jobs due to a problem with our HTCondor servers. Things are running well but it was running out of memory due to the number of jobs. Tomorrow we will add another bigger machine.
ID: 4353 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 4354 - Posted: 23 Nov 2016, 18:42:28 UTC - in response to Message 4353.  

Thanks for the info.
Much appreciated.
ID: 4354 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 4357 - Posted: 24 Nov 2016, 19:23:45 UTC
Last modified: 24 Nov 2016, 20:00:56 UTC

Some observations:

It uses very little memory; about 1 GB for a 4 core task.Disk space about 0.75GB per task plus 450 MB for the image file.

However, the startup/shutdown of each job takes about 2.5 minutes.
Compared to the run-time of about 15 to 20 min (in my case)it seems very wasteful.

These jobs are too short to run efficiently on multi-core tasks.

Will see, how they behave as single core tasks.

"Finished_x.log" and "running.log" does not contain any info.(just dummy argument)

BTW the ...
Guest Log: [INFO] Job finished in slot2 with unknown exit code.
in the stderr.txt is not particularly helpful, either.
ID: 4357 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42
Volunteer tester

Send message
Joined: 16 Aug 15
Posts: 966
Credit: 1,211,816
RAC: 0
Message 4358 - Posted: 24 Nov 2016, 20:45:20 UTC

Efficiency is very bad on single core tasks as well.

It is about 66%. For about 1/3 of the job duration the CPU is below 5%.

There is no significant disk or network activity, either.
ID: 4358 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1185
Credit: 825,079
RAC: 1,059
Message 4359 - Posted: 24 Nov 2016, 21:30:14 UTC

I did a short test. 2 jobs done on single core VM.
Run time 57 min 51 sec
CPU time 47 min 14 sec

http://lhcathomedev.cern.ch/vLHCathome-dev/result.php?resultid=288738

I noticed that the process 'bc' is running several times during 1 job.
ID: 4359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1067
Credit: 329,449
RAC: 199
Message 4360 - Posted: 24 Nov 2016, 22:16:57 UTC - in response to Message 4357.  

These are just test jobs to exercise the system. ALICE is currently busy with the p-Pb heavy ion run so will not be able to make any progress until next year.
ID: 4360 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
maeax

Send message
Joined: 22 Apr 16
Posts: 671
Credit: 1,881,299
RAC: 7,218
Message 4884 - Posted: 3 May 2017, 7:56:42 UTC
Last modified: 3 May 2017, 7:58:19 UTC

Boinc 7.7.2 - Virtualbox 5.1.22

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=324669

maxCPUs=1 and maxJobs=1

206 (0x000000CE) EXIT_INIT_FAILURE

Broken after 11 min. duration-time. RDP was shown. (console)
ID: 4884 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1067
Credit: 329,449
RAC: 199
Message 4893 - Posted: 4 May 2017, 12:11:21 UTC - in response to Message 4884.  

Alice will be taking a rest for a while.
ID: 4893 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,945,852
RAC: 0
Message 5471 - Posted: 26 Jul 2018, 9:14:30 UTC

I have a large number of OLD Alice, Atlas , LHCb , Benchark, etc., results still listed under my Tasks. Some of these are over 2 years old so surely can't still be useful. Is it not about time for a purge of these to free up some server space? I know some purges in the past haven't gone well and been over agressive but perhaps a 6 months limit (?) might allow enough records for task comparison by the user and remove redundant tasks.
ID: 5471 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 751
Credit: 11,610,444
RAC: 1,210
Message 5472 - Posted: 26 Jul 2018, 15:18:47 UTC - in response to Message 5471.  

Laurence will be back around August 10
Mad Scientist For Life
ID: 5472 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : ALICE Application : The ALICE Application


©2024 CERN