Message boards :
Theory Application :
New version v3.12
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Sep 14 Posts: 1064 Credit: 327,073 RAC: 133 |
This version updates the CVMFS configuration to include the new repository /cvmfs/alice.cern.chfrom which we plan to use some software and data. |
Send message Joined: 8 Apr 15 Posts: 750 Credit: 11,603,490 RAC: 1,713 |
Thanks Laurence Mad Scientist For Life |
Send message Joined: 10 Mar 17 Posts: 40 Credit: 108,345 RAC: 0 |
How can we check that it actually works? In the log it does not show any hint that the alice cvmfs repository is used: 2018-11-27 11:18:51 (7412): Guest Log: [DEBUG] VM is running outside WLCG 2018-11-27 11:19:54 (7412): Guest Log: [DEBUG] Probing CVMFS ... 2018-11-27 11:19:55 (7412): Guest Log: Probing /cvmfs/grid.cern.ch... OK 2018-11-27 11:19:56 (7412): Guest Log: VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE 2018-11-27 11:19:56 (7412): Guest Log: 2.4.4.0 3518 1 25332 7789 3 1 7409 10240001 2 65024 0 3 100 0 0 http://s1cern-cvmfs.openhtc.io/cvmfs/grid.cern.ch DIRECT 1 https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2740709 |
Send message Joined: 22 Apr 16 Posts: 664 Credit: 1,791,620 RAC: 3,116 |
Laurence wrote, they PLAN TO USE Alice software. We have to wait therefore. |
Send message Joined: 8 Apr 15 Posts: 750 Credit: 11,603,490 RAC: 1,713 |
Well so far not much luck with this version. I do have one that has been running over an hour. But I watched this happen just now https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2740720 It was running faster than these VB's usually do and as it got up to about 98% in 11 minutes it all of a sudden went *computer error* Error in host info for VM: -182 WARNING: Communication with VM Hypervisor failed. (Possibly Out of Memory). 2018-11-27 12:42:51 (7260): WARNING: Communication with VM Hypervisor failed. 2018-11-27 12:42:51 (7260): ERROR: VBoxManage list hostinfo failed I know it wasn't a memory problem and it is running a v.3.11 with no problem and almost finished (over 18 hours) And is having no problem with the 2-core Theory over at LHC I just started another 3.12 so I will see what happens. ( I just updated to the newest version of Oracle VB and Boinc before I d/l this new v 3.12 last night ) Mad Scientist For Life |
Send message Joined: 22 Apr 16 Posts: 664 Credit: 1,791,620 RAC: 3,116 |
Have two running (one since 12 hours and one since 2 hours). VB 5.2.22 and Boinc 7.14.2. Will see tomorrow how they ended. |
Send message Joined: 8 Apr 15 Posts: 750 Credit: 11,603,490 RAC: 1,713 |
Have two running (one since 12 hours and one since 2 hours). https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2740694 It looks like your first one worked Axel I should have one before........3am so I will watch for that |
Send message Joined: 13 Feb 15 Posts: 1180 Credit: 815,336 RAC: 431 |
( I just updated to the newest version of Oracle VB and Boinc before I d/l this new v 3.12 last night ) Did you also installed the right extension pack: Oracle_VM_VirtualBox_Extension_Pack-5.2.22? |
Send message Joined: 8 Apr 15 Posts: 750 Credit: 11,603,490 RAC: 1,713 |
( I just updated to the newest version of Oracle VB and Boinc before I d/l this new v 3.12 last night ) Always do install the extension pack (all 8 years) .......and reboot https://www.virtualbox.org/wiki/Downloads Mad Scientist For Life |
Send message Joined: 13 Feb 15 Posts: 1180 Credit: 815,336 RAC: 431 |
[Always do install the extension pack (all 8 years) .......and rebootOK, just thought because I saw this in your result: "Error in guest additions for VM: -182" btw: my first and only task was a success just in time before the 18 hours limit would have kicked the task out. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2740707 |
Send message Joined: 22 Apr 16 Posts: 664 Credit: 1,791,620 RAC: 3,116 |
Have two running (one since 12 hours and one since 2 hours). Both ended successful: One in 17:30 hours and one in 13:30 hours. All without shut down from Server after 18:30. One is running now as a new one. |
Send message Joined: 10 Mar 17 Posts: 40 Credit: 108,345 RAC: 0 |
Laurence wrote,He also wrote that THIS version updates the cvmfs configuration to include the alice repository... |
Send message Joined: 8 Apr 15 Posts: 750 Credit: 11,603,490 RAC: 1,713 |
[Always do install the extension pack (all 8 years) .......and rebootOK, just thought because I saw this in your result: "Error in guest additions for VM: -182" Yeah I think it was just one of those *VB* things that can happen CP I have 5 that are running like they should and I looked at the current logs for them and it is 2:22am here and they will be done by the time I get up later today (on 3 different pc's) |
Send message Joined: 22 Apr 16 Posts: 664 Credit: 1,791,620 RAC: 3,116 |
Hi Gyllic, yes, you are right. It is always -dev.... |
Send message Joined: 22 Apr 16 Posts: 664 Credit: 1,791,620 RAC: 3,116 |
Have two running (one since 12 hours and one since 2 hours). Sorry, the one with 17:30 is from the old software v3.11. The v3.12 ended just in the moment after 16:45. Have only ONE Cpu with ONE task. There are some Sherpa's running well in this tasks. Can it be, that Sherpa's run better with ONE Cpu than with more Cpus? https://lhcathomedev.cern.ch/lhcathome-dev/workunit.php?wuid=1865657 |
Send message Joined: 13 Feb 15 Posts: 1180 Credit: 815,336 RAC: 431 |
There are some Sherpa's running well in this tasks. Theory with 2 CPU's will run 2 jobs. Rather often a sherpa will run longer than the VM-lifetime (18hrs) or even endless looping. When you have 2 cpu's configured the second job is probably hours ready leaving a core idle in that case. In former days (T4T) Theory was using the 2nd core for the same job speeding up the job, but never 200%. A great part of a sherpa job is initializing end optimizing before the real events start processing. |
Send message Joined: 28 Jul 16 Posts: 473 Credit: 389,411 RAC: 62 |
I also got a task from Theory Simulation v3.12 that ran over night and finished successfully after nearly 14 h. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2740701 My proxy logs show that there were no requests to a repository like "alice.cern.ch" or similar. There were only 2 VM internal logs that mentioned ALICE but the files were fetched from the sft repository. Log snippet: ===> [runRivet] Tue Nov 27 23:39:32 CET 2018 [boinc pp mb-inelastic 7000 - - pythia6 6.428 z2 100000 6] Setting environment... MCGENERATORS=/cvmfs/sft.cern.ch/lcg/external/MCGenerators_lcgcmt67c gcc = /cvmfs/sft.cern.ch/lcg/external/gcc/4.8.4/x86_64-slc6/bin/gcc gcc version = 4.8.4 RIVET=/cvmfs/sft.cern.ch/lcg/external/MCGenerators_lcgcmt67c/rivet/2.5.4/x86_64-slc6-gcc48-opt RIVET_REF_PATH=/cvmfs/sft.cern.ch/lcg/external/MCGenerators_lcgcmt67c/rivet/2.5.4/x86_64-slc6-gcc48-opt/share/Rivet RIVET_ANALYSIS_PATH=/var/lib/condor/execute/dir_14006/analyses GSL=/cvmfs/sft.cern.ch/lcg/external/GSL/1.10/x86_64-slc6-gcc48-opt HEPMC=/cvmfs/sft.cern.ch/lcg/external/HepMC/2.06.08/x86_64-slc6-gcc48-opt FASTJET=/cvmfs/sft.cern.ch/lcg/external/fastjet/3.0.3/x86_64-slc6-gcc48-opt PYTHON=/cvmfs/sft.cern.ch/lcg/external/Python/2.7.4/x86_64-slc6-gcc48-opt ROOTSYS=/cvmfs/sft.cern.ch/lcg/app/releases/ROOT/5.34.26/x86_64-slc6-gcc48-opt/root Input parameters: mode=boinc beam=pp process=mb-inelastic energy=7000 params=- specific=- generator=pythia6 version=6.428 tune=z2 nevts=100000 seed=6 Prepare temporary directories and files ... workd=/var/lib/condor/execute/dir_14006 tmpd=/var/lib/condor/execute/dir_14006/tmp/tmp.534ZLYPmUM tmp_params=/var/lib/condor/execute/dir_14006/tmp/tmp.534ZLYPmUM/generator.params tmp_hepmc=/var/lib/condor/execute/dir_14006/tmp/tmp.534ZLYPmUM/generator.hepmc tmp_yoda=/var/lib/condor/execute/dir_14006/tmp/tmp.534ZLYPmUM/generator.yoda tmp_jobs=/var/lib/condor/execute/dir_14006/tmp/tmp.534ZLYPmUM/jobs.log tmpd_flat=/var/lib/condor/execute/dir_14006/tmp/tmp.534ZLYPmUM/flat tmpd_dump=/var/lib/condor/execute/dir_14006/tmp/tmp.534ZLYPmUM/dump tmpd_html=/var/lib/condor/execute/dir_14006/tmp/tmp.534ZLYPmUM/html Prepare Rivet parameters ... analysesNames=ALICE_2010_S8625980 ALICE_2012_I1116147 ALICE_2012_I1181770 ALICE_2014_I1300380 ALICE_2015_I1357424 ATLAS_2010_S8918562 ATLAS_2011_I894867 ATLAS_2012_I1084540 ATLAS_2012_I1183818 ATLAS_2014_I1282441 CMS_2012_I1184941 CMS_2012_I1193338 CMS_2013_I1218372 LHCB_2011_I917009 LHCB_2011_I919315 LHCB_2012_I1119400 LHCB_2013_I1218996 LHCF_2012_I1115479 MC_GAPS TOTEM_2012_I1115294 Unpack data histograms... dataFiles = /cvmfs/sft.cern.ch/lcg/external/MCGenerators_lcgcmt67c/rivet/2.5.4/x86_64-slc6-gcc48-opt/share/Rivet/ALICE_2010_S8625980.yoda /cvmfs/sft.cern.ch/lcg/external/MCGenerators_lcgcmt67c/rivet/2.5.4/x86_64-slc6-gcc48-opt/share/Rivet/ALICE_2012_I1116147.yoda /cvmfs/sft.cern.ch/lcg/external/MCGenerators_lcgcmt67c/rivet/2.5.4/x86_64-slc6-gcc48-opt/share/Rivet/ALICE_2012_I1181770.yoda /cvmfs/sft.cern.ch/lcg/external/MCGenerators_lcgcmt67c/rivet/2.5.4/x86_64-slc6-gcc48-opt/share/Rivet/ALICE_2014_I1300380.yoda etc. etc. etc. |
Send message Joined: 12 Sep 14 Posts: 1064 Credit: 327,073 RAC: 133 |
Thanks for all the feedback. It seems ok so will put it on the production server tomorrow. You will not able to see anything related the the alice repository in the output of the tasks. |
Send message Joined: 8 Apr 15 Posts: 750 Credit: 11,603,490 RAC: 1,713 |
I have some single core and some 2-core tasks running ( 3 Valids now) and 7 more running right now with no problems. |
Send message Joined: 12 Sep 14 Posts: 1064 Credit: 327,073 RAC: 133 |
Now available on prod. Thanks for testing and the feedback. |
©2024 CERN