Message boards :
Theory Application :
Errors in log
Message board moderation
Author | Message |
---|---|
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Please look at this. It seems, a process is interupted by starting computing events. Then the events are interupted by the original process. CKIN(1) changed from 2.00000 to 0.00000 CKIN(2) changed from -1.00000 to 7000.00000 MSTJ(22) changed from 1 to 2 PARJ(71) changed from 10.00000 to 10.00000 ****************************************************************************** * * * PYTUNE : Presets for underlying-event (and min-bias) * * Last Change : Mar 2011 - P. Skands * * * * 324 Perugia NOCR * * Tuned by P. Skands, hep-ph/1005.3457 * * Physics Model: T. Sjostrand & P. Skands, hep-ph/0408302 * * CR by M. Sandhoff & P. Skands, in hep-ph/0604120 * * LEP parameters tuned by Professor, hep-ph/0907.2973 * * * * MSTP(51) = 7 PDF set * * MSTP(52) = 1 PDF set internal (=1) or pdflib (=2) * * MSTP( 3) = 2 INT switch for choice of LambdaQCD * * PARJ(81) = 0.2570 FSR LambdaQCD (inside resonance decays) * * MSTP(64) = 3 ISR alphaS type * * PARP(64) = 1.0000 ISR renormalization scale prefactor * * MSTP(67) = 2 ISR coherence option for 1st emission * * MSTP(68) = 3 ISR phase space choice & ME corrections * * (Note: MSTP(68) is not explicitly (re-)set by PYTUNE) * * PARP(67) = 1.0000 ISR Q2max factor * * MSTP(72) = 1 IFSR scheme for non-decay FSR * * PARP(71) = 2.0000 IFSR Q2max factor in non-s-channel procs * * MSTP(70) = 2 ISR IR regularization scheme * * PARJ(82) = 0.8000 FSR IR cutoff * * MSTP(33) = 0 "K" switch for K-factor on/off & type * * MSTP(81) = 21 UE model * * PARP(82) = 1.9500 UE IR cutoff at reference ecm * * (Note: PARP(82) replaces PARP(62).) * * PARP(89) = 1800.0000 UE IR cutoff reference ecm * * PARP(90) = 0.2400 UE IR cutoff ecm scaling power * * MSTP(82) = 5 UE hadron transverse mass distribution * * PARP(83) = 1.8000 UE mass distribution parameter * * MSTP(88) = 0 BR composite scheme * * MSTP(89) = 2 BR color scheme * * PARP(79) = 2.0000 BR composite x enhancement * * PARP(80) = 0.0100 BR breakup suppression * * MSTP(91) = 1 BR primordial kT distribution * * PARP(91) = 2.0000 BR primordial kT width <|kT|> * * PARP(93) = 10.0000 BR primordial kT UV cutoff * * MSTP(95) = 0 FSI color (re-)connection model * * ---------------------------------------------------------------------- * * MSTJ(11) = 5 HAD choice of fragmentation function(s) * * PARJ( 1) = 0.0730 HAD diquark suppression * * PARJ( 2) = 0.2000 HAD strangeness suppression * * PARJ( 3) = 0.9400 HAD strange diquark suppression * #-------------------------------------------------------------------------- # FastJet release 3.0.3 # M. Cacciari, G.P. Salam and G. Soyez # A software package for jet finding and analysis at colliders # http://fastjet.fr # # Please cite EPJC72(2012)1896 [arXiv:1111.6097] if you use this package # for scientific work and optionally PLB641(2006)57 [hep-ph/0512210]. # # FastJet is provided without warranty under the terms of the GNU GPLv2. # It uses T. Chan's closest pair algorithm, S. Fortune's Voronoi code # and 3rd party plugin jet algorithms. See COPYING file for details. #-------------------------------------------------------------------------- 100 events processed 200 events processed 300 events processed 400 events processed 500 events processed 600 events processed 700 events processed 800 events processed Updating display... Display update finished (0 histograms, 0 events). 900 events processed 1000 events processed dumping histograms... 1100 events processed 1200 events processed 1300 events processed . . . 14100 events processed 14200 events processed. 14300 events processed 14400 events processed 14500 events processed 14600 events processed 14700 events processed 14800 events processed 14900 events processed 15000 events processed dumping histograms... Updating display... 15100 events processed Display update finished (127 histograms, 15000 events). 15200 events processed 15300 events processed 15400 events processed 15500 events processed 15600 events processed * PARJ( 4) = 0.0320 HAD vector diquark suppression * * PARJ( 5) = 0.5000 HAD P(popcorn) * * PARJ( 6) = 0.5000 HAD extra popcorn B(s)-M-B(s) supp * * PARJ( 7) = 0.5000 HAD extra popcorn B-M(s)-B supp * * PARJ(11) = 0.3100 HAD P(vector meson), u and d only * * PARJ(12) = 0.4000 HAD P(vector meson), contains s * * PARJ(13) = 0.5400 HAD P(vector meson), heavy quarks * * PARJ(21) = 0.3130 HAD fragmentation pT * * PARJ(25) = 0.6300 HAD eta0 suppression * * PARJ(26) = 0.1200 HAD eta0' suppression * * PARJ(41) = 0.4900 HAD string parameter a(Meson) * * PARJ(42) = 1.2000 HAD string parameter b * * PARJ(45) = 0.5000 HAD string a(Baryon)-a(Meson) * * PARJ(46) = 1.0000 HAD Lund(=0)-Bowler(=1) rQ (rc) * * PARJ(47) = 1.0000 HAD Lund(=0)-Bowler(=1) rb * * * ******************************** END OF PYTUNE ******************************* MSTP(5) changed from 0 to 0 1****************** PYINIT: initialization of PYTHIA routines ***************** ============================================================================== I I I PYTHIA will be initialized for a p+ on p+ collider I I at 7000.000 GeV center-of-mass energy I I I ============================================================================== ******** PYMAXI: summary of differential cross-section maximum search ******** ========================================================== I I I I ISUB Subprocess name I Maximum value I I I I ========================================================== I I I I 11 f + f' -> f + f' (QCD) I 2.3418D-04 I I 12 f + fbar -> f' + fbar' I 1.7659D-06 I I 13 f + fbar -> g + g I 1.9398D-06 I I 28 f + g -> f + g I 1.4489D-03 I I 53 g + g -> f + fbar I 1.7145D-05 I I 68 g + g -> g + g I 5.9918D-04 I I 96 Semihard QCD 2 -> 2 I 1.1214D+04 I I I I ========================================================== ****** PYMULT: initialization of multiple interactions for MSTP(82) = 5 ****** pT0 = 2.70 GeV gives sigma(parton-parton) = 5.33D+02 mb: accepted ****** PYMIGN: initialization of multiple interactions for MSTP(82) = 5 ****** pT0 = 2.70 GeV gives sigma(parton-parton) = 2.14D+02 mb: accepted ********************** PYINIT: initialization completed ********************** Error type 4 has occured after 36 PYEXEC calls: (PYSTRF:) caught in infinite loop Error type 4 has occured after 628 PYEXEC calls: (PYSTRF:) caught in infinite loop Error type 4 has occured after 649 PYEXEC calls: (PYSTRF:) caught in infinite loop Error type 4 has occured after 862 PYEXEC calls: (PYSTRF:) caught in infinite loop Error type 4 has occured after 1004 PYEXEC calls: (PYSTRF:) caught in infinite loop Advisory warning type 9 given after 1138 PYEXEC calls: (PYPTIS:) Sorry, I got a heavy companion quark here. Not handled yet, giving up! Advisory warning type 9 given after 1522 PYEXEC calls: (PYPTIS:) Sorry, I got a heavy companion quark here. Not handled yet, giving up! Error type 4 has occured after 1797 PYEXEC calls: (PYSTRF:) caught in infinite loop Error type 4 has occured after 1839 PYEXEC calls: (PYSTRF:) caught in infinite loop Advisory warning type 9 given after 1891 PYEXEC calls: (PYPTIS:) Sorry, I got a heavy companion quark here. Not handled yet, giving up! |
Send message Joined: 12 Sep 14 Posts: 1069 Credit: 334,882 RAC: 0 |
This looks like an application failure. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Do you need more details to fix this? |
Send message Joined: 4 Mar 16 Posts: 31 Credit: 44,320 RAC: 0 |
We can not fix this kind of errors. The person to report them is someone from the Support section at the end of this website: http://mcplots.cern.ch/ |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Thanks. If there is a problem, isn't it up to the project admin to fix it or report it to someone, who can? |
Send message Joined: 12 Sep 14 Posts: 1069 Credit: 334,882 RAC: 0 |
Yes and no. At the moment we are focusing on the infrastructure so the job is a black box. It doesn't really matter if it works or not, what matters is that we can execute it and see those errors in a log file. As Ivan pointed out in another thread, these jobs and the code are created by scientists and sometimes they make mistakes or in this case it looks like the random event doesn't have code yet to handle that situation. Sorry, I got a heavy companion quark here. Not handled yet, giving up! These are real Test4Theory jobs so you should see similar issues with jobs in the production project. Also failed jobs like this may not be 'failed' jobs. If you count the number of failed jobs of this type you will get statistics on how frequently those events occur. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
I got this. There are a number of failed condor starts 04/11/16 16:18:40 Got activate_claim request from shadow (188.184.187.167) 04/11/16 16:18:40 Remote job ID is 260339.0 04/11/16 16:18:40 Got universe "VANILLA" (5) from request classad 04/11/16 16:18:40 State change: claim-activation protocol successful 04/11/16 16:18:40 Changing activity: Idle -> Busy 04/11/16 16:18:41 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: DAEMON authorization policy contains no matching ALLOW entry for this request; identifiers used for this host: 10.0.2.15,10.0.2.15, hostname size = 1, original ip address = 10.0.2.15 04/11/16 16:18:53 State change: benchmarks completed 04/11/16 16:21:46 Called deactivate_claim_forcibly() 04/11/16 16:21:46 Starter pid 4557 exited with status 0 04/11/16 16:21:46 State change: starter exited 04/11/16 16:21:46 Changing activity: Busy -> Idle 04/11/16 16:21:47 Got activate_claim request from shadow (188.184.187.167) 04/11/16 16:21:47 Remote job ID is 260340.0 04/11/16 16:21:47 Got universe "VANILLA" (5) from request classad 04/11/16 16:21:47 State change: claim-activation protocol successful 04/11/16 16:21:47 Changing activity: Idle -> Busy 04/11/16 16:21:48 PERMISSION DENIED to condor@localhost from host 10.0.2.15 for command 60008 (DC_CHILDALIVE), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason 04/11/16 16:24:15 Called deactivate_claim_forcibly() 04/11/16 16:24:15 Starter pid 5159 exited with status 0 04/11/16 16:24:15 State change: starter exited 04/11/16 16:24:15 Changing activity: Busy -> Idle 04/11/16 16:24:16 Got activate_claim request from shadow (188.184.187.167) |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
console F2 now shows a number of "EXT4-fs error inode doubly allocated?" This happens mostly with pythia8 jobs. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
console F2 now shows a number of "EXT4-fs error inode doubly allocated?" Those EXT4-fs error and similar messages are displayed on any screen active at the moment of those errors popping up. All Consoles are used like a 'System Console' - Console 0 |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
I have this only on F2. F3 for example shows TOP(cpu stats) |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
I have this only on F2. It will also been displayed on ALT-F3, but because the top screen is refreshed ever few seconds, you must be 'lucky' to catch such a message. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
In any case, i hope, it will be fixed, soon. |
Send message Joined: 4 Mar 16 Posts: 31 Credit: 44,320 RAC: 0 |
It is related to the ALLOW_DAEMON rule, it is not a critical error. We are investigating. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
There are a large number of different "apps". Phytia EDIT 6.xxx and 8.xxx. Are they, more or less, all doing the same thing? They have quite different run-times(for the same number of events). Are you trying to find the best one, or is this mix going to stay as is? |
Send message Joined: 4 Mar 16 Posts: 31 Credit: 44,320 RAC: 0 |
If the different "apps" belong to different jobs types (PHYTIA, Sherpa, ...) then it is not in our hands. Otherwise can you please post a screenshot of the log? Thank you. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Ok, they are just different types of jobs. Is there a way to put the outcome of each job in the stderr of the result? This way, we could monitor, if jobs are failing or not and how many are done. |
Send message Joined: 4 Mar 16 Posts: 31 Credit: 44,320 RAC: 0 |
The job application failure is different from the job failure. The former is not related to us (it can be reported to the MCPlots support: http://mcplots.cern.ch/) and the job status will be successful though. The latter can be monitored here: http://www.citizencyberscience.net/t4t-webapp/stats/ http://mcplots-dev.cern.ch/production.php?view=status and maybe in some other user related statistics. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Thanks, Leonardo. |
Send message Joined: 13 Feb 15 Posts: 1188 Credit: 861,475 RAC: 2 |
Ok, they are just different types of jobs. Jobs are failing rarely. 5 jobs failed out of 831 jobs you ran for vLHCathome on the machine you're testing here: http://mcplots-dev.cern.ch/cache/stats/host-8879-82695.txt. As you know Ben Segal wrote, the team is investigating a manner to report the Theory-jobs from vLHCathome-dev also to MCPLOTs. |
Send message Joined: 16 Aug 15 Posts: 966 Credit: 1,211,816 RAC: 0 |
Thanks Crystal. |
©2024 CERN