Name | cl1NDmoelcxn7Olcko1bjSoqABFKDmABFKDmv9XMDm2BFKDmE2Zgjm_0 |
Workunit | 2033243 |
Created | 16 Sep 2020, 12:24:05 UTC |
Sent | 16 Sep 2020, 14:36:32 UTC |
Report deadline | 23 Sep 2020, 14:36:32 UTC |
Received | 16 Sep 2020, 16:42:19 UTC |
Server state | Over |
Outcome | Validate error |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 4228 |
Run time | 4 min 35 sec |
CPU time | 17 sec |
Validate state | Invalid |
Credit | 0.00 |
Device peak FLOPS | 13.39 GFLOPS |
Application version | ATLAS Simulation v1.03 (native_mt) x86_64-pc-linux-gnu |
Peak working set size | 138.68 MB |
Peak swap size | 1.32 GB |
Peak disk usage | 825.48 MB |
<core_client_version>7.16.6</core_client_version> <![CDATA[ <stderr_txt> 17:37:10 (9993): wrapper (7.7.26015): starting 17:37:10 (9993): wrapper: running run_atlas (--nthreads 4) [2020-09-16 17:37:10] Arguments: --nthreads 4 [2020-09-16 17:37:10] Threads: 4 [2020-09-16 17:37:10] Checking for CVMFS [2020-09-16 17:37:11] Probing /cvmfs/atlas-condb.cern.ch... OK [2020-09-16 17:37:11] Probing /cvmfs/atlas.cern.ch... OK [2020-09-16 17:37:11] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE [2020-09-16 17:37:11] 2.5.1.0 25166 7777 53272 69883 1 56 15530655 20480000 0 65024 0 1225632 99.9438 4890521 26259 http://cvmfs-stratum-one.cern.ch/cvmfs/atlas.cern.ch http://188.184.28.244:3128 1 [2020-09-16 17:37:11] CVMFS is ok [2020-09-16 17:37:11] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img [2020-09-16 17:37:11] Checking for singularity binary... [2020-09-16 17:37:11] Using singularity found in PATH at /usr/bin/singularity [2020-09-16 17:37:11] Running /usr/bin/singularity --version [2020-09-16 17:37:11] singularity version 3.6.2-1.el7 [2020-09-16 17:37:11] Checking singularity works with /usr/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname [2020-09-16 17:37:12] dcameron05.cern.ch [2020-09-16 17:37:12] Singularity works [2020-09-16 17:37:12] Set ATHENA_PROC_NUMBER=4 [2020-09-16 17:37:12] Starting ATLAS job with PandaID=4002876565 [2020-09-16 17:37:12] Running command: /usr/bin/singularity exec --pwd /var/lib/boinc/slots/135 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh [2020-09-16 17:41:42] *** The last 200 lines of the pilot log: *** [2020-09-16 17:41:42] 2020-09-16 15:40:45,991 | WARNING | queue_monitor | pilot.control.job | add_timing_and_extracts | [2020-09-16 17:41:42] XXXXXXXXXXXXXXXXXXXXX[begin log extracts] [2020-09-16 17:41:42] - Log from pilotlog.txt - [2020-09-16 17:41:42] 2020-09-16 15:40:45,964 | WARNING | queue_monitor | pilot.api.analytics | get_fitted_data | wrong length of table data, x=[1600270670.0], y=[1614.0] (must be same and length>=4) [2020-09-16 17:41:42] 2020-09-16 15:40:45,964 | DEBUG | queue_monitor | pilot.util.auxiliary.4002876565 | get_job_metrics | job metrics="coreCount=4 actualCoreCount=1" [2020-09-16 17:41:42] 2020-09-16 15:40:45,965 | INFO | queue_monitor | pilot.control.job.4002876565 | get_data_structure | payload/TRF did not report the number of read events [2020-09-16 17:41:42] 2020-09-16 15:40:45,966 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_values | using path: /var/lib/boinc/slots/135/PanDA_Pilot-4002876565/memory_monitor_summary.json (trf name=prmon) [2020-09-16 17:41:42] 2020-09-16 15:40:45,967 | DEBUG | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | summary_dictionary={'Max': {'rx_packets': 40, 'nprocs': 1, 'nthreads': 1, 'rx_bytes': 3897, 'wtime': 5, 'rss': 2660, 'write_bytes': 184320, 'vmem': 14004, 'read_bytes': 4820992, 'stime': 2, 'tx_bytes': 0, 'pss': 1614, 'wchar': 124364, 'rchar': 13181705, 'tx_packets': 0, 'swap': 0, 'utime': 1}, 'Avg': {'write_bytes': 36000.0, 'nprocs': 1.0, 'nthreads': 1.0, 'rx_bytes': 761.132, 'rx_packets': 7.812, 'vmem': 14004.0, 'read_bytes': 941600.0, 'swap': 0.0, 'tx_bytes': 0.0, 'pss': 1614.0, 'wchar': 24289.0, 'rchar': 2574551.0, 'tx_packets': 0.0, 'rss': 2660.0}, 'HW': {'mem': {'MemTotal': 7310336}, 'cpu': {'CoresPerSocket': 1, 'ModelName': 'Intel Core Processor (Broadwell, IBRS)', 'ThreadsPerCore': 1, 'CPUs': 4, 'Sockets': 4}}} [2020-09-16 17:41:42] 2020-09-16 15:40:45,967 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | extracted standard info from prmon json [2020-09-16 17:41:42] 2020-09-16 15:40:45,967 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | extracted standard memory fields from prmon json [2020-09-16 17:41:42] 2020-09-16 15:40:45,967 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | .............................. [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | . Timing measurements: [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | . get job = 0 s [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | . initial setup = 1 s [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | . payload setup = 0 s [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | . total setup = 1 s [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | . stage-in = 0 s [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | . payload execution = 69 s [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | . stage-out = 1 s [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | timing_report | .............................. [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | INFO | queue_monitor | pilot.util.auxiliary.4002876565 | get_log_extracts | building log extracts (sent to the server as 'pilotLog') [2020-09-16 17:41:42] 2020-09-16 15:40:45,968 | DEBUG | queue_monitor | pilot.util.auxiliary.4002876565 | get_panda_tracer_log | PanDA tracer log does not exist: /var/lib/boinc/slots/135/PanDA_Pilot-4002876565/pandatracerlog.txt (ignoring) [2020-09-16 17:41:42] 2020-09-16 15:40:45,969 | INFO | queue_monitor | pilot.util.container | execute | executing command: tail -n 20 /var/lib/boinc/slots/135/PanDA_Pilot-4002876565/pilotlog.txt [2020-09-16 17:41:42] XXXXXXXXXXXXXXXXXXXXX[end log extracts] [2020-09-16 17:41:42] 2020-09-16 15:40:45,991 | DEBUG | queue_monitor | pilot.control.job.4002876565 | send_state | is_harvester_mode(args) : False [2020-09-16 17:41:42] 2020-09-16 15:40:45,992 | DEBUG | queue_monitor | pilot.control.job.4002876565 | send_state | heartbeat dictionary: {'pilotErrorCode': 0, 'rateWBYTES': 36000.0, 'pilotID': 'unknown|PR|2.7.1 (4)', 'totRBYTES': 4820992, 'siteName': 'BOINC-TEST', 'avgVMEM': 14004.0, 'coreCount': 4, 'totWCHAR': 124364, 'rateRCHAR': 2574551.0, 'jobId': '4002876565', 'totRCHAR': 13181705, 'exeErrorCode': 66, 'rateWCHAR': 24289.0, 'metaData': '{\n "cmdLine": "\'/cvmfs/atlas.cern.ch/repo/sw/software/21.0/AtlasOffline/21.0.15/InstallArea/x86_64-slc6-gcc49-opt/share/Sim_tf.py\' \'--inputEVNTFile=EVNT.14296418._001447.pool.root.1\' \'--maxEvents=10\' \'--postInclude\' \'default:RecJobTransforms/UseFrontier.py\' \'--preExec\' \'EVNTtoHITS:simFlags.SimBarcodeOffset.set_Value_and_Lock(200000)\' \'EVNTtoHITS:simFlags.TRTRangeCut=30.0;simFlags.TightMuonStepping=True\' \'--preInclude\' \'EVNTtoHITS:SimulationJobOptions/preInclude.BeamPipeKill.py,SimulationJobOptions/preInclude.FrozenShowersFCalOnly.py\' \'--skipEvents=9600\' \'--firstEvent=13069601\' \'--outputHITSFile=HITS.000649-10479-23551._078090.pool.root.1\' \'--physicsList=FTFP_BERT_ATL_VALIDATION\' \'--randomSeed=65349\' \'--DBRelease=all:current\' \'--conditionsTag\' \'default:OFLCOND-MC16-SDR-14\' \'--geometryVersion=default:ATLAS-R2-2016-01-00-01_VALIDATION\' \'--runNumber=423301\' \'--AMITag=s3126\' \'--DataRunNumber=284500\' \'--simulator=FullG4\' \'--truthStrategy=MC15aPlus\'", \n "created": "2020-09-16T17:37:57", \n "executor": [\n {\n "errMsg": null, \n "exeConfig": {\n "script": "athena.py", \n "substep": "sim"\n }, \n "name": "EVNTtoHITS", \n "rc": -1, \n "statusOK": false, \n "validation": false\n }\n ], \n "exitAcronym": "TRF_EXEC_VALIDATION_FAIL", \n "exitCode": 66, \n "exitMsg": "File EVNT.14296418._001447.pool.root.1 did not pass corruption test", \n "exitMsgExtra": "", \n "files": {\n "input": [\n {\n "dataset": null, \n "nentries": null, \n "subFiles": [\n {\n "file_guid": null, \n "name": "EVNT.14296418._001447.pool.root.1"\n }\n ], \n "type": "EVNT"\n }\n ], \n "output": [\n {\n "argName": "outputHITSFile", \n "dataset": null, \n "subFiles": [\n {\n "file_guid": null, \n "file_size": null, \n "name": "HITS.000649-10479-23551._078090.pool.root.1", \n "nentries": null\n }\n ], \n "type": "HITS"\n }\n ]\n }, \n "name": "Sim_tf", \n "reportVersion": "2.0.7", \n "resource": {\n "executor": {\n "EVNTtoHITS": {\n "cpuTime": null, \n "postExe": {\n "cpuTime": null, \n "wallTime": null\n }, \n "preExe": {\n "cpuTime": null, \n "wallTime": null\n }, \n "total": {\n "cpuTime": null, \n "wallTime": null\n }, \n "validation": {\n "cpuTime": null, \n "wallTime": null\n }, \n "wallTime": null\n }\n }, \n "machine": {\n "cpu_family": "6", \n "linux_distribution": [\n "CentOS Linux", \n "7.7.1908", \n "Core"\n ], \n "model": "61", \n "model_name": "Intel Core Processor (Broadwell, IBRS)", \n "node": "dcameron05.cern.ch", \n "platform": "Linux-3.10.0-1062.4.3.el7.x86_64-x86_64-with-centos-7.7.1908-Core"\n }, \n "transform": {\n "cpuTime": 1, \n "cpuTimeTotal": 0, \n "externalCpuTime": 0, \n "trfPredata": null, \n "wallTime": 0\n }\n }\n}', 'xml': '{"log.000649-10479-23551._078090.job.log.tgz.1": {"adler32": "055a6096", "surl": "srm://srm.ndgf.org:8443/srm/managerv2?SFN=/atlas/disk/atlasdatadisk/rucio/mc16_13TeV/da/ab/log.000649-10479-23551._078090.job.log.tgz.1", "guid": "f4aef7c6-81b0-4050-ad42-22865c360d47", "fsize": 20030}}', 'maxVMEM': 14004, 'cpuConversionFactor': 1.0, 'avgSWAP': 0.0, 'state': 'failed', 'transExitCode': 66, 'pilotErrorDiag': '', 'node': 'dcameron05.cern.ch', 'avgRSS': 2660.0, 'avgPSS': 1614.0, 'timestamp': '2020-09-16T17:40:45+01:00', 'pilotTiming': '0|0|69|1|1', 'attemptNr': 1, 'totWBYTES': 184320, 'rateRBYTES': 941600.0, 'pilotLog': '- Log from pilotlog.txt -\n2020-09-16 15:40:45,964 | WARNING | queue_monitor | pilot.api.analytics | get_fitted_data | wrong length of table data, x=[1600270670.0], y=[1614.0] (must be same and length>=4)\n2020-09-16 15:40:45,964 | DEBUG | queue_monitor | pilot.util.auxiliary.4002876565 | get_job_metrics | job metrics="coreCount=4 actualCoreCount=1"\n2020-09-16 15:40:45,965 | INFO | queue_monitor | pilot.control.job.4002876565 | get_data_structure | payload/TRF did not report the number of read events\n2020-09-16 15:40:45,966 | INFO | queue_monitor | pilot.user.atlas.utilities | get_memory_values | using path: /var/lib/boinc/slots/135/PanDA_Pilot-4002876565/memory_monitor_summary.json (trf name=prmon)\n2020-09-16 15:40:45,967 | DEBUG | queue_monitor | pilot.user.atlas.utilities | get_memory_monitor_info | summary_dictionary={\'Max\': {\'rx_packets\': 40, \'nprocs\': 1, \'nthreads\': 1, \'rx_bytes\': 3897, \'wtime\':', 'cpuConsumptionTime': 16, 'startTime': 1600270643.77164, 'cpuConsumptionUnit': 's+Intel Core Processor (Broadwell, IBRS) 16384 KB', 'exeErrorDiag': u'File EVNT.14296418._001447.pool.root.1 did not pass corruption test', 'maxSWAP': 0, 'jobMetrics': 'coreCount=4 actualCoreCount=1', 'maxRSS': 2660, 'schedulerID': 'unknown', 'endTime': 1600270845.991501, 'maxPSS': 1614} [2020-09-16 17:41:42] 2020-09-16 15:40:45,992 | DEBUG | queue_monitor | pilot.control.job.4002876565 | send_state | wrote heartbeat to file /var/lib/boinc/slots/135/heartbeat.json [2020-09-16 17:41:42] 2020-09-16 15:40:45,992 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | job 4002876565 was dequeued from the monitored payloads queue [2020-09-16 17:41:42] 2020-09-16 15:40:45,992 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | tmp job object deleted [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | job summary report [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | -------------------------------------------------- [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | PanDA job id: 4002876565 [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | task id: 000649-10479-23551 [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | errors: (none) [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | status: LOG_TRANSFER = DONE [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | pilot state: failed [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | transexitcode: 66 [2020-09-16 17:41:42] 2020-09-16 15:40:46,017 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | exeerrorcode: 66 [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | exeerrordiag: File EVNT.14296418._001447.pool.root.1 did not pass corruption test [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | exitcode: 66 [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | exitmsg: File EVNT.14296418._001447.pool.root.1 did not pass corruption test [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | cpuconsumptiontime: 16 s [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | nevents: 0 [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | neventsw: 0 [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | pid: 16752 [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | pgrp: 16752 [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | corecount: 4 [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | event service: False [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | -------------------------------------------------- [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | INFO | retrieve | pilot.util.auxiliary.4002876565 | make_job_report | [2020-09-16 17:41:42] 2020-09-16 15:40:46,018 | DEBUG | retrieve | pilot.control.job.4002876565 | has_job_completed | ls -lF /var/lib/boinc/slots/135: [2020-09-16 17:41:42] [2020-09-16 17:41:42] 2020-09-16 15:40:46,019 | INFO | retrieve | pilot.util.container | execute | executing command: ls -lF /var/lib/boinc/slots/135 [2020-09-16 17:41:42] 2020-09-16 15:40:46,040 | DEBUG | retrieve | pilot.control.job.4002876565 | has_job_completed | total 427320 [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 7893267 Sep 16 17:37 agis_ddmendpoints.json [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 2747842 Sep 16 17:37 agis_schedconf.cvmfs.json [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 0 Sep 16 17:37 boinc_lockfile [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 8192 Sep 16 17:40 boinc_mmap_file [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 530 Sep 16 17:40 boinc_task_state.xml [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 141 Sep 16 17:37 cl1NDmoelcxn7Olcko1bjSoqABFKDmABFKDmv9XMDm2BFKDmE2Zgjm.diag [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 425921745 Sep 16 17:37 EVNT.22499207._000770.pool.root.1 [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 5865 Sep 16 17:40 heartbeat.json [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 6081 Sep 16 17:37 init_data.xml [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 301232 Sep 16 17:37 input.tar.gz [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 112 Sep 16 17:37 job.xml [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 117578 Sep 16 17:40 log.000649-10479-23551._078090.job.log.1 [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 20030 Sep 16 17:38 log.000649-10479-23551._078090.job.log.tgz.1 [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 910 Sep 16 17:38 memory_monitor_summary.json [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 261 Sep 16 17:38 output.list [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 2938 Sep 16 17:37 pandaJob.out [2020-09-16 17:41:42] drwxrwx---. 2 boinc boinc 222 Sep 16 17:38 PanDA_Pilot-4002876565/ [2020-09-16 17:41:42] drwx------. 3 boinc boinc 248 Sep 16 17:37 pilot2/ [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 293491 Sep 16 14:18 pilot2.tar.gz [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 101661 Sep 16 17:40 pilotlog.txt [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 4365 Sep 16 14:20 queuedata.json [2020-09-16 17:41:42] -rwxr-xr-x. 1 boinc boinc 5573 Sep 16 17:37 run_atlas* [2020-09-16 17:41:42] -rwx------. 1 boinc boinc 12641 Sep 16 14:23 runpilot2-wrapper.sh* [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 407 Sep 16 17:37 runtime_log [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 5575 Sep 16 17:37 runtime_log.err [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 408 Sep 16 17:37 setup.sh.local [2020-09-16 17:41:42] drwxrwx--x. 2 boinc boinc 68 Sep 16 17:37 shared/ [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 15768 Sep 16 17:37 start_atlas.sh [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 1703 Sep 16 17:37 stderr.txt [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 107 Sep 16 17:37 wrapper_26015_x86_64-pc-linux-gnu [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 22 Sep 16 17:40 wrapper_checkpoint.txt [2020-09-16 17:41:42] 2020-09-16 15:40:46,040 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue jobs has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,040 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue payloads has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue data_in has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue data_out has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue current_data_in has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue validated_jobs has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue validated_payloads has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue monitored_payloads has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_jobs has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_payloads has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_data_in has 1 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue finished_data_out has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_jobs has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,041 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_payloads has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,042 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_data_in has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,042 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue failed_data_out has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,042 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue completed_jobs has 0 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,042 | INFO | retrieve | pilot.util.queuehandling | queue_report | queue completed_jobids has 1 job(s) [2020-09-16 17:41:42] 2020-09-16 15:40:46,042 | INFO | retrieve | pilot.control.job.4002876565 | has_job_completed | job 4002876565 has completed (purged errors) [2020-09-16 17:41:42] 2020-09-16 15:40:46,042 | INFO | retrieve | pilot.util.processes | cleanup | overall cleanup function is called [2020-09-16 17:41:42] 2020-09-16 15:40:46,043 | DEBUG | retrieve | pilot.util.processes | cleanup | work directory was removed: /var/lib/boinc/slots/135/PanDA_Pilot-4002876565 [2020-09-16 17:41:42] 2020-09-16 15:40:47,048 | INFO | retrieve | pilot.info.jobdata | collect_zombies | --- collectZombieJob: --- 10, [16752] [2020-09-16 17:41:42] 2020-09-16 15:40:47,048 | INFO | retrieve | pilot.info.jobdata | collect_zombies | zombie collector trying to kill pid 16752 [2020-09-16 17:41:42] 2020-09-16 15:40:47,048 | INFO | retrieve | pilot.info.jobdata | collect_zombies | harmless exception when collecting zombies: [Errno 10] No child processes [2020-09-16 17:41:42] 2020-09-16 15:40:48,054 | INFO | retrieve | pilot.util.processes | cleanup | collected zombie processes [2020-09-16 17:41:42] 2020-09-16 15:40:48,054 | INFO | retrieve | pilot.util.processes | cleanup | will now attempt to kill all subprocesses of pid=16752 [2020-09-16 17:41:42] 2020-09-16 15:40:48,106 | INFO | retrieve | pilot.util.processes | kill_processes | process IDs to be killed: [16752] (in reverse order) [2020-09-16 17:41:42] 2020-09-16 15:40:48,141 | WARNING | retrieve | pilot.util.processes | kill_processes | found no corresponding commands to process id(s) [2020-09-16 17:41:42] 2020-09-16 15:40:48,141 | INFO | retrieve | pilot.util.processes | kill_orphans | Do not look for orphan processes in BOINC jobs [2020-09-16 17:41:42] 2020-09-16 15:40:48,141 | INFO | retrieve | pilot.control.job | retrieve | ready for new job [2020-09-16 17:41:42] 2020-09-16 15:40:48,142 | INFO | retrieve | root | retrieve | pilot has finished for previous job - re-establishing logging [2020-09-16 17:41:42] 2020-09-16 15:40:48,142 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | *************************************** [2020-09-16 17:41:42] 2020-09-16 15:40:48,142 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | *** PanDA Pilot version 2.7.1 (4) *** [2020-09-16 17:41:42] 2020-09-16 15:40:48,142 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | *************************************** [2020-09-16 17:41:42] 2020-09-16 15:40:48,143 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | [2020-09-16 17:41:42] 2020-09-16 15:40:48,143 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | pilot is running in a VM [2020-09-16 17:41:42] 2020-09-16 15:40:48,143 | INFO | retrieve | pilot.util.auxiliary | display_architecture_info | architecture information: [2020-09-16 17:41:42] 2020-09-16 15:40:48,216 | INFO | retrieve | pilot.util.auxiliary | display_architecture_info | [2020-09-16 17:41:42] LSB Version: :core-4.1-amd64:core-4.1-noarch [2020-09-16 17:41:42] Distributor ID: CentOS [2020-09-16 17:41:42] Description: CentOS Linux release 7.7.1908 (Core) [2020-09-16 17:41:42] Release: 7.7.1908 [2020-09-16 17:41:42] Codename: Core [2020-09-16 17:41:42] 2020-09-16 15:40:48,216 | INFO | retrieve | pilot.util.auxiliary | pilot_version_banner | *************************************** [2020-09-16 17:41:42] 2020-09-16 15:40:48,719 | DEBUG | retrieve | pilot.util.monitoring | check_local_space | checking local space on /var/lib/boinc/slots/135 [2020-09-16 17:41:42] 2020-09-16 15:40:48,736 | INFO | retrieve | pilot.util.monitoring | check_local_space | sufficient remaining disk space (18879610880 B) [2020-09-16 17:41:42] 2020-09-16 15:40:48,736 | WARNING | retrieve | pilot.control.job | proceed_with_getjob | since timefloor is set to 0, pilot was only allowed to run one job [2020-09-16 17:41:42] 2020-09-16 15:40:48,736 | DEBUG | retrieve | pilot.control.job | retrieve | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:48,737 | DEBUG | retrieve | pilot.control.job | retrieve | [job] retrieve thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:48,742 | WARNING | monitor | pilot.control.monitor | control | aborting monitor loop since graceful_stop has been set [2020-09-16 17:41:42] 2020-09-16 15:40:48,742 | INFO | monitor | pilot.control.monitor | control | [monitor] control thread has ended [2020-09-16 17:41:42] 2020-09-16 15:40:48,786 | DEBUG | execute_payloads | pilot.control.payload | execute_payloads | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:48,787 | INFO | execute_payloads | pilot.control.payload | execute_payloads | [payload] execute_payloads thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:48,909 | INFO | failed_post | pilot.control.payload | failed_post | [payload] failed_post thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:48,936 | DEBUG | data | pilot.control.data | control | data control ending since graceful_stop has been set [2020-09-16 17:41:42] 2020-09-16 15:40:48,936 | DEBUG | data | pilot.control.data | control | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:48,936 | DEBUG | data | pilot.control.data | control | [data] control thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:49,048 | WARNING | copytool_out | pilot.util.common | should_abort | data:copytool_out:received graceful stop - abort after this iteration [2020-09-16 17:41:42] 2020-09-16 15:40:49,542 | WARNING | queue_monitoring | pilot.util.common | should_abort | data:queue_monitoring:received graceful stop - abort after this iteration [2020-09-16 17:41:42] 2020-09-16 15:40:49,549 | DEBUG | payload | pilot.control.payload | control | payload control ending since graceful_stop has been set [2020-09-16 17:41:42] 2020-09-16 15:40:49,550 | DEBUG | payload | pilot.control.payload | control | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:49,550 | DEBUG | payload | pilot.control.payload | control | [payload] control thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:49,660 | DEBUG | create_data_payload | pilot.control.job | create_data_payload | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:49,661 | DEBUG | create_data_payload | pilot.control.job | create_data_payload | [job] create_data_payload thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:49,663 | DEBUG | validate | pilot.control.job | validate | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:49,663 | DEBUG | validate | pilot.control.job | validate | [job] validate thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:49,723 | DEBUG | validate_post | pilot.control.payload | validate_post | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:49,723 | INFO | validate_post | pilot.control.payload | validate_post | [payload] validate_post thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:49,730 | DEBUG | validate_pre | pilot.control.payload | validate_pre | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:49,730 | INFO | validate_pre | pilot.control.payload | validate_pre | [payload] validate_pre thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:49,753 | DEBUG | job | pilot.control.job | control | job control ending since graceful_stop has been set [2020-09-16 17:41:42] 2020-09-16 15:40:49,753 | DEBUG | job | pilot.control.job | control | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:49,753 | DEBUG | job | pilot.control.job | control | [job] control thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:50,048 | DEBUG | copytool_out | pilot.control.data | copytool_out | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:50,049 | DEBUG | copytool_out | pilot.control.data | copytool_out | [data] copytool_out thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:50,210 | DEBUG | copytool_in | pilot.control.data | copytool_in | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:50,210 | DEBUG | copytool_in | pilot.control.data | copytool_in | [data] copytool_in thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:51,002 | WARNING | queue_monitor | pilot.util.common | should_abort | job:queue_monitor:received graceful stop - abort after this iteration [2020-09-16 17:41:42] 2020-09-16 15:40:51,002 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:51,002 | DEBUG | queue_monitor | pilot.control.job | queue_monitor | [job] queue monitor thread has finished [2020-09-16 17:41:42] 2020-09-16 15:40:52,543 | DEBUG | queue_monitoring | pilot.control.data | queue_monitoring | will not set job_aborted yet [2020-09-16 17:41:42] 2020-09-16 15:40:52,543 | DEBUG | queue_monitoring | pilot.control.data | queue_monitoring | [data] queue_monitor thread has finished [2020-09-16 17:41:42] 2020-09-16 15:41:41,801 | WARNING | job_monitor | pilot.control.job | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 61 s) [2020-09-16 17:41:42] 2020-09-16 15:41:41,802 | DEBUG | job_monitor | pilot.util.processes | threads_aborted | aborting since the last relevant thread is about to finish [2020-09-16 17:41:42] 2020-09-16 15:41:41,802 | DEBUG | job_monitor | pilot.control.job | job_monitor | will proceed to set job_aborted [2020-09-16 17:41:42] 2020-09-16 15:41:41,802 | DEBUG | job_monitor | pilot.control.job | job_monitor | [job] job monitor thread has finished [2020-09-16 17:41:42] 2020-09-16 15:41:41,839 | INFO | MainThread | pilot.workflow.generic | run | end of generic workflow (traces error code: 0) [2020-09-16 17:41:42] 2020-09-16 15:41:41,840 | INFO | MainThread | root | wrap_up | traces error code: 0 [2020-09-16 17:41:42] 2020-09-16 15:41:41,840 | INFO | MainThread | root | wrap_up | pilot has finished [2020-09-16 17:41:42] 2020-09-16 15:41:41 UTC [wrapper] ==== pilot stdout END ==== [2020-09-16 17:41:42] 2020-09-16 15:41:41 UTC [wrapper] ==== wrapper stdout RESUME ==== [2020-09-16 17:41:42] 2020-09-16 15:41:41 UTC [wrapper] Pilot exit status: 0 [2020-09-16 17:41:42] 2020-09-16 15:41:41 UTC [wrapper] STATUSCODE: 0 [2020-09-16 17:41:42] 2020-09-16 15:41:41 UTC [wrapper] apfmon messages muted [2020-09-16 17:41:42] ---- find pandaID.out ---- [2020-09-16 17:41:42] total 64 [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 300 Jun 4 15:10 __init__.py [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 11357 Jul 25 2019 LICENSE [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 20 Sep 9 2019 MANIFEST.IN [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 11 Sep 16 17:37 pandaIDs.out [2020-09-16 17:41:42] drwx------. 14 boinc boinc 216 Sep 16 17:37 pilot [2020-09-16 17:41:42] -rwx------. 1 boinc boinc 22342 Aug 13 17:00 pilot.py [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 7 Aug 13 17:00 PILOTVERSION [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 2212 Nov 14 2019 README.md [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 1019 Jun 4 15:10 setup.py [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 221 Jul 25 2019 TODO.md [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 11 Sep 16 17:37 /var/lib/boinc/slots/135/pilot2/pandaIDs.out [2020-09-16 17:41:42] 4002876565 [2020-09-16 17:41:42] [2020-09-16 17:41:42] 2020-09-16 15:41:41 UTC [wrapper] Test setup, not cleaning [2020-09-16 17:41:42] 2020-09-16 15:41:41 UTC [wrapper] ==== wrapper stdout END ==== [2020-09-16 17:41:42] 2020-09-16 15:41:41 UTC [wrapper] ==== wrapper stderr END ==== [2020-09-16 17:41:42] 2020-09-16 15:41:42 UTC [wrapper] wrapper wrapperexiting ec=0, duration=268 [2020-09-16 17:41:42] 2020-09-16 15:41:42 UTC [wrapper] apfmon messages muted [2020-09-16 17:41:42] *** Error codes and diagnostics *** [2020-09-16 17:41:42] "exeErrorCode": 66, [2020-09-16 17:41:42] "exeErrorDiag": "File EVNT.14296418._001447.pool.root.1 did not pass corruption test", [2020-09-16 17:41:42] "pilotErrorCode": 0, [2020-09-16 17:41:42] "pilotErrorDiag": "", [2020-09-16 17:41:42] *** Listing of results directory *** [2020-09-16 17:41:42] total 427420 [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 293491 Sep 16 14:18 pilot2.tar.gz [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 4365 Sep 16 14:20 queuedata.json [2020-09-16 17:41:42] -rwx------. 1 boinc boinc 12641 Sep 16 14:23 runpilot2-wrapper.sh [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 107 Sep 16 17:37 wrapper_26015_x86_64-pc-linux-gnu [2020-09-16 17:41:42] -rwxr-xr-x. 1 boinc boinc 5573 Sep 16 17:37 run_atlas [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 112 Sep 16 17:37 job.xml [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 6081 Sep 16 17:37 init_data.xml [2020-09-16 17:41:42] drwxrwx--x. 2 boinc boinc 68 Sep 16 17:37 shared [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 0 Sep 16 17:37 boinc_lockfile [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 425921745 Sep 16 17:37 EVNT.22499207._000770.pool.root.1 [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 15768 Sep 16 17:37 start_atlas.sh [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 301232 Sep 16 17:37 input.tar.gz [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 2938 Sep 16 17:37 pandaJob.out [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 408 Sep 16 17:37 setup.sh.local [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 2747842 Sep 16 17:37 agis_schedconf.cvmfs.json [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 7893267 Sep 16 17:37 agis_ddmendpoints.json [2020-09-16 17:41:42] drwx------. 3 boinc boinc 248 Sep 16 17:37 pilot2 [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 910 Sep 16 17:38 memory_monitor_summary.json [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 20030 Sep 16 17:38 log.000649-10479-23551._078090.job.log.tgz.1 [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 5865 Sep 16 17:40 heartbeat.json [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 22 Sep 16 17:41 wrapper_checkpoint.txt [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 8192 Sep 16 17:41 boinc_mmap_file [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 530 Sep 16 17:41 boinc_task_state.xml [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 8670 Sep 16 17:41 pilotlog.txt [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 134828 Sep 16 17:41 log.000649-10479-23551._078090.job.log.1 [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 261 Sep 16 17:41 output.list [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 9097 Sep 16 17:41 runtime_log.err [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 664 Sep 16 17:41 runtime_log [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 174080 Sep 16 17:41 result.tar.gz [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 569 Sep 16 17:41 cl1NDmoelcxn7Olcko1bjSoqABFKDmABFKDmv9XMDm2BFKDmE2Zgjm.diag [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 38009 Sep 16 17:41 stderr.txt [2020-09-16 17:41:42] No HITS result produced [2020-09-16 17:41:42] *** Contents of shared directory: *** [2020-09-16 17:41:42] total 416424 [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 425921745 Sep 16 17:37 ATLAS.root_0 [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 15768 Sep 16 17:37 start_atlas.sh [2020-09-16 17:41:42] -rw-r--r--. 1 boinc boinc 301232 Sep 16 17:37 input.tar.gz [2020-09-16 17:41:42] -rw-------. 1 boinc boinc 174080 Sep 16 17:41 result.tar.gz 17:41:44 (9993): run_atlas exited; CPU time 17.294118 17:41:44 (9993): called boinc_finish(0) </stderr_txt> ]]>
©2024 CERN