Name 3xMODmHgsRvnShfckohDCDFpABFKDmABFKDmQ6FKDmECFKDmW41Yzn_2
Workunit 1935550
Created 26 Sep 2019, 8:12:26 UTC
Sent 26 Sep 2019, 9:50:10 UTC
Report deadline 3 Oct 2019, 9:50:10 UTC
Received 26 Sep 2019, 11:32:42 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x00000000)
Computer ID 3682
Run time 13 min 58 sec
CPU time 28 min 45 sec
Validate state Valid
Credit 34.60
Device peak FLOPS 17.83 GFLOPS
Application version ATLAS Simulation v0.73 (native_mt)
x86_64-pc-linux-gnu
Peak working set size 1.79 GB
Peak swap size 2.54 GB
Peak disk usage 720.27 MB

Stderr output

<core_client_version>7.16.1</core_client_version>
<![CDATA[
<stderr_txt>
12:18:33 (1254891): wrapper (7.7.26015): starting
12:18:33 (1254891): wrapper: running run_atlas (--nthreads 4)
2019-09-26 12:18:33,754: singularity image is /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6
2019-09-26 12:18:33,754: sys.argv = ['run_atlas', '--nthreads', '4']
2019-09-26 12:18:33,755: THREADS=4

2019-09-26 12:18:33,755: Checking for CVMFS
2019-09-26 12:18:34,784: CVMFS is installed
2019-09-26 12:18:34,784: Checking Singularity...
2019-09-26 12:18:34,800: Singularity is installed, version singularity version 3.4.0-1.2.el7
2019-09-26 12:18:34,801: Testing the function of Singularity...
2019-09-26 12:18:34,801: Checking singularity with cmd:singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 hostname
2019-09-26 12:18:34,909: Singularity Works...
2019-09-26 12:18:34,909: copy /home/dcameron/boinc/slots/0/shared/ATLAS.root_0
2019-09-26 12:18:37,296: copy /home/dcameron/boinc/slots/0/shared/input.tar.gz
2019-09-26 12:18:37,296: copy /home/dcameron/boinc/slots/0/shared/RTE.tar.gz
2019-09-26 12:18:37,297: copy /home/dcameron/boinc/slots/0/shared/start_atlas.sh
2019-09-26 12:18:37,297: export ATHENA_PROC_NUMBER=4;
2019-09-26 12:18:37,310: start atlas job with PandaID=4002876565
2019-09-26 12:18:37,310: cmd = singularity exec --pwd /home/dcameron/boinc/slots/0 -B /cvmfs,/home /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-slc6 sh start_atlas.sh > runtime_log 2> runtime_log.err
2019-09-26 12:32:29,335: running cmd return value is 0
2019-09-26 12:32:29,335: Moving ./HITS.000649-2009753-26370._078090.pool.root.1 to shared/HITS.pool.root.1
2019-09-26 12:32:29,336: HITS result file:
2019-09-26 12:32:29,342: -rw-------. 1 dcameron zp 9151313 Sep 26 12:30 shared/HITS.pool.root.1
2019-09-26 12:32:29,342: *****************The last 100 lines of the pilot log******************
2019-09-26 12:32:29,345: 2019-09-26 10:31:36,755 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | event service: False
2019-09-26 10:31:36,755 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | --------------------------------------------------
2019-09-26 10:31:36,755 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | 
2019-09-26 10:31:36,755 | INFO     | retrieve            | pilot.control.job                | has_job_completed         | job 4002876565 has completed
2019-09-26 10:31:36,755 | INFO     | retrieve            | pilot.util.processes             | cleanup                   | overall cleanup function is called
2019-09-26 10:31:36,760 | DEBUG    | retrieve            | pilot.util.processes             | cleanup                   | work directory was removed: /home/dcameron/boinc/slots/0/PanDA_Pilot-4002876565
2019-09-26 10:31:37,765 | INFO     | retrieve            | pilot.info.jobdata               | collect_zombies           | --- collectZombieJob: --- 10, [1261770]
2019-09-26 10:31:37,765 | INFO     | retrieve            | pilot.info.jobdata               | collect_zombies           | zombie collector trying to kill pid 1261770
2019-09-26 10:31:37,765 | INFO     | retrieve            | pilot.info.jobdata               | collect_zombies           | harmless exception when collecting zombies: [Errno 10] No child processes
2019-09-26 10:31:38,770 | INFO     | retrieve            | pilot.util.processes             | cleanup                   | collected zombie processes
2019-09-26 10:31:38,771 | INFO     | retrieve            | pilot.util.processes             | cleanup                   | will now attempt to kill all subprocesses of pid=1261770
2019-09-26 10:31:38,861 | INFO     | retrieve            | pilot.util.processes             | kill_processes            | process IDs to be killed: [1261770] (in reverse order)
2019-09-26 10:31:38,906 | WARNING  | retrieve            | pilot.util.processes             | kill_processes            | found no corresponding commands to process id(s)
2019-09-26 10:31:38,906 | INFO     | retrieve            | pilot.util.processes             | kill_orphans              | Do not look for orphan processes in BOINC jobs
2019-09-26 10:31:38,906 | INFO     | retrieve            | pilot.control.job                | retrieve                  | ready for new job
2019-09-26 10:31:38,906 | INFO     | retrieve            | root                             | retrieve                  | pilot has finished for previous job - re-establishing logging
No handlers could be found for logger "pilot.util.mpi"
2019-09-26 10:31:38,909 | INFO     | retrieve            | pilot.util.auxiliary             | pilot_version_banner      | *****************************************
2019-09-26 10:31:38,909 | INFO     | retrieve            | pilot.util.auxiliary             | pilot_version_banner      | ***  PanDA Pilot version 2.1.25 (11)  ***
2019-09-26 10:31:38,909 | INFO     | retrieve            | pilot.util.auxiliary             | pilot_version_banner      | *****************************************
2019-09-26 10:31:38,909 | INFO     | retrieve            | pilot.util.auxiliary             | pilot_version_banner      | 
2019-09-26 10:31:38,909 | INFO     | retrieve            | pilot.util.auxiliary             | display_architecture_info | architecture information:
2019-09-26 10:31:38,985 | INFO     | retrieve            | pilot.util.auxiliary             | display_architecture_info | 
LSB Version:	:base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch
Distributor ID:	ScientificCERNSLC
Description:	Scientific Linux CERN SLC release 6.10 (Carbon)
Release:	6.10
Codename:	Carbon
2019-09-26 10:31:38,986 | INFO     | retrieve            | pilot.util.auxiliary             | pilot_version_banner      | *****************************************
2019-09-26 10:31:39,488 | DEBUG    | retrieve            | pilot.util.monitoring            | check_local_space         | checking local space on /home/dcameron/boinc/slots/0
2019-09-26 10:31:39,506 | INFO     | retrieve            | pilot.util.monitoring            | check_local_space         | sufficient remaining disk space (70039633920 B)
2019-09-26 10:31:39,506 | WARNING  | retrieve            | pilot.control.job                | proceed_with_getjob       | since timefloor is set to 0, pilot was only allowed to run one job
2019-09-26 10:31:39,506 | DEBUG    | retrieve            | pilot.control.job                | retrieve                  | [job] retrieve thread has finished
2019-09-26 10:31:39,509 | INFO     | monitor             | pilot.control.monitor            | control                   | [monitor] control thread has ended
2019-09-26 10:31:39,556 | DEBUG    | data                | pilot.control.data               | control                   | data control ending since graceful_stop has been set
2019-09-26 10:31:39,556 | DEBUG    | data                | pilot.control.data               | control                   | [data] control thread has finished
2019-09-26 10:31:39,594 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 14 threads
2019-09-26 10:31:39,594 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job, started 140406927873792)>, <ExcThread(job_monitor, started 140406280931072)>, <ExcThread(queue_monitor, started 140406272538368)>, <ExcThread(validate_pre, started 140406809409280)>, <ExcThread(validate_post, started 140406289323776)>, <ExcThread(validate, started 140406919481088)>, <ExcThread(failed_post, started 140406264145664)>, <ExcThread(payload, started 140406826194688)>, <ExcThread(execute_payloads, started 140406255752960)>, <ExcThread(copytool_out, started 140406834587392)>, <ExcThread(queue_monitoring, started 140406801016576)>, <ExcThread(copytool_in, started 140406297716480)>, <ExcThread(create_data_payload, started 140406817801984)>]
2019-09-26 10:31:39,862 | INFO     | validate_post       | pilot.control.payload            | validate_post             | [payload] validate_post thread has finished
2019-09-26 10:31:39,896 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 13 threads
2019-09-26 10:31:39,896 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job, started 140406927873792)>, <ExcThread(job_monitor, started 140406280931072)>, <ExcThread(queue_monitor, started 140406272538368)>, <ExcThread(validate_pre, started 140406809409280)>, <ExcThread(validate, started 140406919481088)>, <ExcThread(failed_post, started 140406264145664)>, <ExcThread(payload, started 140406826194688)>, <ExcThread(execute_payloads, started 140406255752960)>, <ExcThread(copytool_out, started 140406834587392)>, <ExcThread(queue_monitoring, started 140406801016576)>, <ExcThread(copytool_in, started 140406297716480)>, <ExcThread(create_data_payload, started 140406817801984)>]
2019-09-26 10:31:39,917 | INFO     | execute_payloads    | pilot.control.payload            | execute_payloads          | [payload] execute_payloads thread has finished
2019-09-26 10:31:40,002 | DEBUG    | job                 | pilot.control.job                | control                   | job control ending since graceful_stop has been set
2019-09-26 10:31:40,002 | DEBUG    | job                 | pilot.control.job                | control                   | [job] control thread has finished
2019-09-26 10:31:40,070 | DEBUG    | payload             | pilot.control.payload            | control                   | payload control ending since graceful_stop has been set
2019-09-26 10:31:40,070 | DEBUG    | payload             | pilot.control.payload            | control                   | [payload] control thread has finished
2019-09-26 10:31:40,104 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 10 threads
2019-09-26 10:31:40,104 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job_monitor, started 140406280931072)>, <ExcThread(queue_monitor, started 140406272538368)>, <ExcThread(validate_pre, started 140406809409280)>, <ExcThread(validate, started 140406919481088)>, <ExcThread(failed_post, started 140406264145664)>, <ExcThread(copytool_out, started 140406834587392)>, <ExcThread(queue_monitoring, started 140406801016576)>, <ExcThread(copytool_in, started 140406297716480)>, <ExcThread(create_data_payload, started 140406817801984)>]
2019-09-26 10:31:40,229 | INFO     | validate_pre        | pilot.control.payload            | validate_pre              | [payload] validate_pre thread has finished
2019-09-26 10:31:40,270 | DEBUG    | copytool_in         | pilot.control.data               | copytool_in               | [data] copytool_in thread has finished
2019-09-26 10:31:40,305 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 8 threads
2019-09-26 10:31:40,306 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job_monitor, started 140406280931072)>, <ExcThread(queue_monitor, started 140406272538368)>, <ExcThread(validate, started 140406919481088)>, <ExcThread(failed_post, started 140406264145664)>, <ExcThread(copytool_out, started 140406834587392)>, <ExcThread(queue_monitoring, started 140406801016576)>, <ExcThread(create_data_payload, started 140406817801984)>]
2019-09-26 10:31:40,613 | INFO     | failed_post         | pilot.control.payload            | failed_post               | [payload] failed_post thread has finished
2019-09-26 10:31:40,695 | WARNING  | copytool_out        | pilot.util.common                | should_abort              | data:copytool_out:received graceful stop - abort after this iteration
2019-09-26 10:31:40,708 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 7 threads
2019-09-26 10:31:40,708 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job_monitor, started 140406280931072)>, <ExcThread(queue_monitor, started 140406272538368)>, <ExcThread(validate, started 140406919481088)>, <ExcThread(copytool_out, started 140406834587392)>, <ExcThread(queue_monitoring, started 140406801016576)>, <ExcThread(create_data_payload, started 140406817801984)>]
2019-09-26 10:31:40,726 | DEBUG    | create_data_payload | pilot.control.job                | create_data_payload       | [job] create_data_payload thread has finished
2019-09-26 10:31:40,754 | WARNING  | queue_monitor       | pilot.util.common                | should_abort              | job:queue_monitor:received graceful stop - abort after this iteration
2019-09-26 10:31:40,754 | DEBUG    | queue_monitor       | pilot.control.job                | queue_monitor             | [job] queue monitor thread has finished
2019-09-26 10:31:40,809 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 5 threads
2019-09-26 10:31:40,809 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job_monitor, started 140406280931072)>, <ExcThread(validate, started 140406919481088)>, <ExcThread(copytool_out, started 140406834587392)>, <ExcThread(queue_monitoring, started 140406801016576)>]
2019-09-26 10:31:40,997 | DEBUG    | validate            | pilot.control.job                | validate                  | [job] validate thread has finished
2019-09-26 10:31:41,010 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 4 threads
2019-09-26 10:31:41,011 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job_monitor, started 140406280931072)>, <ExcThread(copytool_out, started 140406834587392)>, <ExcThread(queue_monitoring, started 140406801016576)>]
2019-09-26 10:31:41,695 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | [data] copytool_out thread has finished
2019-09-26 10:31:41,715 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 3 threads
2019-09-26 10:31:41,715 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job_monitor, started 140406280931072)>, <ExcThread(queue_monitoring, started 140406801016576)>]
2019-09-26 10:31:42,377 | WARNING  | queue_monitoring    | pilot.util.common                | should_abort              | data:queue_monitoring:received graceful stop - abort after this iteration
2019-09-26 10:31:45,378 | DEBUG    | queue_monitoring    | pilot.control.data               | queue_monitoring          | [data] queue_monitor thread has finished
2019-09-26 10:31:45,436 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | thread count now at 2 threads
2019-09-26 10:31:45,436 | DEBUG    | MainThread          | pilot.workflow.generic           | run                       | enumerate: [<_MainThread(MainThread, started 140407147632384)>, <ExcThread(job_monitor, started 140406280931072)>]
2019-09-26 10:32:29,054 | WARNING  | job_monitor         | pilot.control.job                | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 62 s)
2019-09-26 10:32:29,054 | DEBUG    | job_monitor         | pilot.control.job                | job_monitor               | [job] job monitor thread has finished
2019-09-26 10:32:29,095 | INFO     | MainThread          | pilot.workflow.generic           | run                       | end of generic workflow (traces error code: 0)
2019-09-26 10:32:29,096 | INFO     | MainThread          | root                             | wrap_up                   | traces error code: 0
2019-09-26 10:32:29,096 | INFO     | MainThread          | root                             | wrap_up                   | pilot has finished
2019-09-26 10:32:29 UTC [wrapper] ==== pilot stdout END ====
2019-09-26 10:32:29 UTC [wrapper] ==== wrapper stdout RESUME ====
2019-09-26 10:32:29 UTC [wrapper] Pilot exit status: 0
2019-09-26 10:32:29 UTC [wrapper] STATUSCODE: 0
2019-09-26 10:32:29 UTC [wrapper] apfmon messages muted
---- find pandaID.out ----
total 56
-rw-------.  1 dcameron zp 11357 Jul 25 16:38 LICENSE
-rw-------.  1 dcameron zp    20 Sep  9 13:04 MANIFEST.IN
-rw-------.  1 dcameron zp    11 Sep 26 12:18 pandaIDs.out
drwx------. 14 dcameron zp   216 Sep 26 12:18 pilot
-rwx------.  1 dcameron zp 20136 Sep  9 13:04 pilot.py
-rw-------.  1 dcameron zp     9 Sep  9 13:04 PILOTVERSION
-rw-------.  1 dcameron zp  2251 Jul 25 16:38 README.md
-rw-------.  1 dcameron zp   760 Aug 22 11:01 setup.py
-rw-------.  1 dcameron zp   221 Jul 25 16:38 TODO.md
-rw-------. 1 dcameron zp 11 Sep 26 12:18 /home/dcameron/boinc/slots/0/pilot2/pandaIDs.out
4002876565

2019-09-26 10:32:29 UTC [wrapper] Test setup, not cleaning
2019-09-26 10:32:29 UTC [wrapper] ==== wrapper stdout END ====
2019-09-26 10:32:29 UTC [wrapper] ==== wrapper stderr END ====
2019-09-26 10:32:29 UTC [wrapper] wrapper wrapperexiting ec=0, duration=832
2019-09-26 10:32:29 UTC [wrapper] apfmon messages muted
2019-09-26 12:32:29,348: ***************diag file************
2019-09-26 12:32:29,348: runtimeenvironments=APPS/HEP/ATLAS-SITE;
Processors=1
WallTime=831.68s
KernelTime=51.79s
UserTime=1724.77s
CPUUsage=213%
MaxResidentMemory=1944844kB
AverageResidentMemory=0kB
AverageTotalMemory=0kB
AverageUnsharedMemory=0kB
AverageUnsharedStack=0kB
AverageSharedMemory=0kB
PageSize=4096B
MajorPageFaults=174824
MinorPageFaults=9114100
Swaps=0
ForcedSwitches=65885
WaitSwitches=3329135
Inputs=40189424
Outputs=84280
SocketReceived=0
SocketSent=0
Signals=0

nodename=David_Cameron@pcoslo5.cern.ch
exitcode=0
2019-09-26 12:32:29,352: ******************************WorkDir***********************
2019-09-26 12:32:29,353: total 369604
drwxrwx--x. 7 dcameron zp      4096 Sep 26 12:32 .
drwxrwx--x. 3 dcameron zp        15 Sep 16 13:48 ..
-rw-------. 1 dcameron zp       506 Sep 26 12:32 3xMODmHgsRvnShfckohDCDFpABFKDmABFKDmQ6FKDmECFKDmW41Yzn.diag
-rw-------. 1 dcameron zp   7579510 Sep 26 12:18 agis_ddmendpoints.json
-rw-------. 1 dcameron zp   4044847 Sep 26 12:18 agis_schedconf.cvmfs.json
drwx------. 2 dcameron zp         6 Sep 26 12:18 .alrb
drwxr-xr-x. 3 dcameron zp        17 Sep 26 12:18 APPS
-rw-------. 1 dcameron zp       548 Sep 26 12:18 .asetup
-rw-------. 1 dcameron zp       826 Sep 26 12:19 .asetup.save
drwx------. 2 dcameron zp         6 Sep 26 12:19 .asetup-sysbin_1258984
-rw-r--r--. 1 dcameron zp         0 Sep 26 12:18 boinc_lockfile
-rw-r--r--. 1 dcameron zp      8192 Sep 26 12:32 boinc_mmap_file
-rw-r--r--. 1 dcameron zp       534 Sep 26 12:27 boinc_task_state.xml
-rw-r--r--. 1 dcameron zp 365251149 Sep 26 12:18 EVNT.14296418._001447.pool.root.1
-rw-------. 1 dcameron zp     14669 Sep 26 12:31 heartbeat.json
-rw-r--r--. 1 dcameron zp      6290 Sep 26 12:18 init_data.xml
-rw-r--r--. 1 dcameron zp    266069 Sep 26 12:18 input.tar.gz
-rw-r--r--. 1 dcameron zp       112 Sep 26 12:18 job.xml
-rw-------. 1 dcameron zp    710818 Sep 26 12:31 log.000649-2009753-26370._078090.job.log.tgz.1
-rw-------. 1 dcameron zp    179432 Sep 26 12:32 log.14568781._078090.job.log.1
-rw-------. 1 dcameron zp       499 Sep 26 12:32 output.list
-rw-------. 1 dcameron zp      2958 Sep 10 12:35 pandaJobData.out
drwx------. 3 dcameron zp       229 Sep 26 12:18 pilot2
-rw-r--r--. 1 dcameron zp    259319 Sep 10 12:31 pilot2.tar.gz
-rw-------. 1 dcameron zp     12133 Sep 26 12:32 pilotlog.txt
-rw-------. 1 dcameron zp      5077 Sep 26 12:18 queuedata.json
-rw-r--r--. 1 dcameron zp       815 Sep 26 12:18 RTE.tar.gz
-rwxr-xr-x. 1 dcameron zp      7941 Sep 26 12:18 run_atlas
-rwx------. 1 dcameron zp     12641 Sep 10 12:35 runpilot2-wrapper.sh
-rw-r--r--. 1 dcameron zp       618 Sep 26 12:32 runtime_log
-rw-r--r--. 1 dcameron zp      8100 Sep 26 12:32 runtime_log.err
-rw-------. 1 dcameron zp       240 Sep 26 12:18 setup.sh.local
drwxrwx--x. 2 dcameron zp       131 Sep 26 12:32 shared
-rw-r--r--. 1 dcameron zp      8494 Sep 26 12:18 start_atlas.sh
-rw-r--r--. 1 dcameron zp     18560 Sep 26 12:32 stderr.txt
-rw-r--r--. 1 dcameron zp       107 Sep 26 12:18 wrapper_26015_x86_64-pc-linux-gnu
-rw-r--r--. 1 dcameron zp        25 Sep 26 12:32 wrapper_checkpoint.txt

2019-09-26 12:32:29,353: running start_atlas return value is 0
2019-09-26 12:32:29,353: Parent exit 0
2019-09-26 12:32:29,353: child process exit 0
12:32:29 (1254891): run_atlas exited; CPU time 1725.124906
12:32:29 (1254891): called boinc_finish(0)

</stderr_txt>
]]>


©2024 CERN