Name vjLKDm3kZqunShfckohDCDFpABFKDmABFKDmdZHRDmABFKDmsJX6Fm_0
Workunit 1901373
Created 30 May 2019, 12:55:12 UTC
Sent 30 May 2019, 15:25:52 UTC
Report deadline 6 Jun 2019, 15:25:52 UTC
Received 31 May 2019, 1:47:26 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x00000000)
Computer ID 2244
Run time 10 hours 21 min 34 sec
CPU time 10 hours 40 min 52 sec
Validate state Valid
Credit 172.62
Device peak FLOPS 2.00 GFLOPS
Application version ATLAS Simulation v0.62 (native_mt)
x86_64-pc-linux-gnu
Peak working set size 1.91 GB
Peak swap size 2.44 GB
Peak disk usage 838.64 MB

Stderr output

<core_client_version>7.5.1</core_client_version>
<![CDATA[
<stderr_txt>
05:17:36 (10003): wrapper (7.7.26015): starting
05:17:36 (10003): wrapper: running run_atlas (--nthreads 1)
singularity image is /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-slc6.img
sys.argv = ['run_atlas', '--nthreads', '1']
THREADS=1
Checking for CVMFS
CVMFS is installed
OS:Scientific Linux release 6.10 (Carbon)

This is SLC or CentOS release 6, run the atlas job without Singularity
copy /root/Downloads/BOINC/slots/25/shared/input.tar.gz
copy /root/Downloads/BOINC/slots/25/shared/start_atlas.sh
copy /root/Downloads/BOINC/slots/25/shared/ATLAS.root_0
copy /root/Downloads/BOINC/slots/25/shared/RTE.tar.gz
start atlas job with 
cmd = sh start_atlas.sh > runtime_log 2> runtime_log.err
running cmd return value is 0
Moving ./HITS.18099307._036993.pool.root.1 to shared/HITS.pool.root.1
*****************The last 100 lines of the pilot log******************
2019-05-31 13:53:52,396 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_job_metrics           | job metrics="coreCount=1 leak=3.81"
2019-05-31 13:53:52,396 | INFO     | queue_monitor       | pilot.control.job                | get_data_structure        | payload/TRF did not report the number of read events
2019-05-31 13:53:52,397 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_values         | using path: /root/Downloads/BOINC/slots/25/PanDA_Pilot-4364286066/memory_monitor_summary.json
2019-05-31 13:53:52,398 | DEBUG    | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | summary_dictionary={'Max': {'maxRSS': 2111744, 'maxSwap': 0, 'totRCHAR': 1639886823, 'totRBYTES': 1634512896, 'totWCHAR': 115661588, 'totWBYTES': 125935616, 'maxVMEM': 3222540, 'maxPSS': 2103296}, 'Avg': {'avgRSS': 2074284, 'rateWCHAR': 3046, 'avgVMEM': 3175032, 'avgPSS': 2065534, 'rateRCHAR': 43187, 'avgSwap': 0, 'rateRBYTES': 43046, 'rateWBYTES': 3316}}
2019-05-31 13:53:52,398 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard info from memory monitor json
2019-05-31 13:53:52,398 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard memory fields from memory monitor json
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . Timing measurements:
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . get job = 8 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . initial setup = 16 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload setup = 0 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . total setup = 16 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-in = 8 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload execution = 38064 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-out = 28 s
2019-05-31 13:53:52,400 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-05-31 13:53:52,400 | INFO     | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | building log extracts (sent to the server as 'pilotLog')
2019-05-31 13:53:52,400 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_panda_tracer_log      | PanDA tracer log does not exist: /root/Downloads/BOINC/slots/25/PanDA_Pilot-4364286066/pandatracerlog.txt (ignoring)
2019-05-31 13:53:52,400 | INFO     | queue_monitor       | pilot.util.container             | execute                   | executing command: tail -n 20 /root/Downloads/BOINC/slots/25/PanDA_Pilot-4364286066/pilotlog.txt
2019-05-31 13:53:52,518 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_pilot_log_extracts    | dumping warning messages from pilotlog.txt:

2019-05-31 13:53:52,588 | WARNING  | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | detected the following tail of warning/fatal messages in the pilot log:
- Log from pilotlog.txt -2019-05-31 13:53:52,395 | INFO     | queue_monitor       | pilot.api.analytics              | get_fitted_data           | current memory leak: 3.81 B/s (using 612 data points, chi2=-36730081127.376816)
2019-05-31 13:53:52,396 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_job_metrics           | job metrics="coreCount=1 leak=3.81"
2019-05-31 13:53:52,396 | INFO     | queue_monitor       | pilot.control.job                | get_data_structure        | payload/TRF did not report the number of read events
2019-05-31 13:53:52,397 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_values         | using path: /root/Downloads/BOINC/slots/25/PanDA_Pilot-4364286066/memory_monitor_summary.json
2019-05-31 13:53:52,398 | DEBUG    | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | summary_dictionary={'Max': {'maxRSS': 2111744, 'maxSwap': 0, 'totRCHAR': 1639886823, 'totRBYTES': 1634512896, 'totWCHAR': 115661588, 'totWBYTES': 125935616, 'maxVMEM': 3222540, 'maxPSS': 2103296}, 'Avg': {'avgRSS': 2074284, 'rateWCHAR': 3046, 'avgVMEM': 3175032, 'avgPSS': 2065534, 'rateRCHAR': 43187, 'avgSwap': 0, 'rateRBYTES': 43046, 'rateWBYTES': 3316}}
2019-05-31 13:53:52,398 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard info from memory monitor json
2019-05-31 13:53:52,398 | INFO     | queue_monitor       | pilot.user.atlas.utilities       | get_memory_monitor_info   | extracted standard memory fields from memory monitor json
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . Timing measurements:
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . get job = 8 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . initial setup = 16 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload setup = 0 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . total setup = 16 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-in = 8 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . payload execution = 38064 s
2019-05-31 13:53:52,399 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | . stage-out = 28 s
2019-05-31 13:53:52,400 | INFO     | queue_monitor       | pilot.util.auxiliary             | timing_report             | ..............................
2019-05-31 13:53:52,400 | INFO     | queue_monitor       | pilot.util.auxiliary             | get_log_extracts          | building log extracts (sent to the server as 'pilotLog')
2019-05-31 13:53:52,400 | DEBUG    | queue_monitor       | pilot.util.auxiliary             | get_panda_tracer_log      | PanDA tracer log does not exist: /root/Downloads/BOINC/slots/25/PanDA_Pilot-4364286066/pandatracerlog.txt (ignoring)
2019-05-31 13:53:52,400 | INFO     | queue_monitor       | pilot.util.container             | execute                   | executing command: tail -n 20 /root/Downloads/BOINC/slots/25/PanDA_Pilot-4364286066/pilotlog.txt
- Error messages from pilotlog.txt -
          "CRITICAL": 0, 

          "ERROR": 0, 

          "FATAL": 0, 


2019-05-31 13:53:52,592 | DEBUG    | queue_monitor       | pilot.control.job                | send_state                | wrote heartbeat to file /root/Downloads/BOINC/slots/25/heartbeat.json
2019-05-31 13:53:52,592 | INFO     | queue_monitor       | pilot.control.job                | queue_monitor             | job 4364286066 was dequeued from the monitored payloads queue
2019-05-31 13:53:53,606 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | 
2019-05-31 13:53:53,607 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | job summary report
2019-05-31 13:53:53,607 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | --------------------------------------------------
2019-05-31 13:53:53,607 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | PanDA job id: 4364286066
2019-05-31 13:53:53,607 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | task id: 18099307
2019-05-31 13:53:53,607 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | errors: (none)
2019-05-31 13:53:53,608 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | status: LOG_TRANSFER = DONE 
2019-05-31 13:53:53,613 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pilot state: finished 
2019-05-31 13:53:53,613 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | transexitcode: 0
2019-05-31 13:53:53,614 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exeerrorcode: 0
2019-05-31 13:53:53,614 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exeerrordiag: 
2019-05-31 13:53:53,614 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exitcode: 0
2019-05-31 13:53:53,615 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | exitmsg: OK
2019-05-31 13:53:53,615 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | cpuconsumptiontime: 38209 s
2019-05-31 13:53:53,616 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | nevents: 0
2019-05-31 13:53:53,616 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | neventsw: 0
2019-05-31 13:53:53,616 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pid: 16191
2019-05-31 13:53:53,617 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | pgrp: None
2019-05-31 13:53:53,617 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | corecount: 1
2019-05-31 13:53:53,617 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | event service: False
2019-05-31 13:53:53,617 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | --------------------------------------------------
2019-05-31 13:53:53,617 | INFO     | retrieve            | pilot.util.auxiliary             | make_job_report           | 
2019-05-31 13:53:53,617 | INFO     | retrieve            | pilot.control.job                | has_job_completed         | job 4364286066 has completed
2019-05-31 13:53:53,617 | INFO     | retrieve            | pilot.control.job                | retrieve                  | ready for new job
2019-05-31 13:53:53,617 | DEBUG    | retrieve            | pilot.control.job                | retrieve                  | getjob_requests=1
2019-05-31 13:54:03,635 | DEBUG    | retrieve            | pilot.control.job                | proceed_with_getjob       | proceed_with_getjob called with getjob_requests=1
2019-05-31 13:54:03,636 | DEBUG    | retrieve            | pilot.util.monitoring            | check_local_space         | checking local space on /root/Downloads/BOINC/slots/25
2019-05-31 13:54:03,659 | INFO     | retrieve            | pilot.util.monitoring            | check_local_space         | sufficient remaining disk space (14301528064 B)
2019-05-31 13:54:03,659 | WARNING  | retrieve            | pilot.control.job                | proceed_with_getjob       | since timefloor is set to 0, pilot was only allowed to run one job
2019-05-31 13:54:03,787 | DEBUG    | data                | pilot.control.data               | control                   | data control ending since graceful_stop has been set
2019-05-31 13:54:03,989 | WARNING  | copytool_out        | pilot.util.common                | should_abort              | data:copytool_out:received graceful stop - abort after this iteration
2019-05-31 13:54:03,990 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | will abort 
2019-05-31 13:54:04,187 | INFO     | validate_post       | pilot.control.payload            | validate_post             | validate_post has finished
2019-05-31 13:54:04,336 | DEBUG    | payload             | pilot.control.payload            | control                   | payload control ending since graceful_stop has been set
2019-05-31 13:54:04,336 | DEBUG    | job                 | pilot.control.job                | control                   | job control ending since graceful_stop has been set
2019-05-31 13:54:04,604 | DEBUG    | copytool_in         | pilot.control.data               | copytool_in               | copytool_in ended since graceful_stop has been set
2019-05-31 13:54:04,638 | INFO     | monitor             | pilot.control.monitor            | control                   | monitor control has ended
2019-05-31 13:54:05,052 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | aborting
2019-05-31 13:54:05,055 | DEBUG    | copytool_out        | pilot.control.data               | copytool_out              | copytool_out has finished
2019-05-31 13:54:05,417 | WARNING  | queue_monitoring    | pilot.util.common                | should_abort              | data:queue_monitoring:received graceful stop - abort after this iteration
2019-05-31 13:54:05,729 | WARNING  | queue_monitor       | pilot.util.common                | should_abort              | job:queue_monitor:received graceful stop - abort after this iteration
2019-05-31 13:54:05,730 | INFO     | queue_monitor       | pilot.control.job                | queue_monitor             | [job] queue monitor has finished
2019-05-31 13:54:08,610 | INFO     | queue_monitoring    | pilot.control.data               | queue_monitoring          | [data] queue monitor has finished
2019-05-31 13:54:25,468 | WARNING  | job_monitor         | pilot.control.job                | check_job_monitor_waiting_time | no jobs in monitored_payloads queue (waited for 61 s)
2019-05-31 13:54:25,470 | INFO     | job_monitor         | pilot.control.job                | job_monitor               | job monitor has finished
2019-05-31 13:54:25,470 | INFO     | MainThread          | pilot.workflow.generic           | run                       | end of generic workflow (traces error code: 0)
2019-05-31 13:54:25,471 | INFO     | MainThread          | root                             | wrap_up                   | traces error code: 0
2019-05-31 13:54:25,471 | INFO     | MainThread          | root                             | wrap_up                   | pilot has finished
***************diag file************
runtimeenvironments=APPS/HEP/ATLAS-SITE;
Processors=1
WallTime=38198.48s
KernelTime=904.17s
UserTime=38142.81s
CPUUsage=102%
MaxResidentMemory=2000260kB
AverageResidentMemory=0kB
AverageTotalMemory=0kB
AverageUnsharedMemory=0kB
AverageUnsharedStack=0kB
AverageSharedMemory=0kB
PageSize=4096B
MajorPageFaults=9573
MinorPageFaults=15698627
Swaps=0
ForcedSwitches=397640
WaitSwitches=18950619
Inputs=3336744
Outputs=303784
SocketReceived=0
SocketSent=0
Signals=0

nodename=maeax@APU8S
exitcode=0
******************************WorkDir***********************
insgesamt 382996
drwxrwx--x.  7 root root      4096 31. Mai 15:54 .
drwxr-x--x. 28 root root      4096 31. Mai 05:17 ..
-rw-------.  1 root root   7284719 31. Mai 05:18 agis_ddmendpoints.json
-rw-------.  1 root root   4801684 31. Mai 05:18 agis_schedconf.cvmfs.json
drwx------.  2 root root      4096 31. Mai 05:17 .alrb
drwxr-xr-x.  3 root root      4096 31. Mai 05:17 APPS
-rw-------.  1 root root       550 31. Mai 05:17 .asetup
-rw-------.  1 root root      2986 31. Mai 05:18 .asetup.save
-rw-r--r--.  1 root root         0 31. Mai 05:17 boinc_lockfile
-rw-r--r--.  1 root root      8192 31. Mai 15:54 boinc_mmap_file
-rw-r--r--.  1 root root       537 31. Mai 15:53 boinc_task_state.xml
-rw-r--r--.  1 root root 375535403 31. Mai 05:17 EVNT.17323937._000680.pool.root.1
-rw-------.  1 root root     57291 31. Mai 15:53 heartbeat.json
-rw-r--r--.  1 root root      5470 31. Mai 05:17 init_data.xml
-rw-r--r--.  1 root root    249673 31. Mai 05:17 input.tar.gz
-rw-r--r--.  1 root root       112 31. Mai 05:17 job.xml
-rw-------.  1 root root   1757110 31. Mai 15:54 log.18099307._036993.job.log.1
-rw-------.  1 root root    324576 31. Mai 15:53 log.18099307._036993.job.log.tgz.1
-rw-------.  1 root root       306 31. Mai 15:53 memory_monitor_summary.json
-rw-------.  1 root root       391 31. Mai 15:54 output.list
-rw-------.  1 4871 1028      2886 30. Mai 14:54 pandaJobData.out
drwxrwx---.  2 root root      4096 31. Mai 15:53 PanDA_Pilot-4364286066
drwxr-xr-x.  3 1700 1332      4096 20. Mai 09:54 pilot2
-rw-r--r--.  1 root root    240831 24. Mai 16:04 pilot2.tar.gz
-rw-------.  1 root root   1742022 31. Mai 15:54 pilotlog.txt
-rw-r--r--.  1 root root      4444 30. Mai 14:54 queuedata.json
-rw-r--r--.  1 root root       786 31. Mai 05:17 RTE.tar.gz
-rwxr-xr-x.  1 root root      8512 31. Mai 05:17 run_atlas
-rwx------.  1 4871 1028     14753 30. Mai 14:54 runpilot2-wrapper.sh
-rw-r--r--.  1 root root       692 31. Mai 15:54 runtime_log
-rw-r--r--.  1 root root      7355 31. Mai 15:54 runtime_log.err
drwxrwx--x.  2 root root      4096 31. Mai 15:54 shared
-rw-r--r--.  1 root root      8714 31. Mai 05:17 start_atlas.sh
-rw-r--r--.  1 root root     16744 31. Mai 15:54 stderr.txt
-rw-------.  1 root root       493 31. Mai 15:54 vjLKDm3kZqunShfckohDCDFpABFKDmABFKDmdZHRDmABFKDmsJX6Fm.diag
-rw-r--r--.  1 root root       107 31. Mai 05:17 wrapper_26015_x86_64-pc-linux-gnu
-rw-r--r--.  1 root root        28 31. Mai 15:54 wrapper_checkpoint.txt
running start_atlas return value is 0
Parent exit 0
child process exit 0
15:54:26 (10003): run_atlas exited; CPU time 38143.286339
15:54:26 (10003): called boinc_finish(0)

</stderr_txt>
]]>


©2024 CERN