41)
Message boards :
ATLAS Application :
ATLAS vbox v.1.18
(Message 7884)
Posted 17 Nov 2022 by Crystal Pellet Post: Yesterday the tasks used the new wrapper v26206 as announced, but were still running the old version 1.17 of the application. Today the new application 1.18-vdi was downloaded. |
42)
Message boards :
News :
Server Release 1.4.0
(Message 7883)
Posted 16 Nov 2022 by Crystal Pellet Post: Get this line set_cached_data(): can't open ../cache/f2/server_status.php_job_status on top of the Server status page. |
43)
Message boards :
News :
Server Release 1.4.0
(Message 7871)
Posted 10 Nov 2022 by Crystal Pellet Post: When forum posts selected to sort newest first, the newest post is shown below the previous (older) post. |
44)
Message boards :
ATLAS Application :
ATLAS vbox v.1.17
(Message 7867)
Posted 8 Nov 2022 by Crystal Pellet Post: @Laurence and/or David, Any plans to update vboxwrapper to the official released version 26206 here at dev and production? |
45)
Message boards :
CMS Application :
New Version 60.68
(Message 7850)
Posted 4 Nov 2022 by Crystal Pellet Post: A new CMS version was deployed this morning. No new files were downloaded to my system. What is the difference to the previous version? Is it more than avoiding requesting multiple idtokens? |
46)
Message boards :
CMS Application :
New Version 60.67
(Message 7847)
Posted 3 Nov 2022 by Crystal Pellet Post: I tested 1 CMS-task v60.67 and that returned fine. https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3136136 Combination BOINC 7.20.2 and VBox 7.0.2 Somewhere in the middle of the task, I suspended the task a few minutes, where the state was saved to disk. Towards the end one task-suspend with keep in memory active. Nothing to do with this vboxwrapper, but I noticed that Vbox7 keeps the used harddisks in the VirtuBox.xml file even after a reboot and no BOINC-VMs in use. <MediaRegistry> <HardDisks> <HardDisk uuid="{6f08958e-7bfd-4804-8dd7-c7b4408cb126}" location="D:/Boinc1/projects/lhcathome.cern.ch_lhcathome/ATLAS_vbox_2.02_image.vdi" format="VDI" type="MultiAttach"/> <HardDisk uuid="{997e0796-142b-4278-9763-0bceb3ac71bc}" location="D:/Boinc1/projects/lhcathomedev.cern.ch_lhcathome-dev/ATLAS_vbox_1.17_image.vdi" format="VDI" type="MultiAttach"/> <HardDisk uuid="{dae25e8f-de18-4971-b11c-eca764ede402}" location="D:/Boinc1/projects/lhcathome.cern.ch_lhcathome/CMS_2022_09_07_prod.vdi" format="VDI" type="MultiAttach"/> <HardDisk uuid="{8fb925ef-3497-4bfb-88e3-bbab2930787f}" location="D:/Boinc1/projects/lhcathomedev.cern.ch_lhcathome-dev/CMS_2022_09_07.vdi" format="VDI" type="MultiAttach"/> </HardDisks> |
47)
Message boards :
ATLAS Application :
ATLAS vbox v.1.17
(Message 7832)
Posted 21 Oct 2022 by Crystal Pellet Post: Tested 1 ATLAS-task with the newest VirtualBox version 7.0.2: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3126629 |
48)
Message boards :
General Discussion :
Xtrack beam simulation
(Message 7831)
Posted 20 Oct 2022 by Crystal Pellet Post: I got this afternoon 5 Xtrack beam simulation tasks. All 5 resends - https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4600&offset=0&show_names=0&state=0&appid=16 Tasks are not shown in BOINC's tasks list. This is what is shown in BOINC's log: 598 lhcathome-dev 20 Oct 17:00:50 Requesting new tasks for CPU 599 lhcathome-dev 20 Oct 17:00:52 [error] Can't parse file info in scheduler reply: file name is empty or has '..' 600 lhcathome-dev 20 Oct 17:00:52 Scheduler request completed: got 1 new tasks 601 lhcathome-dev 20 Oct 17:00:52 Project requested delay of 61 seconds 602 lhcathome-dev 20 Oct 17:00:52 [error] State file error: missing file ../xboinc_input.bin 603 lhcathome-dev 20 Oct 17:00:52 [error] State file error: missing input file ../xboinc_input.bin 604 lhcathome-dev 20 Oct 17:00:52 [error] Can't handle task Xtrack_3838109_1664549883.039677 in scheduler reply 605 lhcathome-dev 20 Oct 17:00:52 [error] State file error: missing task Xtrack_3838109_1664549883.039677 606 lhcathome-dev 20 Oct 17:00:52 [error] Can't handle task Xtrack_3838109_1664549883.039677_2 in scheduler reply 607 lhcathome-dev 20 Oct 17:00:54 Started download of xboinc_011-windows_x86_64.exe 608 lhcathome-dev 20 Oct 17:00:59 Finished download of xboinc_011-windows_x86_64.exe 620 lhcathome-dev 20 Oct 17:19:05 Sending scheduler request: To fetch work. 622 lhcathome-dev 20 Oct 17:19:05 Requesting new tasks for CPU 623 lhcathome-dev 20 Oct 17:19:07 [error] Can't parse file info in scheduler reply: file name is empty or has '..' 624 lhcathome-dev 20 Oct 17:19:07 [error] Can't parse file info in scheduler reply: file name is empty or has '..' 625 lhcathome-dev 20 Oct 17:19:07 Scheduler request completed: got 2 new tasks 626 lhcathome-dev 20 Oct 17:19:07 Project requested delay of 61 seconds 627 lhcathome-dev 20 Oct 17:19:07 [error] State file error: missing file ../xboinc_input.bin 628 lhcathome-dev 20 Oct 17:19:07 [error] State file error: missing input file ../xboinc_input.bin 629 lhcathome-dev 20 Oct 17:19:07 [error] Can't handle task Xtrack_3838316_1664550120.666061 in scheduler reply 630 lhcathome-dev 20 Oct 17:19:07 [error] State file error: missing file ../xboinc_input.bin 631 lhcathome-dev 20 Oct 17:19:07 [error] State file error: missing input file ../xboinc_input.bin 632 lhcathome-dev 20 Oct 17:19:07 [error] Can't handle task Xtrack_3838326_1664550122.364841 in scheduler reply 633 lhcathome-dev 20 Oct 17:19:07 [error] State file error: missing task Xtrack_3838316_1664550120.666061 634 lhcathome-dev 20 Oct 17:19:07 [error] Can't handle task Xtrack_3838316_1664550120.666061_2 in scheduler reply 635 lhcathome-dev 20 Oct 17:19:07 [error] State file error: missing task Xtrack_3838326_1664550122.364841 636 lhcathome-dev 20 Oct 17:19:07 [error] Can't handle task Xtrack_3838326_1664550122.364841_2 in scheduler reply 637 lhcathome-dev 20 Oct 17:34:24 Sending scheduler request: To fetch work. 638 lhcathome-dev 20 Oct 17:34:24 Requesting new tasks for CPU 639 lhcathome-dev 20 Oct 17:34:25 [error] Can't parse file info in scheduler reply: file name is empty or has '..' 640 lhcathome-dev 20 Oct 17:34:25 [error] Can't parse file info in scheduler reply: file name is empty or has '..' 641 lhcathome-dev 20 Oct 17:34:25 Scheduler request completed: got 2 new tasks 642 lhcathome-dev 20 Oct 17:34:25 Project requested delay of 61 seconds 643 lhcathome-dev 20 Oct 17:34:25 [error] State file error: missing file ../xboinc_input.bin 644 lhcathome-dev 20 Oct 17:34:25 [error] State file error: missing input file ../xboinc_input.bin 645 lhcathome-dev 20 Oct 17:34:25 [error] Can't handle task Xtrack_3838302_1664550118.050580 in scheduler reply 646 lhcathome-dev 20 Oct 17:34:25 [error] State file error: missing file ../xboinc_input.bin 647 lhcathome-dev 20 Oct 17:34:25 [error] State file error: missing input file ../xboinc_input.bin 648 lhcathome-dev 20 Oct 17:34:25 [error] Can't handle task Xtrack_3838322_1664550121.701115 in scheduler reply 649 lhcathome-dev 20 Oct 17:34:25 [error] State file error: missing task Xtrack_3838302_1664550118.050580 650 lhcathome-dev 20 Oct 17:34:25 [error] Can't handle task Xtrack_3838302_1664550118.050580_2 in scheduler reply 651 lhcathome-dev 20 Oct 17:34:25 [error] State file error: missing task Xtrack_3838322_1664550121.701115 652 lhcathome-dev 20 Oct 17:34:25 [error] Can't handle task Xtrack_3838322_1664550121.701115_2 in scheduler reply |
49)
Message boards :
ATLAS Application :
ATLAS vbox v.1.17
(Message 7827)
Posted 19 Oct 2022 by Crystal Pellet Post: I tested several tasks. Not sure what problem should be solved. https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4547&offset=0&show_names=0&state=0&appid=5 https://lhcathomedev.cern.ch/lhcathome-dev/results.php?hostid=4600&offset=0&show_names=0&state=0&appid=5 On host id 4547 I had three errors. 1 was part of the first 5 tasks starting at once. On that task I got a popup of a vboxheadless.exe application error. With 2 other tasks I tested the netwoork connection problem by interrupting my internet when starting an ATLAS task. I got Testing CVMFS .... of course without response. The tasks keep on running endless. After connecting to the internet again also nothing positive happened / tasks keep on running doing nothing. So I suspended the tasks, removed the saved states, rebooted the VMs and saved the state after over 5 minutes of runtime. Then I resumed the tasks in BOINC again. 1 task got the same VBoxHeadless application error as above and the other was canceled by the server although running well. |
50)
Message boards :
General Discussion :
Server error
(Message 7817)
Posted 21 Sep 2022 by Crystal Pellet Post: lhcathome-dev 21 Sep 13:02:14 Sending scheduler request: To report completed tasks. lhcathome-dev 21 Sep 13:02:14 Reporting 1 completed tasks lhcathome-dev 21 Sep 13:02:14 Not requesting tasks: "no new tasks" requested via Manager lhcathome-dev 21 Sep 13:02:15 Scheduler request completed lhcathome-dev 21 Sep 13:02:15 Server error: recompile needed lhcathome-dev 21 Sep 13:02:15 Project requested delay of 3600 seconds |
51)
Message boards :
CMS Application :
New Version 60.66
(Message 7813)
Posted 17 Sep 2022 by Crystal Pellet Post: Thought it was a interrupt at 13 UTC in the CMS-Servers (WM-Agent upgrade).No, there were enough jobs. Since 05.30 UTC this morning we ran out of CMS-jobs. |
52)
Message boards :
CMS Application :
New Version 60.66
(Message 7811)
Posted 17 Sep 2022 by Crystal Pellet Post: Since 2 hours (13 UTC) task is doing nothing. 9. task have 1.4 MByte Data, but did not finished. Your task also ended after the hard-coded job duration of 64800 seconds (18 hours). So did Ivan's v60.66 tasks: Runtime CPU seconds 64,967.01 39,813.72 64,966.98 39,689.98 64,885.72 5,572.52 64,886.27 5,588.20 64,907.93 5,049.78 The shutdown after 12 hours runtime and a finised job is not working. On LHC@home there is even a more sophisticated methode to calculate, whether it's worth to request a new job even before the first 12 hours are over. |
53)
Message boards :
CMS Application :
New Version 60.66
(Message 7808)
Posted 16 Sep 2022 by Crystal Pellet Post: The task I started yesterday evening is still in a running state, but not doing a cms-job. 4 jobs has finished. It is not (yet) finished gracefully by the VM itself although: 09/16/22 12:09:14 (pid:15847) The DaemonShutdown expression "(STARTD_StartTime =?= 0)" evaluated to TRUE: starting graceful shutdown 09/16/22 12:09:14 (pid:15847) Got SIGTERM. Performing graceful shutdown. 09/16/22 12:09:14 (pid:15847) About to tell the ProcD to exit 09/16/22 12:09:14 (pid:15847) All daemons are gone. Exiting. 09/16/22 12:09:14 (pid:15847) **** condor_master (condor_MASTER) pid 15847 EXITING WITH STATUS 99 Run time over 15 hours and CPU-time over 14 hours. There is only 1 boinc process active inside the VM: bash. I'll wait another hour or until the hard shutdown by vboxwrapper after 18 hours. pid 15847 EXITING WITH STATUS 99: What does that mean? https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3112985 |
54)
Message boards :
CMS Application :
New Version 60.66
(Message 7805)
Posted 16 Sep 2022 by Crystal Pellet Post: The 2nd, 3rd and 4th job of this task did not have these connection issues. |
55)
Message boards :
CMS Application :
New Version 60.66
(Message 7802)
Posted 15 Sep 2022 by Crystal Pellet Post: That certainly looks better. Transient network problems earlier?I had and have no network problems. I'm not using any proxy. |
56)
Message boards :
CMS Application :
New Version 60.66
(Message 7800)
Posted 15 Sep 2022 by Crystal Pellet Post: Jobs now available again. I believe the little glitch here in -dev has been repaired but I won't be able to confirm it myself until mid-morning tomorrow. Not looking good so far: cmsRun -j FrameworkJobReport.xml PSet.py warn [frontier.c:1014]: Request 507 on chan 1 failed at Thu Sep 15 21:48:52 2022: -6 [fn-socket.c:239]: read from 172.64.206.32 timed out after 10 seconds warn [frontier.c:1136]: Trying next server cms-frontier.openhtc.io[172.64.207.32] warn [frontier.c:1014]: Request 701 on chan 1 failed at Thu Sep 15 21:49:07 2022: -6 [fn-socket.c:239]: read from 172.64.207.32 timed out after 10 seconds warn [frontier.c:1136]: Trying next server cms-frontier.openhtc.io[2606:4700:e6::ac40:cf20] warn [frontier.c:1014]: Request 702 on chan 1 failed at Thu Sep 15 21:49:07 2022: -9 [fn-socket.c:85]: network error on connect to 2606:4700:e6::ac40:cf20: Network is unreachable warn [frontier.c:1136]: Trying next server cms-frontier.openhtc.io[2606:4700:e6::ac40:ce20] warn [frontier.c:1014]: Request 703 on chan 1 failed at Thu Sep 15 21:49:07 2022: -9 [fn-socket.c:85]: network error on connect to 2606:4700:e6::ac40:ce20: Network is unreachable warn [frontier.c:1136]: Trying next server cms1-frontier.openhtc.io warn [frontier.c:1014]: Request 704 on chan 1 failed at Thu Sep 15 21:49:27 2022: -6 [fn-urlparse.c:178]: host name cms1-frontier.openhtc.io problem: Name or service not known warn [frontier.c:1136]: Trying next server cms2-frontier.openhtc.io ... but finally: Begin processing the 1st record. Run 1, Event 1920001, LumiSection 3841 on stream 0 at 15-Sep-2022 21:51:36.256 CEST |
57)
Message boards :
CMS Application :
New Version 60.65
(Message 7774)
Posted 1 Sep 2022 by Crystal Pellet Post: After initializing and before processing the events, I see cmsRun -j FrameworkJobReport.xml PSet.py warn [fn-htclient.c:530]: Retrying after system error |
58)
Message boards :
CMS Application :
New Version 60.64
(Message 7724)
Posted 11 Aug 2022 by Crystal Pellet Post: The information that should have been shown, when using ALT-key's is still not displayed in this version. ALT-F1 Startup console is shown ALT-F2 shows only the dummy message (job output - event processing) ALT-F3 "top" is shown correctly ALT-F4 shows only the dummy message (output of job wrapper) ALT-F5 shows only the dummy message (error messages) ALT-F6 login screen is shown |
59)
Message boards :
ATLAS Application :
ATLAS vbox v.1.15
(Message 7715)
Posted 2 Aug 2022 by Crystal Pellet Post: 100 Atlas-Tasks per day and PC.Even when there are only a few, it's worth to find the cause, because they are running endless occupying a slot. F-key's not available in BOINC's Console is a sign that the VM did not started through. In Oracle VM VirtualBox Manager you may click the VM when you have that issue and touch the button with the green right arrow "Show" (Zeigen) from the top menu. Maybe you only get a black screen and have to wake up the display with e.g. only the Alt-key. You probably also good improve your throughput by reducing the number of cores to 8 per VM and increase the number of tasks to 7 or even 8. |
60)
Message boards :
ATLAS Application :
ATLAS vbox v.1.15
(Message 7710)
Posted 2 Aug 2022 by Crystal Pellet Post: It's always OK to test things, especially if something not common happens.There are not so many Atlas-Tasks in -dev to see such a problem. You have to provide as much information as possible. What else is running on the machine. Do you use a second (Linux) VM on a (Windows) machine and run BOINC from there. Of special interest: What do you see in VM-Console with ALT-F1? I see very often "Checking CVMFS ....", but without a response. Do you start several VM's at once? |
©2024 CERN