Message boards : Theory Application : New Version (v3.13)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1067
Credit: 334,882
RAC: 4
Message 6403 - Posted: 1 Jul 2019, 12:02:39 UTC

This new version updates the cache to hopefully address the "Could not get X509 credentials" issue from production.
ID: 6403 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,948,723
RAC: 224
Message 6404 - Posted: 1 Jul 2019, 17:50:55 UTC - in response to Message 6403.  
Last modified: 1 Jul 2019, 17:51:44 UTC

And it now runs multiple consecutive Jobs within the Task, as on the Production site rather than just a single Job, with Alt - F1 - 5 screens now available as well rather than being straight in to F2.
I'm going to guess at a similar 12hr initial limit, finishing after any Job in progress at that time completes. I've not got any that far in yet. I'm hoping that there is also a fix to credit-for-work-done any Tasks that are terminated at the 18hr cut-off. I'll wait and watch any healthy ones that get that far before extending that limit.
ID: 6404 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,948,723
RAC: 224
Message 6405 - Posted: 1 Jul 2019, 21:09:50 UTC

Ah, so it's not a "NEW" version, updating 4.xx, it's a new image for the "old" 3.xx version (263.90 on main site) using Theory_2017_05_29 xml not Theory_2019_02_20 therefore not using cranky.
I get confused easily with the apps using a different numbering system here from what they do when they are moved over to Production.

Anyway, the ones I have all seem to be running fine although I wasn't getting the X509 problem that others were having so might be better to wait for somebody that WAS having that problem to report whether the issue is fixed.
ID: 6405 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 774
Credit: 11,944,427
RAC: 1,567
Message 6406 - Posted: 2 Jul 2019, 2:48:13 UTC
Last modified: 2 Jul 2019, 2:52:17 UTC

I will run some but as always it will take hours to d/l this vdi since for some reason the Cern server doesn't like to send them 5,300 miles at a normal speed so right now it is slower than a 1995 dialup ( 1.66KBps) .....and it can't be my satellite dish causing this.

(yeah this has been mentioned many,many times)
ID: 6406 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1188
Credit: 854,677
RAC: 13
Message 6407 - Posted: 2 Jul 2019, 10:54:11 UTC

But why is 2279-779662-75.run downloaded, copied to 'input'-file in the slot-directory and obviously not used?

At least the runspec=boinc pp jets 7000 40,-,460 - pythia6 6.428 z2 100000 75 is not the job that's running within the VM.
Also is cranky-0.0.21 copied to the slot folder, but not running in the VM.

What is the process crypto running in the VM?
ID: 6407 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 774
Credit: 11,944,427
RAC: 1,567
Message 6408 - Posted: 2 Jul 2019, 11:01:47 UTC

Ok I now have three of the 2-core tasks running and will let them run and able to get new tasks if needed later.

I would run 4 of those but I have a LHC version at about 93% so I will let that finish and then run these 4 X 2-core tasks
(its 4am here so it will be a few hours before I get back here to check)

But they all did get beyond the Credentials and HTCondor ping.
ID: 6408 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 774
Credit: 11,944,427
RAC: 1,567
Message 6409 - Posted: 2 Jul 2019, 11:14:42 UTC

(a couple of quick copy amd pastes)

00:03:30.036293 VRDP: New connection:
00:03:30.036382 VRDP: Connection opened (IPv6): 4
00:03:30.036556 VRDP: Negotiating security method with the client.
00:03:30.036905 VRDP: failed to access the server certificate file '': VERR_FILE_NOT_FOUND
00:03:30.037031 VRDP: Connection closed: 4
00:03:30.089739 VRDP: New connection:
00:03:30.089819 VRDP: Connection opened (IPv6): 5
00:03:30.089981 VRDP: Negotiating security method with the client.
00:03:30.107497 VRDP: Methods 0x0000001b
00:03:30.107528 VRDP: Channel: [rdpdr] [1004]. Accepted.
00:03:30.107542 VRDP: Channel: [rdpsnd] [1005]. Accepted.
00:03:30.107556 VRDP: Channel: [cliprdr] [1006]. Accepted.
00:03:30.107569 VRDP: Channel: [drdynvc] [1007]. Accepted.
00:03:30.107582 VRDP: Unsupported SEC_TAG: 0xC006/8. Skipping.
00:03:30.107596 VRDP: Unsupported SEC_TAG: 0xC00A/8. Skipping.
00:03:30.299013 VRDP: Client seems to be MSFT.
00:03:30.299044 VRDP: Logon: HAL5000 (::1) build 17763. User: [] Domain: [] Screen: 0
00:03:30.299208 AUTH: User: []. Domain: []. Authentication type: [Null]
00:03:30.299219 AUTH: Access granted.
00:03:30.300278 VRDP: Enabling upstream audio.
00:03:30.300339 VBVA: VRDP acceleration has been requested.
00:03:30.304274 VMMDev: SetVideoModeHint: Got a video mode hint (1364x768x32)@(0x0),(1;0) at 0
00:03:30.304348 VRDP: SunFlsh disabled.
00:03:30.358164 VRDP: SCARD enabled for 6
00:03:50.874008 VMMDev: Guest Log: [INFO] Reading volunteer information
00:03:51.244560 VMMDev: Guest Log: [INFO] Volunteer: Magic Quantum Mechanic (192)
00:03:51.336223 VMMDev: Guest Log: [INFO] VMID: ba004f3b-40b6-4c21-b3d2-35a8a839b4b6
00:03:51.856795 VMMDev: Guest Log: [INFO] Requesting an X509 credential from LHC@home
00:03:54.707105 VMMDev: Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev
00:03:58.102764 VMMDev: Guest Log: [INFO] Running the fast benchmark.
00:08:53.497897 VMMDev: Guest Log: [INFO] Machine performance 3.23 HEPSPEC06
00:08:53.716310 VMMDev: Guest Log: [INFO] Theory application starting. Check log files.
00:08:54.714014 VMMDev: Guest Log: [DEBUG] HTCondor ping
00:09:02.403519 VMMDev: Guest Log: [DEBUG] 0
00:09:08.309268 VRDP: Logoff: HAL5000 (::1) build 17763. User: [] Domain: [] Reason 0x0000.
00:09:08.309417 VRDP: Connection closed: 5
00:09:08.309566 VBVA: VRDP acceleration has been disabled.
00:12:25.919018 VMMDev: Guest Log: [INFO] New Job Starting in slot1
00:12:26.163117 VMMDev: Guest Log: [INFO] Condor JobID: 502248.90 in slot1
00:12:26.433353 VMMDev: Guest Log: [INFO] New Job Starting in slot2
00:12:26.718784 VMMDev: Guest Log: [INFO] Condor JobID: 502248.91 in slot2
00:12:31.322869 VMMDev: Guest Log: [INFO] MCPlots JobID: 50561484 in slot1
00:12:31.947434 VMMDev: Guest Log: [INFO] MCPlots JobID: 50561441 in slot2
00:12:36.460952 VMMDev: Guest Log: [INFO] ===> [runRivet] Tue Jul 2 12:48:42 CEST 2019 [boinc pp jets 8000 25 - pythia6 6.428 374 100000 75]
00:12:37.044618 VMMDev: Guest Log: [INFO] ===> [runRivet] Tue Jul 2 12:48:42 CEST 2019 [boinc pp jets 8000 250,-,4160 - pythia6 6.428 z1 100000 75]
(that was the 3rd one and here is the first one with one of the typical mess on a few lines starting the job)

00:07:40.595092 VMMDev: Guest Log: [DEBUG] HTCondor ping
00:07:48.180262 VMMDev: Guest Log: [DEBUG] 0
00:07:51.770599 VRDP: Logoff: HAL5000 (::1) build 17763. User: [] Domain: [] Reason 0x0000.
00:07:51.770747 VRDP: Connection closed: 1
00:07:51.770909 VBVA: VRDP acceleration has been disabled.
00:11:33.717606 VMMDev: Guest Log: [INFO] [NeIwN FJOo]b NSetwa rJtoibn gS tianr tsilnogt 1i
00:11:33.717709 VMMDev: Guest Log: lot2
00:11:33.874400 VMMDev: Guest Log: [[INIFNOF]O ]C oCnodnodro rJ oJboIbDI:D : 5 0520224284.81.41 8i ni slno ts1lo

00:11:39.141178 VMMDev: Guest Log: [INFO] MCPlots JobID: 50561454 in slot1
00:11:39.148507 VMMDev: Guest Log: [INFO] MCPlots JobID: 50561408 in slot2
00:11:44.250193 VMMDev: Guest Log: [INFO] ===> [runRivet] Tue Jul 2 12:39:42 CEST 2019 [boinc ppbar jets 1960 64 - pythia6 6.428 z1 100000 75]
00:11:44.274097 VMMDev: Guest Log: [INFO] ===> [runRivet] Tue Jul 2 12:39:43 CEST 2019 [boinc ppbar mb-inelastic 200 - - pythia8 8.235 tune-AU2lox 100000 75]
ID: 6409 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1067
Credit: 334,882
RAC: 4
Message 6411 - Posted: 2 Jul 2019, 13:07:33 UTC - in response to Message 6407.  

But why is 2279-779662-75.run downloaded, copied to 'input'-file in the slot-directory and obviously not used?

At least the runspec=boinc pp jets 7000 40,-,460 - pythia6 6.428 z2 100000 75 is not the job that's running within the VM.
Also is cranky-0.0.21 copied to the slot folder, but not running in the VM.

What is the process crypto running in the VM?


It is a bit of a mess at the moment. I am testing the old style VM and this project is now set up for the native Theory method. Hence cranky and the input file are redundant. Once we have sanity checked this VM so it can be released to production I will move back. Hopefully I have got some thing interesting in the pipe line but need to get day-to-day maintenance done first.
ID: 6411 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 138
Credit: 2,948,723
RAC: 224
Message 6412 - Posted: 2 Jul 2019, 17:35:18 UTC

All of my 3.13s happily returned McPlots but ALL ended with

upload failure: <file_xfer_error>
<file_name>Theory_2279-772328-75_0_r1496801322_result</file_name>
<error_code>-240 (stat() failed)</error_code>
</file_xfer_error>

… on completion, including 2 that I tried to "end gracefully" with the checkpoint edit. 2 others still running near completion but I'm not hopeful. Couple of 3.14s running, but not far in, so we'll see how they fair.
ID: 6412 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1067
Credit: 334,882
RAC: 4
Message 6413 - Posted: 2 Jul 2019, 17:51:05 UTC - in response to Message 6412.  

Thanks. The error is due to this hybrid mess I created.
ID: 6413 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Theory Application : New Version (v3.13)


©2024 CERN