Message boards : Theory Application : New version 5.00
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6688 - Posted: 26 Sep 2019, 9:23:24 UTC - in response to Message 6683.  

I have updated the image. Please let me know whether or not they all start fine on Windows. I have reduced the memory as requested. The cut-off time I will leave until the image is working as a precaution.
ID: 6688 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,548
RAC: 0
Message 6689 - Posted: 26 Sep 2019, 11:32:12 UTC - in response to Message 6688.  
Last modified: 26 Sep 2019, 11:34:04 UTC

I have updated the image. Please let me know whether or not they all start fine on Windows.
The issue of not found /shared/* is not solved.

I'm getting the same first image shown in message 6676

The difference is that the VM gets a shutdown signal now, because also #bash cranky can't be found.

Failed task: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2825994

Like in version 5.03 I get the VM startup successful, when I pause all tasks on other threads.
ID: 6689 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6690 - Posted: 26 Sep 2019, 13:12:42 UTC - in response to Message 6689.  
Last modified: 26 Sep 2019, 13:12:59 UTC

The issue of not found /shared/* is not solved.

I have made another change and added some debugging statements.
ID: 6690 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,548
RAC: 0
Message 6691 - Posted: 26 Sep 2019, 13:47:13 UTC - in response to Message 6690.  
Last modified: 26 Sep 2019, 13:53:19 UTC

Testing with 7 other threads busy, I see



... and no job is starting.

I'll see whether the job is killed by itself, else I'll stop it gracefully. Result: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2826358
ID: 6691 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6692 - Posted: 26 Sep 2019, 14:27:34 UTC - in response to Message 6691.  

This is strange. From the image it appears that the job is started before CVMFS is mounted. However the job is configured to start after the target muliti-user system is reached. Also I don't see this on my Linux machine. A new version is on it's way where I very kindly ask it to start after the HTTP server.
ID: 6692 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,548
RAC: 0
Message 6693 - Posted: 26 Sep 2019, 16:13:05 UTC - in response to Message 6692.  
Last modified: 26 Sep 2019, 16:16:55 UTC

The first task with version 5.07 has started under 7-threads-load and is doing the 124th attempt of pp jets 7000 300 - pythia8 8.235 tune-A2m 84000, probably lasting a bit longer to finish.
On the other threads, I surely will get shorter tasks too, sometime popping up in my list of tasks.
You see, it takes some time between Checking CVMFS and Checking runc.

ID: 6693 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 120
Credit: 2,113,012
RAC: 5
Message 6694 - Posted: 26 Sep 2019, 18:31:58 UTC

I allowed the Linux machine and 1 Windows host to get 1 of the latest 5.07s each so as not to have to Abort any on finding that it didn't work. From CP's comments on them possibly not being too happy if they weren't getting all the attention on start-up, I suspended most other work on those hosts and both are currently running Jobs. WooHoo
I resumed all other LHC sixtracktest tasks but I have limited these at single-core, single Job so we'll see if they start up ok normally when these finish.
Linux box finished 1 job, Task reported and credited, new task booted fine and new Job started.

Strangely, "Show graphics" on both hosts lands on the SAME partially complete Vincia job, that I'm not running, but clicking through to the logs gets to the logs of the actual running jobs. Even the new Tasks again lands there.
ID: 6694 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 120
Credit: 2,113,012
RAC: 5
Message 6695 - Posted: 26 Sep 2019, 20:37:16 UTC - in response to Message 6694.  

Windows host also completed its task, reported, credited and, with all other cores busy, booted up and started a new job.
ID: 6695 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,548
RAC: 0
Message 6696 - Posted: 26 Sep 2019, 20:41:12 UTC - in response to Message 6694.  

From CP's comments on them possibly not being too happy if they weren't getting all the attention on start-up, I suspended most other work on those hosts ....
The problem was with the previous versions.
With version 5.07, a load on all the threads, except the one where the new VM is starting, should be no problem anymore.
ID: 6696 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6699 - Posted: 27 Sep 2019, 7:42:03 UTC - in response to Message 6694.  

Strangely, "Show graphics" on both hosts lands on the SAME partially complete Vincia job, that I'm not running, but clicking through to the logs gets to the logs of the actual running jobs. Even the new Tasks again lands there.

This is maybe the default job that is displayed before anything is generated. It might not be picking up the new job images due to permissions. Will investigate later.
ID: 6699 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6700 - Posted: 27 Sep 2019, 7:43:37 UTC - in response to Message 6696.  

With version 5.07, a load on all the threads, except the one where the new VM is starting, should be no problem anymore.

Great! Thanks for your help with testing.
ID: 6700 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,548
RAC: 0
Message 6701 - Posted: 27 Sep 2019, 8:45:33 UTC - in response to Message 6700.  

With version 5.07, a load on all the threads, except the one where the new VM is starting, should be no problem anymore.

Great! Thanks for your help with testing.
My pleasure. ... and another wish ;)
In the native version of cranky, you have added a procedure to add a line to stderr like
09:04:00 CEST +02:00 2019-09-16: cranky-0.1.1: [INFO] ===> [runRivet] Mon Sep 16 07:03:58 UTC 2019 [boinc pp jets 7000 400 - pythia6 6.428 psoft 100000 108]

Maybe you can add that in this version too and also display that line on ALT-F1 at the end of the startup just after datetimegroup: cranky: [INFO] Running Container 'runc'.
So during runtime the user has an easy access to what job is actually running and how many events.
ID: 6701 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6702 - Posted: 27 Sep 2019, 10:49:58 UTC - in response to Message 6701.  

With version 5.07, a load on all the threads, except the one where the new VM is starting, should be no problem anymore.

Great! Thanks for your help with testing.
My pleasure. ... and another wish ;)
In the native version of cranky, you have added a procedure to add a line to stderr like
09:04:00 CEST +02:00 2019-09-16: cranky-0.1.1: [INFO] ===> [runRivet] Mon Sep 16 07:03:58 UTC 2019 [boinc pp jets 7000 400 - pythia6 6.428 psoft 100000 108]

Maybe you can add that in this version too and also display that line on ALT-F1 at the end of the startup just after datetimegroup: cranky: [INFO] Running Container 'runc'.
So during runtime the user has an easy access to what job is actually running and how many events.

The code to do that is there, it is just not working. Second thing to investigate.
ID: 6702 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6703 - Posted: 27 Sep 2019, 13:18:59 UTC - in response to Message 6694.  

Strangely, "Show graphics" on both hosts lands on the SAME partially complete Vincia job, that I'm not running, but clicking through to the logs gets to the logs of the actual running jobs. Even the new Tasks again lands there.

You should have your plots now.
ID: 6703 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6704 - Posted: 27 Sep 2019, 13:20:00 UTC - in response to Message 6702.  

The code to do that is there, it is just not working. Second thing to investigate.

Still not working after an attempt to fix, it is a tricky one.
ID: 6704 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 12 Sep 14
Posts: 1021
Credit: 274,753
RAC: 0
Message 6705 - Posted: 27 Sep 2019, 13:57:30 UTC - in response to Message 6704.  

Still not working after an attempt to fix, it is a tricky one.

Fixed now.
ID: 6705 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,548
RAC: 0
Message 6706 - Posted: 27 Sep 2019, 16:47:18 UTC - in response to Message 6705.  
Last modified: 27 Sep 2019, 17:40:16 UTC

Still not working after an attempt to fix, it is a tricky one.

Fixed now.

On the ALT-F1 Console it's working. Thanks!

It is not written to stderr output.
It's done now in the current production Theory VBox version directly at the start of a job.
E.g.:
2019-09-26 17:16:01 (5336): Guest Log: [INFO] ===> [runRivet] Thu Sep 26 17:15:43 CEST 2019 [boinc pp jets 8000 600 - pythia8 8.210 default-noCR 100000 124]
2019-09-26 22:28:30 (5336): Guest Log: [INFO] Job finished in slot1 with 0.
ID: 6706 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Ray Murray
Avatar

Send message
Joined: 13 Apr 15
Posts: 120
Credit: 2,113,012
RAC: 5
Message 6707 - Posted: 27 Sep 2019, 18:33:28 UTC - in response to Message 6703.  
Last modified: 27 Sep 2019, 18:39:16 UTC

Yes, "Show Graphics" now goes to the actual running job.

On the Linux host, I'm getting yellow-triangle ghost images left behind in VBox Media Manager when a task finishes, which have to be manually deleted. I freely admit to being not much good with Linux so it could be something I haven't set up correctly here.
No ghosts on Windows hosts.
ID: 6707 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1010
Credit: 591,548
RAC: 0
Message 6708 - Posted: 27 Sep 2019, 20:51:13 UTC

I extended the job duration of max 18 hours to make this task finish after >22 hours run time: https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=2826366
===> [runRivet] Thu Sep 26 22:11:56 UTC 2019 [boinc pp z1j 7000 150 - sherpa 2.1.1 default 100000 124]
ID: 6708 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 28 Jul 16
Posts: 263
Credit: 232,222
RAC: 0
Message 6711 - Posted: 29 Sep 2019, 14:29:06 UTC

Some comments regarding Theory 5.09 (vbox64_theory) on linux.


1.
Startup is much faster than for the recent production app (vbox).
Very nice.


2.
The app uses a bootstrap that contains some weird commands (for me).

2.1
printf "\033c" & # Clears the console

Why do you send the command to the background?
This causes an additional shell process to be set up just to clear the screen and to terminate that shell immediately after the screen is cleared.
Should work a few us faster without "&".
printf "\033c" 


2.2
Setup the consoles

CP already mentioned that ALT-F2 and ALT-F3 are switched compared to other vbox apps.
This should be streamlined.

bash -c "top 2>&1 >/dev/tty2 2>/dev/null </dev/tty2" &

Looks weird for 2 reasons:
- stderr is redirected twice. Why?
- stdin is redirected from tty2

The latter causes top to be stopped if a user accidentally hits a key at the top console.
I suggest to redirect the input from a console that can't be reached by keybord, e.g. tty13.
The following command works on a VM running opensuse, update delay is extended to 5 s:
bash -c "top -d 5 >/dev/tty3 2>/dev/null </dev/tty13" &



3.
Network
The app's CVMFS is not yet configured to use openhtc.io.
Instead it uses the normal CVMFS-Stratum-Ones.
In addition it bypasses a local proxy.

Both should be solved in future releases.


4.
stderr.txt doesn't include a line like this (CP already mentioned that):
2019-09-26 17:16:01 (5336): Guest Log: [INFO] ===> [runRivet] Thu Sep 26 17:15:43 CEST 2019 [boinc pp jets 8000 600 - pythia8 8.210 default-noCR 100000 124]
ID: 6711 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Theory Application : New version 5.00


©2020 CERN