Message boards : Theory Application : CVMFS configuration of LHC@Home VMs
Message board moderation

To post messages, you must log in.

AuthorMessage
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 475
Credit: 389,411
RAC: 28
Message 6424 - Posted: 4 Jul 2019, 9:02:22 UTC

Some comments regarding the CVMFS configuration of LHC@Home VMs


CVMFS is configured by several files in /etc/cvmfs and it's subdirectories.
The files are sourced in a specific order and contain configuration parameters as key-value pairs.
As the configuration files are interpreted by /bin/sh, parameters containing special characters, e.g. semicolons, must be quoted.
See: https://cvmfs.readthedocs.io/en/stable/cpt-configure.html

Example:
/etc/cvmfs/config.d/bar.cern.ch.conf

CVMFS_SERVER_URL=http://foo.cern.ch/cvmfs/@fqrn@    # can be left unquoted

CVMFS_SERVER_URL="http://foo.cern.ch/cvmfs/@fqrn@;http://baz.cern.ch/cvmfs/@fqrn@"    # MUST be quoted (either ' or ")



An unquoted line like this
CVMFS_SERVER_URL=http://foo.cern.ch/cvmfs/@fqrn@;http://baz.cern.ch/cvmfs/@fqrn@

would configure only the 1st server (foo.cern.ch) as ";" would be interpreted as command seperator.
Luckily "http://baz.cern.ch/cvmfs/@fqrn@" is not a valid command and will be ignored.


The sourcing order becomes important if default settings should be replaced.
See comments from /etc/cvmfs/default.conf
# Don't edit here.  Create /etc/cvmfs/default.local.
# As a rule of thumb, overwrite only parameters you find in here.
# If you look for any other parameter, check /etc/cvmfs/domain.d/<your_domain>.(conf|local)
# and /etc/cvmfs/config.d/<your_repository>.(conf|local)
#
# Parameter files are sourced in the following order
#
# /etc/cvmfs/default.conf
# /etc/cvmfs/default.d/*.conf (in alphabetical order)
# $CVMFS_CONFIG_REPOSITORY/etc/cvmfs/default.conf (if config repository is set)
# /etc/cvmfs/default.local
#
# $CVMFS_CONFIG_REPOSITORY/etc/cvmfs/domain.d/<your_domain>.conf (if config repository is set)
# /etc/cvmfs/domain.d/<your_domain>.conf
# /etc/cvmfs/domain.d/<your_domain>.local
#
# $CVMFS_CONFIG_REPOSITORY/etc/cvmfs/config.d/<your_repository>.conf (if config repository is set)
# /etc/cvmfs/config.d/<your_repository>.conf
# /etc/cvmfs/config.d/<your_repository>.local


Example
/etc/cvmfs/local.conf
CVMFS_SERVER_URL=http://foo.cern.ch/cvmfs/@fqrn@

/etc/cvmfs/config.d/bar.cern.ch.conf
CVMFS_SERVER_URL=http://baz.cern.ch/cvmfs/@fqrn@

This would configure all repositories to use foo.cern.ch except bar.cern.ch which would use baz.cern.ch.
To get bar.cern.ch from a different server a *.local file must be created in /etc/cvmfs/config.d:
/etc/cvmfs/config.d/bar.cern.ch.local
CVMFS_SERVER_URL=http://foobaz.cern.ch/cvmfs/@fqrn@

In addition the example violates the rule of thumb to only write parameters in a *.local file that can be found in the corresponding *.conf file.
As CVMFS_SERVER_URL is usually configured in /etc/cvmfs/domain.d or /etc/cvmfs/config.d it should not be reconfigured in /etc/cvmfs/default.local.




Regarding the CVMFS of LHC@Home I found a user-data file that writes an initial CVMFS configuration:
cvmfs:
    local:
        CVMFS_QUOTA_LIMIT: 10000
        CVMFS_REPOSITORIES: grid,sft,alice
        CVMFS_HTTP_PROXY: auto
        CVMFS_PAC_URLS: "http://lhchomeproxy.cern.ch/wpad.dat;http://lhchomeproxy.fnal.gov/wpad.dat"
        CVMFS_SERVER_URL: "http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1fnal-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ral-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@"
write_files:
  - owner: root:root
    path:  /etc/cvmfs/config.d/cernvm-prod.cern.ch.local
    permissions: '0644'
    content: |
        CVMFS_SERVER_URL="http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1fnal-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ihep-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ral-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1unl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1asgc-cvmfs.openhtc.io:8080/cvmfs/@fqrn@

It looks like this file is used to write CVMFS_PAC_URLS and CVMFS_SERVER_URL to /etc/cvmfs/default.local.
CVMFS_SERVER_URL results in a misconfiguration as this parameter is unquoted in the target and it shouldn't be set there at all.
/etc/cvmfs/local.conf
CVMFS_SERVER_URL=http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1fnal-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ral-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@



CVMFS_PAC_URLS will be modified later during the VM's boot process but the pattern looks similar to CVMFS_SERVER_URL.
Hence I suspect it is also set without correct quoting.
This results in the fact that lhchomeproxy.fnal.gov can't be used as fallback.

In addition CVMFS_REPOSITORIES lists more repositories than necessary.
Neither "sft" nor "alice" are required during the early boot phase.
Hence they might be removed from user-data and configured later via bootstrap.
This would also make user-data more general (Theory, CMS, ...) as a change of this file would require a new vdi.


The lower part of user-data writes CVMFS_SERVER_URL to cernvm-prod.cern.ch.local.
This might be critical as the closing double quotes are missing (!).
Beside that CVMFS_SERVER_URL should be set in /etc/cvmfs/domain.d/cern.ch.local.


Here is a suggestion how the cvmfs part of the user-data file might look like:
cvmfs:
    local:
        CVMFS_QUOTA_LIMIT: 10000
        CVMFS_REPOSITORIES: grid
        CVMFS_HTTP_PROXY: auto
        CVMFS_PAC_URLS: |
            "http://lhchomeproxy.cern.ch/wpad.dat;http://lhchomeproxy.fnal.gov/wpad.dat"
write_files:
  - owner: root:root
    path:  /etc/cvmfs/domain.d/cern.ch.local
    permissions: '0644'
    content: |
        CVMFS_SERVER_URL="http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1fnal-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ihep-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ral-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1unl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1asgc-cvmfs.openhtc.io:8080/cvmfs/@fqrn@"

App specific repositories should be added via bootstrap.
ID: 6424 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 475
Credit: 389,411
RAC: 28
Message 6435 - Posted: 12 Jul 2019, 8:45:59 UTC

It looks like the missing quotes in "user-data" still cause some VMs at the production server to fail whenever they pic the last server from the list which is s1bnl-cvmfs.openhtc.io.

Examples:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=235927721
https://lhcathome.cern.ch/lhcathome/result.php?resultid=235773403
https://lhcathome.cern.ch/lhcathome/result.php?resultid=236228607
https://lhcathome.cern.ch/lhcathome/result.php?resultid=236155355

They typically show a line like this in stderr.txt:
2019-07-11 09:29:08 (32000): Guest Log: 2.4.4.0 3535 1 27852 9707 3 1 7431 10240000 2 65024 0 3 100 0 0 http://s1bnl-cvmfs.openhtc.io/cvmfs/grid.cern.chCVMFS_SERVER_URL=http://s1cern-cvmfs.openhtc.io/cvmfs/grid.cern.ch DIRECT 0


Malformed CVMFS server entry:
http://s1bnl-cvmfs.openhtc.io/cvmfs/grid.cern.chCVMFS_SERVER_URL=http://s1cern-cvmfs.openhtc.io/cvmfs/grid.cern.ch

should be:
http://s1bnl-cvmfs.openhtc.io/cvmfs/grid.cern.ch

The "0" at the end of the line indicates that the repository is NOT connected.
Hence all data requested from that repository can't be accessed.
ID: 6435 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 751
Credit: 11,610,376
RAC: 1,406
Message 6436 - Posted: 12 Jul 2019, 9:13:10 UTC

There has been so much trouble with the EXIT_INIT_FAILURE and EXIT_TIME_LIMIT_EXCEEDED over there and I checked other members that run even more than I do and see hundreds of these errors and nothing done about it so I gave up and switched most of mine to Sixtracks since they don't even need the internet to run.

Funny thing is my 2-core Theory tasks are Valids but 4-core on my newer pc's failed and some of them are the type that run over 10 hours before the Error (and the X86 has no problem with single-core tasks along with other single-core X64 tasks)....I hope your " CVMFS configuration of LHC" is read and something done since these days we have hundreds of pc's just running non-stop and paying no attention to all the Invailds/errors.


(Not to mention CMS problems here on the Win 10 OS)
And the usual [ERROR] Condor ended after 1156 seconds with LHC Theory
(in 24 hours I start a new month at high-speed satellite so I will test them again at LHC and the CMS here since they demand even more MB d/l here just to get a task to start running)
ID: 6436 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crystal Pellet
Volunteer tester

Send message
Joined: 13 Feb 15
Posts: 1182
Credit: 815,866
RAC: 245
Message 6437 - Posted: 13 Jul 2019, 8:51:18 UTC - in response to Message 6435.  

Hi computezrmle, since you're the network-guru here.

I loaded the new 12-Jul-vdi on the production client and I see:

http://s1cern-cvmfs.openhtc.io/cvmfs/grid.cern.ch DIRECT 1

in the remote display. I suppose, that failure is corrected.
ID: 6437 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 28 Jul 16
Posts: 475
Credit: 389,411
RAC: 28
Message 6438 - Posted: 13 Jul 2019, 11:32:13 UTC - in response to Message 6437.  

Hi computezrmle, since you're the network-guru here.

Thanks. Too much honor.

I loaded the new 12-Jul-vdi on the production client and I see:

http://s1cern-cvmfs.openhtc.io/cvmfs/grid.cern.ch DIRECT 1

in the remote display. I suppose, that failure is corrected.

It depends on the server that you get back from the geolocation API.
In your case it's s1cern... which never caused that error as far as I remember.

If you get the last server from the halfquoted string (s1bnl...) it's name is immediately followed by "CVMFS_SERVER..." echoed by the bootstrap script.
That concatenated string results in an invalid CVMFS setup.

Since the configuration string is correctly quoted in the new vdi that error should not occur any more.
ID: 6438 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 8 Apr 15
Posts: 751
Credit: 11,610,376
RAC: 1,406
Message 6907 - Posted: 17 Dec 2019, 1:12:27 UTC

just messing around waiting for the football game to start

https://cernvm.cern.ch/portal/filesystem/debugmount
ID: 6907 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Theory Application : CVMFS configuration of LHC@Home VMs


©2024 CERN