Document identifier: | LCG-GIS-MI |
---|---|
Date: | 2 March 2005 |
Author: | Guillermo Diez-Andino, Laurence Field, Oliver Keeble, Antonio Retico, Alessandro Usai |
Version: | v2.3.0-6 |
New versions of this document will be distributed synchronously with the
LCG middleware releases and they will contain the current
``state-of-art'' of the installation and configuration procedures.
A dual document with the upgrade procedures to manually update the
configuration of the nodes from the previous LCG version to the current one is
also part of the release.
Since the release LCG-2_3_0, the manual installation and configuration
of LCG nodes is supported by a set of scripts.
Nevertheless, the automatic configuration for some particular node types has
been intentionally left not covered. This mostly happens when a particular
possible configuration is not recommended or obsolete within the LCG-2
production environment (e.g. Computing Element with Open-PBS).
Two list of ``supported'' and ``not recommended'' node
configurations follows.
The ``supported'' node types are:
https://plone.fnal.gov/Plone/Scientific%20Linux/sl.download/
The site where the sources, and the images (iso) to create the CDs can be
found is
ftp://ftp.scientificlinux.org/linux/scientific/301/iso/
http://grid-deployment.web.cern.ch/grid-deployment/download/RpmDir/release/ntp-4.1.1-1.i386.rpm
http://grid-deployment.web.cern.ch/grid-deployment/download/RpmDir/release/libcap-devel-1.10-8.i386.rpm
http://grid-deployment.web.cern.ch/grid-deployment/download/RpmDir/release/libcap-1.10-8.i386.rpm
restrict <time_server_IP_address> mask 255.255.255.255 nomodify notrap noquery server <time_server_name>Additional time servers can be added for better performance results. For each server, the hostname and IP address are required. Then, for each time-server you are using, add a couple of lines similar to the ones shown above into the file /etc/ntp.conf.
137.138.16.69 137.138.17.69
-A input -s <NTP-serverIP-1> -d 0/0 123 -p udp -j ACCEPT -A input -s <NTP-serverIP-2> -d 0/0 123 -p udp -j ACCEPTIf you are using iptables, you can add the following to /etc/sysconfig/iptables
-A INPUT -s <NTP-serverIP-1> -p udp --dport 123 -j ACCEPT -A INPUT -s <NTP-serverIP-2> udp --dport 123 -j ACCEPT
Remember that, in the provided examples, rules are parsed in order, so ensure that there are no matching REJECT lines preceding those that you add. You can then reload the firewall
> /etc/init.d/ipchains restart
> ntpdate <your ntp server name> > service ntpd start > chkconfig ntpd on
> ntpq -p
> wget http://www.cern.ch/grid-deployment/gis/yaim/lcg-yaim-x.x.x-x.noarch.rpm
> rpm -ivh lcg-yaim-x.x.x-x.noarch.rpm
> apt-get install lcg-yaim
WARNING: The Site Configuration File is sourced by the configuration
scripts. Therefore there must be no spaces around the equal sign.
Example of wrong configuration:
SITE_NAME = my-siteExample of correct configuration:
SITE_NAME=my-siteA good syntax test for your Site Configuration file (e.g. my-site-info.def) is to try and source it manually, running the command
> source my-site-info.defand checking that no error messages are produced.
The complete specification of the configurable variables follows.
We strongly recommend that, if you have not clear the meaning of a
configuration variable, you just report to us and try and stick to values
provided in the examples.
Maybe instead, though you understand the meaning, you are in doubts about the
values to be configured into some of the variables above listed.
This may happen, for instance, if you are running a very small site and you
are not configuring the whole set of nodes, and therefore you have to refer
to some ``public'' service (e.g. RB, BDII ...).
In this case, if you have a reference site, please ask them for indications.
Otherwise, send a message to the "LCG-ROLLOUT@cclrclsv.RL.AC.UK" mailing list.
For each variable described below, an application context is outlined.
The 'use context' of a variable is the lists of those node types that actually
need that particular variable to be set up in order to be correctly configured.
For each label-name listed in the CE_CLOSE_SE variable you need to create a set of new variables as follows2:
> wget ftp://ftp.scientificlinux.org/linux/scientific/303/i386/SL/RPMS/apt-0.5.15cnc6-4.SL.i386.rpm
> wget -nc http://ftp.freshrpms.net/pub/freshrpms/redhat/7.3/apt/apt-0.5.5cnc5-fr0.rh73.2.i386.rpm
> rpm -ivh apt-0.5.15cnc6-4.SL.i386.rpm
> rpm -ivh apt-0.5.5cnc5-fr0.rh73.2.i386.rpmPlease note that for the dependencies of the middleware to be met, you'll have to make sure that apt can find and download your OS rpms. This typically means you'll have to install an rpm called 'apt-sourceslist', or else create an appropriate file in your /etc/apt/sources.list.d directory.
RH7.3:
LCG_REPOSITORY="rpm http://grid-deployment.web.cern.ch/grid-deployment/gis apt/LCG-2_3_0/en/i386 lcg_rh73"
SL3:
LCG_REPOSITORY="rpm http://grid-deployment.web.cern.ch/grid-deployment/gis apt/LCG-2_3_0/en/i386 lcg_sl3"
In order to install the node with the desired middleware packages run the command
> /opt/lcg/yaim/scripts/install_node <site-configuration-file> <meta-package>
The complete list of the available meta-packages available with this release is
provided in 7.2. (RH7.3) and 7.2.(SL3)
For example, in order to install a CE with Torque, after the configuration of the site-info.def file is done, you have to run:
> /opt/lcg/yaim/scripts/install_node /opt/lcg/yaim/examples/site-info.def lcg-CE-torque
WARNING: The ``bare-middleware'' versions of the WN and CE meta-packages are
provided in case you are running a not covered LRMS system.
Consider that if you have chosen to go for the ``bare-middleware''
installation, for instance, of the CE, then you will need to run
> /opt/lcg/yaim/scripts/install_node /opt/lcg/yaim/examples/site-info.def lcg-torqueon the machine, in order to get the installation completed with Torque.
WARNING: There is a known installation conflict between the 'torque-clients'
rpm and the 'postfix' mail client (Savannah. bug #5509).
In order to workaround the problem you can either uninstall postfix or remove
the file
/usr/share/man/man8/qmgr.8.gz from the target node.
Node Type | meta-package Name | meta-package Description |
Worker Node (middleware only) | lcg-WN | It does not include any LRMS |
Worker Node (with Torque client) | lcg-WN-torque | It includes the 'Torque' LRMS |
Computing Element (middleware only) | lcg-CE | It does not include any LRMS |
Computing Element (with Torque) | lcg-CE-torque | It includes the 'Torque' LRMS |
User Interface | lcg-UI | User Interface |
LCG-BDII | lcg-LCG-BDII | LCG-BDII |
MON-Box | lcg-MON | RGMA-based monitoring system collector server |
Proxy | lcg-PX | Proxy Server |
Resource Broker | lcg-RB | Resource Broker |
Classic Storage Element | lcg-SECLASSIC | Storage Element on local disk |
Re-locatable distribution | lcg-TAR | It can be used to set up a Worker node or a UI |
Torque LRMS | lcg-torque | Torque client and server to be used in combination with the 'bare middleware' version of CE and WN packages |
Node Type | meta-package Name | meta-package Description |
Worker Node (middleware only) | lcg-WN | It does not include any LRMS |
Worker Node (with Torque client) | lcg-WN-torque | It includes the 'Torque' LRMS |
Computing Element (middleware only) | lcg-CE | It does not include any LRMS |
Computing Element (with Torque) | lcg-CE-torque | It includes the 'Torque' LRMS |
User Interface | lcg-UI | User Interface |
LCG-BDII | lcg-LCG-BDII | LCG-BDII |
MON-Box | lcg-MON | RGMA-based monitoring system collector server |
Proxy | lcg-PX | Proxy Server |
Resource Broker | lcg-RB | Resource Broker |
Classic Storage Element | lcg-SECLASSIC | Storage Element on local disk |
Re-locatable distribution | lcg-TAR | can be used to set up a Worker node or a UI |
Torque LRMS | lcg-torque | Torque client and server to be used in combination with the 'bare middleware' version of CE and WN packages |
> apt-get update && apt-get -y install lcg-CA
In order to keep the CA configuration up-to-date on your node we strongly recommend Site Administrators to program a periodic upgrade procedure of the CA on the installed node (e.g. running the above command via a daily cron job).
CE, SE, PROXY, RB nodes require the host certificate/key files before you
start their installation.
Contact your national Certification Authority (CA) to understand how to
obtain a host certificate if you do not have one already.
Instruction to obtain a CA list can be found in
http://markusw.home.cern.ch/markusw/lcg2CAlist.html
From the CA list so obtained you should choose a CA close to you.
Once you have obtained a valid certificate, i.e. a file
/etc/grid-security
The general procedure to configure the middleware packages that have been installed on the node via the procedure described in 7., is to run the command:
> /opt/lcg/yaim/scripts/<configuration-script> <site-configuration-file>The complete list of the configuration scripts available with this release is provided in 10.1..
For example, in order to configure the WN with Torque you had installed before, after the configuration of the site-info.def file is done, you have to run:
> /opt/lcg/yaim/scripts/configure_WN_torque /opt/lcg/yaim/examples/site-info.def
In the following paragraph a reference to all the available
configuration scripts is given.
There are items in the list marked with an asterisk (*). For these node types
there are some particularities in the configuration procedure or extra
configuration details to be considered, which are described in a following
dedicated section.
For all the unmarked node types, the general configuration procedure is the
one above described.
Node Type | Script Name | Script Description |
Worker Node (middleware only) | configure_WN | It does not configure any LRMS |
Worker Node (with Torque client) | configure_WN_torque | It configures also the 'Torque' LRMS client |
Computing Element (middleware only) | configure_CE | It does not configure any LRMS |
Computing Element (with Torque) * | configure_CE_torque | It configures also the 'Torque' LRMS client and server |
User Interface | configure_UI | User Interface |
LCG-BDII | configure_BDII | LCG-BDII |
MON-Box | configure_MON | RGMA-based monitoring system collector server |
Proxy | configure_PX | Proxy Server |
Resource Broker | configure_RB | Resource Broker |
Classic Storage Element | configure_classic_SE | Storage Element on local disk |
Re-locatable distribution * | configure_TAR | It can be used to set up a Worker Node or a UI (see 11.2. for details) |
You can use yaim to install more than one node type on a single machine. In this case, you should install all the relevant software first, and then run the configure scripts. For example, to install a combined RB and BDII, you should do the following;
> /opt/lcg/yaim/scripts/install_node /opt/lcg/yaim/examples/site-info.def lcg-RB > /opt/lcg/yaim/scripts/install_node /opt/lcg/yaim/examples/site-info.def lcg-LCG-BDII > /opt/lcg/yaim/scripts/configure_RB /opt/lcg/yaim/examples/site-info.def > /opt/lcg/yaim/scripts/configure_BDII /opt/lcg/yaim/examples/site-info.def
Note that one combination known not to work is the CE/RB, due to a conflict between the GridFTP servers.
WARNING: in the CE configuration context (and also in the 'torque' LRMS one),
a file with a a list of managed nodes needs to be compiled. An example of this
configuration file is given in /opt/lcg/yaim/examples/wn-list.conf
Then the file path needs to be pointed by the variable WN_LIST in the
Site Configuration File (see 5.1.).
The Maui scheduler configuration provided with the script is currently very
basic.
More advanced configuration examples, to be implemented manually by Site Administrators can be found in [5]
Once you have the middleware directory available, you must edit the site-info.def file as usual, putting the location of the middleware into the variable INSTALL_ROOT.
If you are sharing the distribution to a number of nodes, commonly WNs, then they should all mount the tree at INSTALL_ROOT. You should configure the middleware on one node (remember you'll need to mount with appropriate privileges) and then it should work for all the others if you set up your batch system and the CA certificates in the usual way. If you'd rather have the CAs on your share, the yaim function install_certs_userland may be of interest. You may want to mount your share ro after the configuration has been done.
What happens next depends on whether you are root or not...
Next you must install the dependency software, and run the config_TAR script, adding the type of node as an argument;
> /opt/lcg/yaim/scripts/install_node <site-configuration-file> lcg-TAR > /opt/lcg/yaim/scripts/configure_TAR /opt/lcg/yaim/examples/site-info.def WN|UI
Note that the script will not configure any LRMS. If you're configuring torque for the first time, you may find the config_users and config_torque_client yaim functions useful.
NOTE - at time of writing, this is not supported in the distributed yaim rpm. You'll have to check out the latest version from CERN's deployment CVS - http://lcgdeploy.cvs.cern.ch/cgi-bin/lcgdeploy.cgi/lcg-scripts/yaim/.
The relocatable tarball has some dependencies which would normally be installed as rpms by root. We've made this software available as a second tar file which you must download and untar under $EDG_LOCATION. This means that if you untarred the main distribution under /home/user/UI, you must untar the supplementary files under /home/user/UI/edg.
The middleware also requires java. If this is not available on your machine, download it from sun (1.4.2 is recommended - http://java.sun.com/j2se/1.4.2/download.html) and make sure you set the JAVA_LOCATION variable in your site-info.def. You'll probably want to alter the OUTPUT_STORAGE variable there too, as it's set to /tmp/jobOutput by default and it may be better pointing at your home directory somewhere.
Once the software is all installed, you should run
> /opt/lcg/yaim/scripts/configure_TAR /opt/lcg/yaim/examples/site-info.def UIto configure it.
Finally, you'll have to set up some way of sourcing the environment necessary to run the grid software. A script will be available under $INSTALL_ROOT/etc/profile.d for this purpose. Source grid_env.sh or grid_env.csh depending upon your choice of shell.
Installing a UI this way puts all the CA certificates under $INSTALL_ROOT/etc/grid-security and adds a user cron job to download the crls. However, please note that you'll need to keep the CA certificates up to date yourself.
In [3] there is more information on using this form of the distribution, including a description of what the configure script does. You should check this reference if you'd like to customise the relocatable distribution.
This distribution is used at CERN to make its lxplus system available as a UI. You can take a look at the docs for this too [4].
You can download the tar file for each operating system from
http://grid-deployment.web.cern.ch/grid-deployment/download/relocatable/LCG-2_3_0-rh73.tar.gz
http://grid-deployment.web.cern.ch/grid-deployment/download/relocatable/LCG-2_3_0-sl3.tar.gz
You can download supplementary tar files for the userland installation from
version | date | description |
v2.3.0-2 | 10/Jan/05 | 5.1.: CA_WGET variable added in site configuration file. |
v2.3.0-3 | 2/Feb/05 | Bibliography: Link to Generic Configuration Reference changed. |
`` | `` | 11.1., 5.1.: Details added on WN and users lists. |
`` | `` | 10.1.: script ``configure_torque''. no more available: removed from the list. |
v2.3.0-4 | 16/Feb/05 | Configure apt to find your OS rpms. |
v2.3.0-5 | 22/Feb/05 | Remove apt prefs stuff, mention multiple nodes on one box. |
v2.3.0-6 | 03/Mar/05 | Better lcg-CA update advice. |
This document was generated using the LaTeX2HTML translator Version 2002 (1.62)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -html_version 4.0 -no_navigation -address 'GRID deployment' LCG2-Manual-Install.drv_html
The translation was initiated by Oliver KEEBLE on 2005-03-02