Date |
03.05.2010 |
Priority |
Normal |
Description
glite-SGE_utils
First SGE_utils release in SL5
This new glite-SGE_utils release integrates SGE LRMS with
CREAMCE (Version 1.6) in SL5, x86_64. The CREAMCE integration
consists in setting up the BLAH configuration to interoperate
with SGE. For BLAH to work, the following set of SGE scripts and
binaries should be available under /opt/glite/bin: BUpdaterSGE,
sge_helper, sge_hold.sh, sge_submit.sh, sge_resume.sh,
sge_status.sh and sge_cancel.sh. These scripts are installed by
glite-ce-blahp rpm but the same files can also be downloaded from
http://www.egee.cesga.es/cream/releases/0.60/bin_0.60.tar.gz
The CREAMCE must be installed in a separate node from the SGE
QMASTER, and the same SGE software version should be used in both
cases. After installation of the CREAMCE and SGE_utils
meta-packages, reconfigure the services: /opt/glite/yaim/bin/yaim
-c -s /root/site-cfg/siteinfo/site-info-egee.def -n creamCE -n
SGE_utils
CREAMCE should be declared as an allowed submission host in
SGE QMASTER using "qconf -as
<CE.MY.DOMAIN>". The SGE Qmaster
configuration should also have the definition
"execd_params INHERIT_ENV=false" which can be
implemented in SGE QMASTER using "qconf
-mconf". This setting allows to propagate the
environment of the submission machine (CE) into the execution
machine (WN).
The transferring of files between WN and CE is handled by a script, called sge_filestaging, which must be available in all WNs under /opt/glite/bin, and which you may find in your CreamCE installation under /opt/glite/bin/sge_filestaging. If you are using glite-yaim-sge-client to configure your WNs (http://eticssoft.web.cern.ch/eticssoft/repository/org.glite/org.glite.yaim.sge-client/4.1.1/noarch/glite-yaim-sge-client-4.1.1-3.noarch.rpm) that script will be present by default in your WNs, otherwise, you will have to install it. By default, this copy mechanism works with passwordless scp WN<->CreamCE but it is up to the site admin to set it up (YAIM will not take care of that task on your behalf). This script must be executed as prolog and epilog of your jobs. Therefore you should define:
prolog /opt/glite/bin/sge_filestaging --stagein
epilog /opt/glite/bin/sge_filestaging --stageout
either in the SGE global configuration "qconf -mconf" or in "each queue configuration "qconf -mq ". If you already have some prolog and epilog scripts defined, just add those definitions to your scripts. If your prolog and epilog scripts run as root, you will have use su (for example, su -m -c "/opt/glite/bin/sge_filestaging --stageout" $USER)
Some sites use SGE installations shared via NFS or equivalent
(see bug #59060). In order to prevent any changes in that SGE NFS
SHARED setup, a new yaim variable, called SGE_SHARED_INSTALL, is
introduced. Its default value is "no" meaning
that SGE software WILL BE configured by YAIM. If you are using a
SGE installation shared via NFS or equivalent, and you do not
want to change it with YAIM, set SGE_SHARED_INSTALL=yes in your
site-info.def file.
KNOWN ISSUES:
KNOWN ISSUES FOUND DURING THE
CERTIFICATION:
SOLVED BUGS:
RELATED SOLVED BUGS:
Patch #3767: [ yaim-core ] yaim-core 4.0.12 SL5/x86_64
New release of yaim core containing a set of bug fixes and new
features:
- Can now configure the GSI callout to call the ARGUS PEP
client.
- Avoid mistakenly removing all the services from gLiteservices
file.
- Fix GLOBUS_TCP_PORT_RANGE setting on the SL5 tarball UI.
- Correct unset for shell functions in
clean-grid-env-funcs.sh
- Make config_bdii_only return non zero in case of error
- Fixes for installing the UI tarball on CernVM.
- Allow general use of the 'nickname' field in the VOMSES
settings.
- Add yaim core RPM dependency on perl
- Allow use of pool accounts with up to 4 digits
- Fix grid-env.sh manipulation when running a single yaim
function
- Fix gridmap dir group on WMS
- Change the CE_INBOUNDIP and CE_OUTBOUNDIP defaults in
site-info.def to be valid and imply the correct (upper)
case.
- Call setup-openssl for VDT 1.10.
Patch #3977: SL5/x86_64 APEL CPUScalingFactor bug fix
APEL will now read the CPUScalingReferenceSI00 value from the
site GIIS. If this value is not available, APEL will read
GlueHostbenchmarkSI00.
Note that the new version of APEL will read the CPU power
(specint rating) from the GlueCECapability CPUScalingReferenceSI00
attribute if it is published, so please check that the value is
correct (the same as the GlueHostBenchmarkSI00 attribute) and that
your APEL accounting records look OK after the upgrade.
This update fixes various bugs. For the full list of bugs, please see list below.
Fixed bugs
Number | Description |
#3767 |
[ yaim-core ] yaim-core 4.0.12 SL5/x86_64 |
#3959 |
Release 1.6 of CREAM CE for sl5_x86_64 |
#3977 |
SL5/x86_64 APEL CPUScalingFactor bug fix |
#59060 |
Introduce yaim variable in SGE_utils for site admins to set if they do not want to setup SGE client |
#59626 |
Wrong YAIM error message , SGE_utis |
#59627 |
Typo in YAIM error message [SGE_utils] |
#59937 |
sge_helper blah scirpt coding error |
#63047 |
Changes needed in blah.conf to support CreamCE V1.6 |
Updated rpms
The RPMs can be updated using yum via
Service reconfiguration after update
Service must be reconfigured.
Service restart after update
Service must be restarted.
How to apply the fix
- Update the RPMs (see above)
- Update configuration (see above)
- Restart the service if necessary (see above)
|