Version information
Start using this module
Add this module to your Puppetfile:
mod 'HEPPuppet-htcondor', '2.0.1'
Learn more about managing modules with a PuppetfileDocumentation
#Puppet module for HTCondor batch system
Latest stable version: https://github.com/HEP-Puppet/htcondor/releases/tag/v1.3.1
Development branch (heading for 2.0.0): https://github.com/HEP-Puppet/htcondor/tree/development
Puppetforge: https://forge.puppetlabs.com/HEPPuppet/htcondor
####Table of Contents
- Overview - What is the htcondor module?
- Module Description - What does the module do?
- Setup - The basics of getting started with htcondor
- Limitations - OS compatibility, etc.
- Development - Guide for contributing to the module
##Overview The htcondor modules allows you to set up a HTCondor cluster (https://research.cs.wisc.edu/htcondor/). It depends on several other modules, including puppetlabs/(stdlib|concat|firewall). Please check the metadata.json for detailed dependencies.
##Module Description An HTCondor cluster consists of at least three types of nodes:
- a worker for executing the jobs
- a scheduler for job submission
- a collector/negotiator to match jobs with workers
This puppet modules allows for the configuration of these three types of nodes.
##Setup What the htcondor module affects:
- configuration files and directories (/etc/condor/*)
- installation of htcondor software (condor* packages)
- a new fact for facter: condor_version
###Beginning with HTCondor Since admins might wish to run their own repository or disable repositories after install, the HTCondor repository is no longer included in the Puppet module since version 2.0.0. Therefore, the first step is to install the latest HTCondor repository for your OS (https://research.cs.wisc.edu/htcondor/yum/):
yum install -y https://research.cs.wisc.edu/htcondor/yum/repo.d/htcondor-stable-rhel6.repo
If you wish to use a pool password for authentication you will need to create one first: condor_store_cred -f <path_to_htcondor_module>/files/pool_password
.
Examples
hiera
config examples can be found in the examples folder. They describe a minimal example of
- settings shared across different node types:
htcondor_common.yaml
- settings for managers (nodes that run collector & negotiator daemons):
htcondor_manager.yaml
- settings for schedulers:
htcondor_scheduler.yaml
- settings for worker nodes:
htcondor_common.yaml
The examples assume class management in hiere by addinghiera_include('classes')
to thesite.pp
. Real life examples can be found in https://github.com/uobdic/UKI-SOUTHGRID-BRIS-HEP.
Custom machine/job attributes
Sometimes it is necessary to create custom attributes for condor. Machine attributes can be used
in job requirements (e.g. HasMatLab = True
) and job attributes for job reporting/monitoring (e.g. HEPSPEC06 = 14.00
).
To specify the attributes in hiera simply add
htcondor::custom_attributes:
- HasMatLab: True
...
and for job attributes
htcondor::custom_job_attributes:
- HEPSPEC06: 14.00
- CPUScaling: 1.04
...
Although the use is identical, they are put into different places. custom_attributes
end up added to the STARTD_ATTRS
and custom_job_attributes
are added to STARTD_JOB_ATTRS
.
##Limitations ###General
##Development
###Contributing
###Running tests
Please run
bundle exec rake validate && bundle exec rake lint && bundle exec rake spec SPEC_OPTS='--format documentation'
and make sure no errors are present when submitting code.
Version 2.0.0
Version 2.0.0 brought big changes to the module. The bigest change is a structual one.
htcondor::params.pp
was added to set defaults for all the parameters.
In addition, parameters are attempted to be read via hiera
first. Full merge
support for hashes and arrays is provided.
With these changes the htcondor::config.pp
was split into six pieces:
- the main config setting up the rest
- a common config part
- the security configuration
- separate configs for manager, scheduler & worker The full detail of these changes can be seen in PR 53.
New features
- configure connection broker for private workers (i.e. workers that cannot be reached from the manager or scheduler but can reach the manager).
- enabled
ganglia
daemon for schedulers (previously only possible on managers) - flag to enable condor reporting, disabed by default
- added
use_anonymous_auth
- added
custom_machine_attributes
andcustom_machine_attributes
which can be used to add classads forSTARTD_ATTRS
andSTARTD_JOB_ATTRS
Bug fixes
- daemon list would be incorrect for some versions of Ruby. This was due to the use of
and
andor
operators which is incorrect for boolean comparisons. - added missing
cluster_has_multiple_domains
parameter (w.r.t to 2.0.0 beta) - removed repository dependency if it is disabled
Other
- changed config templates to ensure new line at the end of the file and reduced the use of
-%>
- workers are no longer able to write to schedulers by default
- new formatting for the security config: one line per entry for manager/scheduler/worker
- removed
use_pkg_config
parameter. - no longer changing
/etc/condor/condor_config
nor/etc/condor/condor_config.local
as recommended by the HTCondor team - content previously in
/etc/condor/condor_config.local
now in/etc/condor/config.d/00_config_local.config
Dependencies
- puppetlabs/firewall (>=0.3.1)
- puppetlabs/stdlib (>=4.1.0)