.. _baseline-user-guide: ******************* Baseline User Guide ******************* .. _notable-diffs: Notable Differences =================== .. list-table:: Please note before using Baseline :widths: 20 150 :header-rows: 1 * - Topic - Description * - :ref:`Accounts` - All users must apply for Baseline accounts using the MyOLCF `Account Request Form `_. Each account must be associated with an allocated project. Your project's PI can provide the project ID which will be used in your account request. * - :ref:`Access` - Baseline can be access by ssh'ing into baseline.ccs.ornl.gov * - :ref:`Compute Resource` - Baseline is comprised of 180 nodes each with 128 cores 2X AMD 7713 processor. Compute resources are currently uniform with regard to compute, but contain a mixture of 256GB and 512GB per node memory. Batch system partitions can be used to target higher memory nodes as well as nodes purchased by research teams. * - :ref:`Batch Submission` - Baseline utilizes the Slurm batch scheduler and is similar to other CADES resources with a few notable exceptions. Please note that each submission must specify a node count, walltime, project ID, and partition. * - :ref:`Programming Environment` - Default modules `gcc/12.2.0` & `openmpi/4.0` are loaded upon login. As with other CADES resources, the module system can be used to modify the programming environment. * - :ref:`File Systems` - Each Baseline user has access to an NFS Home and Project area as well as a GPFS scratch filesystem. **Please note, Baseline does not cross-mount filesystems from other CADES resources.** * - :ref:`Transferring Data` - Because Baseline does not cross-mount other CADES filesystems, you will need to transfer all needed data onto Baseline. The NCCS OpenDTNs are available to all Baseline users to assist. For larger transfers, we recomend using the Globus endpoint, `NCCS Open DTN`. .. _system-overview: System Overview =============== CADES Baseline resources and services are deployed in the NCCS Open data enclave to serve ORNL researchers and their close collaborators. Baseline consists of both publicly available resources and resources that have been purchased by specific research groups. The open access portion of Baseline contains 180 nodes each with 128 cores 2X AMD 7713 processor. Baseline shares a 2.3 PB partition on the Wolf2 GPFS filesystem with other Open enclave environments for fast parallel active storage. Each baseline user will be given a 50 GB Baseline-specific home area. The open access cluster will use a fairshare scheduling approach. Research teams who wish to purchase privileged access to specific resources can reach out to the CADES director. .. image:: /images/Baseline-Node-Description-SMT1.png :align: center .. _baseline-login-nodes: Node Types ---------- On Baseline, there are two major types of nodes you will encounter: Login and Compute. While these are similar in terms of hardware, they differ considerably in their intended use. +-------------+--------------------------------------------------------------------------------------+ | Node Type | Description | +=============+======================================================================================+ | Login | When you connect to Baseline, you're placed on a login node. This | | | is the place to write/edit/compile your code, manage data, submit jobs, etc. You | | | should never launch parallel jobs from a login node nor should you run threaded | | | jobs on a login node. Login nodes are shared resources that are in use by many | | | users simultaneously. | +-------------+--------------------------------------------------------------------------------------+ | Compute | Most of the nodes on Baseline are compute nodes. These are where | | | your parallel job executes. They're accessed via the ``srun`` command. | +-------------+--------------------------------------------------------------------------------------+ Login nodes ----------- Baseline contains 4 login nodes that are accessible through a single load balancer at ``baseline.ccs.ornl.gov``. +--------------------+----------------+--------------+---------+ | Nodes | Cores-per-node | Processor | Memory | +====================+================+==============+=========+ | baseline-login[1-4]| 128 | 2X AMD 7713 | 256GB | +--------------------+----------------+--------------+---------+ As detailed in the :ref:`connecting` section, users can SSH into the general load balancer login node or explicitly SSH into one of the four login nodes. .. _baseline-compute-nodes: Compute nodes ------------- Baseline’s open access cluster contains 140 compute nodes in two memory configurations: +-------------------+----------------+-------------+---------+ | Nodes | Cores-per-node | Processor | Memory | +===================+================+=============+=========+ | baseline[1-72] | 128 | 2X AMD 7713 | 512GB | +-------------------+----------------+-------------+---------+ | baseline[73-140] | 128 | 2x AMD 7713 | 256GB | +-------------------+----------------+-------------+---------+ As detailed in the :ref:`partitions` section, baseline[1-72] represent the ``batch_high_memory`` partition, while baseline[73-140] represent the ``batch_low_memory`` partition. The following baseline nodes have been purchased by research groups and are reserved for their exclusive use: +-------------------+----------------+-------------+---------+---------+-------+ | Nodes | Cores-per-node | Processor | Memory | GPU | Owner | +===================+================+=============+=========+=========+=======+ | baseline[141-160] | 128 | 2X AMD 7713 | 1024GB | N/A | CCSI | +-------------------+----------------+-------------+---------+---------+-------+ | baseline[161-180] | 128 | 2x AMD 7713 | 512GB | N/A | CNMS | +-------------------+----------------+-------------+---------+---------+-------+ | baseline-gpu1 | 128 | 2x AMD 7713 | 1024GB | 8x H100 | ACMHS | +-------------------+----------------+-------------+---------+---------+-------+ The above nodes represent the ``batch_ccsi``, ``batch_cnms``, and ``gpu_acmhs`` partitions, respectively. File system ----------- CADES users share 2.3 PB of the Wolf2 General Parallel File System (GPFS) which sits in the NCCS Open Science enclave. Baseline mounts a stand-alone NFS-based filesystem which provides user home directories with 50 GB of storage separate from their NCCS Open Home. See :ref:`data-storage` for more information. .. note:: Please note, Baseline does not cross-mount filesystems available to other CADES resources. Operating System ---------------- Baseline is running Red Hat Enterprise Linux (RHEL). .. _accounts-projects: Account and Project Applications ================================ Active Baseline umbrella projects --------------------------------- Users within the Directorates / Divisions below can apply to join their respective umbrella projects. If you wish to have an umbrella project created for your Directorate or Divison, please reach out to `cades-help@ornl.gov`. +---------+-------------------------------------------------------------------------------------------------+ | Project | Directorate / Division | +=========+=================================================================================================+ | PHY191 | Physical Sciences Directorate (PSD) | +---------+-------------------------------------------------------------------------------------------------+ | CSC635 | Computing & Computational Sciences Directorate (CCSD) | +---------+-------------------------------------------------------------------------------------------------+ | NPH166 | Isotope Science and Engineering Directorate (ISED) | +---------+-------------------------------------------------------------------------------------------------+ | MAT269 | Center for Nanophase Material Sciences (CNMS) | +---------+-------------------------------------------------------------------------------------------------+ | BIE124 | Biosciences Division (BSD) | +---------+-------------------------------------------------------------------------------------------------+ | CLI185 | Climate Change Science Institute (CCSI) / Environmental Sciences Division (ESD) | +---------+-------------------------------------------------------------------------------------------------+ | GEN193 | Neutron Technologies Division (NTD) | +---------+-------------------------------------------------------------------------------------------------+ Applying for a user account --------------------------- - All users must apply for an account using the `Account Request Form `_, and apply for the `Open` side of the project that you want to apply for. - All accounts must be associated with an allocated project. Your project's PI can provide the project identifier that will be used in your account request. - When our accounts team begins processing your application, you will receive an automated email containing an unique 36-character confirmation code. Make note of it; you can use it to check the status of your application at any time. - The principal investigator (PI) of the project must approve your account and system access. We will make the project PI aware of your request. Checking the status of your application --------------------------------------- You can check the general status of your application at any time using the `myOLCF `_ self-service portal’s account status page. For more information, see the `myOLCF self-service portal documentation `_. If you need to make further inquiries about your application, you may email our Accounts Team at `accounts@ccs.ornl.gov`. When all of the above steps are completed, your user account will be created and you will be notified by email. Now that you have a user account and it has been associated with a project, you’re ready to get to work. This website provides extensive documentation for OLCF systems, and can help you efficiently use your project’s allocation. We recommend reading the System User Guides for the machines you will be using often. Maintaining your user account ----------------------------- All OLCF Users are subject to a yearly user account renewal to validate their account. Users will receive an email from our Accounts team at `accounts@ccs.ornl.gov` with details on how to renew your account on myOLCF. If you are currently a member of multiple Open enclave projects, submitting a renewal application for only one project will validate your user. There is no need to submit a renewal application for each project. If a user fails to renew their account in the allotted time, a new user account application will need to be submitted. Get access to additional projects --------------------------------- If you already have a user account at the Baseline, your existing credentials can be leveraged across multiple projects. You can gain access to another project by logging in to the myOLCF self-service portal and filling out the application under `My Account > Join Another Project`. For more information, see the `myOLCF self-service portal documentation `_. Once the PI of that project has been contacted and granted permission, your user account will be added to the relevant charge accounts and unix groups, and you will see these additions when you log in. .. _connecting: Connecting ========== Baseline has 4 login nodes that are configured behind a load balancer. These login nodes provide an environment for editing, compiling, and launching codes onto the compute nodes. All users will access the system through these same login nodes, and as such, running CPU or memory-intensive tasks on these nodes could interrupt service to other users. As a courtesy, we ask that you refrain from doing any analysis or visualization tasks on the login nodes. To connect to Baseline, ssh to the load balancer at ``baseline.ccs.ornl.gov`` :: ssh @baseline.ccs.ornl.gov .. note:: Login node resources are shared by all Baseline users. Please be courteous and limit the use of memory/cpu intensive process on the login nodes. Memory and CPU intensive as well as long running processes should be executed on Baseline's compute resources. .. _prog-env: Shell and Programming Environment ================================= Default shell ------------- A user’s default shell is selected when completing the user account request form. Currently, supported shells include: +------+------+------+------+------+ | bash | tsch | csh | ksh | zsh | +------+------+------+------+------+ If you would like to have your default shell changed, please send an email to: `cades-help@ornl.gov`. Environment Modules (Lmod) -------------------------- Environment modules are provided through `Lmod `__, a Lua-based module system for dynamically altering shell environments. By managing changes to the shell’s environment variables (such as ``PATH``, ``LD_LIBRARY_PATH``, and ``PKG_CONFIG_PATH``), Lmod allows you to alter the software available in your shell environment without the risk of creating package and version combinations that cannot coexist in a single environment. General Usage ^^^^^^^^^^^^^ The interface to Lmod is provided by the ``module`` command: +------------------------------------+-------------------------------------------------------------------------+ | Command | Description | +====================================+=========================================================================+ | ``module -t list`` | Shows a terse list of the currently loaded modules | +------------------------------------+-------------------------------------------------------------------------+ | ``module avail`` | Shows a table of the currently available modules | +------------------------------------+-------------------------------------------------------------------------+ | ``module help `` | Shows help information about ```` | +------------------------------------+-------------------------------------------------------------------------+ | ``module show `` | Shows the environment changes made by the ```` modulefile | +------------------------------------+-------------------------------------------------------------------------+ | ``module spider `` | Searches all possible modules according to ```` | +------------------------------------+-------------------------------------------------------------------------+ | ``module load [...]`` | Loads the given ````\(s) into the current environment | +------------------------------------+-------------------------------------------------------------------------+ | ``module use `` | Adds ```` to the modulefile search cache and ``MODULESPATH`` | +------------------------------------+-------------------------------------------------------------------------+ | ``module unuse `` | Removes ```` from the modulefile search cache and ``MODULESPATH`` | +------------------------------------+-------------------------------------------------------------------------+ | ``module purge`` | Unloads all modules | +------------------------------------+-------------------------------------------------------------------------+ | ``module reset`` | Resets loaded modules to system defaults | +------------------------------------+-------------------------------------------------------------------------+ | ``module update`` | Reloads all currently loaded modules | +------------------------------------+-------------------------------------------------------------------------+ Searching for Modules ^^^^^^^^^^^^^^^^^^^^^ Modules with dependencies are only available when the underlying dependencies, such as compiler families, are loaded. Thus, ``module avail`` will only display modules that are compatible with the current state of the environment. To search the entire hierarchy across all possible dependencies, the ``spider`` sub-command can be used as summarized in the following table. +------------------------------------------+--------------------------------------------------------------------------------------+ | Command | Description | +==========================================+======================================================================================+ | ``module spider`` | Shows the entire possible graph of modules | +------------------------------------------+--------------------------------------------------------------------------------------+ | ``module spider `` | Searches for modules named ```` in the graph of possible modules | +------------------------------------------+--------------------------------------------------------------------------------------+ | ``module spider /`` | Searches for a specific version of ```` in the graph of possible modules | +------------------------------------------+--------------------------------------------------------------------------------------+ | ``module spider `` | Searches for modulefiles containing ```` | +------------------------------------------+--------------------------------------------------------------------------------------+ .. _compiling: Compiling ========= Available compilers: -------------------- The following compilers are available on Baseline: - intel, intel composer xe - gcc, the gnu compiler collection (default) Upon login, default version of the gcc compiler and openmpi are automatically added to each user’s environment. Users do not need to make any environment changes to use the default version of gcc and openmpi. If a different compiler is required, it is important to use the correct environment for each compiler. To aid users in pairing the correct compiler and environment, the module system on baseline automatically pulls in libraries compiled with a given compiler when changing compilers. The compiler modules will load the correct pairing of compiler version, message passing libraries, and other items required to build and run code. To change the default loaded gcc environment to the intel environment for example, use: :: $ module load intel This will automatically unload the current compiler and system libraries associated with it, load the new compiler environment and automatically load associated system libraries as well. .. _running: Running Jobs ============ On Baseline, computational work is performed by *jobs*. Timely, efficient execution of these jobs is the primary concern of operation in any HPC system. A job on a commodity cluster such as Baseline typically comprises a few different components: - A batch submission script - A binary executable - A set of input files for the executable - A set of output files created by the executable The process for running a job, in general, is to: #. Prepare executables and input files. #. Write a batch script. #. Submit the batch script to the batch scheduler. #. Optionally monitor the job before and during execution. The following sections describe in detail how to create, submit, and manage jobs for execution on Baseline. Login vs Compute Nodes on Baseline ---------------------------------- When you initially log into baseline, you are placed on a *login* node. Login node resources are shared among all users of the sysetm. Because of this, you should be mindful when performing tasks on a login node and in particular should avoid long-running, memory-intensive, or many-core tasks on login nodes. Login nodes should be used for basic tasks such as file editing, code compilation, data backup, and job submission. Login nodes should *not* be used for memory- or compute-intensive tasks. Users should also limit the number of simultaneous tasks performed on the login resources. For example, a user should not run (10) simultaneous ``tar`` processes on a login node. .. note:: Special attention should be given to "make -j" which will by default launch one task for each core on the node. You should specify a limit such as "make -j 4" to limit the impact to other users on the login node. The majority of nodes on Baseline are *compute* nodes. Compute nodes are the appropriate place for resource-intensive (long-running, memory-intensive, or many-core) tasks. Compute nodes are accessed via the Slurm Workload Manager. There are several ways to access compute nodes with Slurm: by directly running a parallel task with `srun`, by starting an interactive-batch session with `salloc`, or by lauching a batch script with `sbatch`. These are described below. .. _baseline_slurm: Slurm ----- Baseline uses the Slurm batch scheduler. This section describes submitting and managing jobs within Slurm. Batch Scripts ^^^^^^^^^^^^^ Batch scripts, or job submission scripts, are the most common mechanism by which a user configures and submits a job for execution. A batch script is simply a shell script that also includes directives to be interpreted by the batch scheduling software (e.g. Slurm). Batch scripts are submitted to the batch scheduler, where they are then parsed for the scheduling configuration options. The batch scheduler then places the script in the appropriate queue, where it is designated as a batch job. Once the batch jobs makes its way through the queue, the script will be executed on the compute nodes. Components of a Batch Script """""""""""""""""""""""""""" Batch scripts are parsed into the following (3) sections: **Interpreter Line** The first line of a script can be used to specify the script’s interpreter; this line is optional. If not used, the submitter’s default shell will be used. The line uses the *hash-bang* syntax, i.e., ``#!/path/to/shell``. **Slurm Submission Options** The Slurm submission options are preceded by the string ``#SBATCH``, making them appear as comments to a shell. Slurm will look for ``#SBATCH`` options in a batch script from the script’s first line through the first non-comment line. A comment line begins with ``#``. ``#SBATCH`` options entered after the first non-comment line will not be read by Slurm. **Shell Commands** The shell commands follow the last ``#SBATCH`` option and represent the executable content of the batch job. If any ``#SBATCH`` lines follow executable statements, they will be treated as comments only. The execution section of a script will be interpreted by a shell and can contain multiple lines of executables, shell commands, and comments. when the job's queue wait time is finished, commands within this section will be executed on the primary compute node of the job's allocated resources. Under normal circumstances, the batch job will exit the queue after the last line of the script is executed. Example Batch Script """""""""""""""""""" The most common way to interact with the batch system is via batch scripts. A batch script is simply a shell script with added directives to request various resoruces from or provide certain information to the scheduling system. Aside from these directives, the batch script is simply the series of commands needed to set up and run your job. Consider the following batch script: .. code-block:: bash :linenos: #!/bin/bash #SBATCH -A ABC123 #SBATCH -J test #SBATCH -o %x-%j.out #SBATCH -t 1:00:00 #SBATCH -p batch #SBATCH -N 2 #SBATCH --mem=500GB cd $SLURM_SUBMIT_DIR srun ... In the script, Slurm directives are preceded by ``#SBATCH``, making them appear as comments to the shell. Slurm looks for these directives through the first non-comment, non-whitespace line. Options after that will be ignored by Slurm (and the shell). +------+-------------------------------------------------------------------------------------------------+ | Line | Description | +======+=================================================================================================+ | 1 | Shell interpreter line | +------+-------------------------------------------------------------------------------------------------+ | 2 | CADES project to charge | +------+-------------------------------------------------------------------------------------------------+ | 3 | Job name | +------+-------------------------------------------------------------------------------------------------+ | 4 | Job standard output file (``%x`` will be replaced with the job name and ``%j`` with the Job ID) | +------+-------------------------------------------------------------------------------------------------+ | 5 | Walltime requested (in ``HH:MM:SS`` format). See the table below for other formats. | +------+-------------------------------------------------------------------------------------------------+ | 6 | Partition (queue) to use | +------+-------------------------------------------------------------------------------------------------+ | 7 | Number of compute nodes requested | +------+-------------------------------------------------------------------------------------------------+ | 8 | Job Memory | +------+-------------------------------------------------------------------------------------------------+ | 9 | Blank Line | +------+-------------------------------------------------------------------------------------------------+ | 10 | Change into the run directory | +------+-------------------------------------------------------------------------------------------------+ | 11 | Run the job ( add layout details ) | +------+-------------------------------------------------------------------------------------------------+ Batch scripts can be submitted for execution using the ``sbatch`` command. For example, the following will submit the batch script named ``test.slurm``: .. code:: sbatch test.slurm .. note:: You must submit your batch job with the ``sbatch`` command. If you simply run it like a normal shell script (e.g. "./test.slurm"), it will run on the login node and will not properly allocate resources on the compute nodes. If successfully submitted, a Slurm job ID will be returned. This ID can be used to track the job. It is also helpful in troubleshooting a failed job; make a note of the job ID for each of your jobs in case you must contact `cades-help@ornl.gov` for support. Interactive Batch Jobs ^^^^^^^^^^^^^^^^^^^^^^ Batch scripts are useful when one has a pre-determined group of commands to execute, the results of which can be viewed at a later time. However, it is often necessary to run tasks on compute resources interactively. Users are not allowed to access cluster compute nodes directly from a login node. Instead, users must use an *interactive batch job* to allocate and gain access to compute resources. This is done by using the Slurm ``salloc`` command. Other Slurm options are passed to ``salloc`` on the command line as well: .. code:: $ salloc -A ABC123 -p batch -N 4 -t 1:00:00 This request will: +----------------------------+----------------------------------------------------------------+ | ``salloc`` | Start an interactive session | +----------------------------+----------------------------------------------------------------+ | ``-A ABC123`` | Charge to the ``abc123`` project | +----------------------------+----------------------------------------------------------------+ | ``-p batch_low_memory`` | Run in the ``batch`` partition | +----------------------------+----------------------------------------------------------------+ | ``-N 4`` | request (4) nodes... | +----------------------------+----------------------------------------------------------------+ | ``-t 1:00:00`` | ...for (1) hour | +----------------------------+----------------------------------------------------------------+ After running this command, the job will wait until enough compute nodes are available, just as any other batch job must. However, once the job starts, the user will be given an interactive prompt on the primary compute node within the allocated resource pool. Commands may then be executed directly (instead of through a batch script). Debugging """"""""" A common use of interactive batch is to aid in debugging efforts. interactive access to compute resources allows the ability to run a process to the point of failure; however, unlike a batch job, the process can be restarted after brief changes are made without losing the compute resource pool; thus speeding up the debugging effort. Choosing a Job Size """"""""""""""""""" Because interactive jobs must sit in the queue until enough resources become available to allocate, it is useful to know when a job can start. Use the ``sbatch --test-only`` command to see when a job of a specific size could be scheduled. For example, the snapshot below shows that a (2) node job would start at 10:54. .. code:: $ sbatch --test-only -N2 -t1:00:00 batch-script.slurm sbatch: Job 1375 to start at 2023-10-06T10:54:01 using 64 processors on nodes baseline[100-101] in partition batch_all .. note:: The queue is fluid, the given time is an estimate made from the current queue state and load. Future job submissions and job completions will alter the estimate. Common Batch Options to Slurm ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The following table summarizes frequently-used options to Slurm: +--------------------+-----------------------------------+-----------------------------------------------------------+ | Option | Use | Description | +====================+===================================+===========================================================+ | ``-A`` | ``#SBATCH -A `` | Causes the job time to be charged to ````. | | | | The account string, e.g. ``abc123`` is typically composed | | | | of three letters followed by three digits and optionally | | | | followed by a subproject identifier. The utility | | | | ``showproj`` can be used to list your valid assigned | | | | project ID(s). ``This option is required by all jobs.`` | +--------------------+-----------------------------------+-----------------------------------------------------------+ | ``-N`` | ``#SBATCH -N `` | Number of compute nodes to allocate. | | | | Jobs cannot request partial nodes. | | | | ``This option is required by all jobs.`` | +--------------------+-----------------------------------+-----------------------------------------------------------+ | ``-t`` | ``#SBATCH -t