Using Scratch Space on Tempest
If you are attempting to run a job that deals with a large amount of file IO (reading and writing files), then the use of the scratch directory would help save space and time for you job. The tmp_scratch directory gets created and subsequently deleted with each job, allowing you to have a temporary directory to hold large amounts of files that you might not need after the job is completed.
Using Scratch Space in Batch Script
The scratch directory gets created automatically when a job get submitted, but the size of the directory can be changed with the --tmp variable in your batch script. The following code shows the sbatch variable that adjusts this size to 1G.
#SBATCH --tmp=1G # Sets size of scratch directory to 1G
The directory gets contained within the tmp_scratch variable which can be accessed with $tmp_scratch variable in your batch script. You can then move files in and out of this directory in order to utilize it. The following code will echo the location of the scratch directory and then move a file into and out of it.
echo "$tmp_scratch" # Print the tmp_scratch path
cp someFile.jl $tmp_scratch # Copy a file into tmp scratch
cd $tmp_scratch # Move into tmp scratch
cp anotherFile.jl ~/anotherFile.jl # Copy a file from tmp scratch to root
The following batch script runs a Julia script that generates 1000 matrices and saves them as individual files. It then goes through and loads each one and calculates which one of them has the highest mean and saves that file. If you weren’t using the scratch directory, this would result in 1001 new files created in your directory that you would have to clean up manually. Instead, the result of using the scratch directory is the single matrix that results from the code.
#!/bin/bash
##
## Lines starting with #SBATCH are read by Slurm. Lines starting with ## are comments.
## All other lines are read by the shell.
##
## Basic parameters
##
#SBATCH --account=group-rci # specify the account to use if using a priority partition
#SBATCH --partition=nextgen # queue partition to run the job in
#SBATCH --cpus-per-task=64 # number of cores to allocate
#SBATCH --mem-per-cpu=1000 # ammount of Memory allocated
#SBATCH --time=0-24:00:00 # maximum job run time in days-hours:minutes:secconds
##SBATCH --tmp=16G #
#SBATCH --job-name=exampleWithScratch # job name
#SBATCH --output=%x.out # standard output from job
#SBATCH --error=%x.err # standard error from job
echo "$tmp_scratch" # Display directory to tmp_scratch folder
cp generateFiles.jl $tmp_scratch # Move files into scratch directory
cp processFiles.jl $tmp_scratch #
cd $tmp_scratch # Move into scratch directory
module load Julia # Load Julia and run scripts
julia generateFiles.jl
julia processFiles.jl
mv results ~/examples/tmp-scratch-example/withScratch/ # Copy results into original directory
Important Notes
The scratch directory is much faster for reading and writing files as it doesn’t have to move over the network in order to interact with them. If you have several large files or a lot of small files, the scratch directory would help speed up programs greatly.
The scratch directory and any files inside of it also gets removed once the job is completed, so if you want to keep any of the files that get created or modified in the scratch directory, you need to move them to another directory before the job completes or else they will be deleted with the directory at the end of the job. This can be done by saving the files to a specific file path in the code that you’re executing or by moving them at the end of the batch script.