This lesson is being piloted (Beta version)

Storage Spaces (2025)

Overview

Teaching: 30 min
Exercises: 15 min
Questions
  • What are the types and roles of DUNE’s data volumes?

  • What are the commands and tools to handle data?

Objectives
  • Understanding the data volumes and their properties

  • Displaying volume information (total size, available size, mount point, device location)

  • Differentiating the commands to handle data between grid accessible and interactive volumes

This is an updated version of the 2023 training

Workshop Storage Spaces Video from December 2024

Introduction

There are five types of storage volumes that you will encounter at Fermilab (or CERN):

Each has its own advantages and limitations, and knowing which one to use when isn’t all straightforward or obvious. But with some amount of foresight, you can avoid some of the common pitfalls that have caught out other users.

Vocabulary

What is POSIX? A volume with POSIX access (Portable Operating System Interface Wikipedia) allow users to directly read, write and modify using standard commands, e.g. using bash scripts, fopen(). In general, volumes mounted directly into the operating system.

What is meant by ‘grid accessible’? Volumes that are grid accessible require specific tool suites to handle data stored there. Grid access to a volume is NOT POSIX access. This will be explained in the following sections.

What is immutable? A file that is immutable means that once it is written to the volume it cannot be modified. It can only be read, moved, or deleted. This property is in general a restriction imposed by the storage volume on which the file is stored. Not a good choice for code or other files you want to change.

Interactive storage volumes (mounted on dunegpvmXX.fnal.gov)

Home area is similar to the user’s local hard drive but network mounted

Locally mounted volumes are physical disks, mounted directly on the computer

Network Attached Storage (NAS) element behaves similar to a locally mounted volume.

Grid-accessible storage volumes

At Fermilab, an instance of dCache+CTA is used for large-scale, distributed storage with capacity for more than 100 PB of storage and O(10000) connections. Whenever possible, these storage elements should be accessed over xrootd (see next section) as the mount points on interactive nodes are slow, unstable, and can cause the node to become unusable. Here are the different dCache volumes:

Persistent dCache: the data in the file is actively available for reads at any time and will not be removed until manually deleted by user. The persistent dCache contains 3 logical areas: (1) /pnfs/dune/persistent/users in which every user has a quota up to 5TB total (2) /pnfs/dune/persistent/physicsgroups. This is dedicated for DUNE Physics groups and managed by the respective physics conveners of those physics groups. https://wiki.dunescience.org/wiki/DUNE_Computing/Using_the_Physics_Groups_Persistent_Space_at_Fermilab gives more details on how to get access to these groups. In general, if you need to store more than 5TB in persistent dCache you should be working with the Physics Groups areas. (3) the “staging” area /pnfs/dune/persistent/staging which is not accessible by regular users but is by far the largest of the three. It is used for official datasets.

Scratch dCache: large volume shared across all experiments. When a new file is written to scratch space, old files are removed in order to make room for the newer file. Removal is based on Least Recently Utilized (LRU) policy, and performed by an automated daemon.

Tape-backed dCache: disk based storage areas that have their contents mirrored to permanent storage on CTA tape.

Files are not available for immediate read on disk, but needs to be ‘staged’ from tape first (see video of a tape storage robot).

Rucio Storage Elements: Rucio Storage Elements (or RSEs) are storage elements provided by collaborating institution for official DUNE datasets. Data stored in DUNE RSE’s must be fully cataloged in the metacat catalog and is managed by the DUNE data management team. This is where you find the official data samples.

CVMFS: CERN Virtual Machine File System is a centrally managed storage area that is distributed over the network, and utilized to distribute common software and a limited set of reference files. CVMFS is mounted over the network, and can be utilized on grid nodes, interactive nodes, and personal desktops/laptops. It is read only, and the most common source for centrally maintained versions of experiment software libraries/executables. CVMFS is mounted at /cvmfs/ and access is POSIX-like, but read only.

add in RCDS and StashCache

Note - When reading from dcache always use the root: syntax, not direct /pnfs

The Fermilab dcache areas have NFS mounts. These are for your convenience, they allow you to look at the directory structure and, for example, remove files. However, NFS access is slow, inconsistent, and can hang the machine if I/O heavy processes use it. Always use the xroot root://<site> … when reading/accessing files instead of /pnfs/ directly. Once you have your dune environment set up the pnfs2xrootd command can do the conversion to root: format for you (only for files at FNAL for now).

Summary on storage spaces

Full documentation: Understanding Storage Volumes

  Quota/Space Retention Policy Tape Backed? Retention Lifetime on disk Use for Path Grid Accessible
Persistent dCache Yes(5)/~400 TB/exp Managed by User/Exp No Until manually deleted immutable files w/ long lifetime /pnfs/dune/persistent/users Yes
Persistent PhysGrp Yes(50)/~500 TB/exp Managed by PhysGrp No Until manually deleted immutable files w/ long lifetime /pnfs/dune/persistent/physicsgroups Yes
Scratch dCache No/no limit LRU eviction - least recently used file deleted No Varies, ~30 days (NOT guaranteed) immutable files w/ short lifetime /pnfs/dune/scratch Yes
Tape backed dCache No/O(40) PB LRU eviction (from disk) Yes Approx 30 days Long-term archive /pnfs/dune/… Yes
NAS Data Yes (~1 TB)/ 62 TB total Managed by Experiment No Until manually deleted Storing final analysis samples /exp/dune/data No
NAS App Yes (~100 GB)/ ~50 TB total Managed by Experiment No Until manually deleted Storing and compiling software /exp/dune/app No
Home Area (NFS mount) Yes (~10 GB) Centrally Managed by CCD No Until manually deleted Storing global environment scripts (All FNAL Exp) /nashome/<letter>/<uid> No
Rucio 25 PB Centrally Managed by DUNE Yes Each file has retention policy Official DUNE Data samples use rucio/justIN to access Yes

Storage Picture

Monitoring and Usage

Remember that these volumes are not infinite, and monitoring your and the experiment’s usage of these volumes is important to smooth access to data and simulation samples. To see your persistent usage visit here (bottom left):

And to see the total volume usage at Rucio Storage Elements around the world:

Resource DUNE Rucio Storage

Note - do not blindly copy files from personal machines to DUNE systems.

You may have files on your personal machine that contain personal information, licensed software or (god forbid) malware or pornography. Do not transfer any files from your personal machine to DUNE machines unless they are directly related to work on DUNE. You must be fully aware of any file’s contents. We have seen it all and we do not want to.

Commands and tools

This section will teach you the main tools and commands to display storage information and access data.

ifdh

Another useful data handling command you will soon come across is ifdh. This stands for Intensity Frontier Data Handling. It is a tool suite that facilitates selecting the appropriate data transfer method from many possibilities while protecting shared resources from overload. You may see ifdhc, where c refers to client.

Note

ifdh is much more efficient than NFS file access. Please use it and/or xrdcp when accessing remote files.

Here is an example to copy a file. Refer to the Mission Setup for the setting up the DUNELAR_VERSION.

Note

For now do this in the Apptainer

/cvmfs/oasis.opensciencegrid.org/mis/apptainer/current/bin/apptainer shell --shell=/bin/bash \
-B /cvmfs,/exp,/nashome,/pnfs/dune,/opt,/run/user,/etc/hostname,/etc/hosts,/etc/krb5.conf --ipc --pid \
/cvmfs/singularity.opensciencegrid.org/fermilab/fnal-dev-sl7:latest

once in the Apptainer

source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
setup ifdhc
export IFDH_TOKEN_ENABLE=1
ifdh cp root://fndcadoor.fnal.gov:1094/pnfs/fnal.gov/usr/dune/tape_backed/dunepro/physics/full-reconstructed/2023/mc/out1/MC_Winter2023_RITM1592444_reReco/54/05/35/65/NNBarAtm_hA_BR_dune10kt_1x2x6_54053565_607_20220331T192335Z_gen_g4_detsim_reco_65751406_0_20230125T150414Z_reReco.root /dev/null

TODO - make certain we have a valid file

Note, if the destination for an ifdh cp command is a directory instead of filename with full path, you have to add the “-D” option to the command line.

Prior to attempting the first exercise, please take a look at the full list of IFDH commands, to be able to complete the exercise. In particular, mkdir, cp, rmdir,

Resource: ifdh commands

Exercise 1

Using the ifdh command, complete the following tasks:

  • create a directory in your dCache scratch area (/pnfs/dune/scratch/users/${USER}/) called “DUNE_tutorial_2025”
  • copy /exp/dune/app/users/${USER}/my_first_login.txt file to that directory
  • copy the my_first_login.txt file from your dCache scratch directory (i.e. DUNE_tutorial_2024) to /dev/null
  • remove the directory DUNE_tutorial_2025
  • create the directory DUNE_tutorial_2025_data_file Note, if the destination for an ifdh cp command is a directory instead of filename with full path, you have to add the “-D” option to the command line. Also, for a directory to be deleted, it must be empty.

Answer

ifdh mkdir /pnfs/dune/scratch/users/${USER}/DUNE_tutorial_2025
ifdh cp -D /exp/dune/app/users/${USER}/my_first_login.txt /pnfs/dune/scratch/users/${USER}/DUNE_tutorial_2025
ifdh cp /pnfs/dune/scratch/users/${USER}/DUNE_tutorial_2025/my_first_login.txt /dev/null
ifdh rm /pnfs/dune/scratch/users/${USER}/DUNE_tutorial_2025/my_first_login.txt
ifdh rmdir /pnfs/dune/scratch/users/${USER}/DUNE_tutorial_2025
ifdh mkdir /pnfs/dune/scratch/users/${USER}/DUNE_tutorial_2025_data_file

xrootd

The eXtended ROOT daemon is a software framework designed for accessing data from various architectures in a complete scalable way (in size and performance).

XRootD is most suitable for read-only data access. XRootD Man pages

Issue the following command. Please look at the input and output of the command, and recognize that this is a listing of /pnfs/dune/scratch/users/${USER}/DUNE_tutorial_2024. Try and understand how the translation between a NFS path and an xrootd URI could be done by hand if you needed to do so.

xrdfs root://fndca1.fnal.gov:1094/ ls /pnfs/fnal.gov/usr/dune/scratch/users/${USER}/

Note that you can do

lar -c <input.fcl> <xrootd_uri> 

TODO - HDF5 voodoo?

to stream into a larsoft module configured within the fhicl file. As well, it can be implemented in standalone C++ as

TFile * thefile = TFile::Open(<xrootd_uri>)

or PyROOT code as

thefile = ROOT.TFile.Open(<xrootd_uri>)

What is the right xroot path for a file.

If a file is in /pnfs/dune/tape_backed/dunepro/protodune-sp/reco-recalibrated/2021/detector/physics/PDSPProd4/00/00/51/41/np04_raw_run005141_0003_dl9_reco1_18127219_0_20210318T104440Z_reco2_51835174_0_20211231T143346Z.root

the command

pnfs2xrootd /pnfs/dune/tape_backed/dunepro/protodune-sp/reco-recalibrated/2021/detector/physics/PDSPProd4/00/00/51/41/np04_raw_run005141_0003_dl9_reco1_18127219_0_20210318T104440Z_reco2_51835174_0_20211231T143346Z.root

will return the correct xrootd uri:

root://fndca1.fnal.gov:1094//pnfs/fnal.gov/usr/dune/tape_backed/dunepro/protodune-sp/reco-recalibrated/2021/detector/physics/PDSPProd4/00/00/51/41/np04_raw_run005141_0003_dl9_reco1_18127219_0_20210318T104440Z_reco2_51835174_0_20211231T143346Z.root

you can then

root -l <that long root: path>

to open the root file.

This even works if the file is in Europe - which you cannot do with a direct /pnfs! (NOTE! not all storage elements accept tokens so this may stop)

#Need to setup root executable in the environment first...
export DUNELAR_VERSION=v09_90_01d00
export DUNELAR_QUALIFIER=e26:prof
export UPS_OVERRIDE=“-H Linux64bit+3.10-2.17"
source /cvmfs/dune.opensciencegrid.org/products/dune/setup_dune.sh
setup dunesw $DUNELAR_VERSION -q $DUNELAR_QUALIFIER

root -l root://dune.dcache.nikhef.nl:1094/pnfs/nikhef.nl/data/dune/generic/rucio/usertests/b4/03/prod_beam_p1GeV_cosmics_protodunehd_20240405T005104Z_188961_006300_g4_stage1_g4_stage2_sce_E500_detsim_reco_20240426T232530Z_rerun_reco.root

See the next episode on data management for instructions on finding files worldwide.

Note Files in /tape_backed/ may not be immediately accessible, those in /persistent/ and /scratch/ are.

Is my file available or stuck on tape?

/tape_backed/ storage at Fermilab is migrated to tape and may not be on disk? You can check this by doing the following in an AL9 window

gfal-xattr  <xrootpath> user.status

if it is on disk you get

ONLINE

if it is only on tape you get

NEARLINE

(This command doesn’t work on SL7 so use an AL9 window)

The df command

To find out what types of volumes are available on a node can be achieved with the command df. The -h is for human readable format. It will list a lot of information about each volume (total size, available size, mount point, device location).

df -h

Exercise 3

From the output of the df -h command, identify:

  1. the home area
  2. the NAS storage spaces
  3. the different dCache volumes

Quiz

Question 01

Which volumes are directly accessible (POSIX) from grid worker nodes?

  1. /exp/dune/data
  2. DUNE CVMFS repository
  3. /pnfs/dune/scratch
  4. /pnfs/dune/persistent
  5. None of the Above

Answer

The correct answer is B - DUNE CVMFS repository.

Question 02

Which data volume is the best location for the output of an analysis-user grid job?

  1. dCache scratch (/pnfs/dune/scratch/users/${USER}/)
  2. dCache persistent (/pnfs/dune/persistent/users/${USER}/)
  3. CTA tape (/pnfs/dune/tape_backed/users/${USER}/)
  4. user’s home area (`~${USER}`)
  5. CEPH data volume (/exp/dune/data or /exp/dune/app)

Answer

The correct answer is A, dCache scratch (/pnfs/dune/scratch/users/${USER}/).

Question 03

You have written a shell script that sets up your environment for both DUNE and another FNAL experiment. Where should you put it?

  1. DUNE CVMFS repository
  2. /pnfs/dune/scratch/
  3. /exp/dune/app/
  4. Your GPVM home area
  5. Your laptop home area

Answer

The correct answer is D - Your GPVM home area.

Question 04

What is the preferred way of reading a file interactively?

  1. Read it across the nfs mount on the GPVM
  2. Download the whole file to /tmp with xrdcp
  3. Open it for streaming via xrootd
  4. None of the above

Answer

The correct answer is C - Open it for streaming via xrootd. Use pnfs2xrootd to generate the streaming path.

Comment here


Key Points

  • Home directories are centrally managed by Computing Division and meant to store setup scripts, do NOT store certificates here.

  • Network attached storage (NAS) /exp/dune/app is primarily for code development.

  • The NAS /exp/dune/data is for store ntuples and small datasets.

  • dCache volumes (tape, resilient, scratch, persistent) offer large storage with various retention lifetime.

  • The tool suites idfh and XRootD allow for accessing data with appropriate transfer method and in a scalable way.