Sylvester HPC QuickStart Guide

The purpose of this document is to provide members of the Sylvester Comprehensive Cancer Center community with the basic tools necessary to access the High-Performance Computing (HPC) resources at the UM Frost Institute for Data Science and Computing (IDSC). This guide is designed as a starting point for users new to Pegasus and is intended to act as a reference to supplement the on boarding training required for all users.

How do I get started?

After reviewing our Policies, you will need a Sylvester Project Allocation. You can submit a Sylvester Project Allocation Request.

To join an existing Project, submit a IDSC Account Request, with the Project ID.

If you are new to Pegasus, please review our Onboarding Training Material.

If you’re new to Linux, please review our Linux Training Material.

Sylvester Utilization

For access to the Sylvester Utilization Dashboard please contact hpc@ccs.miami.edu.

Sylvester Dedicated HPC Queues

We can leverage the IDSC Service Units (SU) pricing model and a Tier Subscription approach to increase the performance and utilization of Sylvester HPC dedicated resources while maintaining high availability for users who have financially contributed to these dedicated resources.

HPC Storage

  • Each user may utilize up to 250 GB (50,000 files) of GPFS home space.

  • Each project may utilize up to 2T (400,000 files) of GPFS scratch space.

  • Each project may utilize up to 2T (400,000 files) of GPFS Sylvester project space.

  • Users and Projects over quotas will not be able to create new files.

  • Scratch and Project space is intended only for data in active use.

  • There are no IDSC managed backups of GPFS Scratch or Project space.

  • Scratch space is subject to purging when necessary for continued operation.

  • Scratch space is charged only for actual utilization.

  • Projects may lease additional GPFS project space annually at $500 for 10TB (2,000,000 files).

  • Dedicated space is charged for total allocation and not by utilization.

Tier 1 Development (sccc-dev)

Open to all Sylvester Labs upon resource availability, requires Lab’s PI approval.

- For users new to HPC, like the “IDSC Early Career Research Grant”
- Job priority significantly less than Tier 2
- Preemptible Jobs after 1 hour of runtime
- Max Job Runtime: 8 hours
- Max Cores per job, and concurrent: 16 cores per job, 32 cores per project
- Max Memory per job: 128GB ram per job, 512GB ram per project (1 node)
- Scratch Storage: Local 1TB NVMe (optimal) and Pegasus HPC Scratch (PI respondsible)
- Project Storage: Pegasus HPC HOME (250GB) and Pegasus HPC Storage (PI respondsible)

Tier 2 General (sccc)

Open to all Sylvester Labs upon resource availability, requires Lab’s PI approval.

- Job priority significantly less than Tier 3.
- Preemptible Jobs after 1 hours of runtime
- Max Job Runtime: 48 hours
- Max Cores per job, and concurrent: 64 cores per job (same node), no limit per project.
- Max Memory per job: 512GB ram per job, no limit per project.
- Scratch Storage: Local 1TB NVMe (optimal) and Pegasus HPC Scratch (PI respondsible)
- Project Storage: Pegasus HPC HOME (250GB) and Pegasus HPC Storage (PI respondsible)

Tier 3 Big Memory (sccc-bigmem)

Open to all Sylvester Labs upon resource availability, requires Lab’s PI approval.

- Job priority significantly less than Tier 4.
- Preemptible Jobs after 1 hours of runtime
- Max Job Runtime: 96 hours
- Max Cores per job, and concurrent: 64 cores per job (same node), 1 job per project.
- Max Memory per job: 4TB per job and project (each of the two nodes has 4TB RAM).
- Scratch Storage: Local 1TB NVMe (optimal) and Pegasus HPC Scratch (PI respondsible)
- Project Storage: Pegasus HPC HOME (250GB) and Pegasus HPC Storage (PI respondsible)

Tier 3 GPU Queue (sccc-gpu)

Open to all Sylvester Labs upon resource availability, requires Lab’s PI approval.

- Job priority: Significantly less than Tier 4.
- Preemptible Jobs after 1 hours of runtime
- Max Job Runtime: 96 hours
- Access to Sylvester GPU's (1 Nvidia A100 per node)
- Max Cores per job, and concurrent: 4 cores per job (same node), 1 job per project.
- Max Memory per job: 512GB per job and project.
- Scratch Storage: Local 1TB NVMe (optimal) and Pegasus HPC Scratch (PI respondsible)
- Project Storage: Pegasus HPC HOME (250GB) and Pegasus HPC Storage (PI respondsible)

Tier 4 Premium (sccc-premium, sccc-bigmem-premium, sccc-gpu-premium)

Reserved for Sylvester Labs that have purchased dedicated resources

- Job priority: Other Tier 4 Jobs
- Job limitations: Up to Lab PI's
- Scratch Storage: Local 1TB NVMe (optimal) and Pegasus HPC Scratch (PI respondsible)
- Project Storage: Pegasus HPC HOME (250GB), Sylvester HPC Storage, and Pegasus HPC Storage (PI respondsible)

What are specs of available HPC Nodes?

Triton (96 nodes)

OS:   CentOS 7.9, ppc64le
CPU:  2 x IBM Power9 (40 cores/node, 3840 cores total)
RAM:  16 x 16GiB RDIMM DDR4 2666MHz ECC (256GiB/node, 6.4GB/core)
GPU:  2 x Nvidia V100-SXM2 (16GB GPU RAM)
NET:  100Gbps Infiniband (IB), 1Gbps Ethernet
Disk: 2 x 1.92TB Micron 5100PRO SSD (RAID1, 1080MBps/1040MBps Seq Read/Write, 186K/74K IOPS)

Pegasus Compute (350 nodes)

OS:   CentOS 7.6, x86_64
CPU:  16c/node (for a total of 4800 CPU-cores)
RAM:  64GiB nodes (4GiB/core, for a total of 22400GiB)
RAM:  256GiB nodes (16GiB/core, for a total of 4096GiB)
NET:  56Gbps Infiniband, 1 Gbps Ethernet
Disk: Stateless (ramdisk)

Pegasus Sylvester Dedicated Compute (16 nodes)

OS:   CentOS 7.9, x86_64
CPU:  2 x Intel Xeon Gold 6338 CPU @ 2.00GHz (64 cores/node, 1024 cores total)
RAM:  16 x 32GiB RDIMM DDR4 3200MHz ECC (512GiB/node, 8192GiB RAM total)
Net:  100Gbps Infiniband, 10Gbps Ethernet,
Disk: 960GiB Samsung PM9A3 NVMe (6500MBps/1500MBps Seq Read/Write, 580K/70K IOPS)

Pegasus Sylvester Dedicated Big Memory GPU (2 nodes)

OS:   CentOS 7.9, x86_64
CPU:  2 x Intel Xeon Gold 6338 CPU @ 2.00GHz (64 cores/node, 128 cores total)
RAM:  32 x 128GiB RDIMM DDR4 3200MHz ECC (4096GB/node, 8192GiB RAM total)
GPU:  1 x Nvidia A100 (80GB GPU RAM)
Net:  100Gbps Infiniband, 10Gbps Ethernet,
Disk: 960GiB Samsung PM9A3 NVMe (6500MBps/1500MBps Seq Read/Write, 580K/70K IOPS)

How do I reset my IDSC password?

Via the IDSC Password Management tool. You will need to be connected to the University’s Secure Network to access this tool and all Sylvester HPC Resources.

How do I access the Secure Network remotely?

Via the University of Miami’s VPN.

How do I run Nextflow on Sylvester HPC resources?

Running Nextflow (nf-core/sarek)