199 words
1 minute
Linux&Slurm Common Command

I. HPC Resource Inspection Commands — Linux & SLURM
Overview: These are essential Linux and SLURM (Simple Linux Utility for Resource Management) commands used on HPC clusters to inspect memory, CPU, GPU resources, and job status.
1. lscpu — CPU Hardware Info
lscpuDisplays CPU hardware information: core count, clock frequency, NUMA topology, cache sizes, and more.
2. scontrol show node ... | grep -i gres — Node GPU Resources
scontrol show node ghpc008 | grep -i gresShows the node’s GPU resources — the number and type of GPUs allocated or available on that node.
3. top — Real-time Process Monitor
topDisplays a live, continuously refreshed view of running processes, CPU usage, and memory usage.
4. scontrol show job $SLURM_JOB_ID — Current Job Details
scontrol show job $SLURM_JOB_IDDisplays detailed information about the current SLURM job, including assigned CPUs, GPUs, memory allocation, target node, and runtime status.
💡 One-line Takeaway
Use
Use
lscpu for CPU specs, scontrol show node ... | grep gres for GPU inventory, top for live process monitoring, and scontrol show job to inspect your running SLURM job's resource allocation. Linux&Slurm Common Command
https://lxy-alexander.github.io/blog/posts/tools/linuxslurm-common-command/