
clusterpilot
Stop writing SLURM scripts
by hand.
ClusterPilot is a keyboard-driven TUI that turns a plain-English job description into a correct, code and cluster-aware SLURM script – then uploads, submits, monitors, and syncs results back. Supports Compute Canada, university clusters, NSF ACCESS, ARCHER2, EuroHPC, and any standard SLURM environment. Built by a PhD student who got tired of doing this manually.
How it works
Six steps.
Zero SSH copy-paste.
Choose a cluster
Add as many clusters as you have access to – each gets its own profile in the config file and appears in the dropdown. Select your target cluster. Partition options, GPU flags, account names, and scratch paths load automatically.
Pick your job type
Point to a self-contained script (Julia, Python, or other), or a full package with a driver script. ClusterPilot reads your Project.toml and Manifest.toml to wire up the environment.
Describe your job
Type what your job does in plain English. The script is generated from both your description and the actual code – cluster constraints, environment setup, and all.
Review and submit
A complete SLURM script appears. Edit inline if needed, then upload and submit in one keypress over your existing SSH ControlMaster socket.
Monitor passively
A background daemon polls squeue at a configurable interval (default 5 minutes). Get push notifications on your phone when jobs start, finish, or fail.
Results synced back
On completion, output files are rsynced to your local project directory. Source code is skipped. Only results come back.
Features
Everything in the loop.
Nothing you didn’t ask for.
01 –
Code-aware script generation
The SLURM script is generated from both your plain-English description and the actual job code – cluster constraints, GPU flags, and environment setup all included. For Julia packages, it reads your Project.toml and Manifest.toml and writes the correct Pkg.instantiate() calls. For self-contained scripts, it infers dependencies and handles module loading automatically.
02 –
One-keypress submit
Files are rsynced and sbatch runs over your existing SSH ControlMaster socket. No new SSH sessions, no config changes to ~/.ssh/config.
03-
Passive monitoring
Two modes depending on how you work. Keep the TUI open and the job list refreshes automatically every 10 seconds – tail logs live, check status at a glance. Or close the lid entirely: the background daemon polls squeue every 5 minutes over a persistent SSH connection, notifies you on job start, completion, and failure, and syncs results when done. No babysitting required either way.
04 –
Push notifications
Optional ntfy.sh integration sends job start, completion, failure, and walltime warnings straight to your phone. No account required.
05 –
Sync results + cluster cleanup
On completion, output files sync back to your local project directory. Source is skipped. Your data is already there when you open the lid. When results are synced, optionally clean up the job directory on the cluster to reclaim scratch space without SSH-ing in manually.
06-
Job arrays
Submit parametric sweeps as SLURM job arrays directly from the submission UI. Just tell the AI what you want – “run this over L = 4, 6, 8, 10” – and it generates the correct #SBATCH --array directive, maps the index to your parameter, and handles the rest. No manual array scripting required.
Who it’s for
Built for researchers,
not sysadmins...
If you spend more time formatting #SBATCH directives than thinking about your science, ClusterPilot is for you. Works with any standard SLURM cluster – Compute Canada, NSF ACCESS, ARCHER2, EuroHPC, and most university systems.
Computational researchers running GPU simulations
Monte Carlo, MD, DFT, ML training – if it runs on a V100, it fits.
Students new to HPC clusters
Stop copying SLURM scripts from Stack Overflow and hoping they work.
Anyone submitting multiple jobs a day
Each submission gets its own isolated directory. No more overwriting results.
Researchers tired of cluster-specific gotchas
GPU syntax, account flags, scratch paths, module names – all handled automatically. You describe the job, ClusterPilot handles the rest.
pricing
Free forever for self-hosters.
Hosted tier coming soon...
Current
Self-hosted
Free
forever
Bring your own AI provider API key. Full functionality, no limitations. MIT licence – use it, fork it, extend it.
Full TUI – submit, monitor, sync
AI script generation (your API key)
SSH ControlMaster integration
Push notifications via ntfy.sh
Background daemon + systemd service
Open source, MIT licence
Coming soon
Hosted
TBD
/ month
Zero setup. Managed API key, web dashboard, and priority support. For researchers who want it to just work.
Everything in Self-hosted
Managed API key
Web dashboard for job history
Priority support
Team access (share clusters)
Hosted tier
Zero setup.
Coming soon.
If you’d rather not manage an API key or run a daemon yourself, the hosted tier is for you. Managed API key, web dashboard, nothing to configure. I’m validating demand before building it. If enough researchers sign up, it gets built.
No spam. Just one email when it launches.
Get started
Up and running
in two commands.
Install from PyPI, run once to generate a starter config, fill in your cluster username and API key, and you’re submitting jobs.
Four years of
manual SLURM.
Never again.
ClusterPilot is free, open source, and installable in 30 seconds. If it saves you time, consider sponsoring development.