notes
SRE
DevOps
SLI, SLO, SLA
Monitoring
Logging
Alerting
Incident Management
Postmortems
Troubleshooting
On-Call
Resilience engineering
Data Integrity
Risks
Slurm