IT-SDK-SRE

From wiki.samerhijazi.net
Revision as of 00:46, 27 July 2021 by Fiducia (talk | contribs) (SRE Toolchain)
Jump to navigation Jump to search

Ref

Notes

  • SRE focuses on improving software system reliability across key categories including availability, performance, latency, efficiency, capacity, and incident response.
  • service-level indicators (SLIs) and service-level objectives (SLOs)
  • Uptime: "five nines" or 99.999%, over five minutes of downtime per year.
  • Uptime: "four nines" or 99.99%, nearly an hour of downtime per year.
  • Dynatrace is both an Application Performance Monitoring and application Management tool, it can be used as Cloud based SaaS offering or installed on-prem and more.

SRE Toolchain

Containers for Microservices

  • Docker
  • Kubernetes
  • Swarm
  • Apache Mesos
  • Podman

Source Control Tools

  • Git

CI/CD Tools

  • Jenkins
  • CircleCI
  • GitLab
  • GoCD
  • Semaphore
  • Concourse

Data Storage Tools

  • MySQL
  • PostgreSQL
  • MonogoDB
  • Apache Hadoop
  • Apache Hive
  • Amazon Aurora (MySQL and PostgreSQL-compatible)
  • MariaDB (fork from MySQL)

Configuration Management Tools

  • Ansible
  • Chef
  • Puppet
  • Saltstack

Monitoring and Observability Tools

Metrics Collection Tools

  • Prometheus
  • Stackdriver (Google Cloud Operations)
  • InfluxDB
  • Sensu Go

Log Aggregation Tools

  • Fluentd
  • Sentry
  • Logstash

Distributed Tracing Tools

  • OpenTelemetry
  • Jaeger

Application Performance Monitoring Tools

  • Appdynamics
  • New Relic
  • Dynatrace

Dashboarding Tools

  • Grafana
  • Stashboard
  • Redash
  • Metabase

Incident Management

  • Pagerduty
  • Opsgenie
  • Squadcast

NEW-Work

  • AWS, Azure, concourse, Jenkins, Aurora DB, Dynatrace, New Relic, ElasticSearch, Kibana

Notes

Dynatrace