xRunBooks for SRE
Last updated
Was this helpful?
Last updated
Was this helpful?
AWS : Attaching lifecycle policies to AWS S3 buckets enables us to automate the management of object lifecycle in your storage buckets. By configuring lifecycle policies, you can define rules that determine the actions to be taken on objects based on their age or other criteria. This includes transitioning objects to different storage classes, such as moving infrequently accessed data to lower-cost storage tiers or archiving them to Glacier, as well as setting expiration dates for objects. By attaching lifecycle policies to your S3 buckets, you can optimize storage costs by automatically moving data to the most cost-effective storage tier based on its lifecycle. Additionally, it allows you to efficiently manage data retention and comply with regulatory requirements or business policies regarding data expiration. This runbook helps us find all the buckets without any lifecycle policy and attach one to them.
AWS : This runbook can be used to change the type of an EBS volume to GP3(General Purpose 3). GP3 type volume has a number of advantages over it's predecessors. gp3 volumes are ideal for a wide variety of applications that require high performance at low cost
AWS : For a record in a hosted zone, lower TTL means that more queries arrive at the name servers because the cached values expire sooner. If you configure a higher TTL for your records, then the intermediate resolvers cache the records for longer time. As a result, there are fewer queries received by the name servers. This configuration reduces the charges corresponding to the DNS queries answered. However, higher TTL slows the propagation of record changes because the previous values are cached for longer periods. This Runbook can be used to configure a higher value of a TTL .
AWS : Create new IAM user with a security Policy. Sends confirmation to Slack.
AWS : EBS (Elastic Block Storage) volumes are attached to EC2 Instances as storage devices. Unused (Unattached) EBS Volumes can keep accruing costs even when their associated EC2 instances are no longer running. These volumes need to be deleted if the instances they are attached to are no more required. This runbook helps us find such volumes and delete them.
AWS : This runbook can help us identify low usage Amazon Elastic Block Store (EBS) volumes and delete these volumes in order to lower the cost of your AWS bill. This is calculates using the VolumeUsage metric. It measures the percentage of the total storage space that is currently being used by an EBS volume. This metric is reported as a percentage value between 0 and 100.
AWS : ECS clusters are a managed service that allows users to run Docker containers on AWS, making it easier to manage and scale containerized applications. However, running ECS clusters with low CPU utilization can result in wasted resources and unnecessary costs. AWS charges for the resources allocated to a cluster, regardless of whether they are fully utilized or not. By deleting clusters that are not being fully utilized, you can reduce the number of resources being allocated and lower the overall cost of running ECS. Furthermore, deleting unused or low-utilization clusters can also improve overall system performance by freeing up resources for other applications that require more processing power. This runbook helps us to identify such clusters and delete them.
AWS : ELBs are used to distribute incoming traffic across multiple targets or instances, but if those targets or instances are no longer in use, then the ELBs may be unnecessary and can be deleted to save costs. Deleting ELBs with no targets or instances is a simple but effective way to optimize costs in your AWS environment. By identifying and removing these unused ELBs, you can reduce the number of resources you are paying for and avoid unnecessary charges. This runbook helps you identify all types of ELB's- Network, Application, Classic that don't have any target groups or instances attached to them.
AWS : This runbook is the inverse of Create IAM user with profile - removes the profile, the login and then the IAM user itself..
AWS : Amazon Elastic Block Store (EBS) snapshots are created incrementally, an initial snapshot will include all the data on the disk, and subsequent snapshots will only store the blocks on the volume that have changed since the prior snapshot. Unchanged data is not stored, but referenced using the previous snapshot. This runbook helps us to find old EBS snapshots and thereby lower storage costs.
AWS : Deleting RDS instances with low CPU utilization is a cost optimization strategy that involves identifying RDS instances with consistently low CPU usage and deleting them to save costs. This approach helps to eliminate unnecessary costs associated with running idle database instances that are not being fully utilized. This runbook helps us to find and delete such instances.
AWS : This runbook can be used to delete all unattached EBS Volumes within an AWS region. You can delete an Amazon EBS volume that you no longer need. After deletion, its data is gone and the volume can't be attached to any instance. So before deletion, you can store a snapshot of the volume, which you can use to re-create the volume later.
AWS : This runbook can be used to delete unused secrets in AWS.
AWS : Cloudwatch will retain empty Log Streams after the data retention time period. Those log streams should be deleted in order to save costs. This runbook can find unused log streams over a threshold number of days and help you delete them.
AWS : This runbook search for all unused NAT gateways from all the region and delete those gateways.
AWS : When we associate healthchecks with an endpoint, Amazon Route53 sends health check requests to the endpoint IP address. These health checks validate that the endpoint IP addresses are operating as intended. There may be multiple reasons that healtchecks are lying usused for example- health check was mistakenly configured against your application by another customer, health check was configured from your account for testing purposes but wasn't deleted when testing was complete, health check was based on domain names and hence requests were sent due to DNS caching, Elastic Load Balancing service updated its public IP addresses due to scaling, and the IP addresses were reassigned to your load balancer, and many more. This runbook finds such healthchecks and deletes them to save AWS costs.
AWS : This runbook can be used to detach an instance from Auto Scaling Group. You can remove (detach) an instance that is in the InService state from an Auto Scaling group. After the instance is detached, you can manage it independently from the rest of the Auto Scaling group. By detaching an instance, you can move an instance out of one Auto Scaling group and attach it to a different group. For more information, see Attach EC2 instances to your Auto Scaling group.
AWS : This runbook locates large files in an EC2 instance and backs them up into a given S3 bucket. Afterwards, it deletes the files backed up and send a message on a specified Slack channel. It uses SSH and linux commands to perform the functions it needs.
AWS : This runbook finds redshift clusters that don't have pause resume enabled and schedules the pause resume for the cluster.
AWS : This runbook can be used to list unhealthy EC2 instance from an ELB. Sometimes it difficult to determine why Amazon EC2 Auto Scaling didn't terminate an unhealthy instance from Activity History alone. You can find further details about an unhealthy instance's state, and how to terminate that instance, by checking the a few extra things.
AWS : This runbook finds all EC2 key pairs that are not used by an EC2 instance and notifies a slack channel about them. Optionally it can delete the key pairs based on user configuration.
AWS : A disassociated Elastic IP address remains allocated to your account until you explicitly release it. AWS imposes a small hourly charge for Elastic IP addresses that are not associated with a running instance. This runbook can be used to deleted those unattached AWS Elastic IP addresses.
AWS : This runbook restarts unhealthy services in a target group. The restart command is provided via a tag attached to the instance.
AWS : This runbook can be used to copy AMI from one region to multiple AWS regions using unSkript legos with AWS CLI commands.We can get all the available regions by using AWS CLI Commands.
AWS : This runbook can be used to identify and remove any unused NAT Gateways. This allows us to adhere to best practices and avoid unnecessary costs. NAT gateways are used to connect a private instance with outside networks. When a NAT gateway is provisioned, AWS charges you based on the number of hours it was available and the data (GB) it processes.
AWS : This runbook can be used to detach an instance from Auto Scaling Group. You can remove (detach) an instance that is in the Service state from an Auto Scaling group. After the instance is detached, you can manage it independently from the rest of the Auto Scaling group. By detaching an instance, you can move an instance out of one Auto Scaling group and attach it to a different group. For more information, see Attach EC2 instances to your Auto Scaling group.
AWS : This runbook check if there is a failed deployment in progress for a service in an ECS cluster. If it finds one, it sends the list of stopped task associated with this deployment and their stopped reason to slack.
AWS : This runbook can be used to Enforce Mandatory Tags Across All AWS Resources.We can get all the untag resources of the given region,discovers tag keys of the given region and attaches mandatory tags to all the untagged resource.
AWS : To avoid unexpected interruptions, it's a good practice to check to see if there are any EC2 instances scheduled to retire. This runbook can be used to List the EC2 instances that are scheduled to retire. To handle the instance retirement, user can stop and restart it before the retirement date. That action moves the instance over to a more stable host.
AWS : This runbook can be used to collect the data from cloudwatch related to AWS DynamoDB for provision capacity.
AWS : This run resizes the EBS volume to a specified amount. This runbook can be attached to Disk usage related Cloudwatch alarms to do the appropriate resizing. It also extends the filesystem to use the new volume size.
AWS : This runbook can be used to resize list of pvcs in a namespace. By default, it uses all pvcs to be resized.
AWS : This runbook resizes the PVC to input size.
AWS : This runbook can be used to Restart AWS EC2 Instances
AWS : This lego can be used to launch an AWS EC2 instance from AMI in the given region.
AWS : This runbook can be used to troubleshoot EC2 instance configuration in a private subnet by capturing the VPC ID for a given instance ID. Using VPC ID to get Internet Gateway details then try to SSH and connect to internet.
Jenkins : This runbook fetches the logs for a given Jenkins job and posts to a slack channel
Jira : Using the Panel Library - visualize the time it takes for issues to close over a specifict timeframe
Kubernetes : This runbook shows and deletes the evicted pods for given namespace. If the user provides the namespace input, then it only collects pods for the given namespace; otherwise, it will select all pods from all the namespaces.
Kubernetes : This runbook fetches the kube system config map for a k8s cluster and publishes the information on a Slack channel.
Kubernetes : This runbook get the matching nodes for a given configuration (storage, cpu, memory, pod_limit) from a k8s cluster
Kubernetes : This RunBook checks the logs of every pod in a namespace for warning messages.
Kubernetes : This runbook checks if any Pod(s) in CrashLoopBackoff state in a given k8s namespace. If it finds, it tries to find out the reason why the Pod(s) is in that state.
Kubernetes : This runbook checks if any Pod(s) in ImagePullBackOff state in a given k8s namespace. If it finds, it tries to find out the reason why the Pod(s) is in that state.
Kubernetes : This runbook checks any Pods are in terminating state in a given k8s namespace. If it finds, it tries to recover it by resetting finalizers of the pod.
Kubernetes : This runbook resizes a list of Kubernetes PVCs.
Kubernetes : This runbook resizes a Kubernetes PVC.
Kubernetes : This runbook can be used to rollback Kubernetes Deployment
Postgresql : This runbook displays collects the long running queries from a database and sends a message to the specified slack channel. Poorly optimized queries and excessive connections can cause problems in PostgreSQL, impacting upstream services.