incident management
            Kitchen Soap – The Infinite Hows (or, the Dangers Of The Five Whys)
            
                    
    
                https://www.kitchensoap.com/2014/11/14/the-infinite-hows-or-the-dangers-of-the-five-whys/
            
                            (this is also posted on O’Reilly’s Radar blog. Much thanks to Daniel Schauenberg, Morgan Evans, and Steven Shorrock for feedback on this) Before I begin this post, let me say that this is intended to be a critique of the Five Whys method, not a criticism of the people who are in favor of using…
                Added 3 hours ago 
            
                            
                    
            STPA (System Theoretic Process Analysis) -- Teaching a new way to prevent outages at Google
            
                    
    
                https://sre.google/stpa/teaching/
            
                            Read how Google is using System Theoretic Process Analysis (STPA) to analyze pure software systems and discover risks.
                Added 1 day ago 
            
                            
                    
            Summary of the Amazon DynamoDB Service Disruption in Northern Virginia (US-EAST-1) Region
            
    
                https://aws.amazon.com/message/101925/
            
                    
                Added 6 days ago 
            
                            
                    
            Major AWS Outage Happening
            
                    
    
                https://old.reddit.com/r/aws/comments/1obd3lx/dynamodb_down_useast1/
            
                            Well, looks like we have a dumpster fire on DynamoDB in us-east-1 again.
                Added 1 week ago 
            
                            
                    
            Root cause analysis? You're doing it wrong
            
    
                https://entropicthoughts.com/root-cause-analysis-youre-doing-it-wrong
            
                    
                Added 2 weeks ago 
            
                            
                    
            New AWS Security Incident Response helps organizations respond to and recover from security events |
            
                    
    
                https://aws.amazon.com/blogs/aws/new-aws-security-incident-response-helps-organizations-respond-to-and-recover-from-security-events/
            
                            AWS introduces a new service to streamline security event response, providing automated triage, coordinated communication, and expert guidance to recover from cybersecurity threats.
                Added 5 months ago 
            
                            
                    
            https://www.crowdstrike.com/wp-content/uploads/2024/08/Channel-File-291-Incident-Root-Cause-Analysis-08.06.2024.pdf
            
    
                https://www.crowdstrike.com/wp-content/uploads/2024/08/Channel-File-291-Incident-Root-Cause-Analysis-08.06.2024.pdf
            
                    
                Added 5 months ago 
            
                            
                    
            https://docs.dissect.tools/en/latest/overview/index.html?s=09
            
    
                https://docs.dissect.tools/en/latest/overview/index.html?s=09
            
                    
                Added 5 months ago