Don't understand the system? Start fixing it anyway.
My first professional engineering job dropped me into a sizeable microservices platform carrying significant technical debt. Nearly 1,000 user applications were stuck in a catch-all "fault" status....

Source: DEV Community
My first professional engineering job dropped me into a sizeable microservices platform carrying significant technical debt. Nearly 1,000 user applications were stuck in a catch-all "fault" status. The faults were caused by bugs across different services, but the status didn't distinguish between them. You couldn't tell whether you were looking at a one-off edge case or a symptom of something affecting hundreds of users. From the outside, the system was impenetrable. The engineering department was split into multiple teams, each owning a handful of services with deep knowledge of their slice. That created silos. Problems that spanned services fell between the cracks, and communication across team boundaries was slow. It puts me in mind of Conway's Law: the observation that a system's architecture tends to reflect the communication structure of the organisation that built it (Melvin Conway, 1967). The team was overwhelmed. Customer support tickets were flooding in, and developers were h