Due to reachability issues with an internal system component starting at approximately 9pm PT, some edge nodes were unable to retrieve all data for some requests. The end-to-end system continued to function for all requests with no increase in error rates, however with higher latency than normal.
Internal metrics did not immediately identify this because it occurred only on some combinations of requests and request parameters, and the particular timeout behavior did not trigger the internal latency SLA cutoff.
An audit has been scheduled to verify that all internal requests meet the system’s internal latency enforcement, and work has been scheduled to expand external monitoring to more combinations of requests and request parameters.