Sunday, October 12, 2014

Achieving Rapid Response Times in Large Online Services - Google

When a query comes in there is a whole bunch of sub systems needs to be used in order to generate the information they needs on surface of the page. So they break these large systems down into bunch of sub services, and they need enough computation power on the back end so eventually when they get a query in and hit thousands of server and get all the results back and decide what they gonna show to the user in a pretty short amount of time.


In Google all of these servers run on shared environment, they do not allocate a particular machine for a specific task.

Canary Requests

One of the ways to keeping your serving system safe in the presence of a large fanout system. normally when you take in a query on top of the tree  and send it down to the parents eventually to all the leaves. What happens if all of the sudden the query passing code runs on leaves crash for some reason, due to a weird property of the query never seen before. so all of the sudden you send the query down and it will kill your data center.


To handle this they take a little bit of a latency hit in order to keep the serving system safe. What they do they send the query to just a one leave and if that succeed then they have more confidence that query to gonna trouble sending to all thousand servers.

Backup Request

Request (req 9 ) is sent to particular server and if they do not heard back from that server, the send the same request to a another server.

 Source:https://www.youtube.com/watch?v=1-3Ahy7Fxsc

No comments:

Post a Comment