• An Illustrated Proof of the CAP Theorem


    An Illustrated Proof of the CAP Theorem

    The CAP Theorem is a fundamental theorem in distributed systems that states any distributed system can have at most two of the following three properties.

    • Consistency
    • Availability
    • Partition tolerance

    This guide will summarize Gilbert and Lynch's specification and proof of the CAP Theorem with pictures!

    What is the CAP Theorem?

    The CAP theorem states that a distributed system cannot simultaneously be consistent, available, and partition tolerant. Sounds simple enough, but what does it mean to be consistent? available? partition tolerant? Heck, what exactly do you even mean by a distributed system?

    In this section, we'll introduce a simple distributed system and explain what it means for that system to be available, consistent, and partition tolerant. For a formal description of the system and the three properties, please refer to Gilbert and Lynch's paper.

    A Distributed System

    Let's consider a very simple distributed system. Our system is composed of two servers, G1G1 and G2G2. Both of these servers are keeping track of the same variable, vv, whose value is initially v0v0. G1G1 and G2G2 can communicate with each other and can also communicate with external clients. Here's what our system looks like.

    img

    A client can request to write and read from any server. When a server receives a request, it performs any computations it wants and then responds to the client. For example, here is what a write looks like.

    img img img

    And here is what a read looks like.

    img img

    Now that we've gotten our system established, let's go over what it means for the system to be consistent, available, and partition tolerant.

    Consistency

    Here's how Gilbert and Lynch describe consistency.

    any read operation that begins after a write operation completes must return that value, or the result of a later write operation

    In a consistent system, once a client writes a value to any server and gets a response, it expects to get that value (or a fresher value) back from any server it reads from.

    Here is an example of an inconsistent system.

    img img img imgimg

    Our client writes v1v1 to G1G1 and G1G1 acknowledges, but when it reads from G2G2, it gets stale data: v0v0.

    On the other hand, here is an example of a consistent system.

    img img img imgimg img img img

    In this system, G1G1 replicates its value to G2G2 before sending an acknowledgement to the client. Thus, when the client reads from G2G2, it gets the most up to date value of vv: v1v1.

    Availability

    Here's how Gilbert and Lynch describe availability.

    every request received by a non-failing node in the system must result in a response

    In an available system, if our client sends a request to a server and the server has not crashed, then the server must eventually respond to the client. The server is not allowed to ignore the client's requests.

    Partition Tolerance

    Here's how Gilbert and Lynch describe partitions.

    the network will be allowed to lose arbitrarily many messages sent from one node to another

    This means that any messages G1G1 and G2G2 send to one another can be dropped. If all the messages were being dropped, then our system would look like this.

    img

    Our system has to be able to function correctly despite arbitrary network partitions in order to be partition tolerant.

    The Proof

    Now that we've acquainted ourselves with the notion of consistency, availability, and partition tolerance, we can prove that a system cannot simultaneously have all three.

    Assume for contradiction that there does exist a system that is consistent, available, and partition tolerant. The first thing we do is partition our system. It looks like this.

    img

    Next, we have our client request that v1v1 be written to G1G1. Since our system is available, G1G1 must respond. Since the network is partitioned, however, G1G1 cannot replicate its data to G2G2. Gilbert and Lynch call this phase of execution α1α1.

    img img img

    Next, we have our client issue a read request to G2G2. Again, since our system is available, G2G2 must respond. And since the network is partitioned, G2G2 cannot update its value from G1G1. It returns v0v0. Gilbert and Lynch call this phase of execution α2α2.

    img img

    G2G2 returns v0v0 to our client after the client had already written v1v1 to G1G1. This is inconsistent.

    We assumed a consistent, available, partition tolerant system existed, but we just showed that there exists an execution for any such system in which the system acts inconsistently. Thus, no such system exists.

    Reprinted from here

  • 相关阅读:
    【转载】关于sql的执行计划(推荐详细) 天高地厚
    填充因子 天高地厚
    【转载】基于Windows下的Web性能测试和压力测试 天高地厚
    [转]SQL Server 2008存储结构之GAM、SGAM 天高地厚
    用sp_change_users_login消除Sql Server的孤立用户
    定义类成员
    个人JS脚本验证大全[转]
    网页居中
    转 优先队列 的使用
    java BigInteger
  • 原文地址:https://www.cnblogs.com/simon-1024/p/12060089.html
Copyright © 2020-2023  润新知