Imdempotence (which you may read a formal definition of on [Wikipedia](https://en.wikipedia.org/wiki/Idempotence), when we are talking about messaging, is when a message redelivery can be handled without ending up in an unintended state.
## 交付保证
## Delivery guarantees[^1]
在说幂等性之前,我们先来说下关于消费端的消息交付。
[^1]: The chapter refers to the [Delivery guarantees](https://github.com/rebus-org/Rebus/wiki/Delivery-guarantees) of rebus, which I think is described very good.
由于CAP不是使用的 MS DTC 或其他类型的2PC分布式事务机制,所以存在至少消息严格交付一次的问题,具体的说在基于消息的系统中,存在一下三种可能:
Before we talk about idempotency, let's talk about the delivery of messages on the consumer side.
* Exactly Once(*) (仅有一次)
* At Most Once (最多一次)
* At Least Once (最少一次)
Since CAP is not a used MS DTC or other type of 2PC distributed transaction mechanism, there is a problem that at least the message is strictly delivered once. Specifically, in a message-based system, there are three possibilities:
带 * 号表示在实际场景中,很难达到。
* Exactly Once(*)
* At Most Once
* At Least Once
Exactly once has a (*) next to it, because in the general case, it is simply not possible.
### At Most Once
最多一次交付保证,涵盖了保证一次或根本不接收所有消息的情况。
The At Most Once delivery guarantee covers the case when you are guaranteed to receive all messages either once, or maybe not at all.
This type of delivery guarantee can arise from your messaging system and your code performing its actions in the following order:
这种类型的传递保证可能来自你的消息系统,你的代码按以下顺序执行其操作:
```
1. 从队列移除消息
2. 开始一个工作事务
3. 处理消息 ( 你的代码 )
4. 是否成功 ?
1. Remove message from queue
2. Start work transaction
3. Handle message (your code)
4. Success?
Yes:
1. 提交工作事务
1. Commit work transaction
No:
1. 回滚工作事务
2. 将消息发回到队列。
1. Roll back work transaction
2. Put message back into the queue
```
正常情况下,他们工作的很好,工作事务将被提交。
In the sunshine scenario, this is all well and good – your messages will be received, and work transactions will be committed, and you will be happy.
However, the sun does not always shine, and stuff tends to fail – especially if you do enough stuff. Consider e.g. what would happen if anything fails after having performed step (1), and then – when you try to execute step (4)/(2) (i.e. put the message back into the queue) – the network was temporarily unavailable, or the message broker restarted, or the host machine decided to reboot because it had installed an update.
使用这个协议,你将冒着丢失消息的风险,如果可以接受,那就没有关系。
This can be OK if it's what you want, but most things in CAP revolve around the concept of DURABLE messages, i.e. messages whose contents is just as important as the data in your database.
### At Least Once
这个交付保证包含你收到至少一次的消息,当出现故障时,可能会收到多次消息。
This delivery guarantee covers the case when you are guaranteed to receive all messages either once, or maybe more times if something has failed.
它需要稍微改变我们执行步骤的顺序,它要求消息队列系统支持事务或ACK机制,比如传统的 begin-commit-rollback 协议(MSMQ是这样),或者是 receive-ack-nack 协议(RabbitMQ,Azure Service Bus等是这样的)。
It requires a slight change to the order we are executing our steps in, and it requires that the message queue system supports transactions, either in the form of the traditional begin-commit-rollback protocol (MSMQ does this), or in the form of a receive-ack-nack protocol (RabbitMQ, Azure Service Bus, etc. do this).
大致步骤如下:
Check this out – if we do this:
```
1. 抢占队列中的消息。
2. 开始一个工作事务
3. 处理消息 ( 你的代码 )
4. 是否成功 ?
1. Grab lease on message in queue
2. Start work transaction
3. Handle message (your code)
4. Success?
Yes:
1. 提交工作事务
2. 从队列删除消息
1. Commit work transaction
2. Delete message from queue
No:
1. 回滚工作事务
2. 从队列释放抢占的消息
1. Roll back work transaction
2. Release lease on message
```
当出现失败或者抢占消息超时的时候,我们总是能够再次接收到消息以保证我们工作事务提交成功。
and the "lease" we grabbed on the message in step (1) is associated with an appropriate timeout, then we are guaranteed that no matter how wrong things go, we will only actually remove the message from the queue (i.e. execute step (4)/(2)) if we have successfully committed our "work transaction".
It depends on what you're doing 😄 maybe it's a transaction in a relational database (which traditionally have pretty good support in this regard), maybe it's a transaction in a document database that happens to support transaction (like RavenDB or Postgres), or maybe it's a conceptual transaction in the form of whichever work you happen to carry out as a consequence of handling a message, e.g. update a bunch of documents in MongoDB, move some files around in the file system, or mutate some obscure in-mem data structure.
比如它可以是传统的RDMS事务,也或者是 MongoDB 事务或者是一个交易等。
The fact that the "work transaction" is just a conceptual thing is what makes it impossible to support the aforementioned Exactly Once delivery guarantee – it's just not generally possible to commit or roll back a "work transaction" and a "queue transaction" (which is what we could call the protocol carried out with the message queue systems) atomically and consistently.
在这里它代表一个执行单元,这个执行单元是一个概念性的事实以支持前面提到的仅交付一次的这种问题。
## Idempotence at CAP
通常,不可能做到消息的事务和工作事务来形成原子性进行提交或者回滚。
In the CAP, the delivery guarantees we use is **At Least Once**.
## CAP 中的幂等性
Since we have a temporary storage medium (database table), we may be able to do At Most Once, but in order to strictly guarantee that the message will not be lost, we do not provide related functions or configurations.
在CAP中,我们采用的交付保证为 At Least Once。
### Why are we not providing(achieving) idempotency ?
由于我们具有临时存储介质(数据库表),也许可以做到 At Most Once, 但是为了严格保证消息不会丢失,我们没有提供相关功能或配置。
1. The message was successfully written, but the execution of the Consumer method failed.
### 为什么没有实现幂等?
There are a lot of reasons why the Consumer method fails. I don't know if the specific scene is blindly retrying or not retrying is an incorrect choice.
For example, if the consumer is debiting service, if the execution of the debit is successful, but fails to write the debit log, the CAP will judge that the consumer failed to execute and try again. If the client does not guarantee idempotency, the framework will retry it, which will inevitably lead to serious consequences for multiple debits.
1、消息写入成功了,但是此时执行Consumer方法失败了
2. The implementation of the Consumer method succeeded, but received the same message.
The scenario is also possible here. If the Consumer has been successfully executed at the beginning, but for some reason, such as the Broker recovery, and received the same message, the CAP will consider this a new after receiving the Broker message. The message will be executed again by the Consumer. Because it is a new message, the CAP cannot be idempotent at this time.
2、执行Consumer方法成功了,但是又收到了同样的消息
3. The current data storage mode can not be idempotent.
此处场景也是可能存在的,假如开始的时候Consumer已经执行成功了,但是由于某种原因如 Broker 宕机恢复等,又收到了相同的消息,CAP 在收到Broker消息后会认为这个是一个新的消息,会对 Consumer再次执行,由于是新消息,此时 CAP 也是无法做到幂等的。
Since the table of the CAP message is deleted after 1 hour for the successfully consumed message, if the historical message cannot be idempotent. Historically, if the broker has maintained or manually processed some messages for some reason.
数据库提供的 `INSERT ON DUPLICATE KEY UPDATE` 或者是采取类型的程序判断行为。
You can use the `INSERT ON DUPLICATE KEY UPDATE` provided by the database to easily done.
### 显式处理幂等消息
### Explicitly handling redeliveries
另外一种处理幂等性的方式就是在消息传递的过程中传递ID,然后由单独的消息跟踪器来处理。
Another way of making message processing idempotent, is to simply track IDs of processed messages explicitly, and then make your code handle a redelivery.
Assuming that you are keeping track of message IDs by using an `IMessageTracker` that uses the same transactional data store as the rest of your work, your code might look somewhat like this:
```c#
readonly IMessageTracker _messageTracker;
@@ -134,4 +146,4 @@ public async Task Handle(SomeMessage message)
As for the implementation of `IMessageTracker`, you can use a storage message Id such as Redis or a database and the corresponding processing state.
+ 62- 12
docs/content/user-guide/en/cap/transactions.mdZobrazit soubor
@@ -1,21 +1,71 @@
# 事务
# Transaction
## 分布式事务?
## Distributed transactions?
CAP 不直接提供开箱即用的基于 DTC 或者 2PC 的分布式事务,相反我们提供一种可以用于解决在分布式事务遇到的问题的一种解决方案。
CAP does not directly provide out-of-the-box MS DTC or 2PC-based distributed transactions, instead we provide a solution that can be used to solve problems encountered in distributed transactions.
In a distributed environment, using 2PC or DTC-based distributed transactions can be very expensive due to the overhead involved in communication, as is performance. In addition, since distributed transactions based on 2PC or DTC are also subject to the **CAP theorem**, it will have to give up availability (A in CAP) when network partitioning occurs.
针对于分布式事务的处理,CAP 采用的是“异步确保”这种方案。
> A distributed transaction is a very complex process with a lot of moving parts that can fail. Also, if these parts run on different machines or even in different data centers, the process of committing a transaction could become very long and unreliable.
### 异步确保
> This could seriously affect the user experience and overall system bandwidth. So **one of the best ways to solve the problem of distributed transactions is to avoid them completely**.[^1]
For the processing of distributed transactions, CAP uses the "Eventual Consistency and Compensation" scheme.
By far, one of the most feasible models of handling consistency across microservices is [eventual consistency](https://en.wikipedia.org/wiki/Eventual_consistency).
This model doesn’t enforce distributed ACID transactions across microservices. Instead, it proposes to use some mechanisms of ensuring that the system would be eventually consistent at some point in the future.
#### A Case for Eventual Consistency
For example, suppose we need to solve the following task:
* register a user profile
* do some automated background check that the user can actually access the system
The second task is to ensure, for example, that this user wasn’t banned from our servers for some reason.
But it could take time, and we’d like to extract it to a separate microservice. It wouldn’t be reasonable to keep the user waiting for so long just to know that she was registered successfully.
**One way to solve it would be with a message-driven approach including compensation**. Let’s consider the following architecture:
* the user microservice tasked with registering a user profile
* the validation microservice tasked with doing a background check
* the messaging platform that supports persistent queues
The messaging platform could ensure that the messages sent by the microservices are persisted. Then they would be delivered at a later time if the receiver weren’t currently available
#### Happy Scenario
In this architecture, a happy scenario would be:
* the user microservice registers a user, saving information about her in its local database
* the user microservice marks this user with a flag. It could signify that this user hasn’t yet been validated and doesn’t have access to full system functionality
* a confirmation of registration is sent to the user with a warning that not all functionality of the system is accessible right away
* the user microservice sends a message to the validation microservice to do the background check of a user
* the validation microservice runs the background check and sends a message to the user microservice with the results of the check
* if the results are positive, the user microservice unblocks the user
* if the results are negative, the user microservice deletes the user account
After we’ve gone through all these steps, the system should be in a consistent state. However, for some period of time, the user entity appeared to be in an incomplete state.
The last step, when the user microservice removes the invalid account, is a compensation phase.
#### Failure Scenarios
Now let’s consider some failure scenarios:
* if the validation microservice is not accessible, then the messaging platform with its persistent queue functionality ensures that the validation microservice would receive this message at some later time
* suppose the messaging platform fails, then the user microservice tries to send the message again at some later time, for example, by scheduled batch-processing of all users that were not yet validated
* if the validation microservice receives the message, validates the user but can’t send the answer back due to the messaging platform failure, the validation microservice also retries sending the message at some later time
* if one of the messages got lost, or some other failure happened, the user microservice finds all non-validated users by scheduled batch-processing and sends requests for validation again
Even if some of the messages were issued multiple times, this wouldn’t affect the consistency of the data in the microservices’ databases.
**By carefully considering all possible failure scenarios, we can ensure that our system would satisfy the conditions of eventual consistency. At the same time, we wouldn’t need to deal with the costly distributed transactions.**
But we have to be aware that ensuring eventual consistency is a complex task. It doesn’t have a single solution for all cases.