Tree-Graph Structure of Conflux — explained by Conflux CEO Dr. Fan Long

30 October, 2019

img

Teacher ConFi Pointing Out a Visualization of the Tree Graph
Conflux adopts an unusual scalability approach to achieve 3500+ TPS on the current Conflux testnet. The intention is to encourage miners to actively participate in the block generation under PoW mechanism, and to increase the throughput of the whole network by parallel block generations from many nodes. Then, all the blocks are given a unified order through a Tree Graph structure, therefore all the traction sequence is confirmed as well as traction conflict is sorted out. This approach solves the scalability bottleneck by altering the ledger structure, which addresses more on scaling the consensus mechanism itself, and is way different from common practices such as sharing, layer-2 scalability, and state channel scalability.

In addition to exhausting the layer-1 performance, another reason for adopting Tree Graph structure is, from Conflux Founder Dr. Fan Long’s perspective, the technology of sharding is still immature. “Previously, sharding only occurred when there were no attacker or opponent party. While sharding, people always assume one of the databases is down, but never think of the possibility of malicious attackers. So, if ETH 2.0 plans to realize sharding, there will be a lot of traps on the road. Huge technological difficulties. Would even be daunting.”

To understand the essentials of the Tree-Graph structure, a deeper understanding of the consensus mechanism behind Bitcoin and Ethereum are necessary. The following explanations are from a previous interview of Conflux CEO Dr. Fan Long:

1. Bitcoin and Ethereum

Bitcoin adopts the longest chain rule to determine the main chain. All forked blocks are discarded and do not affect the longest chain determination. The transactions from the forked blocks are not recorded, hence the miners are not rewarded. Therefore, the computing power invested in the discarded blocks does not contribute to either the security or the throughput of the system.

Ethereum upgraded the block generation time from Bitcoin’s 10 minutes to 15 seconds, which is a huge improvement. But at the same time, due to the shortened block generation time, the amount of forked blocks is greatly increased. In this case, it is not appropriate to continue the longest chain rule. Therefore, Ethereum adopts the GHOST algorithm to select the heaviest chain as its main chain. GHOST is a main chain selection protocol that selects the branch with the heaviest subtree when encountering forking.

So what is the difference between the heaviest chain and the longest chain?

Simple explanation is, one considers the amount of blocks, the other considers the length of the blocks:

img

Visualization of Main Chains According to Longest-Chain Rule and GHOST Protocol
Ethereum adopts a variant of the GHOST protocol. For the block reward, the forked block referenced by the subsequent block, which here refers as the “uncle block”, will also bring certain rewards to the miners. This approach allows more computing power to contribute to the system security, moreover, will solve the centralization and fairness issue caused by large number of forked blocks to some level. The GHOST protocol integrates the forked blocks into one main chain as much as possible.

2. Paralleled Block Generation Idea of Conflux

Although Ethereum retains a little reward for the uncle block, the transactions within those uncle block are still invalid and are not recorded on the chain. This part of the throughput is essentially being wasted.

Conflux adopts a GHOST-like approach (which is called the GHAST algorithm) to select the main chain, also known as the “pivot chain.” The difference is that the fork block is considered valid in Conflux. Not only will the miners be rewarded, but the transactions within the block are also recorded on the chain. This means that the throughput from the fork block is retrieved. Miners are able to generate new blocks in parallel without waiting to receive all the latest blocks.

In order to retrieve the fork block, fully sequencing all the transactions on both the forked blocks and the normal blocks is the vital problem. If we are able to sort all the transactions in full order, we can solve the problem of forked transaction conflicts, and have a clean ledger at the end.

2.1 Example of Queuing

From a higher level perspective, Conflux’s idea of scalability is actually quite simple.

Bitcoin needs to wait for the current block to end before generating another block. It’s the same as queuing, everyone is crowded in the same line. In order to prevent potential chaos caused by overcrowding, Bitcoin chooses to increase the difficulty of PoW target to limit the speed of people (the speed of block generation). The speed of this queue is destined to be very slow.

In addition, the longest chain rule stipulates that the fork branch is invalid unless they can be longer than the current one. Although the longest chain rule guarantees the security of the queue order, it sacrifices the efficiency of the queue.

Conflux’s approach is to allow everyone to follow different queues and forks randomly. Suppose there are 10 queues at the same time, it’s still ok as long as we can finally find a way to determine the full order of each person in all these 10 queues.

Under this approach, miners are encouraged to generate blocks actively. Without exceedingly limits the amount of concurrent blocks, and fully utilize the bandwidth resources, the throughput of the entire system is greatly improved. Therefore we turn a scalability problem into a sequencing problem: how to determine the full order of all the blocks when the forking is allowed? In the 10 queues, how to determine who comes first?

The answer is hidden in the Tree Graph structure.

2.2 What is a Tree Graph structure?

Conflux’s block structure is not a chain, but rather a graph, and the graph contains a tree. Therefore, Conflux calls it a Tree Graph structure.

The Tree Graph structure introduces two kinds of edges for each block, the parent edge and the reference edge.

Here the “edge” refers to the reference relationship between the blocks. When one block refers to another block, an edge is formed. In the example of queuing, the “edge” is equivalent to quoting the person who queues before you. Through this edge, we can know that the others are in front of you.

Conflux stipulates that each block has only one parent edge, which means when a miner packs a new block, he must choose one previous block as parent block. Only one parent block is allowed. Multiple parent blocks are not possible.

In addition to the parent edge, each block must reference itself to other unreferenced blocks on other forks in sights, thusly formed the reference edge. The role of the reference edge is to indicate that the referenced fork blocks are in front of the newly added block itself, which demonstrates the front and back sequence. Each block is with zero or more reference edges.

When only looking at the parent edge, the entire Conflux structure is a tree:

img

The “Tree” Structure of Conflux
When only looking at both the reference edge and the parent edge, the entire structure is a graph:

img

The Tree-Graph Structure with Parent and Reference Edge
3 How to Sequence

Having figured out the Tree Graph structure, let’s take a look at how the system sequence all the blocks while accepting the forks.

Sequencing is obviously crucial. Once a consistent and irreversible order is applied to all the blocks, then:

  1. It means that we can determine the validity of the transaction one by one according to the sequence. If two transactions are in conflict, the latter one is invalid; 2. Double spend attack is also impossible, unless the attacker is able to create a heavier branch. But the required work load surpasses all the blocks — including the fork blocks — from the original branch, not just the blocks on the longest chain. Therefore, all the computational power used to generate the fork blocks contributes to the security of the system.

Hence the problem becomes a sequencing algorithm. Because of the existence of the reference edge and the parent edge, this algorithm is fairly straightforward. The following figure shows the pivot chain, the forked blocks and the relationship of them through the edge types:

img

Pivot Chain and Forked Blocks in Tree-Graphe
This sequencing algorithm needs to be solved in two parts.

Through the parent edge, the main chain can be identified by the heaviest chain rule. Here in Conflux, we refer it as the Pivot Chain. In the above picture, the red blocks form a pivot chain.

The blocks on the pivot chain naturally form a sequence: A -> C -> G -> J -> M -> N

How do we sequence the other yellow blocks?

This is when the reference edge comes into the picture. For instance, in the above figure, the block F references the block B, which means F saw the existence of B before F is generated. Thus, B should be placed before F.

This is easy to comprehend. You might ask, how to sequence B and D?

Here, a new concept is introduced. Conflux introduces a concept called Epoch for blocks on the pivot chain, similar to “time period” and “phase”.

For example, block A is the first block on the pivot chain, so A is in Epoch 0; block C is the second block of the pivot chain, so C belongs to Epoch 1, and so on.

For a block not on the pivot chain, its Epoch is determined according to the first pivot chain block that directly or indirectly refers to it. The specific algorithm is not complicated, and it can be explained clearly with a practical example. Let’s take Epoch 3 as an example:

First, the pivot chain block J belongs to Epoch 3:

J’s reference block F is also Epoch 3 because F was not sequenced before, meanwhile we know that F is in front of J according to the relationship of the reference edges;

F’s parent block C was included in Epoch 2, hence C does not belong to Epoch 3. But F’s reference block B was not sequenced, so B also belongs to Epoch 3. Also, according to the reference relationship, B ranks before F.

Therefore, the sequence of Epoch 3 internal blocks is: B -> F -> J

That is to say, for a reference block of a pivot chain block, and this reference block’s parent block along with this parent block’s reference block, if they have not been sequenced, then they belong to the same Epoch. There is also an exception to this situation, in which the two reference blocks are unable to determine the order of each other, then the hash ID will be used to sequence.

To sum up the sequencing:

1.Sequence Epochs by the pivot chain

2.Sequence the blocks within the same Epoch

img

By this algorithm, a consistent sequence can be determined for all the blocks in the above figure. You are encouraged to follow the steps above and see if you get the same result.

4 Analysis of the Conflux solution

Regardless of the blockchain system, any block, once generated, needs to be broadcasted to each node for synchronization through the whole network. As long as broadcast is involved, the transmission rate and load capacity of the network cannot be ignored as there is a bottleneck in transmitting data over the network. Assuming that the network bandwidth is a fixed value, the larger the data is, the slower the transmitting speed will be.

All broadcasted data = really useful blocks + invalid fork blocks + system running costs

In Conflux, the fork blocks are also valid blocks Conflux optimistically assumes that transactions on the fork blocks and transactions on the pivot chain are mostly non-conflicting, so theoretically, retaining all the blocks can make full use of bandwidth.

Of course, this solution is not completely problem-free. Let’s see how Conflux solves some potential problems.

4.1 Duplicate Transaction in Block Packaging

The transactions in the fork block may not conflict. Given the computing calculation on the fork block is also honest calculation power, and it is a good miner who is doing the work, but what if the transactions in the fork block are repeated? In other words, how do we solve the problem of multiple honest miners repeatedly packing the same transactions? Obviously, if miners’ packed transactions are duplicates, then even if all the fork blocks are retained, it is still difficult to achieve the goal of increasing throughput.

For this problem, Conflux adopts a hybrid strategy: using the game theory, to encourage miners to randomly select transactions in the trading pool to pack for the purpose of maximizing the profit. When the amount of transactions in the trading pool is large (it is also when to increase throughput the most), the random packing strategy can greatly reduce the probability of repeatedly packing the transactions. If there is only one transaction in the trading pool, then it is certain that each miner would only pack this transaction, hence here is the highest rate of the transaction repeatability — in fact, this is the time when parallel packaging is needed the least.

Conflux’s conclusion of theoretical calculation is that, by eliminating the inefficient throughput of duplicate transactions, the entire solution can still achieve more than 60% scalability upbringing in the ideal scenario of non-repetitive transactions.

4.2 Liveness Attack

As the heaviest chain rule is applied, if an attacker, by selectively hiding blocks, makes two branch chains weight similarly to an honest node, this honest node can never consistently select a pivot chain, moreover, the sequence of the transactions can not be confirmed. Although this is not a double spend attack, this situation still disable the whole system. This type of attack is also known as a Liveness Attack.

Conflux solves this problem by improving the GHOST algorithm to the aforementioned GHAST algorithm. In the case of an abnormality of a suspected attack, the GHAST algorithm switches to another mode, and allocates the block with a different weight according to the topological position of the block in the Tree Graph. In other words, this weighted topology structure is with similar security level to Bitcoin, which increases the security level of the entire system by slowing down the block generation speed and adding more weight on each block. Until the honest calculating power is restored, When the honest computing power starts to merge on pivot chain, GHAST will switch back to the original fast mode to prevent liveness attacks.

4.3 The overheads of the tree structure

The Tree Graph structure restores the reference relationships of multiple blocks. When the graph size expands, the whole structure will become very complicated. Therefore there will be an expensive calculation cost for calculating the topological relationship in the network. How can Conflux be more efficient when dealing with Tree Graph?

Ethereum adopts a modified GHOST algorithm, which only calculates seven-layer of the relationship. If all layer GHOST is to be taken into consideration, then the corresponding cost is O(N), where N is the number of all the blocks in the entire chain. Conflux introduces a new and more efficient algorithm that controls the complexity of adding blocks per step of the GHAST algorithm to O (logN). Please refer to Conflux academic paper and other materials for more information.

4.4 Incentive Mechanism

In Bitcoin, a large amount of computing power is needed to initiate an attack on fork chain, and no reward if the attack fails. The overall cost of attack is extremely high. All the fork blocks in the Conflux are valid. Although double-spend attack cannot be achieved by forking, if the attacker could gain reward from maliciously adding fork blocks, the system will also be affected. For example, the cost of increasing transaction confirmation time is negligible. Another possibility is to increase the complexity of the whole Tree Graph structure, thereupon increase the overall cost of the system. How would Conflux solve this problem?

The answer is to punish malicious attacker by Incentive Mechanism design. Under Conflux’s pivot chain selection protocol, if an attacker intentionally adds blocks to the fork chain, he must pretend that he does not see some of the reference blocks and ignores other existing blocks. However, in Conflux’s design, block rewards are not fixed: the more often a block disregards other blocks, the less reward it will get. Through this economic punishment, attackers must also bear economic losses in Conflux.

5 Summary

Conflux keeps all fork blocks by Tree Graph, thereby improving the overall throughput of the system, and realizing scalability under the premise of security. The overall idea is not complicated. However, in terms of engineering implementation, the overall sequencing of all transactions, the processing of complex topology of Tree Graph, the problem of block generation incentives, and how to reduce duplicate transactions, there are still many difficulties to be tackled by much profound technical and theoretical solutions. At present, Conflux has already launched and open-sourced the testnet, and welcome any interested personnel or party to try out.

Conflux is a State-of-the-Art public blockchain system that can achieve high TPS without sacrificing decentralization or safety. By delicately combining its unique and advanced algorithm with a novel structure — — Tree Graph (TG), Conflux makes consensus no longer a performance bottleneck, thereupon solves a series of problems in the industrialization of public chains. Currently, in its first stage, Conflux adopts PoW (Proof of Work) mechanism as the basis of its consensus.

Join the Revolution here:

Youtube: Conflux Chain
Twitter: https://twitter.com/ConfluxChain
Telegram: http://t.me/Conflux_English

Source: https://medium.com/@Confluxchain/tree-graph-structure-of-conflux-explained-by-conflux-ceo-dr-fan-long-6133db85eafd