Dialogue with Conflux: Much Space to Improve Layer 1; the Trade-off of Public Chain has not Arrived yet.
28 August, 2019
Conflux office is right next to Tsinghua University. This public chain team is with an exceedingly strong academic background. The founder Dr. Fan Long started to learn programming at the age of 5 and won two gold medals in the International Olympics of Information Science (IOI). He was honorably accepted by Tsinghua University and received his Ph.D. from the Massachusetts Institute of Technology (MIT). Now he is an Assistant Professor at the University of Toronto. Professor Andrew Yao, the only Chinese Turing Award winner, is the chief scientist of Conflux. Core developers and researchers are almost all gold medal winners. Plus the label of “Tsinghua Yao class”, this 50 people team seems a bit outstanding everywhere.
Like Algorand, Conflux is also started from an academic paper. The strong academic origin decides that every proposed technical solution must come along with a strict mathematical proof. Research Director Dr. Guang Yang explained, for public chain projects, which heavily involves cryptography and security, the core team usually tends to comprehend theoretical issues thoroughly, and make sure it’s mathematically reliable and practical, then commences the work. The most straightforward output of this proving process is the corresponding academic paper.
While addressing on the technique and the theory, Fan also feels powerless about the current chaotic situation of the industry. He finds that talking about technology and research means too little when this circle is still full of scammers and folk scientists. In the past, people often asked about the difference between Conflux and other DAG projects, and Fan would try to explain the technical differences cautiously. But after too many such occasions, he would directly say the difference is "they are wrong, and we are right." Right or wrong may be an eccentric expression in the eyes of blockchain space people. Here, it refers to whether the technical solution is theoretically feasible.
In this fierce competing space, the challenge is often all-round. Each and any form of a market, public offerings, ICO or exchange listing, is a colosseum for the team to fight across, as well as a place filled with retail investors who know nothing about risk and want to get rich overnight. Such retail investors often hold a strange logic: since your technology is so advanced, you can pump up your coin price and show it to me. Under this logic, the price drop is the original sin of the project. Once the market performance is not favorable, the doubts of retail investors will influx. This is also the cause of the recent third-generation public chain has encountered so many unexpected doubts. In other words, many teams actually underestimate the risks of the market. The “Professor Coin” is one great example. This concept was initially sought after, but now it is a label that warns retail investors not-to-touch. "Academic genes" may also become a label that people doubt about Conflux in the future. However, the Conflux team has publicly announced not to do any public offering in any form, and will not cooperate with any centralized exchange to IXO. Prudence seems to be the best strategy they are currently undertaking.
Fan is extremely confident on the technology path of Conflux. Their thoughts for community building are also different from most of the other projects. Fan told the Orange Book that after the mainet launch, besides DApp developer support etc. common practice, Conflux would endeavor to work with the government to establish an open platform for a wider audience. As the industry is still far from mature, there are still many blank spots in corresponding policy and regulation. Conflux hopes to provide an effective communication channel between developers and the government through this open development platform. "At least allow them ask what can be done and what cannot, which possibly would be a crucial requirement of the developers."
In fact, academic genes not only affect the work style of the team, but also shape the goals that Conflux wants to achieve. “One MIT holds as many IP addresses as that in China. Why? Because the United States is the rule maker of Internet game. In the future, blockchains might become as important as the Internet, thus making present a great opportunity at both the national strategic level and so many other levels.” When he was a Ph.D. student in MIT, Fan could clearly feel that it is not easy for Chinese researchers to gain deserved recognition from the American academic circle and society. Therefore, he possesses strong desires to prove his value. "We can beat them." Fan said.
Of course, Conflux is aware that the public chain is not just about the technology, what’s more important is the whole ecology. Fan took the operating system as an example. "Why can't China design its own operating system? Among people I know, there are at least forty or fifty who are capable. The difficulty lies in building an ecosystem with as much resources and developers as Android."
In contrast, the current blockchain ecosystem is further from complete. In Fan's perspective, the blockchain space is still just a big casino, "The major factor is, due to limitations in throughput, oracle and layer2, etc., the blockchain is still an isolated small ecology and can't interact with the world. Therefore we can only play something out of nothing, such as gambling."
To change this situation, the public chain, as the most important infrastructure, firstly needs to break through the biggest bottleneck, the throughput. Conflux intends to be where valuable bits are exchanged and validated, which means to bring enough information on chain, meanwhile greatly reduce the cost of the on-chain process by scalability.
Conflux adopts an unusual scalability approach. The intention is to encourage miners to actively participate in the block generation under PoW mechanism, and to increase the throughput of the whole network by parallel block generations from many nodes. Then, all the blocks are given a unified order through a Tree Graph structure, therefore all the traction sequence is confirmed as well as traction conflict is sorted out. This approach solves the scalability bottleneck by altering the ledger structure, which addresses more on scaling the consensus mechanism itself, and is way different from common practices such as sharding, layer-2 scalability, and state channel scalability.
"Under the premise of ensuring security, the performance of layer1 is far from fully developed. The efficiency of existing consensus algorithms is far behind the performance limit of current hardware. We aim to enable our consensus algorithm to fully utilize the computational power and the network transmission rate, when we reach the ceiling, we will consider the layer 2 or sharding. Moreover, Conflux is essentially not in conflict with these solutions, and they could all be added on Conflux."
In addition to exhaust the layer1 performance, another reason for adopting Tree Graph structure is, from Fan’s perspective, the technology of sharding is still immature. “Previously, Sharding only occurred when there were no attacker or opponent party. While sharding, people always assume one of the databases is down, but never think of the possibility of malicious attackers. So, if ETH 2.0 plans to realize sharding, there will be a lot of traps on the road. Huge technological difficulties. Would even be daunting."
Conflux hopes to realize scalability without sacrificing security, meanwhile improve the efficiency of consensus algorithms. Dr. Guang Yang, head of research, believes that the current performance of the blockchain is far from compulsory trade-offs. The rush to add sharding and other solutions will only introduce unnecessary complexity.
This view is very interesting. Since the year of 2017, there have been many changes in the entire public chain space. Everyone gave up the up-front fight with “the Impossible Triangle” and chose some roundabouts, including sharding, state channel, and layer2 scalability. But Conflux's perspective is, our minds are so much faster than our feet. We need to take a step back and firstly make the performance of layer1 to the ultimate. The progress of technology is sometimes slower than we think.
The public chain industry combines both technical and non-technical factors. More than often, the early advantage of the community will bring additional benefits and feedback. I don't know if Conflux is the right solution. However, they independently reflected on the existing public chain problems, and concluded their own results. This article hereby hopes to clearly introduce Conflux's Tree Graph based technical solution, thus invite a wider audience to discuss this different approach.
Start with Bitcoin and Ethereum
Bitcoin adopts the longest chain rule to determine the main chain. All forked blocks are abolished, and are not affecting the longest chain determination. The transactions from the forked blocks are not recorded, hence the miners are not rewarded. Therefore, the computing power invested in the discarded blocks does not contribute to either the security or the throughput of the system.
Ethereum upgraded the block generation time from Bitcoin’s 10 minutes to 15 seconds, which is a huge improvement. But at the same time, due to the shortened block generation time, the amount of forked blocks is greatly increased. In this case, it is not appropriate to continue the longest chain rule. Therefore, Ethereum adopts the GHOST algorithm to select the heaviest chain as its main chain. GHOST is a main chain selection protocol that selects the branch with the heaviest subtree when encountering forking.
So what is the difference between the heaviest chain and the longest chain?
Simple explanation is, one considers the amount of blocks, the other considers the length of the blocks:
Ethereum adopts a variant of the GHOST protocol. For the block reward, the forked block referenced by the subsequent block, which here refers as the "uncle block", will also bring certain rewards to the miners. This approach allows more computing power to contribute to the system security, moreover, will solve the centralization and fairness issue caused by large number of forked blocks to some level. The GHOST protocol integrates the forked blocks into one main chain as much as possible.
Paralleled Block Generation Idea of Conflux
Although Ethereum retains a little reward for the uncle block, the transactions within those uncle block are still invalid and are not recorded on the chain. This part of the throughput is essentially being wasted.
Conflux adopts a GHOST-like approach (which is called the GHAST algorithm) to select the main chain, also known as the "pivot chain." The difference is that the fork block is considered valid in Conflux. Not only the miners will be rewarded, but the transactions within the block are also recorded in the chain. This means that the throughput from the fork block is retrieved. Miners are able to generate new blocks in parallel without waiting for receiving all the latest blocks.
In order to retrieve the fork block, fully sequencing all the transactions on both the forked blocks and the normal blocks is the vital problem. If we are able to sort all the transactions in full order, we can solve the problem of forked transaction conflicts, and have a clean ledger at the end.
Example of Queuing
From a higher level perspective, Conflux's idea of scalability is actually quite simple.
Bitcoin needs to wait for the current block to end before generating another block. It's the same as queuing, everyone is crowded in the same line. In order to prevent potential chaos caused by overcrowding, Bitcoin chooses to increase the difficulty of PoW target to limit the speed of people (the speed of block generation). The speed of this queue is destined to be very slow.
In addition, the longest chain rule stipulates that the fork branch is invalid unless they can be longer than the current one. Although guarantees the security of the queue order, it sacrifices the efficiency of the queue.
Conflux's approach is to allow everyone to follow different queues and fork randomly. Suppose there are 10 queues at the same time, it’s still ok as long as we can finally find a way to determine the full order of each person in all these 10 queues.
Under this approach, miners are encouraged to generate blocks actively. Without exceedingly limits the amount of concurrent blocks, and fully utilize the bandwidth resources, the throughput of the entire system is greatly improved. Therefore we turn a scalability problem into a sequencing problem: how to determine the full order of all the blocks when the forking is allowed? In the 10 queues, how to determine who comes first?
The answer is hidden in the Tree Graph structure.
What is a tree structure?
Conflux's block structure is not a chain, but rather a graph, and the graph contains a tree. Therefore, Conflux calls it a Tree Graph structure.
The Tree Graph structure introduces two kinds of edges for each block, the parent edge and the reference edge.
Here the "edge" refers to the reference relationship among the blocks. When one block refers to another block, an edge is formed. In the example of queuing, the "edge" is equivalent to quoting the person who queues before you. Through this edge, we can know that the others are in front of you.
Conflux stipulates that each block has only one parent edge, which means when a miner packs a new block, he must choose one previous block as parent block. Only one parent block is allowed. Multiple parent blocks are not possible.
In addition to the parent edge, each block must reference itself to other unreferenced blocks on other forks in sights, thusly formed the reference edge. The role of the reference edge is to indicate that the referenced fork blocks are in front of the newly added block itself, which demonstrates the front and back sequence. Each block is with zero or more reference edges.
If you look at the parent edge, the entire Conflux structure is a tree.
If look at both the reference edge and the parent edge, the entire structure is a graph.
How to Sequence
Having figured out the Tree Graph structure, let's take a look at how the system sequence all the blocks while accepting the forks.
Sequencing is obviously crucial. Once a consistent and irreversible order is applied to all the blocks, then:
- It means that we can determine the validity of the transaction one by one according to the sequence. If two transactions are in conflict, the latter one is invalid;
- Double spend attack is also impossible, unless the attacker is able to create a heavier branch. But the required work load supasses all the blocks - including the fork blocks - from the original branch, not just the blocks on the longest chain. Therefore, all the computational power used to generate the fork blocks contributes to the security of the system.
Hence the problem becomes a sequencing algorithm. Because of the existence of the reference edge and the parent edge, this algorithm is fairly straightforward.
This sequencing algorithm needs to be solved in two parts.
Through the parent edge, a main chain can be identified by the heaviest chain rule. Here in Conflux we refer it as the Pivot Chain. In the above picture, the red blocks form a pivot chain.
The blocks on the pivot chain naturally form a sequence:: A -> C -> G -> J -> M -> N
How do we sequence the other yellow blocks?
This is when the reference edge comes into the picture. For instance, in the above figure, the block F references the block B, which means F saw the existence of B before F is generated. Thus, B should be placed before F.
This is easy to comprehend. You might ask, how to sequence B and D?
Here a new concept is introduced. Conflux introduces a concept called Epoch for blocks on the pivot chain, similar to "time period" and "phase".
For example, block A is the first block on the pivot chain, so A is in Epoch 0; block C is the second block of the pivot chain, so C belongs to Epoch 1, and so on.
For a block not on the pivot chain, its Epoch is determined according to the first pivot chain block that directly or indirectly refers to it. The specific algorithm is not complicated, and it can be explained clearly with a practical example. Let's take Epoch 3 as an example:
First, the pivot chain block J belongs to Epoch 3:
J's reference block F is also Epoch 3 because F was not sequenced before, meanwhile we know that F is in front of J according to the relationship of the reference edges;
F's parent block C was included in Epoch 2, hence C does not belong to Epoch 3. But F's reference block B was not sequenced, so B also belongs to Epoch 3. Also, according to the reference relationship, B ranks before F.
Therefore, the sequence of Epoch 3 internal blocks is: B -> F -> J
That is to say, for a reference block of a pivot chain block, and this reference block’s parent block along with this parent block’s reference block, if they have not been sequenced, then they belong to the same Epoch. There is also an exception to this situation, in which the two reference blocks are unable to determine the order of each other, then the hash ID will be used to sequence.
To sum up the sequencing:
- Sequence Epochs by the pivot chain
- Sequence the blocks within the same Epoch.
By this algorithm, a consistent sequence can be determined for all the blocks in ablove figure. You are encouraged to follow the steps above and see if you get the same result.
Analysis of the Conflux solution
Regardless of the blockchain system, any block, once generated, needs to be broadcasted to each node for synchronization through the whole network. As long as broadcast is involved, the transmission rate and load capacity of the network cannot be ignored as there is a bottleneck in transmitting data over the network. Assuming that the network bandwidth is a fixed value, the larger the data is, the slower the transmitting speed will be.
All broadcasted data = really useful blocks + invalid fork blocks + system running costs
In Conflux, the fork blocks are also valid blocks Conflux optimistically assumes that transactions on the fork blocks and transactions on the pivot chain are mostly non-conflicting, so theoretically, retaining all the blocks can make full use of bandwidth.
Of course, this solution is not completely problem-free. Let's see how Conflux solves some potential problems.
Duplicate Transaction in Block Packaging
The transactions in the fork block may not conflict. Given the computing calculation on the fork block is also honest calculation power, and it is a good miner who is doing the work, but what if the transactions in the fork block are repeated? In other words, how do we solve the problem of multiple honest miners repeatedly packing the same transactions? Obviously, if miners’ packed transactions are duplicates, then even if all the fork blocks are retained, it is still difficult to achieve the goal of increasing throughput.
For this problem, Conflux adopts a hybrid strategy: using the game theory, to encourage miners to randomly select transactions in the trading pool to pack for the purpose of maximizing the profit. When the amount of transactions in the trading pool is large (it is also when to increase throughput the most), the random packing strategy can greatly reduce the probability of repeatedly packing the transactions. If there is only one transaction in the trading pool, then it is certain that each miner would only pack this transaction, hence here is the highest rate of the transaction repeatability - in fact, this is the time when parallel packaging is needed the least.
Conflux's conclusion of theoretical calculation is that, by eliminating the inefficient throughput of duplicate transactions, the entire solution can still achieve more than 60% scalability upbringing in the ideal scenario of non-repetitive transactions.
As the heaviest chain rule is applied, if an attacker, by selectively hiding blocks, makes two branch chains weight similarly to an honest node, this honest node can never consistently select a pivot chain, moreover, the sequence of the transactions can not be confirmed. Although this is not a double spend attack, this situation still disable the whole system. This type of attack is also known as a Liveness Attack.
Conflux solves this problem by improving the GHOST algorithm to the aforementioned GHAST algorithm. In the case of an abnormality of a suspected attack, the GHAST algorithm switches to another mode, and allocates the block with a different weight according to the topological position of the block in the Tree Graph. In other words, this weighted topology structure is with similar security level to Bitcoin, which increases the security level of the entire system by slowing down the block generation speed and adding more weight on each block. Until the honest calculating power is restored, When the honest computing power starts to merge on pivot chain, GHAST will switch back to the original fast mode to prevent liveness attacks.
The overheads of the tree structure
The Tree Graph structure restores the reference relationships of multiple blocks. When the graph size expands, the whole structure will become very complicated. Therefore there will be an expensive calculation cost for calculating the topological relationship in the network. How can Conflux be more efficient when dealing with Tree Graph?
Ethereum adopts a modified GHOST algorithm, which only calculates seven-layer of the relationship. If all layer GHOST is to be taken into consideration, then the corresponding cost is O(N), where N is the number of all the blocks in the entire chain. Conflux introduces a new and more efficient algorithm that controls the complexity of adding blocks per step of the GHAST algorithm to O (logN). Please refer to Conflux academic paper and other materials for more information.
In Bitcoin, a large amount of computing power is needed to initiate an attack on fork chain, and no reward if the attack fails. The overall cost of attack is extremely high. All the fork blocks in the Conflux are valid. Although double-spend attack cannot be achieved by forking, if the attacker could gain reward from maliciously adding fork blocks, the system will also be affected. For example, the cost of increasing transaction confirmation time is negligible. Another possibility is to increase the complexity of the whole Tree Graph structure, thereupon increase the overall cost of the system. How would Conflux solve this problem?
The answer is to punish malicious attacker by Incentive Mechanism design. Under Conflux's pivot chain selection protocol, if an attacker intentionally adds blocks to the fork chain, he must pretend that he does not see some of the reference blocks and ignores other existing blocks. However, in Conflux's design, block rewards are not fixed: the more often a block disregards other blocks, the less reward it will get. Through this economic punishment, attackers must also bear economic losses in Conflux.
Conflux keeps all fork blocks by Tree Graph, thereby improving the overall throughput of the system, and realizing scalability under the premise of security. The overall idea is not complicated. However, in terms of engineering implementation, the overall sequencing of all transactions, the processing of complex topology of Tree Graph, the problem of block generation incentives, and how to reduce duplicate transactions, there are still many difficulties to be tackled by much profound technical and theoretical solutions. At present, Conflux has already launched and open-sourced the testnet, and welcome any interested personnel or party to try out.