Dialogue with Conflux: Much Room to Improve Layer 1 and the Trade-off of Public Chain not Arrived yet.
28 August, 2019
The office of Conflux is at Wudaokou, very close to Tsinghua University, Beijing. This is a public chain team with a strong academic background. The founder Dr. Fan Long began to learn programming at the age of 5 and won two gold medals in the International Olympics of Information Science (IOI). He was sent to Tsinghua University with merits later and received his Ph.D. from the Massachusetts Institute of Technology (MIT), and then he took the position as Assistant Professor at the University of Toronto. Professor Andrew Yao, the only Chinese Turing Award winner, is the chief scientist of Conflux. The technicians who study the core agreement in the team are almost award-winning straight A students. With the label of “Tsinghua Yao class”, the team of more than 50 people seems a bit special everywhere.
Like Algorand, Conflux is also the kind of project that started from a paper. Probably because of this academic gene, the team is very keen to see a strict mathematical proof when a new technology is proposed. Dr. Guang Yang, Research Director of Conflux and responsible for the theoretical research work, told the Orange Book that for projects concerning cryptography and security like the public chain, the team with academic background usually tend to figure out things in theory, proving that the program is mathematically reliable and practical, then gets to start working on it. The most straightforward output of this proving process is the corresponding academic paper.
While paying attention to the theory of technology, Fan is also helpless about the chaotic situation of the industry. He finds that this circle is full of scammers and folk scientists, and most time it is useless to talk about technology. In the past, people often asked about the difference between Conflux and other DAG projects, and Fan would try to explain the technical differences carefully, but after too many times of asking, he would directly tell them that the difference is "they are wrong, and we are right." Right or wrong may be a strange expression in the eyes of entrepreneurs in the currency circle. Here, it refers to whether the technical solution is theoretically feasible.
In the dangerous chain track, competition is often all-round. Any form of market, be it public offerings, ICOs or listed exchanges, is a colosseum for the team to fight one after another, and inside there are filled with retail investors who know nothing about risk and want to get rich overnight. Such retail investors often have a strange logic: since your technology is so good, then you can pull up the stock and show it to me. Under this logic, the fall of the currency price is the original sin of the project. Once the market performance is not favorable, the doubts of retail investors will influx.
This is also why the recent third-generation public chain has encountered so many unexpected doubts. Simply put, many teams actually underestimate the risks of the market. The “Professor Coin” is the best proof in the sense that it is the concept that was initially sought after extensively, but now it is a label that warns retail investors not-to-touch. "Academic genes" may also become a label that people doubt about Conflux in the future, however, the Conflux team has publicly promised that it will never go public, and will not cooperate with any centralized exchange to do IXO. “Prudence” seems to be the best strategy they are currently taking.
For the technology route it chose, Conflux, led by Fan, seems very confident. Their ideas for community building are also different from those of other teams. Fan told the Orange Book that after online launching, Conflux would perform some regular actions, such as supporting DAPP developers; besides that, it would try to work with government agencies to create an open development platform. Because the industry is still far from mature, there are still many gaps in policy. Conflux hopes to provide an effective communication channel between developers and the government through this development platform. "At least let them ask what can be done, what cannot, which may be a very important requirement for developers."
In fact, academic genes not only affect the style of the team's work, but also shape the goals that Conflux wants to achieve. “One MIT has as many IP addresses as that in China. Why? Because the United States is the maker of Internet game rules. In the future, blockchains may become as important as the Internet, and it is a good opportunity now at the national strategic level. When he was a Ph.D. student in North America, Fan could clearly feel that it is not easy for Chinese people to gain recognition on research from the American academic circle and society. Therefore, he has a strong desire to prove his value. "We can beat them." Fan said.
Of course, Conflux itself knows that the public chain is not just technology, more important is the whole ecology. Fan took the operating system as an example. "Why can't China design its own operating system? Among people I know, there are at least forty or fifty who can do this, because it is not difficult to develop an operating system, the really difficult thing is to build an ecology or an Android with so many resources and developers on it."
In contrast, the blockchain ecology at this stage is even more imperfect. In Fan's view, it is still just a big casino, "The core reason lies in throughput, oracle and layer2, etc., the blockchain is still an isolated small ecology and can't interact with other worlds, so we can only play something out of nothing, such as gambling."
To change this status quo, the public chain, as the most important infrastructure, needs to break through the biggest bottleneck-throughput. Conflux wants to be the place where valuable bits are exchanged and validated, which means that enough information needs to be chained and the cost of chaining needs to be greatly reduced by capacity expansion.
Conflux adopts a not-so-usual expansion idea. They hope that under the mechanism of POW, miners are encouraged to participate in the block-producing as much as possible, and the throughput of the entire network can be improved by block-producing in paralleled nodes, and then a unified order is determined for all the blocks by means of the tree structure. In this way, the order of all transactions is resolved and so are the transaction conflicts. This kind of expansion by changing the structure of the ledger is more focused on the extension of the consensus algorithm itself, and it is different from the more common stratification, fragmentation, state channel and other expansion schemes in the industry.
"Under the premise of ensuring security, the performance of layer1 is far from the ultimate. The efficiency of existing consensus algorithms is far behind from the performance of physical hardware. We want the consensus algorithm fully make use of the computational power and the rate of network transmission of the hardware, and after reaching the ceiling, we will consider the expansion schemes such as layer 2 and sharding. Moreover, Conflux is essentially not in conflict with these schemes, and they can all be superimposed on Conflux."
In addition to squeezing out the performance of layer1, another reason for Conflux's use of the treegraph scheme is that the technology of sharding is still immature in the eyes of Fan. “Sharding has always been in the environment without attackers and opponents in history. When we do sharding, we can only assume that a database is down and cannot respond, and never assume that there is also a malicious attacker inside. So if ETH 2.0 wants to achieve sharding, there will be a lot of traps to walk through, and huge technological setbacks. This will be a very frightening thing."
Conflux wants to scale up without sacrificing security and improve the efficiency of consensus algorithms. Guang believes that the current performance of the blockchain is far from being a trade-off. The rush to add sharding and other solutions will only introduce unnecessary complexity.
This view is very interesting. Since the year of 2017, there have been many changes in the entire public chain track. Everyone gave up the fight with “the Impossible Triangle” and chose to try some roundabout technical routes, including sharding, state channel, and more expansion on layer2. “But Conflux's idea is that we are thinking too much faster than walking on our feet. We need to take a step back and first make the performance of layer1 to the ultimate. The progress of technology is sometimes slower than we think.”
The public chain is an industry in which technical and non-technical factors are superimposed. In many cases, the community power accumulated by the first-mover advantage will bring additional benefits and feedback. I don't know if Conflux is the right solution, but they have their own thoughts on the issue of the public chain and already give their own answers. This article hopes to introduce Conflux's treegraph-based technical solutions and let more people participate in the discussion of this different ideas.
Let’s start with Bitcoin and Ethereum
As we all know, Bitcoin determines the main chain by picking the longest one, and this is the longest chain algorithm. All of the Bitcoin's forked blocks will be treated as abolished, not affecting choosing the longest chain, and the trade based on them does not count in the trading history, so the miners who dig into the scrap will not receive any reward. Therefore, the power invested in the waste block does not add any security or throughput to the system.
Ethereum changed the time of the block-producing from 10 minutes for Bitcoins to 15 seconds, which is a huge improvement, but also because the time of the block is shortened, the number of forked blocks is greatly increased. In this case, it is not appropriate to continue to select the longest chain as the main chain. Therefore, Ethereum uses the GHOST algorithm to select the heaviest chain as its main chain. GHOST is a main chain selection protocol that selects the branch with the heaviest subtree when it encounters a fork.
So what is the difference between the heaviest chain and the longest chain?
To put it simply, one looks at the number of blocks, the other looks at the length of the blocks:
Ethereum used a variant of the GHOST protocol. In the block reward, the forked block referenced by the subsequent block, that is, the "uncle block" will also bring certain rewards to the miners who dig out these uncle blocks. In this way, more computing power can contribute to the security of the system. On the other hand, it can solve some centralization and fairness problems caused by a large number of forked blocks. The GHOST protocol will make the forked blocks as uniform as possible into one main chain.
Paralleled block-producing ideas of Conflux
Although Ethereum retains a little reward for the uncle block, the transactions in the uncle block are still invalid and are not recorded on the chain. This part of the throughput is equivalent to being waste.
Conflux uses a GHOST-like approach (their own algorithm is called the GHAST algorithm) to select the main chain, also known as the "pivot chain." The difference is that the fork block is valid in the eyes of Conflux. Not only the miners who dig these forks can get a block-producing reward, but the transactions within the block are also recorded in the chain. This is equivalent to retrieve the wasted throughput on the fork block. Miners don't have to wait for receiving all the latest blocks to produce paralleled blocks.
The problem that needs to be solved by retrieving the fork block becomes the sorting problem of all transactions on the forked blocks and the normal blocks. Because if we can put all the transactions out of a sequence, we can solve the problem of forked transaction conflicts, and finally get a clean ledger.
From a higher level perspective, Conflux's idea of expansion is actually quite simple.
Bitcoin needs to wait for the current block to end before producing another block. It's like queuing, everyone is crowded in the same line. In order to prevent confusion caused by a lot of people, Bitcoin chooses to increase the difficulty of PoW's target to limit the speed of people (ie, the speed of the block-producing), so the speed of this line is destined to be very slow.
In addition, the longest chain rule stipulates that the forked line is invalid unless they can be longer than the current one. Although this guarantees the security of the queue order, it also loses the efficiency of the queue.
Conflux's approach is to allow everyone to follow different lines and fork randomly. Suppose there are 10 lines at the same time, so long as we can finally find a way to determine the order of each person in the 10 lines.
With this kind of thinking, miners can be encouraged to make more block actively and reduce the limitations on the number of block produced at the same time in the system, so as to make full use of the bandwidth resources and improve the throughput of the entire system. In this way, we turn a expansion problem into a sorting problem: how to determine the order of all blocks when the fork block is allowed to exist? In 10 lines, how do you determine who will come first?
The answer is hidden in the tree structure.
What is a tree structure?
Conflux's block structure is not a chain, but a graph, and the graph contains a tree. Therefore, Conflux calls it a tree diagram structure.
Conflux's treegraph structure introduces two sides for each block, one is called the parent side and the other is called the reference edge.
The so-called "edge" refers to the reference relationship of the block. When one block refers to another block, an edge is created. In the example of queuing, "edge" is equivalent to quoting the person who queues before you. Through this edge, we can know that the others are in front of you.
Conflux stipulates that each block has one and only one parent edge, which means when a miner packs a new block, he must choose a block in front of him as his father, and can only choose one father, can not choose multiple fathers.
In addition to the parent side, each block must also reference the unreferenced block it sees on the other forks as the reference edge. The role of the reference edge is to indicate that the referenced fork blocks are placed in front of themselves when a block is added, showing a sequence. Each block can have zero or more reference edges.
If you look at the parent edge, the entire Conflux structure is a tree.
If you look at both the reference edge and the parent edge, the entire structure is a graph.
How to determine the order
Having figured out the treegraph structure of Conflux, let's discuss how the system determines the ordering of these blocks in the case where the forked blocks are all accepted.
Sorting is very important, if we can determine a "consistent" and "cannot be tampered" order for all blocks, then
- It means that we can determine the validity of the transaction one by one according to the order. If the two transactions conflict, the latter one is invalid; 2. Double spend attack is also impossible. Unless the attacker can create a heavier branch, which requires more work than all the blocks on the original branch including the forked block—not just the block on the longest chain. Therefore, all of the computational power used to generate the fork blocks contributes to the security of the system.
So the problem becomes a sorting algorithm one. Because of the existence of reference and parent edges, this sorting algorithm is actually very simple.
The sorting algorithm needs to be done in two parts.
Through the parent edge, we can first find a main chain according to “heaviest chain principle”. Conflux calls this main chain a pivot chain. In the above picture, the red block forms a pivot chain.
The blocks on the pivot chain can naturally manage to rank in a sequence: A -> C -> G -> J -> M -> N
How do we sort other yellow blocks?
At this time, the order of “reference edge” has its role to play. For example, in the above figure, the F block references the B block, which means that it has already observed the existence of the B block before the F block is generated, so the B block should be placed before F.
This situation is well understood, you might ask: How do you order B and D? Here we need to introduce a new concept. Conflux introduces a concept called Epoch for blocks on the pivot chain, like the meaning of "time period" and "stage".
For example, block A is the first block on the pivot chain, so block A is in "stage 0"; block C is in the second block of the pivot chain, so block C belongs to "stage 1" and so on.
For a block other than the pivot chain, its stage is determined according to the first pivot chain block that directly or indirectly refers to it. The specific algorithm is not complicated, we can explain it clearly with a practical example. Let's take "Phase 3" as an example:
First, the block J on the pivot chain belongs to "Phase 3":
J's reference block F is also "stage 3" because it has not been sorted before, and we know that F should be in front of J according to the relationship of the reference edges; `
F's father block C has been included in "Phase 2", so it does not belong to "Phase 3", but F's reference block B has not been sorted, so B also belongs to "Phase 3". Also, according to the reference relationship, B ranks before F.
Therefore, the order of internal block in the "Phase 3"is：B -> F -> J
That is to say, the reference block of the pivot chain, the parent block and the reference block of this referenced block, if they have not been sorted, then they belong to the same stage. There is also an exception to this situation, in which two reference blocks are unable to determine the order of each other in the same phase, then the hash ID is used to sort.
To sum up, when sorting:
1 First use the pivot chain to rank a sequence for different stages 2 Sort the blocks in the same stage.
Using this algorithm, we can arrange a certain order for all the blocks in the above figure, including red and yellow. You can follow the steps above to see if you get the same result.
Analysis of the Conflux solution
Regardless of the blockchain system, any block that is already produced needs to be broadcasted to each node for synchronization through the network. As long as broadcast is involved, the transmission rate and load capacity of the network cannot be ignored because there is a bottleneck in transmitting data over the network. Assuming that the network bandwidth is a fixed value, the larger the data is, the slower the speed of the transmitting when broadcasting.
All broadcasted data = really useful block + invalid fork block + system's own overheads
In the Conflux system, the fork block is also a valid block. Conflux optimistically assumes that transactions on the forked block and transactions on the main chain are mostly non-conflicting, so theoretically, retaining all blocks can make full use of bandwidth.
Of course, this kind of solution is not completely without problems. Let's see how Conflux solves some potential problems.
Duplicate transaction which is already packed
The transactions in the fork block may not conflict. The power calculation on the fork block is also an honest calculation power and it is a good person who is digging, but what if the transaction in the fork block is repeated? In other words, how do we solve the problem of multiple honest miners repeating the same transaction? Obviously, if everyone's packed transactions are duplicates, then even if all the forked blocks are retained, it is difficult to achieve the goal of increasing throughput.
For this problem, Conflux adopts a hybrid strategy, using the game mechanism to allow miners to randomly select transactions in the trading pool to pack to maximize profit. When the number of transactions in the trading pool is quite large (it is also the time when we need to increase throughput most), the random packing strategy can greatly reduce the probability of repeating transactions. If there is only one transaction in the trading pool, then it is certain that each miner can only package this transaction, so the transaction repeatability is the highest - in fact, this is the time when you need to use parallel packaging least to improve throughput.
Conflux's theoretically tested conclusion is that, by eliminating the inefficient throughput of duplicate transactions, the entire solution can still achieve more than 60% expansion in ideal non-repetitive transactions.
Liveness active attacks
Because the heaviest chain is used as the main chain, if an attacker makes the honest node see the two sub-chains with similar weight by selectively hiding the block, the honest node can never select a pivot chain consistently, even cannot confirm the order of transactions. Although this is not a double spend attack, it can put the entire system into an unusable state where the transaction cannot be confirmed. This type of attack is also known as a liveness active attack.
Conflux solves this problem by improving the GHOST algorithm, which is the GHAST algorithm mentioned earlier. In the case of an abnormality of a suspected attack, the algorithm switches to another mode and gives the block different weights according to the topological position of the block in the tree diagram. In simple terms, this weighted topology has similar security to Bitcoin, increasing the security of the entire system by slowing block producing speed and putting more weight on each block. Until the honest calculating power is restored, GHAST will switch to the original fast mode to prevent active attacks.
The overheads of the tree structure
The tree diagram structure retains the reference relationship of many blocks. When a graph is very large, the whole structure will be very complicated, and it is costly to calculate the topological relationship in the network. How can Conflux be more efficient when dealing with tree structure?
Ethereum uses the modified GHOST algorithm, which only counts the seven-layer relationship. If you want to calculate the GHOST of all layers, you need the overheads of O(N) complexity, where N is the number of all blocks in the entire chain. Conflux uses a new, more efficient algorithm that controls the complexity of adding blocks per step of the GHAST algorithm to O (logN). If you are interested, you can read Conflux-related papers and materials.
In bitcoin, the attacker of the forked chain needs to consume a lot of computing power. Once the failure does not receive any reward, the attack requires a high cost. All the forked blocks in the Conflux network are valid, although the attacker can't achieve the double-spend attack through the fork, but if the attacker can get the block reward from the maliciously added forked block, this will also have a certain impact on the system. For example, an attacker can increase the transaction confirmation time without having to bear the loss or increase the complexity of the entire tree structure a lot to increase the overhead of the entire system. How does Conflux solve this problem?
The answer is to use incentives to punish those who deliberately fork. Under Conflux's main chain selection protocol, if an attacker intentionally adds a block to the forked chain, he must pretend that he does not see some reference blocks and ignores other blocks that already appear. In Conflux's design, block rewards are not fixed: the more the block “disregards other blocks”, the less reward it can get. Through this economic punishment, attackers must also bear economic losses in Conflux.
Conflux maintains the bifurcation block generated by the miner's consumption of POW through the structure of the tree diagram, thereby improving the throughput of the system and achieving capacity expansion under the premise of security. The overall idea is not complicated, but in terms of engineering implementation, the global ordering of all transactions, the processing of complex topology of tree diagrams, the problem of block incentives, and how do we reduce duplicate transactions, there are many pits that need to be solved by more complicated technical theory. At present, Conflux has already launched the test network, if you’re interested, you can have a go.