How to Build a High-Performance Blockchain

Source: Aptos Labs
Since the advent of computing technology, engineers and researchers have been continuously exploring how to push computing resources to the performance limit, aiming to maximize efficiency while minimizing the latency of computing tasks. The two pillars of high performance and low latency have always shaped the development of computer science, influencing a wide range of fields from CPUs, FPGAs, and database systems to more recent artificial intelligence infrastructure and blockchain systems. In the pursuit of high performance, pipeline technology has become an indispensable tool. Since the introduction of pipeline technology in the IBM System/360 in 1964 [1], it has been a core of high-performance system design, driving key discussions and innovations in the field.
Pipeline technology is not only applied to hardware but also widely used in the database field. For example, Jim Gray introduced the pipeline parallelism approach in his work "High-Performance Database Systems" [2]. This method breaks down complex database queries into multiple stages and runs them simultaneously, thus improving efficiency and performance. Pipeline technology is equally vital in the field of artificial intelligence, especially in widely used deep learning frameworks like TensorFlow. It utilizes data pipeline parallelism to process data preprocessing and loading, ensuring a smooth flow of data for training and inference, making AI workflows faster and more efficient [3].
Blockchain is no exception. Its core function is similar to a database, handling transactions and updating the state, but it adds the challenge of Byzantine fault-tolerant consensus. The key to improving blockchain throughput (transactions per second) and reducing latency (time to finality) lies in optimizing the different stages—ordering, execution, submission, and transaction synchronization—during interactions under high loads. This challenge is particularly crucial in high-throughput scenarios where traditional designs struggle to maintain low latency.
To explore these concepts, let's consider a familiar analogy: the automobile factory. Understanding how the assembly line has revolutionized manufacturing can help us grasp the evolution of the blockchain pipeline—and why next-generation designs like Zaptos [8] are pushing blockchain performance to new heights.
From Automobile Factory to Blockchain
Imagine you are the owner of an automobile factory with two main goals:
· Maximize throughput: Assemble as many cars as possible every day.
· Minimize latency: Reduce the build time of each car.
Now, consider three types of factories:
Simple Factory
In a simple factory, a group of versatile workers systematically assembles a car. One worker assembles the engine, the next worker installs the wheels, and so on—producing only one car at a time.
The issue? Some workers often wait idle, leading to an overall low production efficiency because no one is working on different parts of the same car simultaneously.
Ford Factory
Enter the Ford assembly line[4]! Here, each worker focuses on a single task. The car moves along a conveyor belt, and as each car passes through, a dedicated worker adds their part.
The result? Multiple cars are at different assembly stages simultaneously, and all workers are busy. Throughput increases significantly—but each car still needs to go through each worker sequentially, meaning the delay per car remains the same.
Magic Factory
Imagine a magic factory where all workers can work on a single car simultaneously! No longer needing to move the car from one station to the next, each part of the car is built simultaneously.
The outcome? The car is assembled at a record speed, with every step happening in sync. This is the ideal scenario to address throughput and latency issues.
Alright, enough about car factories—what about blockchain? As it turns out, designing a high-performance blockchain is not so different from optimizing an assembly line.
Blockchain as a Car Factory
In blockchain, processing a block is akin to assembling a car. The analogy goes as follows:
· Worker = Validator Resource
· Car = One Block
· Assembly Task = Consensus, Execution, and Submission stages
Just as in a simple factory where only one car is processed at a time, if a blockchain were to handle only one block at a time, it would result in underutilization of resources. In contrast, modern blockchain designs aim to emulate the Ford assembly line—processing multiple blocks in different stages simultaneously. This is where pipeline technology shines.
Evolution of Blockchain Pipelines
Traditional Architecture: Sequential Blockchain
Imagine a blockchain that processes blocks sequentially. Validators need to:
1. Receive block proposals.
2. Execute blocks to update the blockchain state.
3. Proceed with achieving consensus on that state.
4. Persist the state to the database.
5. Initiate the consensus for the next block.
Where is the problem?
· Execution and submission are in the critical path of the consensus process.
· Each consensus instance needs to wait for the previous one to complete before starting.
This setup is akin to factories of the pre-Ford era: workers (resources) often idle as they focus on only one block (car) at a time. Unfortunately, many existing blockchains still fall into this category, leading to low throughput and high latency.
Aptos: Parallelizing Performance
Diem introduced a pipeline architecture that decouples execution and submission from the consensus phase, with the consensus phase itself also adopting a pipeline design.
· Asynchronous Execution and Submission [5]: Validators first agree on a block, then execute the block based on the parent block's state. Once validated by a quorum of validators, the state is persisted to storage.
· Pipeline Consensus (Jolteon[6]): New consensus instances can start before the previous one completes, akin to a moving assembly line.
This enhancement allows different blocks to be in different stages simultaneously, increasing throughput and significantly reducing block times to just 2 message delays. However, Jolteon's leader-based design may lead to bottlenecks as the leader can become overloaded during transaction dissemination.
Aptos further optimizes the pipeline through Quorum Store[7], a mechanism that decouples data distribution from consensus. Quorum Store no longer relies on a single leader to broadcast large data blocks in the consensus protocol but separates data distribution from metadata ordering, allowing validators to asynchronously and concurrently distribute data. This design leverages the total bandwidth of all validators, effectively eliminating leader bottlenecks in consensus.

Visualization: How Quorum Store balances resource utilization in leader-based consensus protocols.
Thus far, the Aptos blockchain has built the "Ford Factory" of blockchains. Just as Ford's assembly line revolutionized car manufacturing—different cars in different stages simultaneously—Aptos processes different blocks in different stages concurrently. Each validator's resources are fully utilized, ensuring no part of the process remains idle. This clever arrangement has led to a high-throughput system, making Aptos a robust platform for efficiently and scalably handling blockchain transactions.

Illustration: Pipelined Processing of Sequential Blocks in the Aptos Blockchain. Validators can pipeline process different stages of sequential blocks to maximize resource utilization and increase throughput.
While throughput is crucial, end-to-end latency—the time from transaction submission to final confirmation—is equally important. For applications such as payments, decentralized finance (DeFi), and gaming, every millisecond counts. Many users have experienced delays during high-traffic events because each transaction must sequentially pass through a series of stages: client-full node-validator communication, consensus, execution, state validation, submission, and full node synchronization. Under high load, stages like execution and full node synchronization introduce additional latency.

Illustration: Pipeline Architecture of the Aptos Blockchain. The diagram shows client Ci, full node Fi, and validator Vi. Each box represents a stage a transaction block in the blockchain must go through from left to right. The pipeline consists of five stages: consensus (including dissemination and ordering), execution, validation, submission, and full node synchronization.
It's like a Ford factory: while the assembly line maximizes overall throughput, each car still needs to pass through each worker sequentially, resulting in longer completion times. To truly push blockchain performance to the limit, we need to build a "magic factory" where these stages run in parallel.
Zaptos: Towards Optimal Blockchain Latency
Zaptos[8] further reduces latency through three key optimizations without sacrificing throughput.
· Optimistic Execution: Reducing pipeline latency by starting execution immediately upon receiving a block proposal. Validators promptly add the block to the pipeline and speculatively execute after the parent block completes. Full nodes, upon receiving the proposal from the validator, also perform optimistic execution to validate the state proof.
· Optimistic Submission: Writing state to storage immediately after block execution—even before state validation. When validators eventually validate the state, only minimal updates are needed to complete the submission. If a block ultimately remains unsorted, its optimistically submitted state is rolled back for consistency.
· Fast Verification: Validators expedite verification by concurrently sending validation messages at the final consensus round, starting early verification of the executed block's state without waiting for consensus completion. This optimization significantly reduces pipeline latency by one round in common scenarios.

Illustration: Parallel Pipeline Architecture of Zaptos. Stages other than consensus are effectively hidden within the consensus stage, reducing end-to-end latency.
Through these optimizations, Zaptos effectively hides the latency of other pipeline stages within the consensus stage. Thus, if a blockchain adopts an optimal latency consensus protocol, the overall blockchain latency can also reach an optimum!
Talk is Cheap, Show Me the Data
We evaluated Zaptos' end-to-end performance through geographically distributed experiments, with Aptos as the high-performance baseline. For more details, refer to the paper [8].
On Google Cloud, we simulated a globally decentralized network consisting of 100 validators and 30 full nodes distributed across 10 regions, using commercial-grade machines similar to Aptos deployment.
Throughput-Latency

Figure: Common performance characteristics of Zaptos and Aptos blockchains.
The above figure compares the relationship between end-to-end latency and throughput of the two systems. Both exhibit a gradual latency increase as the load increases, with sharp spikes at maximum capacity, but Zaptos consistently demonstrates more stable latency before reaching peak throughput, reducing latency by 160 milliseconds under low load and over 500 milliseconds under high load.
Impressively, Zaptos achieves sub-second latency at 20k TPS in a production-level mainnet environment—this breakthrough makes real-world applications requiring speed and scalability a possibility.
Latency Breakdown

Figure: Latency breakdown of the Aptos blockchain.

Figure: Latency breakdown of Zaptos.
The latency breakdown charts detail the duration of each stage for validators and full nodes in the pipeline. Key insights include:
· Up to 10k TPS: Zaptos' overall latency is nearly equivalent to its consensus latency, as optimistic execution, authentication, and optimistic commit stages are effectively "hidden" within the consensus stage.
· Above 10k TPS: Due to increased optimistic execution and full node synchronization time, non-consensus stages become more significant. Nevertheless, Zaptos significantly reduces overall latency by overlapping most stages. For example, at 20k TPS, the baseline total latency is 1.32 seconds (consensus 0.68 seconds, other stages 0.64 seconds), while Zaptos is 0.78 seconds (consensus 0.67 seconds, other stages 0.11 seconds).
Conclusion
The evolution of blockchain architecture parallels the transformation in manufacturing—from simple sequential workflows to highly parallelized assembly lines. Aptos's assembly line approach has significantly increased throughput, while Zaptos goes further, reducing latency to sub-second levels, all while maintaining high TPS. Just as modern computing architectures leverage parallelism to maximize efficiency, blockchain must continuously optimize its design to eliminate unnecessary delays. By comprehensively optimizing the blockchain pipeline to achieve minimal latency, Zaptos paves the way for real-world blockchain applications that require speed and scalability.
References
[1] Gene M. Amdahl, Gerrit A. Blaauw, and Frederick P. Brooks. 1964. "Architecture of the IBM System/360." IBM Journal of Research and Development. https://doi.org/10.1147/rd.82.0087
[2] David DeWitt, and Jim Gray. 1992. "Parallel Database Systems: The Future of High Performance Database Systems." Communications of the ACM. https://doi.org/10.1145/129888.129894
[3] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin et al. 2016. "TensorFlow: a System for Large-Scale Machine Learning." In 12th USENIX symposium on operating systems design and implementation (OSDI). https://arxiv.org/abs/1605.08695
[4] The Moving Assembly Line and the Five-Dollar Workday. https://corporate.ford.com/articles/history/moving-assembly-line.html
[5] Zekun Li, and Yu Xia. 2021. DIP-213 - Decoupled Execution. https://github.com/diem/dip/blob/7dc44ee57bb7efe76559f05dcc6851d97e2d3149/dips/dip-213.md
[6] Rati Gelashvili, Lefteris Kokoris-Kogias, Alberto Sonnino, Alexander Spiegelman, and Zhuolun Xiang. 2022. "Jolteon and Ditto: Network-Adaptive Efficient Consensus with Asynchronous Fallback." In International conference on financial cryptography and data security (FC). https://arxiv.org/abs/2106.10362
[7] Quorum Store: How Consensus Horizontally Scales on the Aptos Blockchain. https://medium.com/aptoslabs/quorum-store-how-consensus-horizontally-scales-on-the-aptos-blockchain-988866f6d5b0
[8] Zhuolun Xiang, Zekun Li, Balaji Arun, Teng Zhang, and Alexander Spiegelman. 202 2025. "Zaptos: Towards Optimal Blockchain Latency." arXiv preprint arXiv:2501.10612. https://arxiv.org/abs/2501.10612
This article is from a submission and does not represent the views of BlockBeats.
You may also like

From Stanford Lab to Silicon Valley Streets: How OpenMind is Solving the "Last Mile" Problem of the Machine Economy?

PlanX: Reconstructing On-Chain Execution with AI, Moving Towards a New Paradigm

US Judge Allows Binance Unregistered Token Lawsuit to Advance
Key Takeaways: A federal judge in Manhattan dismissed Binance’s petition to resolve a securities lawsuit through private arbitration,…

Crypto VC Paradigm Plans $1.5 Billion Expansion into AI and Robotics
Key Takeaways: Paradigm is setting up a new $1.5 billion fund to explore AI, robotics, and other emerging…

Ethereum Smart Accounts Set to Launch Within a Year, According to Vitalik Buterin
Key Takeaways: Ethereum’s “account abstraction” or smart accounts might be introduced in the coming year through the Hegota…

Bitcoin Recovers After Iran Conflict Shocks Market, Reverses $5K Fall in Just 24 Hours
Key Takeaways: Bitcoin dropped to approximately $63,000 amid tensions but rebounded to $68,200 within a day. Volatility led…

Former Mt. Gox CEO Suggests Hardfork to Retrieve $5.2 Billion in Bitcoin
Key Takeaways: Mark Karpelès, former CEO of Mt. Gox, proposes a Bitcoin network hard fork to access nearly…

South Korea National Tax Service’s Mistake Resulted in $4.8 Million Crypto Loss
Key Takeaways South Korea’s National Tax Service inadvertently exposed private keys, resulting in a $4.8 million crypto loss.…

Morgan Stanley Seeks National Trust Charter for Cryptocurrency Custody
Key Takeaways: Morgan Stanley has initiated a significant step toward digital asset management by applying for a national…

Solana Price Outlook: Major ETF Inflows Hint at Institutional Moves
Key Takeaways: Solana has experienced substantial ETF inflows, prompting speculation about institutional buy-in. On February 25, Solana recorded…

Bitcoin Price Prediction: Wikipedia Founder Warns BTC Could Plunge Below $10K — Should Investors Worry?
Key Takeaways Wikipedia co-founder Jimmy Wales warns Bitcoin might decline to below $10,000, prompting a bearish outlook. Wales…

China’s DeepSeek AI Foresees a Bright Future for XRP, Bitcoin, and Ethereum
Key Takeaways: DeepSeek AI predicts that XRP, Bitcoin, and Ethereum may reach new all-time highs within the next…

Can BTC, ETH, and SOL Liquidity Collaborate Effectively? Exploring LiquidChain’s Staking and Settlement Approach
Key Takeaways LiquidChain introduces a novel Layer 3 framework aimed at integrating liquidity across Bitcoin, Ethereum, and Solana.…

Canton Crypto Network vs. XRP: Exploring DTCC’s Infrastructure and Liquidity Dynamics
Key Takeaways Canton Network is crafted for institutional finance, emphasizing privacy and regulatory alignment, critical for the onchain…

Axiom Crypto Exposed: Alleged $400k Insider Trading Scandal Revealed
Key Takeaways A whistleblower has brought to light an alleged insider trading scheme at Axiom Crypto, revealing governance…

Ethereum $159B Stablecoin Dominance: Why Infrastructure Triumphs Over Price
Ethereum’s role as a settlement layer has seen it capture over 53%, or $159 billion, of the $300…

Crypto Price Forecast Today: February 26 – XRP, Solana, Dogecoin
Key Takeaways Potential impact of U.S. regulatory clarity: Up-and-coming regulations like the CLARITY Act in the U.S. are…

XRP Price Outlook: Recent Bug Expose and Protection – What’s Next for XRP Holders?
Key Takeaways A significant flaw in the XRP Ledger was found but addressed before it posed any real…
From Stanford Lab to Silicon Valley Streets: How OpenMind is Solving the "Last Mile" Problem of the Machine Economy?
PlanX: Reconstructing On-Chain Execution with AI, Moving Towards a New Paradigm
US Judge Allows Binance Unregistered Token Lawsuit to Advance
Key Takeaways: A federal judge in Manhattan dismissed Binance’s petition to resolve a securities lawsuit through private arbitration,…
Crypto VC Paradigm Plans $1.5 Billion Expansion into AI and Robotics
Key Takeaways: Paradigm is setting up a new $1.5 billion fund to explore AI, robotics, and other emerging…
Ethereum Smart Accounts Set to Launch Within a Year, According to Vitalik Buterin
Key Takeaways: Ethereum’s “account abstraction” or smart accounts might be introduced in the coming year through the Hegota…
Bitcoin Recovers After Iran Conflict Shocks Market, Reverses $5K Fall in Just 24 Hours
Key Takeaways: Bitcoin dropped to approximately $63,000 amid tensions but rebounded to $68,200 within a day. Volatility led…