The Holy Grail of Bitcoin’s Layer 2 Network: Introspection and Covenants

July 13, 2024

Compared to Turing-complete blockchains like Ethereum, Bitcoin’s scripting language is considered significantly limited, capable of only basic operations, and even lacks multiplication and division functions. More importantly, the blockchain’s own data is almost inaccessible to scripts, leading to severe limitations in flexibility and programmability. Consequently, people have long sought ways to enable introspection for Bitcoin scripts.

Introspection refers to the ability of Bitcoin scripts to inspect and constrain transaction data. This allows scripts to control the use of funds based on specific transaction details, enabling more complex functionalities. Currently, most Bitcoin opcodes either push user-provided data onto the stack or manipulate data already on the stack. Introspection opcodes, however, can push current transaction data (such as timestamps, amounts, and transaction IDs) onto the stack, allowing for more granular control over UTXO spending.

There are currently only three main opcodes in Bitcoin scripts that support introspection: CHECKLOCKTIMEVERIFY, CHECKSEQUENCEVERIFY, and CHECKSIG. CHECKSIG has several variants, including CHECKSIGVERIFY, CHECKSIGADD, CHECKMULTISIG, and CHECKMULTISIGVERIFY.

Covenants are restrictions on how tokens can be transferred. Users can specify how UTXOs are distributed through covenants, many of which are implemented using introspection opcodes. Currently, Bitcoin Optech categorizes introspection under covenant entries.

Bitcoin currently has two covenants: CSV (CheckSequenceVerify) and CLTV (CheckLockTimeVerify), both of which are time-based covenants and form the foundation of many scaling solutions, such as the Lightning Network. This indicates that Bitcoin’s scaling solutions heavily rely on introspection and covenants.

How to Add Conditions to Token Transfers

In the crypto world, the most common method is through commitments, often implemented using hashes. To prove that the transfer conditions are met, signature mechanisms are used for verification. Therefore, covenants involve numerous adjustments to hashes and signatures.

The following are widely discussed covenant opcode proposals:

CTV (CheckTemplateVerify) BIP-119

CTV (CheckTemplateVerify), included in BIP-119, is a highly debated Bitcoin upgrade. CTV allows output scripts to specify templates for spending transactions, including fields such as nVersion, nLockTime, scriptSig hash, input count, sequences hash, output count, outputs hash, and input index. These template restrictions are implemented through a hash commitment, and future spending scripts will check if the hash values of specified fields in the spending transaction match those in the input script. These templates essentially define the time, method, and amount of future UTXO transactions.

Notably, the input TXID is excluded from the hash commitment. This exclusion is necessary because, whether in Legacy or Segwit transactions, the TXID depends on the scriptPubKey value when using the default SIGHASH_ALL signature type. Including the TXID would create a circular dependency, making the hash commitment impossible to construct.

CTV achieves introspection by directly obtaining specified transaction information through a new opcode, hashing it, and comparing it with the commitment on the stack. This method consumes less on-chain space but lacks some flexibility.

The foundation of second-layer solutions like the Lightning Network is pre-signed transactions. Pre-signing usually means generating and signing transactions in advance but not broadcasting them until certain conditions are met. Essentially, CTV implements a stricter pre-signing function, publishing the pre-signed commitment directly on-chain, allowing transactions only according to the predefined template.

CTV was initially proposed to alleviate Bitcoin congestion and can also be termed congestion control. During periods of high congestion, CTV can commit multiple future transactions through a single transaction, avoiding the need to broadcast multiple transactions during congestion and completing actual transactions after the congestion eases. This could significantly help during exchange runs. Additionally, templates can be used to implement vaults, preventing hacker attacks since the funds’ destination is predetermined, making it impossible for hackers to send UTXOs using the CTV script to their addresses.

CTV can significantly improve second-layer networks. For example, in the Lightning Network, implementing Timeout Trees and Channel Factories using CTV allows a single UTXO to expand into a CTV tree, opening multiple state channels with only one on-chain transaction and confirmation. Additionally, CTV supports atomic transactions (ATLC) in the Ark protocol.

APO (SIGHASH_ANYPREVOUT) BIP-118

BIP-118 proposes a new signature hash flag for tapscript, called SIGHASH_ANYPREVOUT, to facilitate writing more flexible spending logic. APO and CTV are similar in many ways. To address the circular dependency between scriptPubKeys and TXIDs, APO’s solution is to exclude the relevant input information and only sign the output, allowing transactions to dynamically bind to different UTXOs.

Logically, the verification operation OP_CHECKSIG (and its related opcodes) has three functions:

Assemble parts of the spending transaction.
Hash these parts.
Verify if the hash is signed by the given key.

The specific content of the signature can be adjusted significantly, and which transaction fields are signed is determined by the SIGHASH flag. According to the definition of BIP 342 signature opcodes, SIGHASH flags include SIGHASH_ALL, SIGHASH_NONE, SIGHASH_SINGLE, and SIGHASH_ANYONECANPAY, among others. SIGHASH_ANYONECANPAY controls the input, while the others control the output.

SIGHASH_ALL is the default SIGHASH flag, signing all outputs. SIGHASH_NONE signs no outputs, and SIGHASH_SINGLE signs only a specified output. SIGHASH_ANYONECANPAY can be set with the other three flags, and if set, it signs only the specified input; otherwise, all inputs must be signed.

None of these SIGHASH flags can eliminate the impact of inputs. Even SIGHASH_ANYONECANPAY requires a commitment to one input.

Therefore, BIP 118 introduces SIGHASH_ANYPREVOUT. APO signatures do not need to commit to the spent input UTXO (called PREVOUT), only signing the output, providing higher flexibility in Bitcoin control. By pre-constructing transactions and creating corresponding one-time-use signatures and public keys, assets sent to that public key address must be spent through the pre-constructed transaction, achieving the covenant.

APO’s flexibility can also be used for transaction repair. If a transaction gets stuck on-chain due to low fees, another transaction can be easily created to increase the fee without needing new signatures. Additionally, for multi-signature wallets, signatures not dependent on the spent input make operations simpler.

By eliminating the circular dependency between scriptPubKeys and input TXIDs, APO can achieve introspection by adding output data in the witness, though this still requires additional witness data space.

For off-chain protocols like the Lightning Network and vaults, APO reduces the need for intermediate state storage, significantly lowering storage requirements and complexity. The most direct use case for APO is Eltoo, simplifying channel factories, constructing lightweight and inexpensive watchtowers, and allowing unilateral exits without leaving erroneous states, enhancing the Lightning Network’s performance. APO can simulate CTV’s functionality, though it requires personal storage of signatures and pre-signed transactions, making it costlier and less efficient than CTV.

APO’s main criticism is the need for a new key version, making it incompatible with existing systems. Additionally, the new signature hash type might introduce potential double-spend risks. After extensive community discussions, APO now requires an ordinary signature in addition to the original, alleviating security concerns and earning its BIP-118 number.

OP_VAULT BIP-345

BIP-345 proposes two new opcodes, OP_VAULT and OP_VAULT_RECOVER, to work with CTV and implement a dedicated covenant that enforces a delay period on spending specified tokens, during which the spending can be “revoked” via a recovery path.

Users can create a vault by setting up a specific Taproot address, including at least two scripts in the MAST: an OP_VAULT script to facilitate the expected withdrawal process and an OP_VAULT_RECOVER script to ensure recovery of coins before withdrawal completion.

How does OP_VAULT achieve interruptible time-locked withdrawals?

In simple terms, the OP_VAULT opcode replaces the spent OP_VAULT script with a specified script, updating a single leaf node in the MAST while keeping the rest unchanged. This is similar to TLUV but without supporting internal key updates.

Introducing a template during script updates can restrict payment effects. The time lock parameter is specified by OP_VAULT, while the template brought by the CTV opcode restricts the set of outputs spent through that script path.

BIP-345 is designed for vaults, allowing users to have a highly secure recovery path (paper wallet, distributed multi-signature) while configuring a spending delay for routine payments. The user’s device continuously monitors the vault’s spending, allowing recovery if an unauthorized transfer occurs.

Implementing vaults with BIP-345 requires considering fee issues, especially for recovery transactions. Possible solutions include CPFP (Child Pays for Parent), temporary anchors, and new signature hash flags like SIGHASH_GROUP.

TLUV (TapleafUpdateVerify)

The TLUV scheme is built around Taproot and aims to solve the problem of efficient exit from shared UTXOs. The guiding principle is that when a Taproot output is spent, we can use the internal structure of the Taproot address and cryptographic transformations to partially update the internal key and MAST according to the update steps described by the TLUV script, thereby achieving contract functionality.

The idea of the TLUV scheme is to create a new Taproot address based on the current spending input through a new opcode TAPLEAF_UPDATE_VERIFY, which can perform one or more of the following operations:

Update the internal public key
Trim the Merkle path
Remove the currently executing leaf node
Add a new leaf node to the end of the Merkle path

Specifically, TLUV takes three inputs:

A specification on how to update the internal public key
A new leaf node for the Merkle path
A specification on whether to remove the current leaf node and/or how many Merkle path leaf nodes to remove

The TLUV opcode calculates the updated scriptPubKey and verifies whether the output corresponding to the current input is spent to that scriptPubKey.

The inspiration for TLUV is coin pools. Today, joint pools can be created using pre-signed transactions, but if you want to achieve permissionless exits, you need to create an exponentially growing number of signatures. TLUV enables permissionless exits without any pre-signed transactions. For example, a group of partners uses Taproot to build a shared UTXO, pooling their funds. They can move funds internally using the Taproot key or jointly sign to make external payments. Individuals can exit the shared pool at any time, removing their payment path, while others can continue to complete payments through the original path without exposing additional information about the remaining members. This method is more efficient and private compared to non-pooled transactions.

The TLUV opcode achieves partial spending restrictions by updating the original MAST. However, it does not achieve introspection of output amounts. Therefore, a new opcode IN_OUT_AMOUNT is needed, which pushes two pieces of data onto the stack: the amount of the UTXO of this input and the corresponding output amount. The person using TLUV is then expected to use mathematical operators to verify whether the funds are properly preserved in the updated scriptPubKey.

Introspection of output amounts adds another complexity because Bitcoin amounts are represented in satoshis and require up to 51 bits, but scripts only allow 32-bit mathematical operations. This requires upgrading the operators in the script by redefining opcode behavior or using SIGHASH_GROUP to replace IN_OUT_AMOUNT.

TLUV is expected to provide solutions for decentralized Layer 2 coin pools, although reliability in terms of tweaking Taproot public keys needs further confirmation.

MATT

MATT (Merkleize All The Things) aims to achieve three goals: Merkleizing state, Merkleizing scripts, and Merkleizing execution, thus achieving general smart contracts.

Merkleizing State: Construct a Merkle Trie where each leaf node is a hash of the state, and the Merkle Root represents the entire contract state.
Merkleizing Scripts: A MAST constructed of Tapscript where each leaf node is a possible state transition path.
Merkleizing Execution: Achieved through cryptographic commitments and fraud challenge mechanisms. For any computational function, participants can compute off-chain and publish commitments, f(x)=y. If other participants find the result f(x)=z, they can challenge it, and arbitration is performed through a binary search, similar to the Optimistic Rollup principle.

To achieve MATT, Bitcoin programming scripts need to have the following functionalities:

Enforce a specific script on an output (and their amounts)
Attach data to an output
Read data from the current input (or other inputs)

The second point is crucial as dynamic data allows state computation through input data provided by the spender, simulating a state machine and deciding the next state and attached data. MATT proposes the OP_CHECKCONTRACTVERIFY (OP_CCV) opcode, a combination of the previously proposed OP_CHECKOUTPUTCONTRACTVERIFY and OP_CHECKINPUTCONTRACTVERIFY opcodes, with an additional flags parameter to specify the target of the operation.

Control of Output Amounts: The most direct way is through direct introspection. However, output amounts are 64-bit numbers requiring 64-bit operations, which adds complexity to Bitcoin script implementation. CCV adopts a delayed check similar to OP_VAULT, summing the input amounts for all inputs to the same output with CCV as the lower limit of that output amount. The check is delayed to the transaction process instead of during input script evaluation.

Given the generality of fraud proofs, some variants of MATT contracts should achieve all types of smart contracts or Layer 2 constructions, although additional requirements (such as capital lock-in and challenge period delays) need accurate evaluation. Further research is required to evaluate which applications are acceptable transactions. For example, using cryptographic commitments and fraud challenge mechanisms to simulate OP_ZK_VERIFY functions, achieving trustless Rollups on Bitcoin.

In practice, things are already happening. Johan Torås Halseth implemented elftrace using the OP_CHECKCONTRACTVERIFY opcode from the MATT soft fork proposal, allowing all programs supported by RISC-V compilation to be verified on the Bitcoin chain, enabling native Bitcoin verification bridges for contract protocols.

CSFS (OP_CHECKSIGFROMSTACK)

From the introduction of the APO opcode, we know that OP_CHECKSIG (and related operations) is responsible for assembling transactions, hashing, and signature verification. However, the message it verifies is derived from the transaction serialization using this opcode, not allowing for specifying other messages. In simple terms, OP_CHECKSIG (and related operations) serves to verify that the UTXO as a transaction input is authorized to be spent by the signature holder, thus protecting Bitcoin’s security.

CSFS, as its name suggests, checks signatures from the stack. The CSFS opcode takes three parameters from the stack: a signature, a message, and a public key, and verifies the signature’s validity. This means that any message can be passed to the stack through witness data and verified by CSFS, enabling some innovations on Bitcoin.

CSFS’s flexibility allows it to implement various mechanisms such as payment signatures, authority delegation, oracle contracts, and double-spend protection bonds, and more importantly, transaction introspection. The principle of transaction introspection using CSFS is very simple. If the transaction content used by OP_CHECKSIG is pushed onto the stack through the witness and the same public key and signature are used to verify with both CSFS and OP_CHECKSIG, if both pass, then the content of any message passed to CSFS is the same as the serialized spending transaction (and other data) implicitly used by OP_CHECKSIG. We then obtain verified transaction data on the stack, which can be used to apply restrictions to the spending transaction with other opcodes.

CSFS often appears with OP_CAT because OP_CAT can concatenate different fields of the transaction to complete serialization, allowing for more precise selection of transaction fields for introspection. Without OP_CAT, the script cannot recompute hashes from individually checkable data, so it can only check if a hash matches a specific value, meaning coins can only be spent through a single specific transaction.

CSFS can achieve opcodes like CLTV, CSV, CTV, APO, and is a general introspection opcode, thus also aiding Layer 2 scaling solutions for Bitcoin. Its drawback is that it requires adding a complete copy of the signed transaction to the stack, which can significantly increase the size of transactions that want to use CSFS for introspection. In contrast, single-purpose introspection opcodes like CLTV and CSV have the least overhead, but adding each new special introspection opcode requires consensus changes.

TXHASH (OP_TXHASH)

OP_TXHASH is a very straightforward introspection opcode that allows the operator to select a field’s hash and push it to the stack. Specifically, OP_TXHASH pops a txhash flag from the stack, calculates a (flagged) txhash based on the flag, and pushes the resulting hash onto the stack.

Due to the similarity between TXHASH and CTV, there has been extensive discussion in the community about the two.

TXHASH can be seen as a general upgrade to CTV, providing a more advanced transaction template that allows users to specify parts of the spending transaction, solving many issues related to transaction fees. Unlike other contract opcodes, TXHASH does not require providing a copy of the necessary data in the witness, further reducing storage needs. Unlike CTV, TXHASH is not NOP-compatible and can only be implemented in tapscript. The combination of TXHASH and CSFS can be considered an alternative to CTV and APO.

In terms of constructing contracts, TXHASH is easier to achieve “additive contracts” by pushing all the parts of the transaction data you want to fix to the stack, hashing them together, and verifying if the result matches a fixed value. CTV is easier to achieve “subtractive contracts” by pushing all the parts of the transaction data you want to keep free to the stack. Then, using rolling OP_SHA256, starting from a fixed intermediate state, that intermediate state commits the transaction hash data prefix. The free parts are hashed into that intermediate state.

The TxFieldSelector field defined in the TXHASH specification is expected to expand to other opcodes, such as OP_TX.

The BIP related to TXHASH is currently in draft status on Github and has not yet been assigned a number.

OP_CAT

OP_CAT is a somewhat mysterious opcode that was deprecated by Satoshi Nakamoto due to security concerns. However, it has recently sparked extensive discussion among Bitcoin core developers, even becoming a meme in the online community. Ultimately, OP_CAT was approved as BIP-347, with many calling it the most likely BIP proposal to pass in the near future.

In reality, the behavior of OP_CAT is very simple: it concatenates two elements on the stack into one. But how does this enable contract functionality?

The function of concatenating two elements corresponds to a powerful cryptographic data structure known as a Merkle Trie. The construction of a Merkle Trie only requires concatenation and hashing operations, both of which are available in Bitcoin scripts. Therefore, with OP_CAT, we can theoretically verify Merkle Proofs in Bitcoin scripts, which is the most common lightweight verification method in blockchains.

As mentioned earlier, CSFS can use OP_CAT to achieve a general contract scheme. In fact, even without CSFS, the structure of Schnorr signatures allows OP_CAT itself to achieve transaction introspection.

In a Schnorr signature, the message to be signed is composed of the following fields:

Version
Locktime
Input count
Output count
List of inputs (each including previous transaction hash, index, script length, script, sequence)
List of outputs (each including value, script length, script)

These fields contain the main elements of a transaction. By placing them in scriptPubKey or witness, and using OP_CAT and OP_SHA256, we can construct a Schnorr signature and verify it with OP_CHECKSIG. If the verification passes, the stack retains the verified transaction data, enabling transaction introspection. This allows us to extract and “inspect” various parts of the transaction, such as its inputs, outputs, destination addresses, or involved Bitcoin amounts.

For more detailed cryptographic principles, refer to Andrew Poelstra’s article “CAT and Schnorr Tricks.”

In summary, the flexibility of OP_CAT makes it capable of mimicking almost any contract opcode, and many contract opcodes depend on its functionality. This significantly advances its position on the merge list. Theoretically, with only OP_CAT and existing Bitcoin opcodes, we could construct a trust-minimized BTC ZK Rollup. Starknet, Chakra, and other ecosystem partners are actively working towards making this happen.

Conclusion

As we explore various strategies to expand Bitcoin and enhance its programmability, it becomes evident that the way forward involves a combination of native improvements, off-chain computation, and complex script functionalities.

Without a flexible base layer, it is impossible to build a more flexible second layer.

Off-chain computation scalability is the future, but Bitcoin’s programmability must break through to better support scaling and become a true world currency.

However, Bitcoin computation fundamentally differs from Ethereum computation. Bitcoin only supports “verification” as a form of computation, whereas Ethereum is fundamentally computational, with verification as a byproduct. This difference is evident in the fact that Ethereum charges gas fees for failed transactions, whereas Bitcoin does not.

Contracts achieve a form of smart contracts based on verification rather than computation. Regarding contracts, aside from a few strict Satoshi fundamentalists, most people believe that contracts are a good choice for improving Bitcoin. However, the community is constantly debating which scheme to use to implement contracts.

APO, OP_VAULT, and TLUV lean towards direct applications. Choosing them can implement specific applications more cheaply and efficiently. Lightning Network enthusiasts prefer APO because it achieves LN-Symmetry; if you want to create a vault, it is best to use OP_VAULT; to build a CoinPool, TLUV is more private and efficient. OP_CAT and TXHASH are more general, with fewer security vulnerabilities, and can achieve more use cases through combinations with other opcodes, albeit potentially at the cost of script complexity. CTV and CSFS have adjusted the blockchain processing approach: CTV implements delayed outputs, and CSFS implements delayed signatures. MATT is relatively unique, using optimistic execution and fraud proofs, and relies on the Merkle Trie structure to achieve general smart contracts, though it still requires new opcodes to provide introspection capabilities.

We see that the Bitcoin community is already intensely discussing the possibility of introducing contracts through a soft fork. Starknet has officially announced its entry into the Bitcoin ecosystem, planning to implement settlement on the Bitcoin network within six months of the OP_CAT merge. Chakra will continue to monitor the latest developments in the Bitcoin ecosystem, promote the OP_CAT soft fork merge, and use the programmability brought by introspection and contracts to build a more secure and efficient Bitcoin settlement layer.