Introduction to Bitcoin Transaction Scripts.

·

13 min read

Bitcoin uses a scripting language to implement transactions on the blockchain. A script is a stack-based simple list of instructions that are recorded with each transaction describing how the recipient of bitcoins can access the coins. Bitcoin scripts are theoretically and intentionally turing complete and lack a looping mechanism. They are deterministic and by using scripts we are sure that the program will terminate at some point since infinite loops cannot happen. This feature keeps the system safe from various issues such as DDoS attacks, infinite looping, and code blocking. Scripts also make sure that the computational power needed to process transactions is minimized.

A bitcoin script encumbers future spending of the sent bitcoins using the following;

  • A hashed public key which is the destination wallet address embedded in the script.
  • A digital signature from the sender that proves ownership of the sent bitcoins by proving ownership of the private key of the corresponding public key.

Scripts afford us the flexibility to be able to change spending parameters for the sent bitcoins, in that, we can choose to require two private keys or none.

A transaction is said to be valid if the combined script does not trigger a failure and the top item in the stack is True when the script completes execution. If A sent B some bitcoins, A dictates script operations that need to be performed for the sent amount to be used in another subsequent transaction by B. For B to be able to spend the bitcoins, he/she provides the inputs to the previous scripts resulting in a combined script that terminates with a True value at the top of the stack.

In this article, we learn about the bitcoin scripting language used for transactions on the blockchain and demonstrate how they are used to express conditional spending while upholding the security of the blockchain.

Turing Incompleteness.

In the theory of computation, a system is said to be turing COMPLETE if it can be used to simulate a turing machine - Alan Turing. Most high-level programming languages such as Python, Java, and C++ are turing-complete. On the other hand, a turing INCOMPLETE system cannot simulate a turing machine. An example of a turing incomplete system is the Bitcoin Script programming language.

The Script supports many functionalities of a regular high-level programming language such as constants, the stack, bitwise operations, reserved keywords, arithmetic, splice, and much more. All these except looping mechanisms. This means that scripts can only execute for a specified amount of time. This limitation on scripts saves network bandwidth and prevents crashing/blocking that can be caused by logic bombs and infinite loops which can be embedded in a bitcoin transaction.

Stack-Based.

In computer science, a stack is a data structure that enables LIFO(Last In First Out) processing of data. Stacks and their operations can be compared to a 'stack of trays/plates' whereby the last to be 'pushed' on top of the stack is the first to be 'popped' off from the top of the stack. Bitcoin script uses a stack LIFO mechanism for its operations using OP_Codes which we discuss in the next section.

Unlike conventional high-level programming languages that have variables, the Bitcoin script places everything on a stack. Bitcoin script stack mechanism allows the language to prevent itself from introducing security issues.

Bitcoin Script OP_Codes.

An opcode(operational code) is a kind of programming language that closely resembles machine code. Opcodes are used to specify operations to be performed on data using operands.

In Bitcoin, opcodes allow the programmability of the blockchain transactions. Remember that cryptocurrencies are programmable money. Bitcoin uses about 256 opcodes(0 - 255), of these 116 are the most commonly used. They include operations for placing inputs on top of the stack(OP_TOTALSTACK), removing the top two stack items(OP_ADD), checking for duplicates on the stack(OP_IFDUP), popping-comparison-pushing(OP_EQUAL), signature verification(OP_CHECKSIG) among others.

In summary, opcodes are responsible for programming operations such as flow control, constants and stack management, arithmetic and bitwise operations, cryptographic operations, and reserved keywords among others.

The following tables summarize the differnt OP_CODES used in Bitcoin for differnet operations; Constants

OP_CODE Description
OP_0, OP_FALSE An empty array of bytes is pushed onto the stack. (This is not a no-op: an item is added to the stack.)
OP_PUSHDATA1 The next byte contains the number of bytes to be pushed onto the stack.
OP_PUSHDATA2 The next two bytes contain the number of bytes to be pushed onto the stack in little endian order.
OP_PUSHDATA4 The next four bytes contain the number of bytes to be pushed onto the stack in little endian order.
OP_1NEGATE The number -1 is pushed onto the stack.
OP_1, OP_TRUE The number 1 is pushed onto the stack.
OP_2-OP_16 The number in the word name (2-16) is pushed onto the stack.

Flow Control

OP_CODES Description
OP_NOP Does nothing.
OP_IF If the top stack value is not False, the statements are executed. The top stack value is removed.
OP_NOTIF If the top stack value is False, the statements are executed. The top stack value is removed.
OP_ELSE If the preceding OP_IF or OP_NOTIF or OP_ELSE was not executed then these statements are and if the preceding OP_IF or OP_NOTIF or OP_ELSE was executed then these statements are not.
OP_ENDIF Ends an if/else block. All blocks must end, or the transaction is invalid. An OP_ENDIF without OP_IF earlier is also invalid.
OP_VERIFY Marks transaction as invalid if top stack value is not true. The top stack value is removed.
OP_RETURN Marks transaction as invalid. Since bitcoin 0.9, a standard way of attaching extra data to transactions is to add a zero-value output with a scriptPubKey consisting of OP_RETURN followed by data. Such outputs are provably unspendable and specially discarded from storage in the UTXO set, reducing their cost to the network. Since standard relay rules allow a single output with OP_RETURN, that contains any sequence of push statements after the OP_RETURN provided the total scriptPubKey length is at most 83 bytes.

Stack Operations

OP_CODE Description
OP_TOALTSTACK Puts the input onto the top of the alt stack. Removes it from the main stack.
OP_FROMALTSTACK Puts the input onto the top of the main stack. Removes it from the alt stack.
OP_IFDUP If the top stack value is not 0, duplicate it.
OP_DEPTH Puts the number of stack items onto the stack.
OP_DROP Removes the top stack item.
OP_DUP Duplicates the top stack item.
OP_NIP Removes the second-to-top stack item.
OP_OVER Copies the second-to-top stack item to the top.
OP_PICK The item n back in the stack is copied to the top.
OP_ROLL The item n back in the stack is moved to the top.
OP_ROT The 3rd item down the stack is moved to the top.
OP_SWAP The top two items on the stack are swapped.
OP_TUCK The item at the top of the stack is copied and inserted before the second-to-top item.
OP_2DROP Removes the top two stack items.
OP_2DUP Duplicates the top two stack items.
OP_3DUP Duplicates the top three stack items.
OP_2OVER Copies the pair of items two spaces back in the stack to the front.
OP_2ROT The fifth and sixth items back are moved to the top of the stack.
OP_2SWAP Swaps the top two pairs of items.

Splicing Operations

OP_CODES Description
OP_CAT Concatenates two strings. disabled.
OP_SUBSTR Returns a section of a string. disabled.
OP_LEFT Keeps only characters left of the specified point in a string. disabled.
OP_RIGHT Keeps only characters right of the specified point in a string. disabled.
OP_SIZE Pushes the string length of the top element of the stack (without popping it).

Bitwise Operations

OP_CODES Description
OP_INVERT Flips all of the bits in the input. disabled.
OP_AND Boolean and between each bit in the inputs. disabled.
OP_OR Boolean or between each bit in the inputs. disabled.
OP_XOR Boolean exclusive or between each bit in the inputs. disabled.
OP_EQUAL Returns 1 if the inputs are exactly equal, 0 otherwise.
OP_EQUALVERIFY Same as OP_EQUAL, but runs OP_VERIFY afterward.

Arithmetic Operations

OP_CODES Description
OP_1ADD 1 is added to the input.
OP_1SUB 1 is subtracted from the input.
OP_2MUL The input is multiplied by 2. disabled.
OP_2DIV The input is divided by 2. disabled.
OP_NEGATE The sign of the input is flipped.
OP_ABS The input is made positive.
OP_NOT If the input is 0 or 1, it is flipped. Otherwise, the output will be 0.
OP_0NOTEQUAL Returns 0 if the input is 0. 1 otherwise.
OP_ADD a is added to b.
OP_SUB b is subtracted from a.
OP_MUL a is multiplied by b. disabled.
OP_DIV a is divided by b. disabled.
OP_MOD Returns the remainder after dividing a by b. disabled.
OP_LSHIFT Shifts a left b bits, preserving sign. disabled.
OP_RSHIFT Shifts a right b bits, preserving sign. disabled.
OP_BOOLAND If both a and b are not 0, the output is 1. Otherwise 0.
OP_BOOLOR If a or b is not 0, the output is 1. Otherwise 0.
OP_NUMEQUAL Returns 1 if the numbers are equal, 0 otherwise.
OP_NUMEQUALVERIFY Same as OP_NUMEQUAL, but runs OP_VERIFY afterward.
OP_NUMNOTEQUAL Returns 1 if the numbers are not equal, 0 otherwise.
OP_LESSTHAN Returns 1 if a is less than b, 0 otherwise.
OP_GREATERTHAN Returns 1 if a is greater than b, 0 otherwise.
OP_LESSTHANOREQUAL Returns 1 if a is less than or equal to b, 0 otherwise.
OP_GREATERTHANOREQUAL Returns 1 if a is greater than or equal to b, 0 otherwise.
OP_MIN Returns the smaller of a and b.
OP_MAX Returns the larger of a and b.
OP_WITHIN Returns 1 if x is within the specified range (left-inclusive), 0 otherwise.

Crypto Operations

OP_CODES Description
OP_RIPEMD160 The input is hashed using RIPEMD-160.
OP_SHA1 The input is hashed using SHA-1.
OP_SHA256 The input is hashed using SHA-256.
OP_HASH160 The input is hashed twice: first with SHA-256 and then with RIPEMD-160.
OP_HASH256 The input is hashed two times with SHA-256.
OP_CODESEPARATOR All of the signature checking words will only match signatures to the data after the most recently-executed OP_CODESEPARATOR.
OP_CHECKSIGVERIFY Same as OP_CHECKSIG, but OP_VERIFY is executed afterward.
OP_CHECKMULTISIG Compares the first signature against each public key until it finds an ECDSA match. Starting with the subsequent public key, it compares the second signature against each remaining public key until it finds an ECDSA match. The process is repeated until all signatures have been checked or not enough public keys remain to produce a successful result. All signatures need to match a public key. Because public keys are not checked again if they fail any signature comparison, signatures must be placed in the scriptSig using the same order as their corresponding public keys were placed in the scriptPubKey or redeemScript. If all signatures are valid, 1 is returned, 0 otherwise. Due to a bug, one extra unused value is removed from the stack.
OP_CHECKMULTISIGVERIFY Same as OP_CHECKMULTISIG, but OP_VERIFY is executed afterward.

Locktime Operations

OP_CODES Description
OP_CHECKLOCKTIMEVERIFY (previously OP_NOP2) Marks transaction as invalid if the top stack item is greater than the transaction's nLockTime field, otherwise script evaluation continues as though an OP_NOP was executed. Transaction is also invalid if 1. the stack is empty; or 2. the top stack item is negative; or 3. the top stack item is greater than or equal to 500000000 while the transaction's nLockTime field is less than 500000000, or vice versa; or 4. the input's nSequence field is equal to 0xffffffff. The precise semantics are described in BIP 0065.
OP_CHECKSEQUENCEVERIFY Marks transaction as invalid if the relative lock time of the input (enforced by BIP 0068 with nSequence) is not equal to or longer than the value of the top stack item. The precise semantics are described in BIP 0112.

Transaction Matching

OP_CODES Description
OP_PUBKEYHASH Represents a public key hashed with OP_HASH160.
OP_PUBKEY Represents a public key compatible with OP_CHECKSIG.
OP_INVALIDOPCODE Matches any opcode that is not yet assigned.

Reserved Codes

OP_CODES Used when...
OP_RESERVED Transaction is invalid unless occuring in an unexecuted OP_IF branch
OP_VER Transaction is invalid unless occuring in an unexecuted OP_IF branch
OP_VERIF Transaction is invalid even when occuring in an unexecuted OP_IF branch
OP_VERNOTIF Transaction is invalid even when occuring in an unexecuted OP_IF branch
OP_RESERVED1 Transaction is invalid unless occuring in an unexecuted OP_IF branch
OP_RESERVED2 Transaction is invalid unless occuring in an unexecuted OP_IF branch
OP_NOP1, OP_NOP4-OP_NOP10 The word is ignored. Does not mark transaction as invalid.

A Bitcoin Script.

Consider the following p2pkh Bitcoin script.

bit64.png

In the image above we have two major components; scriptPubKey - these are the operations that are to be performed. scriptSig - this is the digital signature and the public key that verifies the authenticity of a transaction.

We also have used the following OP-CODES; OP_DUP - duplicates the item on top of the stack. OP_HASH160 - encodes the input twice using SHA256 and RIPEMD160 hashing algorithms. OP_EQUALVERIFY - verifies the data entered is correct. OP_CHECKSIG - summarizes inputs, outputs, and the script of the transaction in a hash.

When the script is executed; The public key of the sender is duplicated. This duplicate public key is hashed using the two hashing algorithms mentioned above. The result is compared with the hash of the public key making sure that it is valid. If so, the script runs and CHECKSIG verifies the signature with the public key.

In summary, a script is executed from left to right. It uses the stack data structure. Data gets pushed onto the stack using OP_CODES. Other OP_CODES pop elements from the stack and operate on them after which they can choose to push the result onto the stack again. We say that a script is valid if the top and only element left on the stack is 1 or greater than 1. A script is invalid if the stack ends up empty, the top element is 0, there is more than a single element left on top of the stack after execution completes or the script terminates prematurely.

Summary

Scripts prevent infinite loops and secure the network from attacks such as DDoS attacks. They limit the scope of execution meaning that the script is bound to terminate at some point. Unlike regular high-level programming languages such as Python which have a looping mechanism, bitcoin scripts are made turing complete intentionally as a security feature. This limits the script's capabilities to only being applicable in the validation of programmable currency. Scripts enable Bitcoin's programmability, they are used to facilitate a transaction on the blockchain. It also prevents transaction complexity, this is what Ethereum blockchain solves, it enables developers to program complex transactions using a high-level programming language such as solidity while maintaining similar security features.

References

Bitcoin scripts

Script IDE

Bitcoin Script Debugger

BTC Script 101

Source Code