Crafting a Blockchain in Go and Rust: A Comparative Journey — Blocks and Transactions [Part 2]

Crafting a Blockchain in Go and Rust: A Comparative Journey — Blocks and Transactions [Part 2]

And making it all deterministic…

Projects available on Github:

View on Github (Pull Requests are Welcome) — Rust Version

View on Github (Pull Requests are Welcome) — Go Version

<- Part 1- Crafting a Blockchain in Go and Rust: A Comparative Journey — Private keys, Public Keys and Signatures

In this post, we’ll dive into the implementation of fundamental blockchain components: the block header, the block itself, and transaction. These elements are crucial for constructing a blockchain, as they form the building blocks of the distributed ledger. We’ll explore the initial implementations in Go and Rust, the challenges encountered with encoding, and how I ultimately achieved deterministic encoding using Protocol Buffers (protobuf).

The Importance of Headers, Blocks, and Transactions

Headers: The header in a blockchain block contains metadata about the block, such as the previous block hash, the timestamp, version, height, nonce and the hash of transactions. It is critical for linking blocks together, ensuring integrity, and enabling quick verification.

Blocks: A block consists of the header, the block signature a public key to verify the signature and a list of transactions. Each block encapsulates a set of transactions that have been validated and recorded in the blockchain.

Transactions: Transactions are the basic units of value transfer in the blockchain. They include the sender’s and receiver’s addresses, the amount transferred, some arbitrary piece of data I decided to add to it, and a digital signature to verify authenticity.

Initial Implementations in Go and Rust

Initially, I implemented the blockchain using the core data structures and native encoding libraries specific to each language. At first, I didn’t give much thought to the implications of using different encoding libraries — after all, bytes are bytes, right? However, I soon realised that to achieve deterministic behavior across both implementations, I needed to rethink that approach. But more on that in a moment.

For now let’s look at the data structures initially implemented.

Go Implementation:

// types/hash.go
// Hash represents the 32-byte hash of a block.
type Hash [HashSize]byte

// core/block.go
// Header represents the header of a block in the blockchain.
type Header struct {
    PrevBlockHash types.Hash // Hash of the previous block

    TxHash types.Hash // Hash of the transactions in the block
    Version uint32 // Version of the block
    Height uint64 // Height of the block in the blockchain
    Timestamp int64 // Timestamp of the block

    Nonce uint64 // Nonce used to mine the block
    Difficulty uint8 // Difficulty used to mine the block
}

// Block represents a block in the blockchain.
type Block struct {
    *Header

    Transactions []*Transaction
    PublicKey crypto.PublicKey
    Signature *crypto.Signature

    // Cached version of the header hash
    hash types.Hash
}

// core/transaction.go
// Transaction represents a transaction in the blockchain.
type Transaction struct {
    From crypto.PublicKey // Public key of the sender
    To crypto.PublicKey // Public key of the receiver
    Value uint64 // Amount to transfer
    Data []byte // Arbitrary data
    Signature *crypto.Signature // Signature of the transaction
    Nonce int64 // Nonce of the transaction

    // hash of the transaction
    hash types.Hash
}

Encoding with Go:

At first, I chose to use the Gob package in Go for encoding the structures. Gob it is often the preferred way when communicating between Go programs since it is Go’s native binary encoding library, designed for efficient serialisation and deserialisation. Gob’s performance is great, but again, mainly or only used for communicating between Go programs, since we are developing one Blockchain Node in Go and another Blockchain Node in Rust, come to think of it may not have been the best choice.

Rust Implementation:

// types/hash.rs
#[derive(PartialEq, Debug, Serialize, Deserialize, Clone, Copy)]
pub struct Hash {
    pub hash: [u8; HASH_SIZE],
}

// core/block.rs
#[derive(Debug, Serialize, Deserialize)]
pub struct Header {
    pub previous_block_hash: [u8; 32],

    pub tx_hash: Option<hash::Hash>,
    pub version: u32,
    pub height: u32,
    pub timestamp: u32,

    pub nonce: u32,
    pub difficulty: u8,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Block {
    header: Header,
    transactions: Vec<Transaction>,
    #[serde(default, skip_serializing_if = "Option::is_none", skip_deserializing)]
    public_key: Option<PublicKey>,
    #[serde(default, skip_serializing_if = "Option::is_none", skip_deserializing)]
    signature: Option<SignatureWrapper>,

    // Cached version of the header hash
    #[serde(default, skip_serializing_if = "Option::is_none", skip_deserializing)]
    hash: Option<hash::Hash>,
}

// core/transaction.rs
#[derive(Debug, Serialize, Deserialize, PartialEq, Clone)]
pub struct Transaction {
    pub from: Option<PublicKey>,
    pub to: Option<PublicKey>,
    pub value: u64,
    pub data: Vec<u8>,
    pub signature: Option<SignatureWrapper>,
    pub nonce: u64,

    // Cached version of the transaction hash
    pub hash: Option<hash::Hash>,
}

Encoding with Rust:

For Rust I decided to with Bincode, which is a binary encoder / decoder implementation in Rust. Bincode is a compact encoder / decoder pair that uses a binary zero-fluff encoding scheme. The size of the encoded object will be the same or smaller than the size that the object takes up in memory in a running Rust program.

In addition to exposing two simple functions (one that encodes to Vec<u8>, and one that decodes from &[u8]), binary-encode exposes a Reader/Writer API that makes it work perfectly with other stream-based APIs such as Rust files, network streams.

But again, Bincode can be quite fast but I ran into the same problem I had with Go Glob package, meaning Bincode although quite popular, it does not have an implementation for Go, as far as I know, so the results would be different and not really deterministic and would ended with the same block being encoded differently on different Blockchain Nodes, which cannot happen.

Encountering the Problem: Non-Deterministic Encoding

The Issue

Despite the success in serialising the structures, I encountered a significant issue: the encoded results were not deterministic. This means that the same data structures were being encoded differently in Go and Rust.

Implications

For a blockchain where different nodes (potentially written in different languages) need to communicate and validate blocks and transactions consistently, non-deterministic encoding is a critical flaw. It could lead to a situation where nodes disagree on the validity of a block simply because of differences in how data is serialised.

The Need for Deterministic Encoding

Why Determinism Matters?

In a blockchain network, all nodes must agree on the state of the ledger. If two nodes encode the same block differently, they might calculate different hashes for that block, leading to a consensus failure.

Serialisation Format Requirements:

For cross-language compatibility and determinism, the serialisation format needs to:

  • Be consistent across different implementations.

  • Preserve the exact byte order.

  • Be well-documented and widely supported.

Switching to Protocol Buffers (Protobuf)

Why Protobuf?

I decided to switch to Protocol Buffers (protobuf) for encoding because it is a language-neutral, platform-neutral extensible mechanism for serialising structured data. It’s widely used in systems that require deterministic encoding across different environments, including many blockchain projects, and it’s quite performant.

Implementation Details

Defining the Protobuf Schema:

For that we need to create a .proto file that represents your Go and Rust structs. I usually create a folder named proto/ and saved it in there. Here’s what the .proto file might look like:

syntax = "proto3";

package proto;

option go_package = "github.com/joaoh82/marvinblockchain/proto";

// Header represents the header of a block in the blockchain.
message Header {
    bytes prev_block_hash = 1;
    bytes tx_hash = 2;
    uint32 version = 3;
    uint64 height = 4;
    int64 timestamp = 5;
    uint64 nonce = 6;
    uint32 difficulty = 7;
}

// Transaction represents a transaction in the blockchain.
message Transaction {
    bytes from = 1;
    bytes to = 2;
    uint64 value = 3;
    bytes data = 4;
    bytes signature = 5;
    int64 nonce = 6;
    bytes hash = 7;
}

// Block represents a block in the blockchain.
message Block {
    Header header = 1;
    repeated Transaction transactions = 2;
    bytes public_key = 3;
    bytes signature = 4;
    bytes hash = 5;
}

Generating Code from the Protobuf Schema:

Go:

First, make sure you have the Protocol Buffers compiler (protoc) and the Go plugin for it installed:

# Install protoc (if not already installed)
brew install protobuf

# Install the Go plugin for protoc
go install google.golang.org/protobuf/cmd/protoc-gen-go@latest

Now, generate the Go code:

# Generating the actual code
rm -rf proto/*.pb.go
protoc --go_out=. --go_opt=paths=source_relative \
proto/*.proto

This command above generates a .pb.go file in the same directory, which contains the Go structs and methods for serialisation and deserialisation.

Rust:

For Rust, you also need to install protoc and the protobuf-compiler. And it can be installed, at least on mac, the same way as above:

# Install protoc (if not already installed)
brew install protobuf

On linux you can check out my github actions scripts in the Rust repo:

sudo apt-get update
sudo apt-get install -y protobuf-compiler
cargo install protobuf-codegen

Add Dependencies

Update your Cargo.toml to include the necessary dependencies.

[dependencies]
prost = "0.13.1"
prost-types = "0.13.1"
[build-dependencies]
prost-build = "0.13.1"

Write a Build Script

In Rust, you need to create a build.rs file to instruct Cargo to compile the Protobuf definitions using the prost crate. Note that you need to create the build.rs file in the root of the project, at the same level you have your Cargo.toml .

fn main() {
    prost_build::compile_protos(&["src/proto/types.proto"], &["src/proto"])
    .expect("Failed to compile proto");
}

Where is the protobuf generated rust file?

Important note here is that the the file generated by prost will not go in the same directory as the *.proto file, but in reality it will go in the target/[environment]/build/[build-hash]/out , which makes sense, because with prost the file is actually generated at build time.

Generate Rust Code from Protobuf

When you run cargo build, the build.rs script will compile your .proto file and generate the corresponding Rust code. This generated code will be placed in the OUT_DIR directory.

More on OUT_DIR and Build Scripts you can find it here.

Now that you have the generated code, you can use it in your Rust application:

Include the Generated Code in your Rust project

include!(concat!(env!("OUT_DIR"), "/proto.rs"));

I actually created a file named mod.rs inside the proto/ directory and added the following content. This way I can follow the same pattern as my Go project and access the proto types via the the proto module .

use prost;
use prost::{Enumeration, Message};

include!(concat!(env!("OUT_DIR"), "/proto.rs"));

Testing and Verifying Determinism

To test it, since there is no networking layer yet, I had to take a more manual route. So I basically wrote a snippet of code in Go and Rust to generate a Block with a Transaction in it, signed everything and then checked the serialisation output of both languages.

Later on, after we add a networking layer and the Node can actually communicate, I create some integration test so we can have a more robust testing framework.

Go Snippet:

func BlockSerialization() {
    fmt.Println("Block Serialization")

    mnemonicTo := "all wild paddle pride wheat menu task funny sign profit blouse hockey"
    // Generate a new private key from the mnemonic
    privateKeyTo, _ := crypto.NewPrivateKeyfromMnemonic(mnemonicTo)
    // fmt.Println("private key TO:", privateKeyTo)
    publicKeyTo := privateKeyTo.PublicKey()
    // fmt.Println("public key TO:", publicKeyTo)
    addressTo := publicKeyTo.Address()
    fmt.Println("address TO:", addressTo)

    mnemonicFrom := "hello wild paddle pride wheat menu task funny sign profit blouse hockey"
    // Generate a new private key from the mnemonic
    privateKeyFrom, _ := crypto.NewPrivateKeyfromMnemonic(mnemonicFrom)
    // fmt.Println("private key FROM:", privateKeyFrom)
    publicKeyFrom := privateKeyFrom.PublicKey()
    // fmt.Println("public key FROM:", publicKeyFrom)
    addressFrom := publicKeyFrom.Address()
    fmt.Println("address FROM:", addressFrom)

    header := &proto.Header{
        PrevBlockHash: make([]byte, 32),
        TxHash: make([]byte, 32),
        Version: 1,
        Height: 1,
        Timestamp: 1627483623,
        Nonce: 12345,
        Difficulty: 10,
    }

    block := &proto.Block{
        Header: header,
        Transactions: []*proto.Transaction{},
        PublicKey: publicKeyFrom.Bytes(),
        Signature: []byte{},
        Hash: []byte{},
    }

    tx := &proto.Transaction{
        From: publicKeyFrom.Bytes(),
        To: publicKeyTo.Bytes(),
        Value: 1000,
        Data: []byte("Transaction data"),
        Signature: make([]byte, 64),
        Nonce: 123,
        Hash: make([]byte, 32),
    }
    core.SignTransaction(&privateKeyFrom, tx)
    core.AddTransaction(block, tx)

    core.SignBlock(&privateKeyFrom, block)

    bBlock, _ := core.SerializeBlock(block)
    fmt.Println("GO: Block WITH TRANSACTIONS hex:", hex.EncodeToString(bBlock))

}

Rust Snippet:

fn block_serialization() -> Result<(), Box> {
println!("Block Serialization");

let mnemonic_to = "all wild paddle pride wheat menu task funny sign profit blouse hockey";
let private_key_to = crypto::keys::get_private_key_from_mnemonic(&mnemonic_to).unwrap();
let public_key_to = private_key_to.public_key();
let address_to = public_key_to.address();
println!("address to: {}", address_to.to_string());

let mnemonic_from = "hello wild paddle pride wheat menu task funny sign profit blouse hockey";
let mut private_key_from = crypto::keys::get_private_key_from_mnemonic(&mnemonic_from).unwrap();
let public_key_from = private_key_from.public_key();
let address_from = public_key_from.address();
println!("address from: {}", address_from.to_string());

// Create an instance of Header
let header = proto::Header {
    prev_block_hash: [0; 32].to_vec(),
    tx_hash: [0; 32].to_vec(),
    version: 1,
    height: 1,
    timestamp: 1627483623,
    nonce: 12345,
    difficulty: 10,
};

// Create an instance of Block
let mut block = proto::Block {
    header: Some(header),
    transactions: vec![],
    public_key: public_key_from.to_bytes().to_vec(),
    signature: vec![],
    hash: vec![],
};

let mut tx = proto::Transaction {
    from: public_key_from.to_bytes().to_vec(),
    to: public_key_to.to_bytes().to_vec(),
    value: 1000,
    data: b"Transaction data".to_vec(),
    signature: [0; 64].to_vec(),
    nonce: 123,
    hash: [0; 32].to_vec(),
};
let _ = core::transaction::sign_transaction(&mut private_key_from, &mut tx).unwrap();
core::block::add_transaction(&mut block, tx);

core::block::sign_block(&mut private_key_from, &mut block).unwrap();

println!("RUST: Block WITH TRANSACTIONS hex: {:?}", hex::encode(core::block::serialize_block(block.clone()).unwrap()));

Ok(())
}

Output and Comparison:

## Not signed Block with signed transactions:
### RUST:
```
"0a530a20000000000000000000000000000000000000000000000000000000000000000012201d2f60156a6850a588d6cddd106af296316f5593a0f9f0b67ceac0c552e838f01801200128e7db85880630b960380a12bf010a20ec319b757d96d2516e6ace0932923098e5b18226a45818a279adba351149938e1220e15af3cd7d9c09ebaf20d1f97ea396c218b66037cdf8e30db0ebd7bb373df56d18e80722105472616e73616374696f6e20646174612a400ba42f94db0f3ebe75e2e40fccbc7e8915510724c18769b8ad939ca44bdec82179da5bd7e7ae7079a6985ee58e77b85e74e3d76ff2c87f85723327e588220c02307b3a204c5836ec9395a4c7dee4fa71e0b2fcadfff90a747e75126fe8df184ae42c7fb61a20ec319b757d96d2516e6ace0932923098e5b18226a45818a279adba351149938e"
```
### GO:
```
0a530a20000000000000000000000000000000000000000000000000000000000000000012201d2f60156a6850a588d6cddd106af296316f5593a0f9f0b67ceac0c552e838f01801200128e7db85880630b960380a12bf010a20ec319b757d96d2516e6ace0932923098e5b18226a45818a279adba351149938e1220e15af3cd7d9c09ebaf20d1f97ea396c218b66037cdf8e30db0ebd7bb373df56d18e80722105472616e73616374696f6e20646174612a400ba42f94db0f3ebe75e2e40fccbc7e8915510724c18769b8ad939ca44bdec82179da5bd7e7ae7079a6985ee58e77b85e74e3d76ff2c87f85723327e588220c02307b3a204c5836ec9395a4c7dee4fa71e0b2fcadfff90a747e75126fe8df184ae42c7fb61a20ec319b757d96d2516e6ace0932923098e5b18226a45818a279adba351149938e
```

## Signed block:
### RUST:
```
"0a530a20000000000000000000000000000000000000000000000000000000000000000012201d2f60156a6850a588d6cddd106af296316f5593a0f9f0b67ceac0c552e838f01801200128e7db85880630b960380a12bf010a20ec319b757d96d2516e6ace0932923098e5b18226a45818a279adba351149938e1220e15af3cd7d9c09ebaf20d1f97ea396c218b66037cdf8e30db0ebd7bb373df56d18e80722105472616e73616374696f6e20646174612a400ba42f94db0f3ebe75e2e40fccbc7e8915510724c18769b8ad939ca44bdec82179da5bd7e7ae7079a6985ee58e77b85e74e3d76ff2c87f85723327e588220c02307b3a204c5836ec9395a4c7dee4fa71e0b2fcadfff90a747e75126fe8df184ae42c7fb61a20ec319b757d96d2516e6ace0932923098e5b18226a45818a279adba351149938e2240131940f20739fb191ff1c4c439e0a5f93b8e25b5c1850a500bfac644ae451a14e8c3bd691854463494ba5b2ba831c39faa583ce0b1c62ca972db8b47c2cc32082a20389bef0d3f0712e3fc5660262c61b9a2feee6475a3a01e35d9f52b2af40b932c"
```
### GO:
```
0a530a20000000000000000000000000000000000000000000000000000000000000000012201d2f60156a6850a588d6cddd106af296316f5593a0f9f0b67ceac0c552e838f01801200128e7db85880630b960380a12bf010a20ec319b757d96d2516e6ace0932923098e5b18226a45818a279adba351149938e1220e15af3cd7d9c09ebaf20d1f97ea396c218b66037cdf8e30db0ebd7bb373df56d18e80722105472616e73616374696f6e20646174612a400ba42f94db0f3ebe75e2e40fccbc7e8915510724c18769b8ad939ca44bdec82179da5bd7e7ae7079a6985ee58e77b85e74e3d76ff2c87f85723327e588220c02307b3a204c5836ec9395a4c7dee4fa71e0b2fcadfff90a747e75126fe8df184ae42c7fb61a20ec319b757d96d2516e6ace0932923098e5b18226a45818a279adba351149938e2240131940f20739fb191ff1c4c439e0a5f93b8e25b5c1850a500bfac644ae451a14e8c3bd691854463494ba5b2ba831c39faa583ce0b1c62ca972db8b47c2cc32082a20389bef0d3f0712e3fc5660262c61b9a2feee6475a3a01e35d9f52b2af40b932c
```

You can copy and paste those values in any code editor to check if they are the same, and as you will see the hashes of each encoded blocks are the same before and after digitally signing it.

As always, the full source code is available on Github:

View on Github (Pull Requests are Welcome) — Rust Version

View on Github (Pull Requests are Welcome) — Go Version

What have I learned from all this?

On this post we set out to implement Blocks and Transactions for the blockchain and encounter our first set of problems when trying to make a blockchain across two different programming languages.

Building a blockchain that operates across different programming languages, such as Go and Rust, introduces several challenges related to language interoperability. These challenges arise from differences in how each language handles data structures, serialisation, concurrency, and more. Here are some key issues you may encounter:

Serialization and Encoding Differences:

  • Data Representation: Different languages often represent and serialize data differently. For example, the way Go’s Gob and Rust’s Bincode handle binary encoding can lead to non-deterministic outputs, even if the input data structures are identical. This discrepancy becomes a critical issue in a blockchain network, where consistency across nodes is paramount.

  • Byte Order (Endianness): Some languages might use different byte orders (big-endian vs. little-endian) when encoding data. If not handled properly, this can result in different hashes for the same data, leading to consensus failures in a blockchain system.

  • Floating-Point Precision: Some languages might handle floating-point arithmetic differently, which can lead to subtle bugs when data is shared between systems.

Concurrency Models:

  • Concurrency and Parallelism: Go uses goroutines and channels for concurrency, which is quite different from Rust’s ownership-based concurrency model. When building a system where components written in different languages need to interact, ensuring that these concurrency models work seamlessly together can be difficult.

  • Thread Safety: Rust emphasises thread safety through its ownership and borrowing system, whereas Go’s garbage collector and goroutines provide a different approach to managing memory and concurrency. Integrating components across these languages may require additional effort to ensure that memory is safely managed.

Error Handling and Type Systems:

  • Type Systems: Go’s type system is relatively straightforward and less strict compared to Rust’s, which has a more complex and expressive type system. This can make it challenging to translate data and logic between the two languages, especially when dealing with complex types or ensuring type safety.

  • Error Handling: Rust’s approach to error handling using Result and Option types differs significantly from Go’s use of error interfaces. Integrating these systems can require additional layers of abstraction or adaptation to ensure that errors are handled consistently across the entire system.

4. Tooling and Ecosystem Differences:

  • Build and Deployment: Managing builds and deployments across multiple languages can complicate the development workflow. Each language has its own build tools, package managers, and deployment pipelines, which need to be integrated seamlessly.

  • Debugging and Testing: Debugging a system that spans multiple languages can be cumbersome, as it may require different tools and approaches. Similarly, testing across languages requires ensuring that test cases are consistent and that any language-specific behaviors are accounted for.

Importance of Standards

Given the challenges of language interoperability and determinism, adhering to well-established and cross-compatible standards becomes crucial in distributed systems like blockchains. Here’s why:

1. Ensuring Consistency:

  • Deterministic Behavior: Standards like Protocol Buffers (protobuf) enforce consistent data serialisation across different languages. By using a common format, you ensure that the same data structure is encoded and decoded identically, regardless of the programming language. This determinism is vital in blockchains, where even a small discrepancy can lead to a fork in the network.

  • Cross-Language Compatibility: Standards like protobuf are designed to work seamlessly across many programming languages. This means you can define your data structures once and generate compatible code for both Go and Rust (as well as many other languages), ensuring that all nodes in your blockchain network interpret the data in the same way.

2. Facilitating Integration:

  • Interoperable Protocols: Using standardized protocols like gRPC (which leverages protobuf for message serialization) allows different components of your system, potentially written in different languages, to communicate effectively. This interoperability reduces the risk of errors and simplifies the integration process.

Although I mentioned gRPC here, but when we get to the networking part you will see that I actually decided to go with libp2p instead. But we will get to that.

  • Interoperability with Other Systems: By adhering to widely-adopted standards, your blockchain can easily integrate with other systems, tools, and services that also support these standards. This is especially important if your blockchain needs to interact with external systems or be compatible with existing infrastructure.

3. Simplifying Development and Maintenance:

  • Unified Tooling: Standards like protobuf come with a rich set of tools and libraries that make development easier. For example, you can use protoc to automatically generate serialisation code in multiple languages, reducing the likelihood of human error and speeding up the development process.

  • Easier Debugging and Troubleshooting: When all parts of a system adhere to the same standards, it simplifies debugging and troubleshooting. For instance, if you know that both Go and Rust nodes are using the same protobuf schema, you can be confident that any discrepancies in behavior are not due to serialization issues.

4. Community and Support:

  • Wide Adoption: Standards like protobuf, libp2p and gRPC have large, active communities and are supported by major technology companies. This means you benefit from ongoing improvements, extensive documentation, and a wealth of community knowledge.

  • Future-Proofing: By using widely-adopted standards, you ensure that your project remains compatible with future developments and can leverage new tools and technologies as they become available.

Final thoughts

In a project where we are building a blockchain in two different languages, like Go and Rust, overcoming language interoperability challenges is critical to ensuring that all nodes work harmoniously. Leveraging well-established standards like protobuf not only resolves potential discrepancies but also ensures that our system is robust, scalable, and maintainable in the long term. By doing so, we are laying a solid foundation for our blockchain to succeed across different platforms and environments.

Don’t forget to Like and Subscribe to be notified about the next posts in the series and also STAR the projects on Github to follow what is going on over there.

View on Github (Pull Requests are Welcome) — Rust Version

View on Github (Pull Requests are Welcome) — Go Version

Cheers!