Go Night Reading-IPFS Preview

IPFS itself does not use many new technologies, most of which are existing technologies and ideas, but it cannot be said that there is no innovation, similar to Bitcoin.


Ideas: KadDHT's node addressing and content addressing are isomorphic. On the basis of KAD calculating logical distance based on XOR, the specific physical distance such as delay (Coral DHT) is added.S-KadDHT adds some resistance to witch attacks because generating a public-private key requires calculating the difficulty value.


The underlying data structure and organization of IPFS are strongly influenced by Git. First, Git is a content-based addressing file system with four objects: blob, tree, commit, tag.Blob objects store various types of files. When you add a blob, its hash is calculated by the sha1 algorithm. The first two are folders, followed by the file name.

Tree can point to other trees or blob s, etc. Links to relationships are made with DAG, and there is no reason to have rings. It is not possible to modify git and find that you have changed to the original version.

//Verify Add blob
$echo "hello,world" > 1.txt
$git hash-object -w 1.txt                                              2d832d9044c698081e59c322d5a2a459da546469
$echo "hello,world." > 1.txt
$git hash-object -w 1.txt                                             ec03970d236871dd491c61683f3074d3038b4f6e
$find .git/objects -type f  //View stored files.git/objects/2d/832d9044c698081e59c322d5a2a459da546469.git/objects/ec/03970d236871dd491c61683f3074d3038b4f6e

#Check Add tree
$git update-index --add --cacheinfo 100644 ec03970d236871dd491c61683f3074d3038b4f6e test
$git write-tree
#Add a new file
$echo "new file">new.txt
$git hash-object -w new.txt
$git update-index --add --cacheinfo 100644 fa49b077972391ad58037050f2a75f74e3671e92 test1
$git write-tree
#Compare the contents of two trees
$git cat-file -p b0de348e9cb3a01b47154a4331678787e29afdff
100644 blob ec03970d236871dd491c61683f3074d3038b4f6e    test
$git cat-file -p 7b452990f69f7751ba805a291549c32dbfeb8716
100644 blob ec03970d236871dd491c61683f3074d3038b4f6e    test
100644 blob fa49b077972391ad58037050f2a75f74e3671e92    test1

IPFS also designed the corresponding objects, blob, list, tree, commit, and also organized with DAG.

SFS Self-Authentication File System

In order to publish dynamic content, a mapping mechanism is required to dynamically map a name to an IPFS hash.In the world of HTTP, PKI needs a certificate issued by a trusted authority to maintain, but because IPFS's ambition is to be decentralized, it does not design a trusted authority, but uses the SFS scheme, which puts information such as public key and private key signature into this mapping, that is, the file name itself contains public key information, so it can be trusted.Present key distribution.

IPFS designed IPNS based on this idea.


Block is a very basic abstraction in IPFS

// ipfs/go-block-format/blocks.go
// Block provides abstraction for blocks implementations.
type Block interface {
    RawData() []byte
    Cid() cid.Cid
    String() string
    Loggable() map[string]interface{}

// A BasicBlock is a singular block of data in ipfs. It implements the Block interface.
type BasicBlock struct {
    cid  cid.Cid
    data []byte

// ipfs/go-ipfs/core/coreapi/block.go

//The basic put block method simply takes a hash and saves it.
func (api *BlockAPI) Put(ctx context.Context, src io.Reader, opts ...caopts.BlockPutOption) (coreiface.BlockStat, error) {
    settings, pref, err := caopts.BlockPutOptions(opts...)
    if err != nil {
        return nil, err

    data, err := ioutil.ReadAll(src)
    if err != nil {
        return nil, err

    bcid, err := pref.Sum(data)
    if err != nil {
        return nil, err

    b, err := blocks.NewBlockWithCid(data, bcid)
    if err != nil {
        return nil, err

    if settings.Pin {
        defer api.blockstore.PinLock().Unlock()

    err = api.blocks.AddBlock(b)
    if err != nil {
        return nil, err

    if settings.Pin {
        api.pinning.PinWithMode(b.Cid(), pin.Recursive)
        if err := api.pinning.Flush(); err != nil {
            return nil, err

    return &BlockStat{path: path.IpldPath(b.Cid()), size: len(data)}, nil


type Link struct {
    Name, Hash string
    Size       uint64

type Node struct {
    Links []Link
    Data  string
# Test that data is a folder with two subfolders
$ipfs add data -r
added QmXkiQUQ5xkC471K29MsL3tKHb9dsFgXernV9hbr4vDP3C data/1/testobject
added QmREfzcCsuHpzUvWUEzgiqHSNpiKnc18s4RfQXqvW669E8 data/2/1.txt
added QmZFYyMVQMXbis5spJkifRtoPoPz9pNo5NyiVSo9UhWTLa data/2/11.pdf
added QmcN88vyVJwfLRreojfT7qdVezShxF1JMCLQBxd3TwL3uz data/1
added QmV14YGVhYS1WyCvmmTnnFzJF4zC587LMCnufjox5r7zHd data/2
added QmdMwuSVU3NKQUy4YgxKWcAm9z6fWXYb6ampbWC442nBgE data
 8.99 MiB / 9.00 MiB [====================================================]  99.96%
 // Looking at links, you can see that it points to two subfolders
$ipfs object links QmdMwuSVU3NKQUy4YgxKWcAm9z6fWXYb6ampbWC442nBgE
QmcN88vyVJwfLRreojfT7qdVezShxF1JMCLQBxd3TwL3uz 82      1
QmV14YGVhYS1WyCvmmTnnFzJF4zC587LMCnufjox5r7zHd 9432929 2


As mentioned above, ipfs have data types such as blob, list, tree, commit, and so on.Or take the folder add ed above for example.

$ipfs dag get QmdMwuSVU3NKQUy4YgxKWcAm9z6fWXYb6ampbWC442nBgE
$ipfs dag get QmcN88vyVJwfLRreojfT7qdVezShxF1JMCLQBxd3TwL3uz


// Link represents an IPFS Merkle DAG Link between Nodes.
type Link struct {
    // utf string name. should be unique per object
    Name string // utf8

    // cumulative size of target object
    Size uint64

    // multihash of the target object
    Cid cid.Cid

The DAG recursively generates a node and becomes a link to the upper node until the end.

ipfs support two types of dags, one balanced dag for random access and the other trickledag for sequential access

var nd ipld.Node
if adder.Trickle {
    nd, err = trickle.Layout(db)
} else {
    nd, err = balanced.Layout(db)

For example, balanced divides the data recursively into nodes on the DAG, and the fillNodeRec function below is recursive.

// Each time a DAG of a certain `depth` is filled (because it
    // has reached its maximum capacity of `db.Maxlinks()` per node)
    // extend it by making it a sub-DAG of a bigger DAG with `depth+1`.
    for depth := 1; !db.Done(); depth++ {

        // Add the old `root` as a child of the `newRoot`.
        newRoot := db.NewFSNodeOverDag(ft.TFile)
        newRoot.AddChild(root, fileSize, db)

        // Fill the `newRoot` (that has the old `root` already as child)
        // and make it the current `root` for the next iteration (when
        // it will become "old").
        root, fileSize, err = fillNodeRec(db, newRoot, depth)
        if err != nil {
            return nil, err


$ipfs add helloworld
added QmVtZPoeiqpREqkpTTNMzXkUt74SgQA4JYMG8zPjMVULby helloworld
12 B / ? [-----------------------------------------------------] 
$ipfs name publish QmVtZPoeiqpREqkpTTNMzXkUt74SgQA4JYMG8zPjMVULby
Published to QmdcFVhY4m3NyYfiMbosr3ZbW9EHCkcBk4xySt4uQozXb6: /ipfs/QmVtZPoeiqpREqkpTTNMzXkUt74SgQA4JYMG8zPjMVULby
#Resolve IPNS
$ipfs name resolve QmdcFVhY4m3NyYfiMbosr3ZbW9EHCkcBk4xySt4uQozXb6
#View the content of IPNS
$ipfs cat /ipns/QmdcFVhY4m3NyYfiMbosr3ZbW9EHCkcBk4xySt4uQozXb6
#Modify file content to republish
$echo "ipfs" >> helloworld
$ipfs add helloworld
added Qmb8sAnXHQYpuAwCTFmHvgz2P5AGJH4y5uVosSdKzcHSbN helloworld
17 B / 17 B [=====================================================] 100.00%
$ipfs name publish Qmb8sAnXHQYpuAwCTFmHvgz2P5AGJH4y5uVosSdKzcHSbN
Published to QmdcFVhY4m3NyYfiMbosr3ZbW9EHCkcBk4xySt4uQozXb6:
#View the content of IPNS
$ipfs cat /ipns/QmdcFVhY4m3NyYfiMbosr3ZbW9EHCkcBk4xySt4uQozXb6

//It's actually the node ID
$ipfs id -f="<id>\n"

Take a look at the code implementation, but it's still understandable

// Create creates a new IPNS entry and signs it with the given private key.
// This function does not embed the public key. If you want to do that, use
// `EmbedPublicKey`.
func Create(sk ic.PrivKey, val []byte, seq uint64, eol time.Time) (*pb.IpnsEntry, error) {
    entry := new(pb.IpnsEntry)

    entry.Value = val
    typ := pb.IpnsEntry_EOL
    entry.ValidityType = &typ
    entry.Sequence = &seq
    entry.Validity = []byte(u.FormatRFC3339(eol))

    sig, err := sk.Sign(ipnsEntryDataForSig(entry))
    if err != nil {
        return nil, err
    entry.Signature = sig

    return entry, nil

//Verify IPNS Entry
// Validates validates the given IPNS entry against the given public key.
func Validate(pk ic.PubKey, entry *pb.IpnsEntry) error {
    // Check the ipns record signature with the public key
    if ok, err := pk.Verify(ipnsEntryDataForSig(entry), entry.GetSignature()); err != nil || !ok {
        return ErrSignature

    eol, err := GetEOL(entry)
    if err != nil {
        return err
    if time.Now().After(eol) {
        return ErrExpiredRecord
    return nil

//Embedding public key information
// EmbedPublicKey embeds the given public key in the given ipns entry. While not
// strictly required, some nodes (e.g., DHT servers) may reject IPNS entries
// that don't embed their public keys as they may not be able to validate them
// efficiently.
func EmbedPublicKey(pk ic.PubKey, entry *pb.IpnsEntry) error {
    // Try extracting the public key from the ID. If we can, *don't* embed
    // it.
    id, err := peer.IDFromPublicKey(pk)
    if err != nil {
        return err
    if _, err := id.ExtractPublicKey(); err != peer.ErrNoPublicKey {
        // Either a *real* error or nil.
        return err

    // We failed to extract the public key from the peer ID, embed it in the
    // record.
    pkBytes, err := pk.Bytes()
    if err != nil {
        return err
    entry.PubKey = pkBytes
    return nil

Kad Addressing

kad's ideas were discussed earlier, so let's look at the core code.

//Functions for calculating distances
func CommonPrefixLen(a, b ID) int {
    return ks.ZeroPrefixLen(u.XOR(a, b))

// Find the nearest node information in the bucket
// NearestPeers returns a list of the 'count' closest peers to the given ID
func (rt *RoutingTable) NearestPeers(id ID, count int) []peer.ID {
    cpl := CommonPrefixLen(id, rt.local)

    // It's assumed that this also protects the buckets.

    // Get bucket at cpl index or last bucket
    var bucket *Bucket
    if cpl >= len(rt.Buckets) {
        cpl = len(rt.Buckets) - 1
    bucket = rt.Buckets[cpl]

    pds := peerDistanceSorter{
        peers:  make([]peerDistance, 0, 3*rt.bucketsize),
        target: id,
    if pds.Len() < count {
        // In the case of an unusual split, one bucket may be short or empty.
        // if this happens, search both surrounding buckets for nearby peers
        if cpl > 0 {
        if cpl < len(rt.Buckets)-1 {

    // Sort by distance to local peer

    if count < pds.Len() {
        pds.peers = pds.peers[:count]

    out := make([]peer.ID, 0, pds.Len())
    for _, p := range pds.peers {
        out = append(out, p.p)

    return out

//Update Node Information
// Update adds or moves the given peer to the front of its respective bucket
func (rt *RoutingTable) Update(p peer.ID) (evicted peer.ID, err error) {
    peerID := ConvertPeerID(p)
    cpl := CommonPrefixLen(peerID, rt.local)

    defer rt.tabLock.Unlock()
    bucketID := cpl
    if bucketID >= len(rt.Buckets) {
        bucketID = len(rt.Buckets) - 1

    bucket := rt.Buckets[bucketID]
    if bucket.Has(p) {
        // If the peer is already in the table, move it to the front.
        // This signifies that it it "more active" and the less active nodes
        // Will as a result tend towards the back of the list
        return "", nil
    //Guaranteed delay requirement
    if rt.metrics.LatencyEWMA(p) > rt.maxLatency {
        // Connection doesnt meet requirements, skip!
        return "", ErrPeerRejectedHighLatency

    // We have enough space in the bucket (whether spawned or grouped).
    if bucket.Len() < rt.bucketsize {
        return "", nil

    if bucketID == len(rt.Buckets)-1 {
        // if the bucket is too large and this is the last bucket (i.e. wildcard), unfold it.
        // the structure of the table has changed, so let's recheck if the peer now has a dedicated bucket.
        bucketID = cpl
        if bucketID >= len(rt.Buckets) {
            bucketID = len(rt.Buckets) - 1
        bucket = rt.Buckets[bucketID]
        if bucket.Len() >= rt.bucketsize {
            // if after all the unfolding, we're unable to find room for this peer, scrap it.
            return "", ErrPeerRejectedNoCapacity
        return "", nil

    return "", ErrPeerRejectedNoCapacity


Since IPFS is a global seed, there are naturally some strategies to prevent people from hitching too heavily, so there is bitswap, which maintains ledger s, records the reception and reception of neighboring nodes (wantlist and havelist), and makes decisions (engine under bitswap).

The code is more complex, there is too much asynchronous logic, so roughly speaking, there is no code to paste. You can talk about the code at that time.

The logic is as follows: when a node wants to get a heap of blocks, it first looks at it locally to see if there is a session created without entering bitswap, and then subscribes to the block in the pub/submechanism of bitswap (which writes in a block channel).Then the cid of these blocks is added to the wantlist and the worker of bitswap starts to work. Many things are done asynchronously during this time. First, FindProviderAsync of the contentRouting of the p2p layer is used periodically to find the node that owns the block. Then bitswap constantly fetches the ID of the provider and sends a request to it. If the block returns, checkCheck the integrity with the hash value, then publish the block through the pub/sub channel, publish the past through the channel, and the upper logic, the other end of the channel, returns the result.

Beyond IPFS-Block Chain Storage

IPFS can not solve the problem, a small number of nodes use love to generate electricity, most nodes take a ride, without economic incentives, bitswap mechanism alone can not be truly complete.

P2P is stored in an age without block chains and can only be used for file sharing, as well as for CDN s built in by some video companies (note that video websites account for your upload bandwidth, but they don't pay you a fee, and you have to pay for watching videos instead).

Thus, the protocol lab proposed the Filecoin project, which aims to make all participants profitable, both in terms of storage space and bandwidth, and to provide users with safer services at lower prices.

Block chain storage is the best place to land relative to other projects, because the content stored and distributed, in a sense, is numeric and can be mathematically and cryptographically validated, while other block chain items, as long as they are linked to the real economy, are difficult to interact with.

IPFS Filecoin
category Decentralized file system (no motivation) Decentralized storage network
function Content Addressing Based Distributed Storage Infrastructure Incentive Layer Over IPFS Networks Provides a Free Trade Market in Cloud Storage
Storage Permissions Have storage rights on IPFS nodes that have ownership 1. In addition to having storage rights on IPFS nodes with ownership 2. You can also have storage rights on top of their vendor's nodes by means of payment
Read permissions ALL (as long as you know cid) ALL (as long as you know cid)

Two Markets

Storage Market

Essentially, it's a de-centralized cloud storage where users rent storage space for miners through Filecoin's Token and miners keep submitting storage certificates where they sell and purchase storage space.

Retrieve Market

It can be understood as a de-centralized CDN service where users retrieve data from miners and return it through Token, where bandwidth and traffic are purchased for sale.

Storage certificate

How do I prove that miners store data?

How do you prove that miners store redundant data?

How do you consistently prove that miners store data?

What is the consensus on block chains?

Basic Storage Certificate (see PPT)

1, local memory slice hash

2, store the Merkle tree root locally, and distribute the file fragmentation and Merkle tree to the storage party.

3. Use the homologous Hash function to generate a homologous hash of a data slice and send it to the storage party.

Why don't the three above apply to Filecoin?

Consider dimensions: communication complexity?Computing complexity?Can you resist three attacks (witch attack, generate attack, foreign attack)?Whether the user and the storage party can conspire, such as using homologous hash, then the storage party can arbitrarily brush the data to generate proof to seize the block rights.

Filecoin uses a storage side to encode a sector (sector) when the storage space has a sector size. The zigzag depth robust graph shows that the encoding of a file block within a sector may be related to several other file blocks at the same time. Then VDE is done once, so that from left to right, from right to left encoding multiple times, one at a time.Layer, z-glyph, becomes zigzag, if one of them is deleted, and many other blocks are needed at the same time, resulting in a high cost of attack generation, where VDE (Verifiable Delay Encoding) is used, and finally zksnark zero knowledge proof is used.

Continuous replication proof implements space-time proof, so space-time proof and replication proof actually work together.

However, at this point, the space-time proof is only proof of storage space, no block rules have been designed, and filecoin uses a VRF-like consensus algorithm (originally the white paper was POW).

Consensus uses VRF-like secret leader elections, calculates hash of an input less than a target, but different storage miners need different targets, which are proportional to the storage power of the entire network, and the higher the proportion of storage certificates generated, the more likely the burst is.So in the early days, when filecoin was in short supply, miners could brush the data themselves to increase the blasting rate.

$$ H(<t||rand(t)>_{M_i})/2^L \leq p_i^t/ \sum_j p_j^t $$

H-hash function

rand(t): a random number at time t, a sequence of continuous random numbers can be generated using VDF.

$M_i$: Sign it

$2^L$: Keep the final value between 0 and 100%

The other side is the ratio of storage computing power to total network computing power, that is, the greater the proportion, the greater the opportunity, a certain POS.

After the code is abstracted, as follows,

func Mine(minerKey PrivateKey) {
    for r := range rounds { // for each round
        bestTipset, tickets := ChainTipsMgr.GetBestTipsetAtRound(r - 1)

        ticket := GenerateTicket(minerKey, bestTipset)

        // Generate an ElectionProof and check if we win
        // Note: even if we don't win, we want to make a ticket
        // in case we need to mine a null round
        win, proof := CheckIfWinnerAtRound(minerKey, r, bestTipset)
        if win {
            GenerateAndBroadcastBlock(bestTipset, tickets, proof)
        } else {
            // Even if we don't win, add our ticket to the tracker in
            // case we need to mine on it later.
            ChainTipsMgr.AddFailedTicket(bestTipset, tickets)

Check to see if you are selected.

const RandomnessLookback = 1 // Also referred to as "K" in many places

func CheckIfWinnerAtRound(key PrivateKey, n Integer, parentTipset Tipset) (bool, ElectionProof) {
  lbt := ChainTipsMgr.TicketFromRound(parentTipset, n-RandomnessLookback)

  eproof := ComputeElectionProof(lbt, key)

  minerPower := GetPower(miner)
  totalPower := state.GetTotalPower()

  if IsProofAWinner(eproof, minerPower, totalPower)
    return True, eproof
    return False, None

Question: Because it is a secret leader election, this consensus is the expected consensus. It is possible that no one will be selected in one round or that many people will be selected (tipset will be packaged in multiple blocks). At present, it is not perfect.

Keywords: Go git network encoding less

Added by phence on Mon, 16 Sep 2019 06:16:49 +0300