>> (p.1)
    Author Topic: Using compact indexes instead of hashes as identifiers.  (Read 2886 times)
    jl777 (OP)
    Legendary
    *
    Offline Offline

    Activity: 1176
    Merit: 1134


    View Profile WWW
    February 26, 2016, 03:50:34 AM
    Merited by ABCbits (2)
     #1

    An implementation idea would be to assume that SIGHASH_ALL is used for all these, it seems that other SIGHASH modes are rarely used and not sure it makes sense to support them for radically new usecases.
    Well everything I described there is completely compatible with all scripthash types; so no real need to limit flexibility. Assuming the sighash code is separate and reused, it shouldn't increase implementation complexity.

    Quote
    Also, is there a reason that a unique number be used to identify each txid and even output script (address)? To my thinking, after a block is past any chance of being reorganized, then there is a canonical ordering of all blocks, and therefore all tx and vins and vouts.

    Since each spend currently requires the 32byte txid and vout, mapping this to 4 or 6 bytes creates a lot of space savings. There is a lot of redundancy in the blockchain with each txid potentially being duplicated once for each vout.
    That could be done-- but there is no need to do so normatively.  E.g. Peers could agree to transmit data using these compressed indexes, while still hashing the original values. This has the advantage of having the IDs not change out from under already authored transactions, and making sure that offline devices don't need access to (or to trust) external index providers.

    Quote
    The other big redundancy are reused addresses, which are actually rmd160 hashes inside spend scripts. Using the canonical ordering of everything, then each rmd160 hash would map to a 32bit index and with the vast majority of scripts being standard a few more bytes can be encoded into a few bits. However, with the coming diaspora of p2sh script proliferation, it could be that address numbering wont be so effective in the future.
    This would undermine the privacy/fungibility properties of bitcoin as a whole to incentivize parties to reuse addresses. Bitcoin's privacy strongly depends on non-reuse, and additional pressure against it for the benefit of saving  on the order 12 bytes per txout doesn't sound like a win-- especially as it would be used to justify bad software, bad businesses practices, and bad public policy that force users to reuse.  Access to those indexes would have to be handled quite carefully since if you paid the wrong one the funds would be stolen.

    Thanks for your thoughts!
    I used to think BTC blockchains' exponential growth was a big problem. Then I realized I could distill it to one quarter the size and also sync in parallel at bandwidth speeds. Now I dont worry about blockchain bloat so much.

    I think we have to admit that a large part of the BTC blockchain has been deanonymized. And until new methods like CT are adopted, this will only get worse.

    So rather than fight a losing battle, why not accept that there are people that dont care much about privacy and convenience is more important. In fact they might be required to be public. The canonical encoding allows to encode the existing blockchain and future public blockchain at much better than any other method as it ends up as high entropy compressed 32bit numbers, vs 32byte txid + vout. The savings is much more than 12 bytes, it takes only 6 bytes to encode an unspent, so closer to 30 bytes. A lot of bitcoin processing is slow due to a variety of reasons. Being able to do computations via standard integers allows for an order of magnitude faster operations in many cases. I think the usage of a DB is the cause of most of the slowdown. Since the blockchain mostly never changes, there is no need for using ACID compliant DB operations for everything. With my design all nodes are doing txindex=1 and it takes very little time to "rescan" a new privkey as everything already has the tx already in linked lists and hash tables.

    The assumption is that each node is maintaining the canonical indexes, otherwise trust is required. With that assumption, this indexing system can be viewed as a highly efficient lossless codec, so not sure why you consider it bad software. I assume you dont consider using gzip a bad practice? I try hard to preserve compatibility with the existing bitcoin core. What I describe does not change the network protocol at all, that is a separate post Smiley

    I am not advocating to change the signed tx to be based on 32 bit numbers! All I recommend is to encode them internally (when beyond the practical chance of being reorged) as 32 bits. All nodes will end up with the same numbering. However, all external interactions continue with the fully expanded form, else it wouldnt be compatible at all. All bundles and subsets of bundles would have verifiable hashes and this will create an efficient encoding for permanent archival storage, with arguably isomorphic behavior to the current raw blockchain inside a DB. The tx that are signed and verified are of course using the fully expanded form. However, if you just want to explore the blockchain, nobody really cares about the exact txid, just that funds went from A to B.

    These are not just thoughts, but description of a working system that syncs entire blockchain and creates the above data structures in about half hour if you have enough bandwidth and 8 cores. It is bandwidth limited, so for a slow home connection of 20mbps, it takes 6hrs for full sync. I am currently debugging the unspents handling, but the rest is basically working and just needs validation. And this part is derived from MGW which has been in service for a couple years.

    I wanted to use the standard bitcoin RPC test suite and I assume what is in the repo is the most complete set? Is there some bruteforce RPC tester that will use large parts of the RPC that can be used as a validation step for new implementation of bitcoin core?

    James

    http://www.digitalcatallaxy.com/report2015.html
    100+ page annual report for SuperNET
Page 1
Viewing Page: 1