-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 ..::[ util.cpp phexdigit array Bug Analysis ]::.. [ Date: 2018.04.29 ] [ Block Height: 520373 ] [ Author: mod6 ] 0x00: [ Introduction ] Yesterday, I was adventuring in the TRB code, considering possible raw transaction changes, when I happened across a bug in the wild. This nasty critter was easy to overlook in the forest of other commas used to define the phexdigit[256] array; it is simply missing a ',' separating two of it's values [R.01]. 0x01: [ Investigation: Stage 1 (From whence it came) ] The first thing that came to mind was, "Where was this introduced?!". I went and looked first at The Bitcoin Foundation's 'genesis.vpatch', and sure enough there it was, just waiting for us. Next, I went up to shithub to see if I could find the responsible party. Now you would think that they, would have all of the previous history easy to browse at shithub.com/bitcoin. But au contraire! They only keep history back to tag v0.7.1. Derp. I started with that tagged version (v0.7.1) and found it is indeed fixed there. I discovered that the fix for this bug was placed in prb [R.02] code in v0.7.0. But that didn't satisfy me yet, I wanted to know who was responsible. I went around chasing down all of the older versions to see where it was introduced. The util.cpp phexdigit array was introduced, with bug included, in version 'v0.1.6test1'. Worth noting, there is only one version I could even find before 'v0.1.6test1'; 'v0.1.5', which contained a util.cpp, but not a ParseHex function or the phexdigit array. Since all I could find were tarballs of the ancient versions, there was obviously no file history, V had obviously not been invented yet, so no one specifically at whom to point the finger. Let's go over the places in the TRB v054 where ParseHex(const char* psz) from util.cpp is called: Line 1673 of bitcoinrpc.cpp in: 'Value getwork(const Array& params, bool fHelp)' Line 1775 of bitcoinrpc.cpp in: 'Value getmemorypool(const Array& params, bool fHelp)' Line 1475 of main.cpp in: 'bool LoadBlockIndex(bool fAllowNew)' The other places are either it's own declaration in util.h and implementation util.cpp. As well, there are, ironically, tests around this in test/util_tests.cpp of this beginning on line 55. I did actually attempt to compile and run these tests; however, even though I was able to compile the source, it seems, the linker failed on the boost unit test framework. This is a discussion for a different time perhaps. Interestingly enough, the unit tests are attempting to parse the same hex value (the genesis block scriptPubKey) that I attempt further into this report. This is pure accident as I didn't even examine the unit tests until after I had created my own tests. 0x02: [ Investigation: Stage 2 (Basic Entomology) ] I started to walk down the mental path of what this actually means in terms of the code, and I become worried that the ParseHex(const char* psz) in util.cpp would be doing whatever arithmetic off-by-one. And to make matters even worse for the man adventuring in the forest, this is a double-headed bug: Not only is the array length of phexdigit[256] actually shortened by one value, now instead of using 0xff to do whatever arithmetic values, now we're using (-2) 0xfe some where! The first thing that I wanted to do was to show that if I used the phexdigit array in a piece of test code, we would get differing values; before the fix and after the fix. However, before I even jumped into that, I also wanted to first show how missing a ',' in an array definition as this would effect the array's values themselves: I'll start by showing that if we initialize an array with a missing comma, how this will push a '-2' onto our stack by adding the values not separated by comma together, followed by the comma separated '-1': mod6@gentoo ~/phexdigit $ cat missing_comma.c #include void main() { const char bytes[] = { -1 -1, -1 }; write(1, bytes, sizeof(bytes)); } mod6@gentoo ~/phexdigit $ gcc -g -o missing_comma missing_comma.c mod6@gentoo ~/phexdigit $ gcc -S missing_comma.c && cat missing_comma.s .file "missing_comma.c" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 subq $16, %rsp movb $-2, -16(%rbp) movb $-1, -15(%rbp) leaq -16(%rbp), %rax movl $2, %edx movq %rax, %rsi movl $1, %edi call write leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Gentoo 4.8.4 p1.6, pie-0.6.1) 4.8.4" .section .note.GNU-stack,"",@progbits Now that we've seen that, what happens if we run 'missing_comma'? mod6@gentoo ~/phexdigit $ ./missing_comma | xxd 0000000: feff .. mod6@gentoo ~/phexdigit $ As you can see, there we are, first value out is 0xfe (-2), followed by 0xff (-1). Let's show how this looks in its corrected form, where we do have a comma separating all of our array values: mod6@gentoo ~/phexdigit $ cat no_missing_comma.c #include void main() { const char bytes[] = { -1, -1, -1 }; write(1, bytes, sizeof(bytes)); } mod6@gentoo ~/phexdigit $ gcc -g -o no_missing_comma no_missing_comma.c mod6@gentoo ~/phexdigit $ gcc -S no_missing_comma.c && cat no_missing_comma.s .file "no_missing_comma.c" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 subq $16, %rsp movb $-1, -16(%rbp) movb $-1, -15(%rbp) movb $-1, -14(%rbp) leaq -16(%rbp), %rax movl $3, %edx movq %rax, %rsi movl $1, %edi call write leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Gentoo 4.8.4 p1.6, pie-0.6.1) 4.8.4" .section .note.GNU-stack,"",@progbits Alright, as we can see, as expected, '-1' gets pushed onto our stack three times. What happens if we run no_missing_comma? mod6@gentoo ~/phexdigit $ ./no_missing_comma | xxd 0000000: ffff ff ... mod6@gentoo ~/phexdigit $ Alright, as expected, we get 0xff (-1) output three times. The next thing that I wanted to try was to take the whole existing (not fixed) phexdigit array into a test program and output its hex values, then compare those values with a fixed version to see the differences. mod6@gentoo ~/phexdigit $ cat phexdigit_before.c #include void main() { static char phexdigit[256] = { -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 0,1,2,3,4,5,6,7,8,9,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1 -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, }; write(1, phexdigit, sizeof(phexdigit)); } mod6@gentoo ~/phexdigit $ cat phexdigit_after.c #include void main() { static char phexdigit[256] = { -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 0,1,2,3,4,5,6,7,8,9,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, }; write(1, phexdigit, sizeof(phexdigit)); } mod6@gentoo ~/phexdigit $ gcc -g -o phexdigit_before phexdigit_before.c mod6@gentoo ~/phexdigit $ gcc -g -o phexdigit_after phexdigit_after.c mod6@gentoo ~/phexdigit $ ./phexdigit_before | xxd > phexdigit_before.hex mod6@gentoo ~/phexdigit $ ./phexdigit_after | xxd > phexdigit_after.hex Let's take a look at the output from both: mod6@gentoo ~/phexdigit $ cat phexdigit_before.hex 0000000: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000010: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000020: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000030: 0001 0203 0405 0607 0809 ffff ffff ffff ................ 0000040: ff0a 0b0c 0d0e 0fff ffff ffff ffff ffff ................ 0000050: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000060: ff0a 0b0c 0d0e 0fff ffff ffff ffff fffe ................ 0000070: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000080: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000090: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000a0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000b0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000c0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000d0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000e0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000f0: ffff ffff ffff ffff ffff ffff ffff ff00 ................ mod6@gentoo ~/phexdigit $ cat phexdigit_after.hex 0000000: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000010: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000020: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000030: 0001 0203 0405 0607 0809 ffff ffff ffff ................ 0000040: ff0a 0b0c 0d0e 0fff ffff ffff ffff ffff ................ 0000050: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000060: ff0a 0b0c 0d0e 0fff ffff ffff ffff ffff ................ 0000070: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000080: ffff ffff ffff ffff ffff ffff ffff ffff ................ 0000090: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000a0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000b0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000c0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000d0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000e0: ffff ffff ffff ffff ffff ffff ffff ffff ................ 00000f0: ffff ffff ffff ffff ffff ffff ffff ffff ................ mod6@gentoo ~/phexdigit $ As we can see in the file phexdigit_before.hex, we not only see the value 0xfe contained in the output, but also a 0x00 at the end, indicating the two-headedness of this bug. In the second, corrected version, phexdigit_after.hex, we see that all the values that are intended to be 0xff are, and the entire array is filled. 0x03: [ Investigation: Stage 3 (Advanced Entomology, Into The Guts) ] ParseHex(const char* psz) is called from a few places, but one in particular I thought was worth investigating, line 10920 of genesis.vpatch where we set the scriptPubKey of the hard-coded genesis block data, as such: txNew.vout[0].scriptPubKey = CScript() << ParseHex("04678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5f") << OP_CHECKSIG; Seeing as how correcting this bug changes how ParseHex(const char* psz) works, I implemented a small test program to see outcomes from before the fix, and after. This is the entire ParseHex(const char* psz) function brought into my test program, along with my driving code in main(). mod6@gentoo ~/phexdigit $ cat full_before.cpp #include #include #include #define loop for(;;) using namespace std; vector ParseHex(const char* psz) { static char phexdigit[256] = { -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 0,1,2,3,4,5,6,7,8,9,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1 -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, }; // convert hex dump to vector vector vch; loop { while (isspace(*psz)) psz++; char c = phexdigit[(unsigned char)*psz++]; if (c == (char)-1) break; unsigned char n = (c << 4); c = phexdigit[(unsigned char)*psz++]; if (c == (char)-1) break; n |= c; vch.push_back(n); } return vch; } int main() { const char *scriptPubKey = "04678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5f"; vector vch; vch = ParseHex(scriptPubKey); for(vector::iterator it = vch.begin(); it != vch.end(); ++it) { printf("%c", (*it)); } return 0; } mod6@gentoo ~/phexdigit $ cat full_after.cpp #include #include #include #define loop for(;;) using namespace std; vector ParseHex(const char* psz) { static char phexdigit[256] = { -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 0,1,2,3,4,5,6,7,8,9,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, }; // convert hex dump to vector vector vch; loop { while (isspace(*psz)) psz++; char c = phexdigit[(unsigned char)*psz++]; if (c == (char)-1) break; unsigned char n = (c << 4); c = phexdigit[(unsigned char)*psz++]; if (c == (char)-1) break; n |= c; vch.push_back(n); } return vch; } int main() { const char *scriptPubKey = "04678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5f"; vector vch; vch = ParseHex(scriptPubKey); for(vector::iterator it = vch.begin(); it != vch.end(); ++it) { printf("%c", (*it)); } return 0; } Let's diff these two tests just to ensure that the only line that was changed was indeed, just adding the missing comma to the phexdigit array: mod6@gentoo ~/phexdigit $ diff -uNr full_before.cpp full_after.cpp - --- full_before.cpp 2018-04-29 02:45:38.839025733 -0500 +++ full_after.cpp 2018-04-29 02:45:49.654026384 -0500 @@ -15,7 +15,7 @@ 0,1,2,3,4,5,6,7,8,9,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, - - -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1 + -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, Ok, so we can see that is correct. Now, let's do the same thing as above in section 2: Compile, Execute, Compare. mod6@gentoo ~/phexdigit $ g++ -g -o full_before full_before.cpp mod6@gentoo ~/phexdigit $ g++ -g -o full_after full_after.cpp mod6@gentoo ~/phexdigit $ ./full_before | xxd > full_before.hex mod6@gentoo ~/phexdigit $ ./full_after | xxd > full_after.hex mod6@gentoo ~/phexdigit $ diff -uNr full_before.hex full_after.hex mod6@gentoo ~/phexdigit $ cat full_before.hex 00000000: 0467 8afd b0fe 5548 2719 67f1 a671 30b7 .g....UH'.g..q0. 00000010: 105c d6a8 28e0 3909 a679 62e0 ea1f 61de .\..(.9..yb...a. 00000020: b649 f6bc 3f4c ef38 c4f3 5504 e51e c112 .I..?L.8..U..... 00000030: de5c 384d f7ba 0b8d 578a 4c70 2b6b f11d .\8M....W.Lp+k.. 00000040: 5f _ mod6@gentoo ~/phexdigit $ cat full_after.hex 00000000: 0467 8afd b0fe 5548 2719 67f1 a671 30b7 .g....UH'.g..q0. 00000010: 105c d6a8 28e0 3909 a679 62e0 ea1f 61de .\..(.9..yb...a. 00000020: b649 f6bc 3f4c ef38 c4f3 5504 e51e c112 .I..?L.8..U..... 00000030: de5c 384d f7ba 0b8d 578a 4c70 2b6b f11d .\8M....W.Lp+k.. 00000040: 5f _ We get the same exact output for both before and after. Perhaps the scriptPubKey from the genesis block is a bad test case. Let's do some debugging to see where we would hit this landmine. I have crafted a new test driver based on the full_before.cpp from above. The difference here is that I know now that to hit the 0xfe (-2), you must attempt to parse a letter outside of what would be normal hexadecimal values [0-9A-F]. The landmine exists at array element one hundred eleven, 111. Which corresponds to the lower case 'o' decimal value. In my debugging, I'm going to show when we pass as string, "ab6o78" to ParseHex, we step on the landmine. Before we debug, let's first show the code that I'm going to use: mod6@gentoo ~/phexdigit $ cat debug_before.cpp #include #include #include #define loop for(;;) using namespace std; vector ParseHex(const char* psz) { static char phexdigit[256] = { -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, 0,1,2,3,4,5,6,7,8,9,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,0xa,0xb,0xc,0xd,0xe,0xf,-1,-1,-1,-1,-1,-1,-1,-1,-1 -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, }; // convert hex dump to vector vector vch; loop { while (isspace(*psz)) psz++; char c = phexdigit[(unsigned char)*psz++]; if (c == (char)-1) break; unsigned char n = (c << 4); c = phexdigit[(unsigned char)*psz++]; if (c == (char)-1) break; n |= c; vch.push_back(n); } return vch; } int main() { //const char *scriptPubKey = "04678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5f"; const char *scriptPubKey = "ab6o78"; // XXX NOTE THE LOWERCASE 'o' vector vch; vch = ParseHex(scriptPubKey); for(vector::iterator it = vch.begin(); it != vch.end(); ++it) { printf("%c", (*it)); } return 0; } mod6@gentoo ~/phexdigit $ g++ -g -o debug_before debug_before.cpp mod6@gentoo ~/phexdigit $ gdb debug_before GNU gdb (GDB) 7.8.90 for GNAT GPL 2015 [rev=gdb-head-ref-102395-gccbe2a7] Copyright (C) 2015 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. See your support agreement for details of warranty and support. If you do not have a current support agreement, then there is absolutely no warranty for this version of GDB. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details.For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from debug_before...done. (gdb) break main Breakpoint 1 at 0x400ae0: file debug_before.cpp, line 57. (gdb) break ParseHex Breakpoint 2 at 0x400a0e: file debug_before.cpp, line 35. (gdb) run Starting program: /home/mod6/phexdigit/debug_before Breakpoint 1, main () at debug_before.cpp:57 57 const char *scriptPubKey = "ab6o78"; (gdb) continue Continuing. Breakpoint 2, ParseHex (psz=0x401f84 "ab6o78") at debug_before.cpp:35 35 vector vch; (gdb) printf "%d", *psz 97(gdb) n 38 while (isspace(*psz)) (gdb) n 40 char c = phexdigit[(unsigned char)*psz++]; (gdb) printf "%d", *psz (gdb) n 41 if (c == (char)-1) (gdb) p/x c $1 = 0xa (gdb) n 43 unsigned char n = (c << 4); (gdb) n 44 c = phexdigit[(unsigned char)*psz++]; (gdb) printf "%d", *psz 98(gdb) n 45 if (c == (char)-1) (gdb) p/x c $2 = 0xb (gdb) n 47 n |= c; (gdb) n 48 vch.push_back(n); (gdb) p/x n $3 = 0xab (gdb) n 49 } (gdb) n 40 char c = phexdigit[(unsigned char)*psz++]; (gdb) printf "%d", *psz 54(gdb) n 41 if (c == (char)-1) (gdb) p/x c $4 = 0x6 (gdb) n 43 unsigned char n = (c << 4); (gdb) n 44 c = phexdigit[(unsigned char)*psz++]; (gdb) p/x n $5 = 0x60 (gdb) printf "%d", *psz 111(gdb) n 45 if (c == (char)-1) (gdb) p/x c $6 = 0xfe (gdb) p phexdigit[111] $7 = -2 '\376' (gdb) n 47 n |= c; (gdb) p/x n $8 = 0x60 (gdb) p/x c $9 = 0xfe (gdb) n 48 vch.push_back(n); (gdb) p/x n $10 = 0xfe (gdb) print "we should have bombed right after this, as after 'n |= c', n would have been FF, causing us to break" $11 = "we should have bombed right after this, as after 'n |= c', n would have been FF, causing us to break" (gdb) n 49 } (gdb) n 40 char c = phexdigit[(unsigned char)*psz++]; (gdb) printf "%d", *psz 55(gdb) n 41 if (c == (char)-1) (gdb) p/x c $12 = 0x7 (gdb) n 43 unsigned char n = (c << 4); (gdb) n 44 c = phexdigit[(unsigned char)*psz++]; (gdb) p/x n $13 = 0x70 (gdb) printf "%d", *psz 56(gdb) n 45 if (c == (char)-1) (gdb) p/x c $14 = 0x8 (gdb) n 47 n |= c; (gdb) p/x n $15 = 0x70 (gdb) p/x c $16 = 0x8 (gdb) n 48 vch.push_back(n); (gdb) p/x n $17 = 0x78 (gdb) n 49 } (gdb) n 40 char c = phexdigit[(unsigned char)*psz++]; (gdb) n 41 if (c == (char)-1) (gdb) n 42 break; (gdb) p/x c $18 = 0xff 0x04: [ Investigation: Conclusion ] Well, it took some time to get to the bottom of this defect, and its possible impact. However seeing as though you need to use a value outside the hexadecimal range to step on the trap, it has a lower chance of being sprung. Regardless, this is still a nasty bug, and it should get fixed. As we saw above in my debugging experience, the program ~did not~ break when it was supposed to break if the lower case 'o' was parsed. I am attaching a vpatch for this problem to this Mailing List post for consideration. Happy Hunting, mod6 [ References ]: [R.01]: http://btcbase.org/log/2018-04-29#1806228 [R.02]: http://btcbase.org/log/2015-12-21#1349600 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iQIcBAEBCgAGBQJa5oFMAAoJEHIXBai3Hq2v+SMP/0/XyhgsVMr1NEQoQZTA+BCq oDBnunNALJzkJcQGKyy9T2Ot1R4xAdcC7g5FvDtJQzNiNWOySmo4jXSPuZVkYZYx 6IKKxDQS27QcV3rc/lxOpiSGQgU0wUxeUgxJXeQaGSgwN8TSRow+W0vIpHgwQB0Z 1SdLQP0gSVYOzYsm4e1aP4hSO3rOXy4xpSixXGVYtmr+7Ec1S5KVRoxXqOo9/ij4 +w3i+9n1HkEFHQ6lzEPjxWR8m9g6jiC5XDTNHj9AGvP+slxDCZ9L2knY2FDMhWyi caLXrftib9mjLdaqZTVNqgXQXvvxrKJLcG3yg59+16QLPBy5TK7qib2+ncEYBBcR guX05Y3CyV4xS7GtkuNEyxvqmCY42MjpcxaboGaiQRjAw1gqxzeztu+aJq6pWVxR bPGflGTv6hU5MFMXHx/DK4lI0zj1MpgiRksjbnFUxzV247G2FuEFdQr7+ZIGoW4i 8p9qDwUbCS7rfdGatAomcx7faT3Sh78qVnl6GO5IQKohie0MuN1LHCegfyWPSSeO ZKtqtsdfik9Kq4aAUUhfTzXk5SR8W3yrWBaKObYuiFUQm1t0gtz7g/fYVr6Nxm2e GwSswxNW/2IgIWJDnWyyCojYnfUeVMm5uUACE8pB3MmGbr1vvxMwu6RcviR/DJF1 FX81FepTnbRtGvvI/mSc =GGig -----END PGP SIGNATURE-----