Restarted test node from wedged state @ +/- few seconds Wed Feb 26 19:15:09 CDT 2020 Catching up to max height... 619140 SetBestChain: new best=0000000000000000001067bb9fb2aac46e14982a4f8594eacce35102bf60a54b height=619140 work=4042714536247139317944455149 Complete. Now to let the node get 10 connections, currenly has 9... Done, now has 10 connections. GDB connected to process, currently in 'continue' mode. mod6@localhost ~ $ gdb -p 31560 GNU gdb (Gentoo 7.12.1 vanilla) 7.12.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: . Find the GDB manual and other documentation resources online at: . For help, type "help". Type "apropos word" to search for commands related to "word". Attaching to process 31560 [New LWP 31563] [New LWP 31565] [New LWP 31566] [New LWP 31567] [New LWP 31568] 0x000000000079c39e in __syscall () (gdb) break net.h:317 Breakpoint 1 at 0x4549c0: file net.h, line 317. (gdb) continue Continuing. -------------------------------------------------------------------------- Here are the code changes that I've compiled in this time: diff -uNr a/bitcoin/src/net.h b/bitcoin/src/net.h --- a/bitcoin/src/net.h 492c9cc92a504bb8174d75fafcbee6980986182a459efc9bfa1d64766320d98ba2fa971d78d00a777c6cc50f82a5d424997927378e99738b1b3b550bdaa727f7 +++ b/bitcoin/src/net.h 7803a1afee9bbb0feda42f6bb4e801c2d0d96ebeae057642b48c7962d16cbee3869509e56867b6071e580d6b44ab9e8e8d1302109804a6e32918982477c26701 @@ -106,8 +106,8 @@ int64 nLastRecv; int64 nLastSendEmpty; int64 nTimeConnected; - unsigned int nHeaderStart; - unsigned int nMessageStart; + int64 nHeaderStart; + int64 nMessageStart; CAddress addr; int nVersion; std::string strSubVer; @@ -289,7 +289,7 @@ void AbortMessage() { - if (nHeaderStart == -1) + if (nHeaderStart < 0) return; vSend.resize(nHeaderStart); nHeaderStart = -1; @@ -309,9 +309,14 @@ return; } - if (nHeaderStart == -1) + if (nHeaderStart < 0) return; + // XXX Debug: Check for 'Size wedge' early, before we call Hash() + if ((vSend.end() - (vSend.begin() + nMessageStart)) >= 0x100000000) { + printf("XXX Debug: Size wedge. Break.\n"); + } + // Set the size unsigned int nSize = vSend.size() - nMessageStart; memcpy((char*)&vSend[nHeaderStart] + offsetof(CMessageHeader, nMessageSize), &nSize, sizeof(nSize)); @@ -337,7 +342,7 @@ void EndMessageAbortIfEmpty() { - if (nHeaderStart == -1) + if (nHeaderStart < 0) return; int nSize = vSend.size() - nMessageStart; if (nSize > 0) -------------------------------------------------------------------------- Test 3: Send 49999 'getdata for block' commands to the node, see how it reacts and if it recovers. Host, Port, File = 127.0.0.1, 8333, snap_49999.txt mod6@localhost ~ $ ./wedger.py 127.0.0.1 8333 snap_49999.txt Alive: V=99999 (/therealbitcoin.org:0.9.99.99/) Jumpers=0x1 (TRB-Compat.) Return Addr=1.2.3.4:8333 Blocks=619140 Sending 1799991-byte message packet... Now listening for replies (Ctl-C to quit...) Violated BTC Protocol: Invalid payload length! .... In the debug.log after all of the request spam: Ate 49,999 of these: received getdata for: block 000000000000000000100075ea71350f142def8ff62e390e260313b4fe199eda 02/27/20 01:43:58 sending: block Size large: 924116 (0x7fce411fd59f, 0x7fce412def73) (924116 bytes) received getdata for: block 0000000000000000000e30c91414a0693a2f375490f3de31d71eeee1d3c10004 02/27/20 01:43:58 sending: block Size large: 644026 (0x7fce412def8b, 0x7fce4137c345) (644026 bytes) received getdata for: block 0000000000000000002226baeef3891b93962be1973450c6d606cffc69be9c07 02/27/20 01:43:58 sending: block Size large: 341524 (0x7fce4137c35d, 0x7fce413cf971) (341524 bytes) No wedge. mod6@localhost ~ $ free total used free shared buff/cache available Mem: 131935684 39048148 894868 744 91992668 91436704 Swap: 0 0 0 mod6@localhost ~ $ uptime 20:46:23 up 91 days, 9:43, 11 users, load average: 0.48, 0.73, 0.78 mod6@localhost ~ $ vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 1 0 42080220 384032 88976368 0 0 2 77 0 0 1 0 99 0 0 0 1 0 42073160 384472 88982080 0 0 80 30804 7281 5918 1 0 99 0 0 0 0 0 42076988 384472 88982976 0 0 24 22668 5469 1743 0 0 98 2 0 ^C This time through, we did not wedge, did not hit the break-point indicating wedge. TRB Conintued on after all of the requests. ----------------------------------------------------------------------- Debugging section: --------------------------------------------------------------------------