"1195725856" and other mysterious numbers

Last week was the final week for this half's performance review at Facebook, where we write summaries of work and impact we and our peers had over the last half year. Naturally, that can only mean one thing: the entire company trends towards peak levels of procrastination, doing literally anything and everything to avoid the unspeakable horror of having to write a few paragraphs of text.

My personal distraction of choice a few days before the deadline was looking at lines like this, spamming from some hosts serving NFS traffic:

RPC: fragment too large: 1195725856
RPC: fragment too large: 1212498244

Let's take a look at the kernel code responsible for generating this warning. Grepping for "fragment too large" shows it comes from svc_tcp_recv_record in net/sunrpc/svcsock.c:

if (svc_sock_reclen(svsk) + svsk->sk_datalen >
                            serv->sv_max_mesg) {
    net_notice_ratelimited(
        "RPC: fragment too large: %d\n",
        svc_sock_reclen(svsk));
    goto err_delete;
}

So we're erroring out because we got passed some message which is beyond sv_max_mesg, cool. But where does this come from? Looking at svc_sock_reclen shows the following:

static inline u32 svc_sock_reclen(struct svc_sock *svsk)
{
    return ntohl(svsk->sk_reclen) & RPC_FRAGMENT_SIZE_MASK;
}

ntohl converts a uint from network byte ordering to the host's byte ordering. The bitwise AND with RPC_FRAGMENT_SIZE_MASK results in only some of the data being retained, and looking at the definition shows us how many bits that is:

#define RPC_LAST_STREAM_FRAGMENT (1U << 31)
#define RPC_FRAGMENT_SIZE_MASK   (~RPC_LAST_STREAM_FRAGMENT)

Okay, so we will only keep the first 31 bits and zero out the high bit, since ~ is bitwise NOT.

That means that these numbers come from the first four bytes of the fragment, omitting the final highest bit, which is reserved to record whether the fragment is the last one for this record (see svc_sock_final_rec). The fact that the error happens so early in fragment parsing in particular got me thinking that the fragment may not be protocol-confirming in the first place, since it's not like we got very far in processing at all, not even past the first four bytes. So what are these first four bytes, then? Looking at the numbers in hex shows something interesting:

% python
>>> hex(1195725856)
'0x47455420'
>>> hex(1212498244)
'0x48454144'

These are all really tightly clustered, generally from 0x40 to 0x50, which implies there might actually be some semantic meaning per-byte. And since these are char-sized, here's a guess about what might be encoded in them…

>>> '\x47\x45\x54\x20'
'GET '
>>> '\x48\x45\x41\x44'
'HEAD'

Oh dear. Somebody is sending HTTP requests to NFS RPC, but at least we are outright rejecting the fragments instead of actually allocating and dirtying a gigabyte of memory.

Next up was finding out who's actually sending these requests. rpcinfo -p shows NFS is listening on the default port, 2049, so we can set up a trap with tcpdump like so:

tcpdump -i any -w trap.pcap dst port 2049
# ...wait for logs to appear again, then ^C...
tcpdump -qX -r trap.pcap | less +/HEAD

From here, it was pretty easy to catch the origin of these requests by tracing back to the origin host and service using the captured pcap data. After that one can coordinate with the team responsible to work out what's actually going on here, and avoid these errant packets being sent out in the first place. As a bonus, you also get to learn more about parts of infrastructure you might otherwise not interact with, which is always cool. :-)

Funnily enough, if you Google for these numbers you can find tons of threads with people encountering them in the wild. Maybe we should start printing ASCII in future in some of the error paths hit when all character values are between 0x0 and 0x7F. I'm sure it would help a lot of people realise what's going on much more quickly. Maybe I'll send a patch upstream to do that in svc_tcp_recv_record and a few other places in the kernel that directly parse the first few data bytes from packets as an integer, let's see.

Here's a trivial program that can generate a bunch of other integers for HTTP that might be of interest:

#include <assert.h>
#include <byteswap.h>
#include <inttypes.h>
#include <limits.h>
#include <stdio.h>
#include <string.h>

static int system_is_little_endian(void) {
    static const int tmp = 1;
    return *(const char *)&tmp == 1;
}

#define print_reinterpreted_inner(type, fmt, bs_func, hdr)      \
    do {                                                        \
        if (strlen(hdr) >= sizeof(type)) {                      \
            type *_hdr_conv = (type *)hdr;                      \
            type _le, _be;                                      \
            if (system_is_little_endian()) {                    \
                _le = *_hdr_conv;                               \
                _be = bs_func(*_hdr_conv);                      \
            } else {                                            \
                _le = bs_func(*_hdr_conv);                      \
                _be = *_hdr_conv;                               \
            }                                                   \
            printf("%.*s,%zu,%" fmt ",%" fmt "\n",              \
                   (int)strlen(hdr) - 2, hdr, sizeof(type),     \
                   _le, _be);                                   \
        }                                                       \
    } while (0)

#define print_reinterpreted(bits, hdr)                          \
    print_reinterpreted_inner(uint##bits##_t, PRIu##bits,       \
                              bswap_##bits, hdr)

int main(void) {
    const char *methods[] = {"GET",   "HEAD",   "POST",
                             "PUT",   "DELETE", "OPTIONS",
                             "TRACE", "PATCH",  "CONNECT"};
    size_t i;

    printf("data,bytes,little-endian,big-endian\n");

    for (i = 0; i < sizeof(methods) / sizeof(methods[0]); i++) {
        int ret;
        char hdr[16];
        unsigned const char *check =
            (unsigned const char *)methods[i];

        /* No high bit, so no need to check signed integers */
        assert(!(check[0] & (1U << (CHAR_BIT - 1))));

        ret = snprintf(hdr, sizeof(hdr), "%s /", methods[i]);
        assert(ret > 0 && ret < (int)sizeof(hdr));

        print_reinterpreted(64, hdr);
        print_reinterpreted(32, hdr);
        print_reinterpreted(16, hdr);
    }
}

And the results:

databyteslittle-endianbig-endian
GET45423936711195725856
GET21773518245
HEAD411451282641212498244
HEAD21773618501
POST414147459361347375956
POST22030420559
PUT45423977761347769376
PUT22184020565
DELETE833957903472795497644919422028622405679
DELETE411626263721145392197
DELETE21773217477
OPTIONS823292915347203236635715160600973038368
OPTIONS412302623511330664521
OPTIONS22055920304
TRACE411283543881414676803
TRACE22107621586
PATCH411295952161346458691
PATCH21672020545
CONNECT823295608722029484194850181421777769504
CONNECT413137549471129270862
CONNECT22029117231

Here's an expandable complete list with all methods known to IANA:

databyteslittle-endianbig-endian
ACL45418688651094929440
ACL21721716707
BASELINE-CONTROL849940096287298849944774188637087157829
BASELINE-CONTROL411630840981111577413
BASELINE-CONTROL21670616961
BIND411459812501112100420
BIND21875416969
CHECKIN823278786449971139234848201154192559648
CHECKIN411286139551128809795
CHECKIN21849917224
CHECKOUT860768504568761078434848201154192954708
CHECKOUT411286139551128809795
CHECKOUT21849917224
CONNECT823295608722029484194850181421777769504
CONNECT413137549471129270862
CONNECT22029117231
COPY414984353951129271385
COPY22029117231
DELETE833957903472795497644919422028622405679
DELETE411626263721145392197
DELETE21773217477
GET45423936711195725856
GET21773518245
HEAD411451282641212498244
HEAD21773618501
LABEL411619699961279345221
LABEL21671619521
LINK412634217721279872587
LINK21876419529
LOCK412627024121280262987
LOCK22030019535
MERGE411965740291296388679
MERGE21774119781
MKACTIVITY852844918390202888455569617121606456905
MKACTIVITY411283525891296777539
MKACTIVITY21927719787
MKCALENDAR849219476365772910855569619311905295940
MKCALENDAR410949292291296778049
MKCALENDAR21927719787
MKCOL413298102531296778063
MKCOL21927719787
MKREDIRECTREF849951354942769262855569635821625627205
MKREDIRECTREF411630211331296781893
MKREDIRECTREF21927719787
MKWORKSPACE857880527629917417735569641362368451408
MKWORKSPACE413311209731296783183
MKWORKSPACE21927719787
MOVE411632843011297045061
MOVE22030119791
OPTIONS823292915347203236635715160600973038368
OPTIONS412302623511330664521
OPTIONS22055920304
ORDERPATCH860712220869517850395715705941611004244
ORDERPATCH411621054231330791493
ORDERPATCH22107120306
PATCH411295952161346458691
PATCH21672020545
POST414147459361347375956
POST22030420559
PRI45416761121347569952
PRI22107220562
PROPFIND849219520091064448805787775677319695940
PROPFIND413473756961347571536
PROPFIND22107220562
PROPPATCH848515745117854316325787775677486945347
PROPPATCH413473756961347571536
PROPPATCH22107220562
PUT45423977761347769376
PUT22184020565
REBIND833957892220645717305928217367116259375
REBIND412290798901380270665
REBIND21774621061
REPORT833958068315320661305928232786117009455
REPORT413306606901380274255
REPORT21774621061
SEARCH833957935730173719876000273900112977967
SEARCH413800092991397047634
SEARCH21774721317
TRACE411283543881414676803
TRACE22107621586
UNBIND833957892220645740376146923424020439087
UNBIND412290821971431192137
UNBIND22005321838
UNCHECKOUT857137345170937810776146924519086050127
UNCHECKOUT412123705171431192392
UNCHECKOUT22005321838
UNLINK833957969186466238296146934419137175599
UNLINK412297375571431194697
UNLINK22005321838
UNLOCK833957968715026468696146934444722429999
UNLOCK413304008531431194703
UNLOCK22005321838
UPDATE833957903472119194456147488538738106415
UPDATE410949960531431323713
UPDATE22056521840
UPDATEREDIRECTREF849951311648818668376147488538738119237
UPDATEREDIRECTREF410949960531431323713
UPDATEREDIRECTREF22056521840
VERSION-CONTROL832646339562392958306216465378320535085
VERSION-CONTROL413979006301447383635
VERSION-CONTROL21775022085

As expected, if you Google for most of these numbers, you can find an endless supply of questions mentioning them in error messages (some previously unidentified examples which I replied to: 1, 2, 3).

Hopefully this post will help people find their real problem – using the wrong protocol – more quickly in future. In particular, if your affected application crashed due to ENOMEM/"Out of memory", please especially consider submitting a patch to clamp the size to some reasonable maximum :-)