Skip to content

stream模式,为啥receiver卡死的情况下,导致sender也卡死了? 一直卡在BRPC_HANDLE_EOVERCROWDED这个宏里 #3336

@chinaux

Description

@chinaux

堆栈(Stack)

Stack:
#0 0x00007f7410dc6a41 clock_nanosleep@@GLIBC_2.17
#1 0x00007f7410dcbf67 __nanosleep
#2 0x00007f7410df7a49 usleep
#3 0x000000000121c897 bthread_usleep
#4 0x00000000013412e1 brpc::Stream::WriteToHostSocket(butil::IOBuf*)
#5 0x0000000001343b3a brpc::Stream::CutMessageIntoFileDescriptor(int, butil::IOBuf**, unsigned long)
#6 0x0000000001331712 brpc::Socket::StartWrite(brpc::Socket::WriteRequest*, brpc::Socket::WriteOptions const&)
#7 0x0000000001331b92 brpc::Socket::Write(butil::IOBuf*, brpc::Socket::WriteOptions const*)
#8 0x0000000001341464 brpc::Stream::AppendIfNotFull(butil::IOBuf const&)
#9 0x0000000001342ce4 brpc::StreamWrite(unsigned long, butil::IOBuf const&)
#10 0x0000000000bc87aa proxima::centaur::kafka_proxy::KafkaProxyService::send_task_job()
#11 0x00007f74110e8b74 execute_native_thread_routine
#12 0x00007f74123b73f9 start_thread
#13 0x00007f7410dffb13 __clone

宏定义代码

// Sleep a while when `write_expr' returns negative with errno=EOVERCROWDED
// Implemented as a macro rather than a field of Socket.WriteOptions because
// the macro works for other functions besides Socket.Write as well.
#define BRPC_HANDLE_EOVERCROWDED(write_expr)                       \
    ({                                                                  \
        int64_t __ret_code__;                                           \
        int sleep_time = 250;                                           \
        while (true) {                                                  \
            __ret_code__ = (write_expr);                                \
            if (__ret_code__ >= 0 || errno != ::brpc::EOVERCROWDED) { \
                break;                                                  \
            }                                                           \
            sleep_time *= 2;                                            \
            if (sleep_time > 2000) { sleep_time = 2000; }               \
            ::bthread_usleep(sleep_time);                               \
        }                                                               \
        __ret_code__;                                                   \
    })

我看还有个BRPC_HANDLE_EOVERCROWDED_N, 想请教下当时没有用_N,是基于什么考虑呢?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions