V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
fzinfz
V2EX  ›  Docker

踩到了 docker 一个 open 了 3 年+, 350+回复的 kennel 级神坑,如果是生产环境各位运维童鞋会怎么处理?

  •  
  •   fzinfz ·
    fzinfz · 2017-06-25 13:18:25 +08:00 · 16922 次点击
    这是一个创建于 2711 天前的主题,其中的信息可能已经有所发展或是发生改变。
    unregister_netdevice: waiting for lo to become free. Usage count = #

    https://github.com/moby/moby/issues/5618

    貌似可能的原因很多,kernel 4.11 解决了一些:
    https://github.com/coreos/bugs/issues/254
    Another source of leakage has been fixed by torvalds/linux@f647821, which is part of 4.11.
    torvalds/linux@b7c8487 is in the same batch of commits, is also in 4.11, and fixes another possible cause
    23 条回复    2022-03-30 17:15:32 +08:00
    choury
        1
    choury  
       2017-06-25 13:50:28 +08:00
    我们的做法是甩锅给内核组,让他们修
    ryd994
        2
    ryd994  
       2017-06-25 14:06:35 +08:00   ❤️ 2
    几乎所有人都踩到的……
    还好没影响我使用,除了 dmesg spamming 以外
    既然有那么多内核大佬都在修,我们这些小菜只能等了
    现在你知道运维找大师做法给服务器开光有啥用了吗?
    哦,还有个办法是买 RH 订阅然后甩锅给 RH
    xss
        3
    xss  
       2017-06-25 14:29:14 +08:00
    我也遇到了, 然而具体有什么影响?
    cnnblike
        4
    cnnblike  
       2017-06-25 15:15:41 +08:00
    我也遇到了,神烦
    mritd
        5
    mritd  
       2017-06-25 16:07:30 +08:00 via iPhone
    很正常
    fuxkcsdn
        6
    fuxkcsdn  
       2017-06-25 20:40:57 +08:00 via iPhone
    遇到过+1
    freestyle
        7
    freestyle  
       2017-06-25 20:45:14 +08:00 via iPhone
    遇到过 但不清楚有什么影响
    htfy96
        8
    htfy96  
       2017-06-25 21:34:56 +08:00
    还有那个 userland-proxy,必须性能很差地在用户态搞 NAT,原因之一也是 kernel bug
    dusheng
        9
    dusheng  
       2017-06-25 23:33:10 +08:00
    刚遇到过
    SharkIng
        10
    SharkIng  
       2017-06-25 23:59:47 +08:00 via iPhone
    这个问题把我们害惨了,不过据说最新 kernel 修复了?
    fzinfz
        11
    fzinfz  
    OP
       2017-06-26 02:51:21 +08:00
    @SharkIng 具体有啥影响?最新 kernel 有做了修复,不过不知是否完全修复。

    这个错误我是刚重启的时候连不上 SSH 进 hyper-v console 才发现的,重启 VM 后 SSH 正常但是还是报了 2 条 unregister_netdevice,不知是否相关。

    [以下有点乱,仅作个人记录]
    看了下 dmesg,还伴随着很多“ docker0: port #(veth......) entered blocking/disabled state ”的 log,看了下 GCE/Vultr KVM 的 VM 也有 blocking/disabled log (但是没有 unregister_netdevice ),猜测是 packetbeat 抓包导致的。(没有 packetbeat 抓包的机子就没有这些 log。)
    PS1:所有 container 都是--net host 运行
    PS2:报 unregister_netdevice 的 VM 用的是 B150 主板板载 Realtek 网卡...
    PS3:目前 kernel 4.10.0-19-generic,准备升 4.11 看下

    贴下 unregister_netdevice 那台机器的 dmesg,当作记录:

    [ 8.263327] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
    [ 8.265021] Bridge firewalling registered
    [ 8.270259] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
    [ 8.352304] Initializing XFRM netlink socket
    [ 8.395477] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
    [ 8.934882] aufs au_opts_verify:1585:dockerd[1619]: dirperm1 breaks the protection by the permission bits on the lower branch
    [ 9.152098] docker0: port 1(veth41c7ef0) entered blocking state
    [ 9.152098] docker0: port 1(veth41c7ef0) entered disabled state
    [ 9.152267] device veth41c7ef0 entered promiscuous mode
    [ 9.152695] IPv6: ADDRCONF(NETDEV_UP): veth41c7ef0: link is not ready
    [ 9.152700] docker0: port 1(veth41c7ef0) entered blocking state
    [ 9.152701] docker0: port 1(veth41c7ef0) entered forwarding state
    [ 9.152771] docker0: port 1(veth41c7ef0) entered disabled state
    [ 11.332410] eth0: renamed from vethe136227
    [ 11.348824] IPv6: ADDRCONF(NETDEV_CHANGE): veth41c7ef0: link becomes ready
    [ 11.348896] docker0: port 1(veth41c7ef0) entered blocking state
    [ 11.348897] docker0: port 1(veth41c7ef0) entered forwarding state
    [ 11.348985] IPv6: ADDRCONF(NETDEV_CHANGE): docker0: link becomes ready
    [ 12.766843] docker0: port 2(veth49d7060) entered blocking state
    [ 12.766844] docker0: port 2(veth49d7060) entered disabled state
    [ 12.767074] device veth49d7060 entered promiscuous mode
    [ 12.767203] IPv6: ADDRCONF(NETDEV_UP): veth49d7060: link is not ready
    [ 12.767206] docker0: port 2(veth49d7060) entered blocking state
    [ 12.767207] docker0: port 2(veth49d7060) entered forwarding state
    [ 12.767329] docker0: port 2(veth49d7060) entered disabled state
    [ 13.780687] eth0: renamed from vethc063ec5
    [ 13.792810] IPv6: ADDRCONF(NETDEV_CHANGE): veth49d7060: link becomes ready
    [ 13.792887] docker0: port 2(veth49d7060) entered blocking state
    [ 13.792888] docker0: port 2(veth49d7060) entered forwarding state
    [ 52.227899] hv_balloon: INFO_TYPE_MAX_PAGE_CNT = 3072000
    [ 56.286544] hv_utils: KVP IC version 4.0
    [ 910.342263] docker0: port 3(veth622db14) entered blocking state
    [ 910.342265] docker0: port 3(veth622db14) entered disabled state
    [ 910.342459] device veth622db14 entered promiscuous mode
    [ 910.342573] IPv6: ADDRCONF(NETDEV_UP): veth622db14: link is not ready
    [ 910.468949] eth0: renamed from vethf117385
    [ 910.485201] IPv6: ADDRCONF(NETDEV_CHANGE): veth622db14: link becomes ready
    [ 910.485297] docker0: port 3(veth622db14) entered blocking state
    [ 910.485299] docker0: port 3(veth622db14) entered forwarding state
    [ 924.891826] docker0: port 3(veth622db14) entered disabled state
    [ 924.891832] vethf117385: renamed from eth0
    [ 924.933207] docker0: port 3(veth622db14) entered disabled state
    [ 924.935302] device veth622db14 left promiscuous mode
    [ 924.935306] docker0: port 3(veth622db14) entered disabled state
    [ 1088.348123] unregister_netdevice: waiting for lo to become free. Usage count = 1
    [ 2182.977980] docker0: port 3(vethe1afdfb) entered blocking state
    [ 2182.977981] docker0: port 3(vethe1afdfb) entered disabled state
    [ 2182.978204] device vethe1afdfb entered promiscuous mode
    [ 2182.978357] IPv6: ADDRCONF(NETDEV_UP): vethe1afdfb: link is not ready
    [ 2183.076738] eth0: renamed from vethfcb6742
    [ 2183.094282] IPv6: ADDRCONF(NETDEV_CHANGE): vethe1afdfb: link becomes ready
    [ 2183.094481] docker0: port 3(vethe1afdfb) entered blocking state
    [ 2183.094482] docker0: port 3(vethe1afdfb) entered forwarding state
    [ 2194.980883] docker0: port 3(vethe1afdfb) entered disabled state
    [ 2194.981195] vethfcb6742: renamed from eth0
    [ 2195.018586] docker0: port 3(vethe1afdfb) entered disabled state
    [ 2195.021027] device vethe1afdfb left promiscuous mode
    [ 2195.021031] docker0: port 3(vethe1afdfb) entered disabled state
    [ 3021.669760] docker0: port 3(vethb0d2857) entered blocking state
    [ 3021.669760] docker0: port 3(vethb0d2857) entered disabled state
    [ 3021.670990] device vethb0d2857 entered promiscuous mode
    [ 3021.671312] IPv6: ADDRCONF(NETDEV_UP): vethb0d2857: link is not ready
    [ 3021.784817] eth0: renamed from veth12e4511
    [ 3021.804650] IPv6: ADDRCONF(NETDEV_CHANGE): vethb0d2857: link becomes ready
    [ 3021.804739] docker0: port 3(vethb0d2857) entered blocking state
    [ 3021.804740] docker0: port 3(vethb0d2857) entered forwarding state
    [ 3022.613543] docker0: port 3(vethb0d2857) entered disabled state
    [ 3022.613652] veth12e4511: renamed from eth0
    [ 3022.651080] docker0: port 3(vethb0d2857) entered disabled state
    [ 3022.656981] device vethb0d2857 left promiscuous mode
    [ 3022.657016] docker0: port 3(vethb0d2857) entered disabled state
    [ 3357.591305] docker0: port 3(vethb900f81) entered blocking state
    [ 3357.591306] docker0: port 3(vethb900f81) entered disabled state
    [ 3357.591425] device vethb900f81 entered promiscuous mode
    [ 3357.591511] IPv6: ADDRCONF(NETDEV_UP): vethb900f81: link is not ready
    [ 3357.696529] eth0: renamed from vethf79a1ab
    [ 3357.716768] IPv6: ADDRCONF(NETDEV_CHANGE): vethb900f81: link becomes ready
    [ 3357.716854] docker0: port 3(vethb900f81) entered blocking state
    [ 3357.716855] docker0: port 3(vethb900f81) entered forwarding state
    [ 3407.848335] vethf79a1ab: renamed from eth0
    [ 3407.881076] docker0: port 3(vethb900f81) entered disabled state
    [ 3407.898598] docker0: port 3(vethb900f81) entered disabled state
    [ 3407.904083] device vethb900f81 left promiscuous mode
    [ 3407.904104] docker0: port 3(vethb900f81) entered disabled state
    [ 3603.292153] unregister_netdevice: waiting for lo to become free. Usage count = 1
    SharkIng
        12
    SharkIng  
       2017-06-26 03:35:07 +08:00   ❤️ 1
    @fzinfz #11 Kernel Panic, 表现类似于服务器断网,因为我们程序当初跑在 Docker 里面的,造成所有流出 Docker 的流量都无法流出。然后 Docker 的程序基本上就死了(主要是因为无法用了)最后我们就把整个程序做成 Dockerless 的了
    Kernel 修复大概是这部分,但是我们没有测试: https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.22
    l142857
        13
    l142857  
       2017-06-26 09:48:00 +08:00
    ixiaohei
        14
    ixiaohei  
       2017-07-04 09:24:15 +08:00   ❤️ 1
    貌似我们公司也碰到了,redhat 给我们的建议是升级为 RedHat 发行的 docker 及补充包来解决
    jessynt
        15
    jessynt  
       2017-07-22 19:09:08 +08:00
    踩到
    unregister_netdevice: waiting for lo to become free
    0312birdzhang
        16
    0312birdzhang  
       2018-05-14 11:22:12 +08:00
    kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1

    希望是踩到这个坑了,这样就可以甩锅了
    caimaoy
        17
    caimaoy  
       2018-12-28 16:16:26 +08:00
    踩锅 + 1
    AngryPanda
        18
    AngryPanda  
       2019-01-04 11:38:48 +08:00
    踩锅 + 1
    dozer47528
        19
    dozer47528  
       2019-02-22 17:52:26 +08:00
    踩锅+1
    1073850525
        20
    1073850525  
       2019-04-28 19:11:06 +08:00
    踩锅 + 1
    atywz
        21
    atywz  
       2019-04-29 21:18:16 +08:00
    甩不了锅 怎么办 55
    vincent927
        22
    vincent927  
       2019-05-07 10:21:48 +08:00
    刚踩到,就升级了下 rancher 然后就出现了···有什么影响吗?解决办法是?
    wwek
        23
    wwek  
       2022-03-30 17:15:32 +08:00
    踩锅 + 1
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   3303 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 22ms · UTC 12:32 · PVG 20:32 · LAX 04:32 · JFK 07:32
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.