Linux netback crash trying to disable due to malformed packet

When Linux’s netback sees a malformed packet, it tries to disable the interface which serves the misbehaving frontend.

This involves taking a mutex, which might sleep. But in recent versions of Linux the guest transmit path is handled by NAPI in softirq context, where sleeping is not allowed. The end result is that the backend domain (often, Dom0) crashes with scheduling while atomic.

在netback,如果发现一个恶意构造的packet,netback会尝试关闭和netfront交互的接口。这个操作涉及到获取一个mutex,造成sleep。然而在最近版本的guest网络路径中,是用NAPI来处理的,这个过程不允许sleep。因此backend的虚拟机(如Dom0)会crash(scheduling while atomic错误)

improper error handling (scheduling while atomic)



This patch does the following:

The reason to disable it in RX path is because RX uses kthread. After this change the behavior of netback is still consistent – it won’t do any TX work for a rogue frontend, and the interface will be eventually turned off.

Also change a continue to break after xenvif_fatal_tx_err, as it doesn’t make sense to continue processing packets if frontend is rogue.

从patch来看,在softirq context的时候不要disable interface,defer it in xenvif_kthred_guest_rx。


Malicious guest administrators can cause denial of service. If driver domains are not in use, the impact is a host crash.