WARNING: at net/core/dev.c:1905 skb_warn_bad_offload+0x94/0xb4() 解决思路

时间:2022-01-08 00:22:54

开发一个内核抓包模块,设计思路是通过在netfilter的PRE_ROUTING和POST_ROUTING链上增加两个钩子函数(input_hook & output_hook),分别对进来的报文和出去的报文做个处理,符合条件后追加一个新的mac、ip、udp头然后发送到指定地址,效果如下:

WARNING: at net/core/dev.c:1905 skb_warn_bad_offload+0x94/0xb4() 解决思路

测试的时候出现如下问题:

内核版本:linux-3.4.39

警告信息:

[ 1211.518995] WARNING: at net/core/dev.c:1905 skb_warn_bad_offload+0x94/0xb4()
[ 1211.519036] gmac0: caps=(0x0000000060004833, 0x0000000000000000) len=3002 data_len=0 gso_size=1460 gso_type=1 ip_summed=0
[ 1211.519044] Modules linked in: hook(O) pppoe ppp_async nf_nat_pptp iptable_nat ipt_REDIRECT ipt_NETMAP ipt_MASQUERADE ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom ums_datafab ums_cypress ums_alauda pptp pppox ppp_mppe ppp_generic nf_nat_tftp nf_nat_proto_udplite nf_nat_proto_sctp nf_nat_proto_gre nf_nat_proto_dccp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_conntrack_pptp nf_conntrack_ipv4 nf_conntrack_amanda xt_u32 xt_time xt_tcpmss xt_string xt_statistic xt_state xt_socket xt_recent xt_quota2 xt_quota xt_policy xt_pkttype xt_physdev xt_owner xt_multiport xt_mac xt_limit xt_length xt_iprange xt_ipp2p(O) xt_helper xt_hashlimit xt_esp xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TRACE xt_TPROXY xt_TCPMSS xt_NOTRACK xt_NFQUEUE xt_NFLOG xt_LOG xt_IPMARK(O) xt_HL xt_DSCP xt_CT xt_CLASSIFY xflow_ddos usb_storage ts_kmp ts_fsm ts_bm spidev slhc nfnetlink_acct nf_tproxy_core nf_nat_snmp_basic nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sane nf_conntrack_proto_udplite nf_conntrack_proto_sctp nf_conntrack_proto_gre nf_conntrack_proto_dccp nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast libcrc32c iptable_raw iptable_mangle iptable_filter ipt_ah ipt_REJECT ipt_ECN ip_tables ip_queue crc_itu_t crc_ccitt compat_xtables(O) arptable_filter arpt_mangle arp_tables evdev sunxi_mci ramreserve configdev i2c_dev mmc_block mmc_core ledtrig_heartbeat ledtrig_gpio leds_sunxi fpga_drv ip6t_REJECT ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 sd_mod scsi_mod ip_gre ss sunxi_gmac sit ipip ip6_tunnel rtl8370_drv extern_wdt snd_page_alloc vfat fat ntfs cifs regmap_spi algif_rng algif_skcipher algif_hash af_alg crypto_user sha256_generic crypto_null md4 ecb cts ctr arc4 ansi_cprng leds_gpio gpio_button_hotplug [last unloaded: hook]
[ 1211.519592] [<c0016824>] (unwind_backtrace+0x0/0xdc) from [<c0037794>] (warn_slowpath_common+0x4c/0x64)
[ 1211.519605] [<c0037794>] (warn_slowpath_common+0x4c/0x64) from [<c00377d8>] (warn_slowpath_fmt+0x2c/0x3c)
[ 1211.519619] [<c00377d8>] (warn_slowpath_fmt+0x2c/0x3c) from [<c037db1c>] (skb_warn_bad_offload+0x94/0xb4)
[ 1211.519634] [<c037db1c>] (skb_warn_bad_offload+0x94/0xb4) from [<c037dd30>] (skb_gso_segment+0xc0/0x244)
[ 1211.519647] [<c037dd30>] (skb_gso_segment+0xc0/0x244) from [<c0381c3c>] (dev_hard_start_xmit+0x344/0x670)
[ 1211.519660] [<c0381c3c>] (dev_hard_start_xmit+0x344/0x670) from [<c0396f84>] (sch_direct_xmit+0x6c/0x1b8)
[ 1211.519672] [<c0396f84>] (sch_direct_xmit+0x6c/0x1b8) from [<c0382304>] (dev_queue_xmit+0x39c/0x604)
[ 1211.519687] [<c0382304>] (dev_queue_xmit+0x39c/0x604) from [<c03b07f8>] (ip_finish_output+0x2b0/0x32c)
[ 1211.519704] [<c03b07f8>] (ip_finish_output+0x2b0/0x32c) from [<bf4332f4>] (output_hook+0x2ec/0x330 [hook])
[ 1211.519718] [<bf4332f4>] (output_hook+0x2ec/0x330 [hook]) from [<c03a2014>] (nf_iterate+0x5c/0x7c)
[ 1211.519730] [<c03a2014>] (nf_iterate+0x5c/0x7c) from [<c03a2090>] (nf_hook_slow+0x5c/0x10c)
[ 1211.519740] [<c03a2090>] (nf_hook_slow+0x5c/0x10c) from [<c03b1d1c>] (ip_output+0xe0/0xfc)
[ 1211.519750] [<c03b1d1c>] (ip_output+0xe0/0xfc) from [<c03b1950>] (ip_queue_xmit+0x308/0x398)
[ 1211.519763] [<c03b1950>] (ip_queue_xmit+0x308/0x398) from [<c03c5584>] (tcp_transmit_skb+0x7d8/0x824)
[ 1211.519777] [<c03c5584>] (tcp_transmit_skb+0x7d8/0x824) from [<c03c5a74>] (tcp_write_xmit+0x4a4/0x970)
[ 1211.519789] [<c03c5a74>] (tcp_write_xmit+0x4a4/0x970) from [<c03c5f64>] (__tcp_push_pending_frames+0x24/0x90)
[ 1211.519800] [<c03c5f64>] (__tcp_push_pending_frames+0x24/0x90) from [<c03b9c2c>] (tcp_sendmsg+0x830/0xb50)
[ 1211.519815] [<c03b9c2c>] (tcp_sendmsg+0x830/0xb50) from [<c036e4b0>] (sock_aio_write+0xf8/0x108)
[ 1211.519829] [<c036e4b0>] (sock_aio_write+0xf8/0x108) from [<c0103808>] (do_sync_write+0xdc/0x118)
[ 1211.519842] [<c0103808>] (do_sync_write+0xdc/0x118) from [<c0103e88>] (vfs_write+0xb4/0x12c)
[ 1211.519853] [<c0103e88>] (vfs_write+0xb4/0x12c) from [<c0104120>] (sys_write+0x38/0x64)
[ 1211.519865] [<c0104120>] (sys_write+0x38/0x64) from [<c000f780>] (ret_fast_syscall+0x0/0x48)
[ 1211.519875] ---[ end trace a874f731006e1847 ]---
[ 1211.519928] output_hook:296 send 780 

根据log信息可以看出是tcp报文在经过我的钩子函数处理后发出这样的warning,这样的话就可以根据log里面函数调用关系去查找内核源码,看一下为什么会打印这样信息。

首先是dev_hard_start_xmit ->skb_gso_segment ->skb_warn_bad_offload

下面是skb_warn_bad_offload的代码,可以看到warning的确由它发出。

static void skb_warn_bad_offload(const struct sk_buff *skb)
{
	static const netdev_features_t null_features = 0;
	struct net_device *dev = skb->dev;
	const char *driver = "";

	if (dev && dev->dev.parent)
		driver = dev_driver_string(dev->dev.parent);

	WARN(1, "%s: caps=(%pNF, %pNF) len=%d data_len=%d gso_size=%d "
	     "gso_type=%d ip_summed=%d\n",
	     driver, dev ? &dev->features : &null_features,
	     skb->sk ? &skb->sk->sk_route_caps : &null_features,
	     skb->len, skb->data_len, skb_shinfo(skb)->gso_size,
	     skb_shinfo(skb)->gso_type, skb->ip_summed);
}

根据调用关系,我们来看一下dev_hard_start_xmit为什么会调用skb_gso_segment

int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
			struct netdev_queue *txq)
{
	const struct net_device_ops *ops = dev->netdev_ops;
	int rc = NETDEV_TX_OK;
	unsigned int skb_len;

	if (likely(!skb->next)) {
		netdev_features_t features;

		/* 部分代码略去 */
		
		if (netif_needs_gso(skb, features)) {
			if (unlikely(dev_gso_segment(skb, features)))
				goto out_kfree_skb;
			if (skb->next)
				goto gso;
这里netif_needs_gso返回为真的话就会去调用skb_gso_segment,像这样的话我们来看一下它的返回条件,下面是它的代码
static inline bool netif_needs_gso(struct sk_buff *skb,
				   netdev_features_t features)
{
	return skb_is_gso(skb) && (!skb_gso_ok(skb, features) ||
		unlikely((skb->ip_summed != CHECKSUM_PARTIAL) &&
			 (skb->ip_summed != CHECKSUM_UNNECESSARY)));
}
static inline bool skb_is_gso(const struct sk_buff *skb)
{
	return skb_shinfo(skb)->gso_size;
}

可以看到,如果想让它返回false的话,我们可以简单更改一下,就像下面这样:

skb_shinfo(skb)->gso_size = 0;

事实上,这样更改后的确没有在出现告警log。这里仅仅给出了一个分析过程,关于内核协议栈还有很多内容值得看。

此外关于gso,这个涉及到硬件分片的知识,有兴趣的同学可以参考下面这张图:

WARNING: at net/core/dev.c:1905 skb_warn_bad_offload+0x94/0xb4() 解决思路