开发一个内核抓包模块,设计思路是通过在netfilter的PRE_ROUTING和POST_ROUTING链上增加两个钩子函数(input_hook & output_hook),分别对进来的报文和出去的报文做个处理,符合条件后追加一个新的mac、ip、udp头然后发送到指定地址,效果如下:
测试的时候出现如下问题:
内核版本:linux-3.4.39
警告信息:
[ 1211.518995] WARNING: at net/core/dev.c:1905 skb_warn_bad_offload+0x94/0xb4() [ 1211.519036] gmac0: caps=(0x0000000060004833, 0x0000000000000000) len=3002 data_len=0 gso_size=1460 gso_type=1 ip_summed=0 [ 1211.519044] Modules linked in: hook(O) pppoe ppp_async nf_nat_pptp iptable_nat ipt_REDIRECT ipt_NETMAP ipt_MASQUERADE ums_usbat ums_sddr55 ums_sddr09 ums_karma ums_jumpshot ums_isd200 ums_freecom ums_datafab ums_cypress ums_alauda pptp pppox ppp_mppe ppp_generic nf_nat_tftp nf_nat_proto_udplite nf_nat_proto_sctp nf_nat_proto_gre nf_nat_proto_dccp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_conntrack_pptp nf_conntrack_ipv4 nf_conntrack_amanda xt_u32 xt_time xt_tcpmss xt_string xt_statistic xt_state xt_socket xt_recent xt_quota2 xt_quota xt_policy xt_pkttype xt_physdev xt_owner xt_multiport xt_mac xt_limit xt_length xt_iprange xt_ipp2p(O) xt_helper xt_hashlimit xt_esp xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TRACE xt_TPROXY xt_TCPMSS xt_NOTRACK xt_NFQUEUE xt_NFLOG xt_LOG xt_IPMARK(O) xt_HL xt_DSCP xt_CT xt_CLASSIFY xflow_ddos usb_storage ts_kmp ts_fsm ts_bm spidev slhc nfnetlink_acct nf_tproxy_core nf_nat_snmp_basic nf_defrag_ipv4 nf_conntrack_tftp nf_conntrack_snmp nf_conntrack_sane nf_conntrack_proto_udplite nf_conntrack_proto_sctp nf_conntrack_proto_gre nf_conntrack_proto_dccp nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack_broadcast libcrc32c iptable_raw iptable_mangle iptable_filter ipt_ah ipt_REJECT ipt_ECN ip_tables ip_queue crc_itu_t crc_ccitt compat_xtables(O) arptable_filter arpt_mangle arp_tables evdev sunxi_mci ramreserve configdev i2c_dev mmc_block mmc_core ledtrig_heartbeat ledtrig_gpio leds_sunxi fpga_drv ip6t_REJECT ip6table_raw ip6_queue ip6table_mangle ip6table_filter ip6_tables nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 sd_mod scsi_mod ip_gre ss sunxi_gmac sit ipip ip6_tunnel rtl8370_drv extern_wdt snd_page_alloc vfat fat ntfs cifs regmap_spi algif_rng algif_skcipher algif_hash af_alg crypto_user sha256_generic crypto_null md4 ecb cts ctr arc4 ansi_cprng leds_gpio gpio_button_hotplug [last unloaded: hook] [ 1211.519592] [<c0016824>] (unwind_backtrace+0x0/0xdc) from [<c0037794>] (warn_slowpath_common+0x4c/0x64) [ 1211.519605] [<c0037794>] (warn_slowpath_common+0x4c/0x64) from [<c00377d8>] (warn_slowpath_fmt+0x2c/0x3c) [ 1211.519619] [<c00377d8>] (warn_slowpath_fmt+0x2c/0x3c) from [<c037db1c>] (skb_warn_bad_offload+0x94/0xb4) [ 1211.519634] [<c037db1c>] (skb_warn_bad_offload+0x94/0xb4) from [<c037dd30>] (skb_gso_segment+0xc0/0x244) [ 1211.519647] [<c037dd30>] (skb_gso_segment+0xc0/0x244) from [<c0381c3c>] (dev_hard_start_xmit+0x344/0x670) [ 1211.519660] [<c0381c3c>] (dev_hard_start_xmit+0x344/0x670) from [<c0396f84>] (sch_direct_xmit+0x6c/0x1b8) [ 1211.519672] [<c0396f84>] (sch_direct_xmit+0x6c/0x1b8) from [<c0382304>] (dev_queue_xmit+0x39c/0x604) [ 1211.519687] [<c0382304>] (dev_queue_xmit+0x39c/0x604) from [<c03b07f8>] (ip_finish_output+0x2b0/0x32c) [ 1211.519704] [<c03b07f8>] (ip_finish_output+0x2b0/0x32c) from [<bf4332f4>] (output_hook+0x2ec/0x330 [hook]) [ 1211.519718] [<bf4332f4>] (output_hook+0x2ec/0x330 [hook]) from [<c03a2014>] (nf_iterate+0x5c/0x7c) [ 1211.519730] [<c03a2014>] (nf_iterate+0x5c/0x7c) from [<c03a2090>] (nf_hook_slow+0x5c/0x10c) [ 1211.519740] [<c03a2090>] (nf_hook_slow+0x5c/0x10c) from [<c03b1d1c>] (ip_output+0xe0/0xfc) [ 1211.519750] [<c03b1d1c>] (ip_output+0xe0/0xfc) from [<c03b1950>] (ip_queue_xmit+0x308/0x398) [ 1211.519763] [<c03b1950>] (ip_queue_xmit+0x308/0x398) from [<c03c5584>] (tcp_transmit_skb+0x7d8/0x824) [ 1211.519777] [<c03c5584>] (tcp_transmit_skb+0x7d8/0x824) from [<c03c5a74>] (tcp_write_xmit+0x4a4/0x970) [ 1211.519789] [<c03c5a74>] (tcp_write_xmit+0x4a4/0x970) from [<c03c5f64>] (__tcp_push_pending_frames+0x24/0x90) [ 1211.519800] [<c03c5f64>] (__tcp_push_pending_frames+0x24/0x90) from [<c03b9c2c>] (tcp_sendmsg+0x830/0xb50) [ 1211.519815] [<c03b9c2c>] (tcp_sendmsg+0x830/0xb50) from [<c036e4b0>] (sock_aio_write+0xf8/0x108) [ 1211.519829] [<c036e4b0>] (sock_aio_write+0xf8/0x108) from [<c0103808>] (do_sync_write+0xdc/0x118) [ 1211.519842] [<c0103808>] (do_sync_write+0xdc/0x118) from [<c0103e88>] (vfs_write+0xb4/0x12c) [ 1211.519853] [<c0103e88>] (vfs_write+0xb4/0x12c) from [<c0104120>] (sys_write+0x38/0x64) [ 1211.519865] [<c0104120>] (sys_write+0x38/0x64) from [<c000f780>] (ret_fast_syscall+0x0/0x48) [ 1211.519875] ---[ end trace a874f731006e1847 ]--- [ 1211.519928] output_hook:296 send 780
根据log信息可以看出是tcp报文在经过我的钩子函数处理后发出这样的warning,这样的话就可以根据log里面函数调用关系去查找内核源码,看一下为什么会打印这样信息。
首先是dev_hard_start_xmit ->skb_gso_segment ->skb_warn_bad_offload
下面是skb_warn_bad_offload的代码,可以看到warning的确由它发出。
static void skb_warn_bad_offload(const struct sk_buff *skb)
{
static const netdev_features_t null_features = 0;
struct net_device *dev = skb->dev;
const char *driver = "";
if (dev && dev->dev.parent)
driver = dev_driver_string(dev->dev.parent);
WARN(1, "%s: caps=(%pNF, %pNF) len=%d data_len=%d gso_size=%d "
"gso_type=%d ip_summed=%d\n",
driver, dev ? &dev->features : &null_features,
skb->sk ? &skb->sk->sk_route_caps : &null_features,
skb->len, skb->data_len, skb_shinfo(skb)->gso_size,
skb_shinfo(skb)->gso_type, skb->ip_summed);
}
根据调用关系,我们来看一下dev_hard_start_xmit为什么会调用skb_gso_segment
int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev, struct netdev_queue *txq) { const struct net_device_ops *ops = dev->netdev_ops; int rc = NETDEV_TX_OK; unsigned int skb_len; if (likely(!skb->next)) { netdev_features_t features; /* 部分代码略去 */
if (netif_needs_gso(skb, features)) {
if (unlikely(dev_gso_segment(skb, features)))
goto out_kfree_skb;
if (skb->next)
goto gso;
这里netif_needs_gso返回为真的话就会去调用skb_gso_segment,像这样的话我们来看一下它的返回条件,下面是它的代码
static inline bool netif_needs_gso(struct sk_buff *skb, netdev_features_t features) { return skb_is_gso(skb) && (!skb_gso_ok(skb, features) || unlikely((skb->ip_summed != CHECKSUM_PARTIAL) && (skb->ip_summed != CHECKSUM_UNNECESSARY))); }
static inline bool skb_is_gso(const struct sk_buff *skb)
{
return skb_shinfo(skb)->gso_size;
}
可以看到,如果想让它返回false的话,我们可以简单更改一下,就像下面这样:
skb_shinfo(skb)->gso_size = 0;
事实上,这样更改后的确没有在出现告警log。这里仅仅给出了一个分析过程,关于内核协议栈还有很多内容值得看。
此外关于gso,这个涉及到硬件分片的知识,有兴趣的同学可以参考下面这张图: