linux内核学习之网络篇——套接字缓冲区

时间:2021-02-23 11:08:10

在内核分析网络分组时,底层协议的数据将传输到跟高的层。而发送数据的时候顺序是相反的。每一层都是通过加(首部+净荷)传向跟底层,直至最终发送。

这些操作决定了网络的的性能。

就如下图所示

linux内核学习之网络篇——套接字缓冲区

 

linux因此设计了一个结构体

 

如下代码

/** 
 *	struct sk_buff - socket buffer
 *	@next: Next buffer in list
 *	@prev: Previous buffer in list
 *	@list: List we are on
 *	@sk: Socket we are owned by
 *	@stamp: Time we arrived
 *	@dev: Device we arrived on/are leaving by
 *      @real_dev: The real device we are using
 *	@h: Transport layer header
 *	@nh: Network layer header
 *	@mac: Link layer header
 *	@dst: FIXME: Describe this field
 *	@cb: Control buffer. Free for use by every layer. Put private vars here
 *	@len: Length of actual data
 *	@data_len: Data length
 *	@csum: Checksum
 *	@__unused: Dead field, may be reused
 *	@cloned: Head may be cloned (check refcnt to be sure)
 *	@pkt_type: Packet class
 *	@ip_summed: Driver fed us an IP checksum
 *	@priority: Packet queueing priority
 *	@users: User count - see {datagram,tcp}.c
 *	@protocol: Packet protocol from driver
 *	@security: Security level of packet
 *	@truesize: Buffer size 
 *	@head: Head of buffer
 *	@data: Data head pointer
 *	@tail: Tail pointer
 *	@end: End pointer
 *	@destructor: Destruct function
 *	@nfmark: Can be used for communication between hooks
 *	@nfcache: Cache info
 *	@nfct: Associated connection, if any
 *	@nf_debug: Netfilter debugging
 *	@nf_bridge: Saved data about a bridged frame - see br_netfilter.c
 *      @private: Data which is private to the HIPPI implementation
 *	@tc_index: Traffic control index
 */

struct sk_buff {
	/* These two members must be first. */
	struct sk_buff		*next;
	struct sk_buff		*prev;

	struct sk_buff_head	*list;
	struct sock		*sk;
	struct timeval		stamp;
	struct net_device	*dev;
	struct net_device	*real_dev;

	union {
		struct tcphdr	*th;
		struct udphdr	*uh;
		struct icmphdr	*icmph;
		struct igmphdr	*igmph;
		struct iphdr	*ipiph;
		unsigned char	*raw;
	} h;

	union {
		struct iphdr	*iph;
		struct ipv6hdr	*ipv6h;
		struct arphdr	*arph;
		unsigned char	*raw;
	} nh;

	union {
	  	struct ethhdr	*ethernet;
	  	unsigned char 	*raw;
	} mac;

	struct  dst_entry	*dst;
	struct	sec_path	*sp;

	/*
	 * This is the control buffer. It is free to use for every
	 * layer. Please put your private variables there. If you
	 * want to keep them across layers you have to do a skb_clone()
	 * first. This is owned by whoever has the skb queued ATM.
	 */
	char			cb[48];

	unsigned int		len,
				data_len,
				csum;
	unsigned char		local_df,
				cloned,
				pkt_type,
				ip_summed;
	__u32			priority;
	unsigned short		protocol,
				security;

	void			(*destructor)(struct sk_buff *skb);
#ifdef CONFIG_NETFILTER
        unsigned long		nfmark;
	__u32			nfcache;
	struct nf_ct_info	*nfct;
#ifdef CONFIG_NETFILTER_DEBUG
        unsigned int		nf_debug;
#endif
#ifdef CONFIG_BRIDGE_NETFILTER
	struct nf_bridge_info	*nf_bridge;
#endif
#endif /* CONFIG_NETFILTER */
#if defined(CONFIG_HIPPI)
	union {
		__u32		ifield;
	} private;
#endif
#ifdef CONFIG_NET_SCHED
       __u32			tc_index;               /* traffic control index */
#endif

	/* These elements must be at the end, see alloc_skb() for details.  */
	unsigned int		truesize;
	atomic_t		users;
	unsigned char		*head,
				*data,
				*tail,
				*end;
};

 

套接字换从区在各个层交换数据,就不用复制数据了。

 

从以上字段和注释可以看到,head和end字段指向了buf的起始位置和终止位置。然后使用header指针指像各种协议填值。然后data就是实际数据。tail记录了数据的偏移值。

 

相信大家都能看懂注释,具体的解释就不用介绍了.,

在一个新的分组产生的时候,TCP层首先在用户空间中分配内存来容纳该分组数据。分配的空间大于数据的实际需要长度。因此较低的层可以增加首部,在往下一层走的时候,只需要对字段添值即可。

 

对接收分组的一样,分组数据复制到内核分配的一个内存区中。并在分析的过程中一直处于内存区中。

 

skbuf还提供了一个双向链表对这个数据分组进行了管理。

如下代码

struct sk_buff_head {
	/* These two members must be first. */
	struct sk_buff	*next;
	struct sk_buff	*prev;

	__u32		qlen;
	spinlock_t	lock;
};


__u32  qlen;  缓冲区中等待队列的长度。就是分组的成员数量。

 

lock 表示了cpu的互斥。

今天分析到此,跟多源码阅读去看skbuff.h的文件。

 

更多文章,欢饮访问:http://blog.csdn.net/wallwind