在内核分析网络分组时,底层协议的数据将传输到跟高的层。而发送数据的时候顺序是相反的。每一层都是通过加(首部+净荷)传向跟底层,直至最终发送。
这些操作决定了网络的的性能。
就如下图所示
linux因此设计了一个结构体
如下代码
/** * struct sk_buff - socket buffer * @next: Next buffer in list * @prev: Previous buffer in list * @list: List we are on * @sk: Socket we are owned by * @stamp: Time we arrived * @dev: Device we arrived on/are leaving by * @real_dev: The real device we are using * @h: Transport layer header * @nh: Network layer header * @mac: Link layer header * @dst: FIXME: Describe this field * @cb: Control buffer. Free for use by every layer. Put private vars here * @len: Length of actual data * @data_len: Data length * @csum: Checksum * @__unused: Dead field, may be reused * @cloned: Head may be cloned (check refcnt to be sure) * @pkt_type: Packet class * @ip_summed: Driver fed us an IP checksum * @priority: Packet queueing priority * @users: User count - see {datagram,tcp}.c * @protocol: Packet protocol from driver * @security: Security level of packet * @truesize: Buffer size * @head: Head of buffer * @data: Data head pointer * @tail: Tail pointer * @end: End pointer * @destructor: Destruct function * @nfmark: Can be used for communication between hooks * @nfcache: Cache info * @nfct: Associated connection, if any * @nf_debug: Netfilter debugging * @nf_bridge: Saved data about a bridged frame - see br_netfilter.c * @private: Data which is private to the HIPPI implementation * @tc_index: Traffic control index */ struct sk_buff { /* These two members must be first. */ struct sk_buff *next; struct sk_buff *prev; struct sk_buff_head *list; struct sock *sk; struct timeval stamp; struct net_device *dev; struct net_device *real_dev; union { struct tcphdr *th; struct udphdr *uh; struct icmphdr *icmph; struct igmphdr *igmph; struct iphdr *ipiph; unsigned char *raw; } h; union { struct iphdr *iph; struct ipv6hdr *ipv6h; struct arphdr *arph; unsigned char *raw; } nh; union { struct ethhdr *ethernet; unsigned char *raw; } mac; struct dst_entry *dst; struct sec_path *sp; /* * This is the control buffer. It is free to use for every * layer. Please put your private variables there. If you * want to keep them across layers you have to do a skb_clone() * first. This is owned by whoever has the skb queued ATM. */ char cb[48]; unsigned int len, data_len, csum; unsigned char local_df, cloned, pkt_type, ip_summed; __u32 priority; unsigned short protocol, security; void (*destructor)(struct sk_buff *skb); #ifdef CONFIG_NETFILTER unsigned long nfmark; __u32 nfcache; struct nf_ct_info *nfct; #ifdef CONFIG_NETFILTER_DEBUG unsigned int nf_debug; #endif #ifdef CONFIG_BRIDGE_NETFILTER struct nf_bridge_info *nf_bridge; #endif #endif /* CONFIG_NETFILTER */ #if defined(CONFIG_HIPPI) union { __u32 ifield; } private; #endif #ifdef CONFIG_NET_SCHED __u32 tc_index; /* traffic control index */ #endif /* These elements must be at the end, see alloc_skb() for details. */ unsigned int truesize; atomic_t users; unsigned char *head, *data, *tail, *end; };
套接字换从区在各个层交换数据,就不用复制数据了。
从以上字段和注释可以看到,head和end字段指向了buf的起始位置和终止位置。然后使用header指针指像各种协议填值。然后data就是实际数据。tail记录了数据的偏移值。
相信大家都能看懂注释,具体的解释就不用介绍了.,
在一个新的分组产生的时候,TCP层首先在用户空间中分配内存来容纳该分组数据。分配的空间大于数据的实际需要长度。因此较低的层可以增加首部,在往下一层走的时候,只需要对字段添值即可。
对接收分组的一样,分组数据复制到内核分配的一个内存区中。并在分析的过程中一直处于内存区中。
skbuf还提供了一个双向链表对这个数据分组进行了管理。
如下代码
struct sk_buff_head { /* These two members must be first. */ struct sk_buff *next; struct sk_buff *prev; __u32 qlen; spinlock_t lock; };
__u32 qlen; 缓冲区中等待队列的长度。就是分组的成员数量。
lock 表示了cpu的互斥。
今天分析到此,跟多源码阅读去看skbuff.h的文件。
更多文章,欢饮访问:http://blog.csdn.net/wallwind