Using LD_PRELOAD analyzes HACK Linux user mode memory usage

LD_PRELOAD is an environment variable, which is used to load the dynamic library. The priority of dynamic library loading is the highest. Generally, its loading order is LD_PRELOAD > LD_ LIBRARY_ PATH > /etc/ld. so. cache > /lib>/usr/lib. In the program, we often call some functions of external libraries

Take malloc / free as an example. If we have a custom rand function, compile it into a dynamic library and use LD_PRELOAD loading, when the program calls the malloc/free function, the call is actually our custom function. Here is an example.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h>
#include <stdint.h>
#include <fcntl.h>
#include <unistd.h>
#include <math.h>
#include <sys/ioctl.h>
 
#define DBG(fmt, ...)   do { printf("%s line %d, "fmt"\n", __func__, __LINE__, ##__VA_ARGS__); } while (0)

//extern unsigned int abcdef;
int main(void)
{
	void *p = malloc(256);
	//DBG("abcdef = 0x%x", abcdef);
	if(p == NULL)
	{
	    DBG("malloc p is null.");
	}
	else
	{
	    DBG("p = %p.", p);
	}

	free(p);
	//DBG("abcdef = 0x%x", abcdef);
	p = NULL;

	return 0;
}

Run after compilation, and the results are as follows:

As expected, the current call logic can be graphically represented as follows, main C directly calls to the C library.

Using LD_PRELOAD join HOOK:

Create a new wrapper So file, internally implement malloc/free, and ensure that the function prototype is completely consistent with the C library

#define _GNU_SOURCE

#include <stdio.h>
#include <stddef.h>
#include <stdint.h>
#include <dlfcn.h>
 
#define DBG(fmt, ...)   do { printf("%s line %d, "fmt"\n", __func__, __LINE__, ##__VA_ARGS__); } while (0)

unsigned int abcdef = 0;
void *malloc(size_t size)
{
	void *ret;

	static void* (*realmalloc)(size_t size) = NULL;

	if(realmalloc == NULL)
	{
		realmalloc = dlsym(RTLD_NEXT, "malloc");
	}

	if(realmalloc == NULL)
	{
		return NULL;
	}

	//DBG("malloc");
	ret = realmalloc(size);
	abcdef = 0xdeadbeef;

	return ret;
}

void free(void *p)
{
	static void* (*realfree)(void* p) = NULL;

	if(realfree == NULL)
	{
		realfree = dlsym(RTLD_NEXT, "free");
	}

	if(realfree == NULL)
	{
		return;
	}

	realfree(p);
	abcdef = 0xbeefdead;

    return;
}

Modify main c. Add the printing logic of abcdef variable to verify whether the process has reached the HOOK library we expect. After testing, wrapper If you add a print statement in C, a segment error will appear at run time, so this is also a last resort.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stddef.h>
#include <stdint.h>
#include <fcntl.h>
#include <unistd.h>
#include <math.h>
#include <sys/ioctl.h>
 
#define DBG(fmt, ...)   do { printf("%s line %d, "fmt"\n", __func__, __LINE__, ##__VA_ARGS__); } while (0)

extern unsigned int abcdef;
int main(void)
{
	void *p = malloc(256);
	DBG("abcdef = 0x%x", abcdef);
	if(p == NULL)
	{
	    DBG("malloc p is null.");
	}
	else
	{
	    DBG("p = %p.", p);
	}

	free(p);
	DBG("abcdef = 0x%x", abcdef);
	p = NULL;

	return 0;
}

Compile wrapper So, the - ldl option must be added, otherwise a segment error will occur. The reason will be explained later. You must also add - fPIC because wrapper There are references to global variables in so.

gcc --shared wrapper.c -o wrapper.so -ldl -fPIC

Compile the main function and export environment variables

$ gcc main.c -L./ wrapper.so
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/caozilong/Workspace/hook
$ LD_PRELOAD=./wrapper.so ./a.out

According to the program flow and printing, the main program first enters wrapper malloc/free implementation in C, and then call the real malloc/free implementation in C library through the latter. Because there is a layer of transfer in the middle, we can do something here, add some debugging, analyze information and solve specific problems. The graphic representation is as follows:

Why not add - ldl segment error?

If the compiler is as shown in the figure below If so does not join the - ldl library, a segment error will appear when running the example:

Analyze the reason. It is likely that the dlsym function binds to different addresses in two cases. When an error occurs, the wrong address is bound.

Under normal circumstances, it is bound to the symbols in GLIBC Library:

In case of error, the binding address of dlsym is 0, resulting in execution segment error.

end!

Keywords: Linux Operation & Maintenance server

Added by nonaguy on Wed, 09 Feb 2022 03:52:57 +0200