linux Basics

1. Signal

Signals are software interrupts that provide a way to handle asynchronous events

The simplest interface of unix signaling mechanism is the signal function

 * sign  Signal integer
 * func  Function pointer
 * return : Function pointer (a function address, function has an integer parameter, no return value)
void (* signal(int sign,void (*func)(int))) (int)
// Other expressions
typedef void signfunc(int);
signfunc *signal(int ,signfunc);

typedef void (*sighandler_t)(int);
sighandler_t signal(int signum,sighandler_t handler);
  • Signals: exceptions in the form of higher-level software

Knowledge development:

Abnormal control flow

Modern control systems respond to situations by making sudden changes in control flow, which we call abnormal control flow ECF

(Exceptional Control Flow).

  • In the hardware layer, the time detected by the hardware will trigger the sudden transfer of control to the exception handler
  • In the operating system layer, the kernel transfers control from one process to another through context switching
  • In the application layer, a process can send signals to another process, and the receiver will suddenly transfer control to one of its signal handlers

An exception is a sudden change in the control flow

The hardware triggers an exception and transfers it to the exception handler through the exception handling table. The rest of the work is to complete the exception handler in the software

Category of exception:

  • Interrupt
  • Trap
  • Fault
  • abort


Interrupts are signals from io devices that occur asynchronously

The I/O device (network adapter, disk controller, timer chip) sends a signal through a pin of the phase processor chip and sends the signal to the system bus to trigger the interrupt. This exception number identifies the device causing the interrupt


Traps are intentional exceptions

The trap handler returns control to the next instruction

A procedure like interface (system call) system call can be provided between the user program and the kernel


The fault may be caused by the error of the program and can be corrected

Page missing exception


The result of an unrecoverable fatal error at termination

Concurrent flow

The execution of a logical flow overlaps with another flow in time, which is called concurrent flow

The general phenomenon that multiple streams execute concurrently is called concurrency

The concept of one process running in turn with other processes is called multitasking

Parallel flow is a true subset of concurrent flow. If two flows run concurrently on different processor cores or computers, we call them parallel flow

2. Str() function

 #include <string.h>

char *
strstr(const char *haystack, const char *needle);
// locate a subtring in string
//Metaphor: looking for a needle in a haystack

3.Redis coding format

redis external data structure

redis object system

  • String string
  • Set set
  • Ordered set zset
  • Hash hash
  • List list

redis internal implementation data structure

// Object coding
#define REDIS_ENCODING_RAW 0     /* Raw representation */
#define REDIS_ENCODING_INT 1     /* Encoded as integer */
#define REDIS_ENCODING_HT 2      /* Encoded as hash table */
#define REDIS_ENCODING_ZIPMAP 3  /* Encoded as zipmap */
#define REDIS_ENCODING_LINKEDLIST 4 /* Encoded as regular linked list */
#define REDIS_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define REDIS_ENCODING_INTSET 6  /* Encoded as intset */
#define REDIS_ENCODING_SKIPLIST 7  /* Encoded as skiplist */
#define REDIS_ENCODING_EMBSTR 8  /* Embedded sds string encoding */


REDIS_STRING encoding type

  • REDIS_ENCODING_INT, string object implemented by integer value
  • REDIS_ENCODING_RAW, a string object implemented by a simple dynamic string
  • REDIS_ENCODING_EMBSTR, a string object implemented by embstr encoded simple dynamic string

embstr encoded string object. All data is stored in a continuous piece of memory

object. Implemented in C

robj *tryObjectEncoding(robj *o) {
    long value;
    sds s = o->ptr;
    size_t len;
    // Encoding is only attempted when the encoding of the string is RAW or EMBSTR
    if (!sdsEncodedObject(o)) return o;
    // Do not encode shared objects
    if (o->refcount > 1) return o;
    // Check the string
    // Encode only strings that are less than or equal to 21 bytes in length and can be interpreted as integers
    len = sdslen(s);
    if (len <= 21 && string2l(s,len,&value)) {
        /* This object is encodable as a long. Try to use a shared object.
         * Note that we avoid using shared integers when maxmemory is used
         * because every object needs to have a private LRU field for the LRU
         * algorithm to work well. */
        if (server.maxmemory == 0 &&
            value >= 0 &&
            value < REDIS_SHARED_INTEGERS)
            return shared.integers[value];
        } else {
            if (o->encoding == REDIS_ENCODING_RAW) sdsfree(o->ptr);
            o->encoding = REDIS_ENCODING_INT;
            o->ptr = (void*) value;
            return o;
    // An attempt was made to encode a RAW encoded string into EMBSTR encoding
        robj *emb;
        if (o->encoding == REDIS_ENCODING_EMBSTR) return o;
        emb = createEmbeddedStringObject(s,sdslen(s));
        return emb;
    // This object cannot be encoded. Try to remove all free space from SDS
    if (o->encoding == REDIS_ENCODING_RAW &&
        sdsavail(s) > len/10)
        o->ptr = sdsRemoveFreeSpace(o->ptr);
    /* Return the original object. */
    return o;


Coding mode

/* Check if the length exceeds the ziplist length threshold. */
// Check whether the code needs to be converted into a double ended linked list after insertion
if (subject->encoding == REDIS_ENCODING_ZIPLIST &&
    ziplistLen(subject->ptr) > server.list_max_ziplist_entries)


Coding mode

void hashTypeTryConversion(robj *o, robj **argv, int start, int end) {
    int i;
    // If the object is not zip list encoded, it is returned directly
    if (o->encoding != REDIS_ENCODING_ZIPLIST) return;

    // Check all input objects to see if their string values exceed the specified length
    for (i = start; i <= end; i++) {
        if (sdsEncodedObject(argv[i]) &&
            sdslen(argv[i]->ptr) > server.hash_max_ziplist_value)
            // Converts the encoding of an object to REDIS_ENCODING_HT
            hashTypeConvert(o, REDIS_ENCODING_HT);


Coding mode

void setTypeConvert(robj *setobj, int enc) {

    setTypeIterator *si;

    // Confirm that the type and code are correct
    redisAssertWithInfo(NULL,setobj,setobj->type == REDIS_SET &&
                             setobj->encoding == REDIS_ENCODING_INTSET);

    if (enc == REDIS_ENCODING_HT) {
        int64_t intele;
        // Create a new dictionary
        dict *d = dictCreate(&setDictType,NULL);
        robj *element;

        /* Presize the dict to avoid rehashing */
        // Pre expansion space

        /* To add the elements we extract integers and create redis objects */
        // Traverse the collection and add elements to the dictionary
        si = setTypeInitIterator(setobj);
        while (setTypeNext(si,NULL,&intele) != -1) {
            element = createStringObjectFromLongLong(intele);
            redisAssertWithInfo(NULL,element,dictAdd(d,element,NULL) == DICT_OK);

        // Update the encoding of the collection
        setobj->encoding = REDIS_ENCODING_HT;
        // Update the value object of the collection
        setobj->ptr = d;
    } else {
        redisPanic("Unsupported set conversion");


Coding mode


4. Euclidean algorithm (rolling and getting along)

Euclidean algorithm:

  • Algorithm for finding the maximum common divisor
  • The core idea of the algorithm: the maximum common divisor of two numbers is equal to the maximum common divisor of the smaller number and the remainder of the division of two numbers


  • Judge whether the score composed of two numbers is the simplest, that is, whether the maximum common divisor is 1
// pseudocode
function gcd(a,b):
	     while b!= 0:
            t = b
            b = a % b
            a = t
        return a

Recursive form

public int gcd(a,b){
     return b != 0? gcd(b,a%b): a;
  • expand

Another method of calculating the greatest common divisor of two numbers

Modified subtraction method

function gcd(a,b):
		 while true:
          if a> b:  a-=b
          else if a < b: b -=a
          else  return a

5. Grammatical problems

The global threshold in c + + can only declare and initialize variables

Cannot be used for assignment, operation, function call, etc

About initialization and assignment

  • Variables in a program can be declared multiple times, but can only be defined once
  • Only when the declared variable is also defined, the declaration can have an initialization type (only the definition can allocate storage space, and the initialization must have storage space to initialize)
  • Declaration + initialization is definition
  • Assignment is an operation. The assignment equal sign is an operator from right to left. For global variables, it cannot be used for operation

Redundant parameter passing compilation warning problem

If the parameters passed in by the function are not used, the compilation will report a warning

Refer to the solution in redis code

How to anti-warning

// define a marco
#define REDIS_NOTUSED(V)  ((void) V)

// function
int function(int arg){

6. Restricted pointer

Restricted pointers, which usually appear in the declaration of pointers

int * restrict p;

If the object pointed to by pointer p needs to be modified later, the object will not be allowed to be accessed in any way other than pointer p

It will be added only when there are high performance requirements for the program


zip( iterables)*

Generate an iterator to recombine iteratable elements

Matrix is a two-dimensional matrix

zip(*matrix), which returns the elements of each column of a matrix matrix

# zip()
# Make an iterator that aggregates elements from each of the iterables.
def zip(*iterables):
    # zip('ABCD', 'xy') --> Ax By
    sentinel = object()
    iterators = [iter(it) for it in iterables]
    while iterators:
        result = []
        for it in iterators:
            elem = next(it, sentinel)
            if elem is sentinel:
        yield tuple(result)
>>> x = [1, 2, 3]
>>> y = [4, 5, 6]
>>> zipped = zip(x, y)
>>> list(zipped)
[(1, 4), (2, 5), (3, 6)]
>>> x2, y2 = zip(*zip(x, y))
>>> x == list(x2) and y == list(y2)


Delegating to a subgenerator

Allowing a generator to delegate part of its operations to another genrator

8. Thread

Thread through pthread_create function to create other threads

#include <pthread.h>

pthread_create(pthread_t *thread, const pthread_attr_t *attr,
         void *(*start_routine)(void *), void *arg);

Upon successful return, the newly created thread id will be set to the memory unit pointed to by the thread;

The new thread starts from start_ The address of the routine function starts running;

Mutex & conditional variable

  • mutex is essentially a lock. mutex is set (locked) when accessing the critical security of shared resources

  • Conditional variable is another synchronization mechanism available to threads. Conditional variable provides a place for multiple threads to meet

Conditional variables + mutexes are used together to allow threads to wait for specific conditions to occur in a non competitive manner

Condition variables are protected by mutexes. Threads must lock mutexes before changing the condition state

#include <pthread.h>
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
  • 1. Pass to pthread_ cond_ mutex of wait protects the condition
  • 2. The caller passes the locked mutex to the function
  • 3. The function places the caller on the thread list of the waiting condition and unlocks the mutex
  • 4. After the function returns, the mutex is locked again

9.fcntl sync

#include <fcntl.h>
// file control
int fcntl(int fildes, int cmd, ...);

Change the properties of an open file

Manipulate file descriptor

     sync -- force completion of pending disk writes (flush cache)

unix system implements buffer cache or page cache in the kernel

Most disk io is done through buffers

Write data to file (delayed write)

  • 1. Kernel copy to buffer (queue)
  • 2. Buffer to disk

When you need to reuse the buffer to store other disk data, you need to perform step 2 (flush)

Sync and fsync are mainly used to ensure the consistency between the actual file system on the disk and the contents of the buffer

10. File sharing

unix system supports sharing open files between different processes

Kernel data structure for all I/O

Three data structures are used to represent open files

  • Process table entry
  • File table entry
  • v node table entry

Note the distinction between file descriptors FD

File status flag FL

Two file descriptors point to the same file table entry, so they share the same file status flag

Folder Redirection

digit1 > &digit2

Indicates that the descriptor digit1 is to be redirected to the same file as the descriptor digits2

./a.out > outfile  2 > &1
# Standard output to outfile
# Execute dup to copy the standard output to descriptor 2 (standard error)
# 1 and 2 point to the same file table entry
./a.out  2 > &1  > outfile
# Descriptor 2 becomes a terminal
# Redirect standard output to outfile
# Descriptor 1 points to the outifle file table entry
# Descriptor 2 points to the file table entry of the terminal

11. IO function

io functions are all around file descriptors

Standard io libraries are all around streams

The standard io function fopen returns a pointer to the FILE object

12. Compilation and linking

// . o relocatable target file
gcc -v   // Version can be displayed

// Standard directive
gcc - Og -o prog main.c sum.c

// Command disassembly
// 1. Preprocessor cpp
cpp [other arguments] main.c /tmp/main.i
main.c   -> mian.i  // Intermediate file of ascii code
// 2.c compiler ccl
ccl /tmp/mian.i  -Og [other arguments] -o /tmp/main.s   // Assembly language file of ascii code
// 3. Assembler as
as [other arguments] -o /tmp/main.o /tmp/mian.s
// main.o relocatable target files
// 4. Linker
ld -o prog [system objects files and args] /tmp/mian.o /tmp/sum.o

// 5. Operation

# The shell calls a function called loader in os to copy the code and data in the executable prog to memory
# And transfer control to the beginning of the program

Process of program execution

When using the exec() kernel function to execute a program

  • 1.c programs are always executed from the main function
  • 2. The kernel calls a special startup routine before calling the main function
  • 3. The executable program file specifies this startup routine as the starting address of the program
  • 4. Start the routine to obtain the command line parameters and environment variable values from the kernel
If the target file is created by c Generated by code compilation, using gcc Just make a link. The entry point of the whole program is ctr1.o Provided in _start;
It first does some initialization (the following is the startup routine) startup routine), Then call c Provided in the code main Function;
Really correct description_start The real entry point of the function, main Function is_start Invoked

13. Prototype design mode

prototype, object creation design pattern

  • abstract factory abstracts factory pattern and generates prototype pattern by dynamic configuration
  • Abstract factory pattern, single instance, generate single instance pattern

Bean factory is widely used in spring framework bean factory

14. Dynamic language vs static language

Dynamic language: dynamic language is also called weakly typed language. The data type language (js, python) is determined at runtime

Static language: static language is also called strongly typed language. The variable data type can be determined at compile time (java,c/c + +)

Keywords: C Linux

Added by egorig on Sun, 27 Feb 2022 15:38:00 +0200