Block device driver
Block device driver is one of the three driving types of Linux. Block device driver is much more complex than character device driver. Different types of storage devices correspond to different driving subsystems. The block device driver framework and its application are introduced below
1, Block device introduction
Block devices are for storage devices, such as SD card, EMMC, NAND Flash, Nor Flash, SPI Flash, mechanical hard disk, solid state disk, etc. Therefore, block device drivers are actually these storage device drivers. Compared with character devices, block device drivers
The main differences of the drive are as follows:
- Block devices can only read and write in blocks. Blocks are the basic data transmission unit of linux virtual file system (VFS); Character devices transmit data in bytes and do not need buffering
- Block devices can be accessed randomly in structure. The reading and writing of these devices are carried out according to blocks. Block devices use buffers to temporarily store data. When the conditions are ripe, the data in the buffers will be written to block devices at one time
2, Block device drive frame
2.1 block_device structure
The kernel uses block_device means block device, block_device is a structure defined in include / Linux / Fs H file
struct block_device { dev_t bd_dev; /* not a kdev_t - it's a search key */ int bd_openers; struct inode *bd_inode; /* will die */ struct super_block *bd_super; struct mutex bd_mutex; /* open/close mutex */ struct list_head bd_inodes; void * bd_claiming; void * bd_holder; int bd_holders; bool bd_write_holder; #ifdef CONFIG_SYSFS struct list_head bd_holder_disks; #endif struct block_device *bd_contains; unsigned bd_block_size; struct hd_struct *bd_part; /*number of times partitions within this device have been opened.*/ unsigned bd_part_count; int bd_invalidated; /* bd_disk The member variable is the pointer type of gendisk structure, and the kernel uses block_device Specific block device object (hard disk / partition), if it is a hard disk bd_disk points to the general disk structure gendisk */ struct gendisk *bd_disk; struct request_queue *bd_queue; struct list_head bd_list; unsigned long bd_private; /* The counter of freeze processes */ int bd_fsfreeze_count; /* Mutex for freeze */ struct mutex bd_fsfreeze_mutex; };
- Register block device: register a new block device with the kernel and apply for the device number
int register_blkdev(unsigned int major, const char *name) //major: main equipment number //Name: block device name //Return value: if the major is between 1 and 255, it means the user-defined main device number, then 0 means successful registration, and a negative value means failed registration //If major is 0, the system will automatically assign the primary device number. If it succeeds, the assigned primary device number will be returned. If it returns a negative value, the registration fails
- Unregister block devices: unregister block devices from the kernel
void unregister_blkdev(unsigned int major, const char *name) //major: the main device number of the block device to be unregistered //Name: the name of the block device to log off
2.2 gendisk structure
The linux kernel uses the gendisk structure to describe a disk device, which is defined in include / linux / genhd H medium
struct gendisk { int major; /* The primary device number of the disk device */ int first_minor; /* The first secondary device number of the disk */ int minors; /* The number of secondary device numbers of the disk, that is, the number of partitions of the disk */ char disk_name[DISK_NAME_LEN]; /* name of major driver */ char *(*devnode)(struct gendisk *gd, umode_t *mode); unsigned int events; /* supported events */ unsigned int async_events; /* async events, subset of all */ /* Partition table corresponding to disk */ struct disk_part_tbl __rcu *part_tbl; struct hd_struct part0; const struct block_device_operations *fops; /* Block device operation set */ struct request_queue *queue; /* Request queue corresponding to disk */ void *private_data; int flags; struct device *driverfs_dev; // FIXME: remove struct kobject *slave_dir; struct timer_rand_state *random; atomic_t sync_io; /* RAID */ struct disk_events *ev; #ifdef CONFIG_BLK_DEV_INTEGRITY struct blk_integrity *integrity; #endif int node_id; };
When writing a block device driver, you need to allocate and initialize a gendisk. The kernel provides a set of gendisk operation functions
- Apply for gendisk: apply before using gendisk
struct gendisk *alloc_disk(int minors) //minors: the number of secondary device numbers, that is, the number of partitions corresponding to gendisk //Return value: success, gendisk applied to, failure, NULL
- Delete gendisk:
void del_gendisk(struct gendisk *gp) //gp: gendisk to delete
- Add gendisk to the kernel: after applying to gendisk, add gendisk to the kernel
void add_disk(struct gendisk *disk) //disk: the gendisk to be added to the kernel
- Set gendisk capacity: when initializing gendisk, you need to set its capacity
void set_capacity(struct gendisk *disk, sector_t size) //disk: gendisk to set the capacity //size: disk capacity. Note that this is the number of sectors //The smallest addressable unit in a block device is a sector, which is generally 512 bytes //Therefore, the size set by this function is the number of sectors obtained by dividing the actual capacity by 521
- Adjust the reference count of gendisk: the kernel adjusts the reference count of gendisk through the following two functions
//Increase the reference count of gendisk truct kobject *get_disk(struct gendisk *disk) //Reduce the reference count of gendisk void put_disk(struct gendisk *disk)
2.3 block_device_operations structure
Block device operation set, which is the structure block_device_operations, defined in include / Linux / blkdev H medium
struct block_device_operations { int (*open) (struct block_device *, fmode_t); void (*release) (struct gendisk *, fmode_t); //rw_ The page function is used to read and write the specified page int (*rw_page)(struct block_device *, sector_t, struct page *, int rw); //ioctl function is used for I/O control of block devices int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long); int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned long); long (*direct_access)(struct block_device *, sector_t, void **, unsigned long *pfn, long size); unsigned int (*check_events) (struct gendisk *disk, unsigned int clearing); /* ->media_changed() is DEPRECATED, use ->check_events() instead */ int (*media_changed) (struct gendisk *); void (*unlock_native_capacity) (struct gendisk *); int (*revalidate_disk) (struct gendisk *); //The getgeo function is used to obtain disk information, including head, cylinder, sector and other information int (*getgeo)(struct block_device *, struct hd_geometry *); /* this callback is with swap_lock and sometimes page table lock held */ void (*swap_slot_free_notify) (struct block_device *, unsigned long); struct module *owner; };
2.4 block device I/O request process
block_ device_ Read and write functions such as read and write are not found in the operations structure, so how does the block device read and write data from the physical block device? This leads to a very important request in the block device driver_ Queue, request and bio. The relationship between the three is shown as follows:
The kernel sends all reads and writes to the block device to the request queue request_ In the queue, request_ There are a large number of requests (request structure) in the queue, and the request contains bio. Bio saves the data related to reading and writing, such as which address of the block device to read from, the length of the data read, where to read, and the data to be written if it is written
- Request queue_ Queue: each disk (gendisk) should be assigned a request_queue
Initialize request queue: apply for and initialize a request_queue, and then set this request when initializing gendisk_ The queue address is assigned to the queue member variable of gendisk
request_queue *blk_init_queue(request_fn_proc *rfn, spinlock_t *lock) //Return value: if it is NULL, it means failure. If it is successful, it will return the request_queue address //Lock: spin lock pointer. It is necessary to drive the writer to define a spin lock and pass it in. The request queue will use the lock //rfn: request processing function pointer, each request_ Each queue must have a request processing function, and its prototype is as follows void (request_fn_proc) (struct request_queue *q)
Delete request queue: when unloading the block device driver, you need to delete the previously applied request_queue
void blk_cleanup_queue(struct request_queue *q) //q: Request queue to be deleted
Allocate request queue and bind manufacturing request function: for non mechanical equipment (EMMC, SD card, etc.); For mechanical equipment, it has been bound when applying for the request queue, and an I/O scheduler is assigned to optimize the data reading and writing process. Non mechanical devices can be accessed completely randomly without complex I/O scheduler
//The following two functions are used together: /* request_queue Application function */ struct request_queue *blk_alloc_queue(gfp_t gfp_mask) //gfp_mask: memory allocation mask, generally GFP_KERNEL //Return value: the requested request without I/O scheduling_ queue /* "Manufacturing request function */ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn) //q: The request queue that needs to be bound, that is, blk_alloc_queue requested request queue //mfn: the "manufacturing" request function that needs to be bound. The function prototype is as follows: void (make_request_fn) (struct request_queue *q, struct bio *bio) // The "manufacturing request" function needs to be implemented by the driver writer
- Request: the request queue contains a series of requests
Get request: from request_ Get each request in sequence in the queue
request *blk_peek_request(struct request_queue *q) //q: Specify request_queue //Return value: request_ The next request to be processed in the queue. If there is no request to be processed, NULL will be returned
Open request: after obtaining the next request to be processed, you should start processing this request
void blk_start_request(struct request *req) //req: request to start processing
Handle the request in one step: you can also use the following functions to complete the acquisition and opening of the request at one time
struct request *blk_fetch_request(struct request_queue *q) { struct request *rq; rq = blk_peek_request(q); if (rq) blk_start_request(rq); return rq; }
Other request related functions
blk_end_request() //The byte data specified in the request is processed blk_end_request_all() //All data in the request are processed blk_end_request_cur() //chunk in current request blk_end_request_err() //The request is processed until the next error occurs __blk_end_request() //And BLK_ end_ The request function is the same, but you need to hold the queue lock __blk_end_request_all() //And BLK_ end_ request_ The all function is the same, but you need to hold the queue lock __blk_end_request_cur() //And BLK_ end_ request_ The cur function is the same, but you need to hold the queue lock __blk_end_request_err() //And BLK_ end_ request_ The err function is the same, but you need to hold the queue lock
- Bio structure: there will be multiple bio in each request, which stores the data, address and other information to be read and written.
bio is a structure defined in include/linux/blk_types.h medium
struct bio { struct bio *bi_next; /* Next bio in the request queue */ struct block_device *bi_bdev; /* Pointing block device */ unsigned long bi_flags; /* bio Status and other information */ unsigned long bi_rw; /* I/O Operation, read or write */ struct bvec_iter bi_iter; /* I/O Operation, read or write */ unsigned int bi_phys_segments; unsigned int bi_seg_front_size; unsigned int bi_seg_back_size; atomic_t bi_remaining; bio_end_io_t *bi_end_io; void *bi_private; #ifdef CONFIG_BLK_CGROUP struct io_context *bi_ioc; struct cgroup_subsys_state *bi_css; #endif union { #if defined(CONFIG_BLK_DEV_INTEGRITY) struct bio_integrity_payload *bi_integrity; #endif }; unsigned short bi_vcnt; /* bio_vec Number of elements in the list */ unsigned short bi_max_vecs; /* bio_vec List length */ atomic_t bi_cnt; /* pin count */ struct bio_vec *bi_io_vec; /* bio_vec list */ struct bio_set *bi_pool; struct bio_vec bi_inline_vecs[0]; }; /*******************************************************************/ /***** bvec_iter The structure describes the device sector and other information to be operated. The structure is as follows*****/ struct bvec_iter { sector_t bi_sector; /* I/O Requested device start sector (512 bytes) */ unsigned int bi_size; /* Number of I / OS remaining */ unsigned int bi_idx; /* blv_vec Current index in */ unsigned int bi_bvec_done; /* The number of bytes that have been processed in the current bvec */ }; /***********************************************************/ /********* bio_vec The structure is described as follows**********************/ struct bio_vec { struct page *bv_page; /* page */ unsigned int bv_len; /* length */ unsigned int bv_offset; /* deviation */ }; /***********************************************************/ /** bio,bvec_iter And bio_ The relationship between the three structures of VEC is shown in the figure below**/ /***********************************************************/
Traversing the bio in the request: the request contains a large number of bio, so it involves traversing all bio in the request and processing
#define __rq_for_each_bio(_bio, rq)\ if ((rq->bio))\ for (_bio = (rq)->bio; _bio; _bio = _bio->bi_next) //_ Bio refers to each bio traversed, which is the pointer class of bio structure //rq is the request to be traversed. It is the request structure pointer type
Traverse all segments in Bio: Bio contains the final data to be operated on, so you also need to traverse all segments in bio
#define bio_for_each_segment(bvl, bio, iter) \ __bio_for_each_segment(bvl, bio, iter, (bio)->bi_iter) //bvl is each bio traversed_ vec //Bio is the bio to be traversed. The type is bio structure pointer //Parameter to be saved in Bi traversal_ iter member variable
Notify bio of the end of processing: if the "manufacturing request" is used, that is, if the bio is processed directly without the I/O scheduler, the kernel should be notified of the completion of bio processing after the completion of bio processing
bvoid bio_endio(struct bio *bio, int error) //Bio: bio to end //error: if bio processing is successful, fill in 0 directly; if it fails, fill in a negative value, such as - EIO
3, Experiment with request queue
Use the RAM on the development board to simulate a block device ramdisk, and then write the block device driver
3.1 driver programming
The traditional request queue is written for the drive of mechanical hard disk. Create a new drive file ramdisk c. And write the following code
#define RAMDISK_SIZE (2 * 1024 * 1024) /* The capacity is 2MB*/ #define RAMDISK_NAME "ramdisk" /* Name*/ #define RADMISK_MINOR three /* Indicates that there are three disk partitions! No, the secondary device number is 3*/ /* ramdisk Equipment structure */ struct ramdisk_dev{ int major; /* Main equipment No */ unsigned char *ramdiskbuf; /* ramdisk Memory space for analog block devices */ spinlock_t lock; /* Spin lock */ struct gendisk *gendisk; /* gendisk */ struct request_queue *queue;/* Request queue */ }; struct ramdisk_dev ramdisk; /* ramdisk equipment */ /* Open block device */ int ramdisk_open(struct block_device *dev, fmode_t mode){ printk("ramdisk open\r\n"); return 0; } /* Release block device */ void ramdisk_release(struct gendisk *disk, fmode_t mode){ printk("ramdisk release\r\n"); } /* Get disk information */ int ramdisk_getgeo(struct block_device *dev, struct hd_geometry *geo){ /* This is relative to the concept of mechanical hard disk */ geo->heads = 2; /* head */ geo->cylinders = 32; /* cylinder */ geo->sectors = RAMDISK_SIZE / (2 * 32 *512); /* Number of sectors on a track */ return 0; } /* Block device operation function */ static struct block_device_operations ramdisk_fops ={ .owner = THIS_MODULE, .open = ramdisk_open, .release = ramdisk_release, .getgeo = ramdisk_getgeo, }; /* Process transmission */ static void ramdisk_transfer(struct request *req){ /* blk_rq_pos The obtained sector address is shifted 9 bits to the left and converted to byte address */ unsigned long start = blk_rq_pos(req) << 9; unsigned long len = blk_rq_cur_bytes(req); /* size */ /* bio Data buffer in * Read: the data read from the disk is stored in the buffer * Write: the buffer stores the data to be written to the disk */ void *buffer = bio_data(req->bio); if(rq_data_dir(req) == READ) /* Read data */ memcpy(buffer, ramdisk.ramdiskbuf + start, len); else if(rq_data_dir(req) == WRITE) /* Write data */ memcpy(ramdisk.ramdiskbuf + start, buffer, len); } /* Request processing function */ void ramdisk_request_fn(struct request_queue *q){ int err = 0; struct request *req; /* Cycle through each request in the request queue */ req = blk_fetch_request(q); while(req != NULL) { /* Make specific transmission processing for the request */ ramdisk_transfer(req); /* Judge whether it is the last request. If not, get the next request * Cycle through all requests in the request queue. */ if (!__blk_end_request_cur(req, err)) req = blk_fetch_request(q); } } /* Drive exit function */ static int __init ramdisk_init(void){ int ret = 0; printk("ramdisk init\r\n"); /* 1,Request memory for ramdisk */ ramdisk.ramdiskbuf = kzalloc(RAMDISK_SIZE, GFP_KERNEL); if(ramdisk.ramdiskbuf == NULL) { ret = -EINVAL; goto ram_fail; } /* 2,Initialize spin lock */ spin_lock_init(&ramdisk.lock); /* 3,Register block device */ ramdisk.major = register_blkdev(0, RAMDISK_NAME); /* The main equipment number is automatically assigned by the system */ if(ramdisk.major < 0) { goto register_blkdev_fail; } printk("ramdisk major = %d\r\n", ramdisk.major); /* 4,Allocate and initialize gendisk */ ramdisk.gendisk = alloc_disk(RADMISK_MINOR); if(!ramdisk.gendisk) { ret = -EINVAL; goto gendisk_alloc_fail; } /* 5,Allocate and initialize the request queue */ ramdisk.queue = blk_init_queue(ramdisk_request_fn, &ramdisk.lock); if(!ramdisk.queue) { ret = EINVAL; goto blk_init_fail; } /* 6,Add (register) disk */ ramdisk.gendisk->major = ramdisk.major; /* Main equipment No */ ramdisk.gendisk->first_minor = 0; /* First secondary equipment number (starting secondary equipment number) */ ramdisk.gendisk->fops = &ramdisk_fops; /* Operation function */ ramdisk.gendisk->private_data = &ramdisk; /* Private data */ ramdisk.gendisk->queue = ramdisk.queue; /* Request queue */ sprintf(ramdisk.gendisk->disk_name, RAMDISK_NAME); /* name */ set_capacity(ramdisk.gendisk, RAMDISK_SIZE/512); /* Equipment capacity (in sectors) */ add_disk(ramdisk.gendisk); return 0; blk_init_fail: put_disk(ramdisk.gendisk); //del_gendisk(ramdisk.gendisk); gendisk_alloc_fail: unregister_blkdev(ramdisk.major, RAMDISK_NAME); register_blkdev_fail: kfree(ramdisk.ramdiskbuf); /* Free memory */ ram_fail: return ret; } /* Drive exit function */ static void __exit ramdisk_exit(void){ printk("ramdisk exit\r\n"); /* Release gendisk */ del_gendisk(ramdisk.gendisk); put_disk(ramdisk.gendisk); /* Clear request queue */ blk_cleanup_queue(ramdisk.queue); /* Unregister block device */ unregister_blkdev(ramdisk.major, RAMDISK_NAME); /* Free memory */ kfree(ramdisk.ramdiskbuf); } module_init(ramdisk_init); module_exit(ramdisk_exit); MODULE_LICENSE("GPL");
3.2 operation test
Use the make -j32 command to compile the driver file to get ramdisk Ko driver module, copy it to the modules/4.1.15 directory, restart the development board and load the driver module
depmod //You need to run this command when loading the driver for the first time modprobe ramdisk.ko //Load driver module
- Check the ramdisk disk: after loading successfully, a device named "ramdisk" will be generated in the / dev / directory
fdisk -l //View disk information
- Format / dev/ramdisk: the prompt in the above figure indicates that there is no partition table because the ramdisk has not been formatted. Use mkfs VFAT command
mkfs.vfat /dev/ramdisk
- Mount access: after formatting, you can mount access. The mount point can be customized. Use the mount command
mount /dev/ramdisk /tmp //Mount to '/ tmp' directory
4, Experiment without request queue
The I/O scheduler will be used in the request queue, which is suitable for the storage device of mechanical hard disk. For storage devices without mechanical structure such as EMMC, SD and ramdisk, you can directly access any sector, so you don't need I/O scheduler and request queue. The following describes how to write a driver using the "manufacturing request" method.
Create a new drive file ramdisk c. And write the following code
#define RAMDISK_SIZE (2 * 1024 * 1024) /* The capacity is 2MB*/ #define RAMDISK_NAME "ramdisk" /* Name*/ #define RADMISK_MINOR three /* Indicates that there are three disk partitions! No, the secondary device number is 3*/ /* ramdisk Equipment structure */ struct ramdisk_dev{ int major; /* Main equipment No */ unsigned char *ramdiskbuf; /* ramdisk Memory space for analog block devices */ spinlock_t lock; /* Spin lock */ struct gendisk *gendisk; /* gendisk */ struct request_queue *queue;/* Request queue */ }; struct ramdisk_dev ramdisk; /* ramdisk equipment */ /* Open block device */ int ramdisk_open(struct block_device *dev, fmode_t mode){ printk("ramdisk open\r\n"); return 0; } /* Release block device */ void ramdisk_release(struct gendisk *disk, fmode_t mode){ printk("ramdisk release\r\n"); } /* Get disk information */ int ramdisk_getgeo(struct block_device *dev, struct hd_geometry *geo){ /* This is relative to the concept of mechanical hard disk */ geo->heads = 2; /* head */ geo->cylinders = 32; /* cylinder */ geo->sectors = RAMDISK_SIZE / (2 * 32 *512); /* Number of sectors on a track */ return 0; } /* Block device operation function */ static struct block_device_operations ramdisk_fops ={ .owner = THIS_MODULE, .open = ramdisk_open, .release = ramdisk_release, .getgeo = ramdisk_getgeo, }; /* "Manufacturing request function */ void ramdisk_make_request_fn(struct request_queue *q, struct bio *bio){ int offset; struct bio_vec bvec; struct bvec_iter iter; unsigned long len = 0; offset = (bio->bi_iter.bi_sector) << 9; /* Gets the offset address of the device to operate on */ /* Process each segment in bio */ bio_for_each_segment(bvec, bio, iter){ char *ptr = page_address(bvec.bv_page) + bvec.bv_offset; len = bvec.bv_len; if(bio_data_dir(bio) == READ) /* Read data */ memcpy(ptr, ramdisk.ramdiskbuf + offset, len); else if(bio_data_dir(bio) == WRITE) /* Write data */ memcpy(ramdisk.ramdiskbuf + offset, ptr, len); offset += len; } set_bit(BIO_UPTODATE, &bio->bi_flags); bio_endio(bio, 0); } /* Drive exit function */ static int __init ramdisk_init(void){ int ret = 0; printk("ramdisk init\r\n"); /* 1,Request memory for ramdisk */ ramdisk.ramdiskbuf = kzalloc(RAMDISK_SIZE, GFP_KERNEL); if(ramdisk.ramdiskbuf == NULL) { ret = -EINVAL; goto ram_fail; } /* 2,Initialize spin lock */ spin_lock_init(&ramdisk.lock); /* 3,Register block device */ ramdisk.major = register_blkdev(0, RAMDISK_NAME); /* The main equipment number is automatically assigned by the system */ if(ramdisk.major < 0) { goto register_blkdev_fail; } printk("ramdisk major = %d\r\n", ramdisk.major); /* 4,Allocate and initialize gendisk */ ramdisk.gendisk = alloc_disk(RADMISK_MINOR); if(!ramdisk.gendisk) { ret = -EINVAL; goto gendisk_alloc_fail; } /* 5,Allocate request queue */ ramdisk.queue = blk_alloc_queue(GFP_KERNEL); if(!ramdisk.queue){ ret = -EINVAL; goto blk_allo_fail; } /* 6,Set "manufacturing request" function */ blk_queue_make_request(ramdisk.queue, ramdisk_make_request_fn); /* 7,Add (register) disk */ ramdisk.gendisk->major = ramdisk.major; /* Main equipment No */ ramdisk.gendisk->first_minor = 0; /* First secondary equipment number (starting secondary equipment number) */ ramdisk.gendisk->fops = &ramdisk_fops; /* Operation function */ ramdisk.gendisk->private_data = &ramdisk; /* Private data */ ramdisk.gendisk->queue = ramdisk.queue; /* Request queue */ sprintf(ramdisk.gendisk->disk_name, RAMDISK_NAME); /* name */ set_capacity(ramdisk.gendisk, RAMDISK_SIZE/512); /* Equipment capacity (in sectors) */ add_disk(ramdisk.gendisk); return 0; blk_allo_fail: put_disk(ramdisk.gendisk); //del_gendisk(ramdisk.gendisk); gendisk_alloc_fail: unregister_blkdev(ramdisk.major, RAMDISK_NAME); register_blkdev_fail: kfree(ramdisk.ramdiskbuf); /* Free memory */ ram_fail: return ret; } /* Drive exit function */ static void __exit ramdisk_exit(void){ printk("ramdisk exit\r\n"); /* Release gendisk */ del_gendisk(ramdisk.gendisk); put_disk(ramdisk.gendisk); /* Clear request queue */ blk_cleanup_queue(ramdisk.queue); /* Unregister block device */ unregister_blkdev(ramdisk.major, RAMDISK_NAME); /* Free memory */ kfree(ramdisk.ramdiskbuf); } module_init(ramdisk_init); module_exit(ramdisk_exit); MODULE_LICENSE("GPL");
The test method is as like as two peas in the previous section.