Linux System Diagnosis Tips (15): How to Fix File System Damage with Start-Stop Problems

Digression

Let's start with what file systems are and what tools are needed.

file system

Nowadays, there are many kinds of storage and complex stacks. But the user's direct interface is still file system . For Linux and the open source community, a lot of software is also dependent. file system . MySQL and PostgreSQL databases, for example, call the file system interface directly, unlike Oracle databases, which can cross the file system layer.

There are many file systems on Linux system. But more commonly used are Ext and Xfs File system. Ext4 filesystem is widely used nowadays.

Here we take the Ext4 filesystem as an example.

Three tools

We won't dwell on the importance of tools. To solve the problem of starting and stopping, there must be three assistants: snapshot, VNC and video recording tools.

For these tools, please refer to[
Linux system diagnostic tips (13): How to fix grub damage for startup and shutdown problems]( https://yq.aliyun.com/articles/642902).

bind mount

When restoring startup and shutdown, we often need to do chroot operations, that is, we need to operate in a new root file system. However, chroot alone is not enough. We also need to mount the / dev, / proc and / sys directories. This allows you to continue using system data after chroot.

For bind mounts, see also[
Linux system diagnostic tips (13): How to fix grub damage for startup and shutdown problems]( https://yq.aliyun.com/articles/642902) Relevant chapters.

Back to topic

file system Damage is a common cause of system startup failure. The common causes of file system damage are partition loss and manual repair of the file system.

Scene reappearance

We artificially destroy partition table information.

Preparing the test environment

Let's first prepare a file system for testing. The specific operation is as follows

#
# Let's take the second disk as an example and assume that its disk path is / dev/vdb
#
# 1. Check for new disks
fdisk -l -u /dev/vdb
# 2. New partition
fdisk /dev/vdb
# 3. Create a file system
mkfs.ext4 /dev/vdb1
# 4. Configuration System Start and Mount New File System
echo '/dev/vdb1 /opt ext4    defaults 0       2' >> /etc/fstab
# 5. Restart the system to verify that the configuration meets expectations
reboot
# 6. Verify that the mount was successful
mount -l | egrep /dev/vdb1

First look at the specific operation of the preparedness file system

Verify mount status

So the mount was successful.

Damage test: fstab configuration error

We tested it in the following way

# defaults -> default
# It can be adjusted by hand directly.
# You can use ed.
ed -s /etc/fstab <<EOF
/\/dev\/vdb1/s/defaults/default/
w
EOF
# Of course, sed can also be used.
sed -i -e '/\/dev\/vdb1/s/defaults/default/' /etc/fstab

Look at startup failure

Damage test: partition table missing

We first restore the file system mount to normal, and then manually destroy the partition table for testing.

# 1. Unloading File System
umount /dev/vdb1
# 2. Check the partition
fdisk -l -u /dev/vdb
# 3. Use fdisk directly to delete partitions
fdisk -u /dev/vdb
# 4. Verify that partitions are deleted
fdisk -l -u /dev/vdb

Examples of operations are as follows

Look at the startup failure (the system startup process is stalled, but there is no obvious error message. So video recording tools are necessary.

How to fix the problem?

Because the current instance failed to start, in this case, we did not use the shell to solve the problem on the current instance. So we need to solve the problem of where to fix it first.

There are many solutions to solve this problem, and the common means of operation and maintenance personnel is to use LiveCD. Here we recommend snapshots.

Configuration error

This situation can be solved by modifying the wrong configuration. Let's skip it.

Partition table loss problem

The Basis of Repair Scheme

The partition table is lost, as long as the partition table is recreated. Because partition table information only involves changing the location of the first sector on disk. Therefore, as long as the partition situation is confirmed, the data on disk will not be damaged if the partition table is lost. But the starting position and size of the partition need to be correct, otherwise it can not solve the problem.

Partition size, because everyone (basically) divides a disk into zones, so (almost) do not need to consider partition size. Once the starting position is determined, the size is given by fdisk. Therefore, the key to the problem is how to determine the starting position of the partition.

Let's take a common Ext filesystem as an example, see How to infer partition location from Ext3 or Ext4 file systems.

Reconstructing zoning

Reconstructing partitions can be achieved by following steps.

Determine the starting position of the partition

First, we need to validate the file system information on the partition. There are many ways:

  1. View the / etc/fstab and / etc/mtab configuration files.
  2. If the file system is still mounted, look at / proc/mounts
  3. Check file system magic directly.
  4. Track from the system log.

Let's look at an example of checking the magic number of the file system directly and verify it with the fdisk tool.

#
# 1. The magic number of ext* file system is 53 ef
#
[root@iz2ze122w6gewwurz1e637z ~]# dd if=/dev/vdb bs=512 count=4096 2>/dev/null | \
> od -tx1 | perl -ne '
>     chomp;
>     if (/^([0-7]+)\s # Location of disk data
>        ([0-9a-f][0-9a-f]\s){8} # Override Irrelevant Data
>        53\sef\s # modulus
>        0[124]\s00\s0[123]\s00\s # File System Status and Behavior Configuration after Error
>        /x) {
>              my $s=int((oct $1)/512)-2;
>              print qq[$s $_\n];
>             }'
56 0072060 72 46 cd 5b 00 00 ff ff 53 ef 01 00 01 00 00 00
#
# 2. Check the magic number, which is the Ext* file system. The starting sector of the partition deduced from magic number is 56 sectors.
#
[root@iz2ze122w6gewwurz1e637z ~]# fdisk -l -u /dev/vdb

//Disk/dev/vdb: 1099.5 GB, 1099511627776 bytes, 2147483648 sectors
Units = A sector of 1 * 512 = 512 bytes
//Sector size (logic/physics): 512 bytes/512 bytes
I/O Size(Minimum/optimum): 512 byte / 512 byte
//Disk label type: dos
//Disk identifier: 0x00088bd9

   //Device Boot Start End Blocks Id System
/dev/vdb1              56  2147483647  1073741796   83  Linux
[root@iz2ze122w6gewwurz1e637z ~]#
#
# 3. The initial sector given by fdisk is 56. This is consistent with our inference from magic numbers.
#

repair

Example: Reproducing the problem scene

If we use the fdisk tool to partition / dev/vdb again. If you configure it in / etc/fstab, the restart will fail. Please refer to the above information for specific failure reporting. If you do not restart the direct mount, you will encounter the following error

#
# 1. Direct mounting
#
[root@iz2ze122w6gewwurz1e637z ~]# mount /dev/vdb1 /opt
mount: /dev/vdb1 is write-protected, mounting read-only
mount: unknown filesystem type '(null)'
#
# 2. Mount failed
# 
[root@iz2ze122w6gewwurz1e637z ~]# mount -l | egrep /dev/vdb
#
# 3. Why did the mount fail? Because fdisk set the partition starting position to 2048
#
[root@iz2ze122w6gewwurz1e637z ~]# fdisk -u -l /dev/vdb

Disk /dev/vdb: 1099.5 GB, 1099511627776 bytes, 2147483648 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x00088bd9

   Device Boot      Start         End      Blocks   Id  System
/dev/vdb1            2048  2147483647  1073740800   83  Linux
[root@iz2ze122w6gewwurz1e637z ~]# 
Reconstructing zoning

Because the fdisk tool always sets the starting sector to 2048 and cannot be adjusted, we use the parted tool. The specific repair process is as follows.

[root@iz2ze122w6gewwurz1e637z ~]# parted /dev/vdb
GNU Parted 3.1
Using /dev/vdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
#
# 1. Use sectors as default units
#
(parted) unit s         
#
# 2. Check the current partition
#
(parted) print                                                            
Model: Virtio Block Device (virtblk)
Disk /dev/vdb: 2147483648s
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start  End          Size         Type     File system  Flags
 1      2048s  2147483647s  2147481600s  primary

#
# 3. Delete problem partitions
#
(parted) rm 1
#
# 4. Reconstruction Zoning
#
(parted) mkpart                                                           
Partition type?  primary/extended? primary                                
File system type?  [ext2]? ext4                                           
Start? 56       # Initial sector adjusted to 56                                            
End? 2147483647 # The sector number starts at 0. So the last sector is the total number of sectors minus 1.
Warning: The resulting partition is not properly aligned for best performance.
Ignore/Cancel? Ignore # This is the partition we want, so we ignore the alarm.
(parted) q                                                                
Information: You may need to update /etc/fstab.

#
# 5. Successful regionalization
[root@iz2ze122w6gewwurz1e637z ~]# tune2fs -l /dev/vdb1  
tune2fs 1.42.9 (28-Dec-2013)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          0d51624e-5f97-46bb-8423-63a6e5e72d1c
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash 
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              67108864
Block count:              268435449
Reserved block count:     13421772
Free blocks:              264170080
Free inodes:              67108853
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      1024
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Mon Oct 22 11:39:29 2018
Last mount time:          Mon Oct 22 12:12:30 2018
Last write time:          Mon Oct 22 12:12:30 2018
Mount count:              1
Maximum mount count:      -1
Last checked:             Mon Oct 22 11:39:29 2018
Check interval:           0 (<none>)
Lifetime writes:          134 MB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:              256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      0f70f076-e2f1-4f0e-bc37-e950a58e0d0c
Journal backup:           inode blocks
[root@iz2ze122w6gewwurz1e637z ~]# mount /dev/vdb1 /opt
[root@iz2ze122w6gewwurz1e637z ~]# 
[root@iz2ze122w6gewwurz1e637z ~]# mount -l | egrep /dev/vdb1
/dev/vdb1 on /opt type ext4 (rw,relatime,data=ordered)
[root@iz2ze122w6gewwurz1e637z ~]#

# the end

//Wish you a pleasant exploration.

Keywords: Operation & Maintenance Linux MySQL PostgreSQL Oracle

Added by mortona on Sun, 19 May 2019 08:56:28 +0300