There are a lot of configurations for Ubuntu deep learning environment
I changed from other majors to deep learning. Due to the lack of system knowledge of many computers at the beginning of contact, various problems often appear in the environment configuration. There are often many solutions to the same problem on the Internet, some of which are available and some are not available. At first, we can only explore the exact solution to the problem through continuous attempts. However, with more and more pits, we gradually understand the reasons behind some common problems. In order to avoid forgetting and facilitate search, I also hope to help other small partners in the same situation as me, and no longer waste a lot of time on information identification and screening, scheme collection and trial and error.
1. Install Python 3 & PIP3 for Ubuntu
1.1 installation foundation
sudo apt-get install software-properties-common
1.2 add source and update
sudo add-apt-repository ppa:deadsnakes/ppa sudo apt update
1.3 installing a specific version of Python & PIP3
sudo apt install -y python3.8 sudo apt install -y python3-pip
1.4 configure soft links
sudo ln -s /usr/bin/python3.8 /usr/bin/python sudo ln -s /usr/bin/pip3 /usr/bin/pip
If the soft link already exists, use the - f parameter to overwrite it
1.5 cleaning apt cache
apt-get autoclean apt autoclean rm -rf /var/lib/apt/lists/* rm -rf ~/.cache/pip rm -rf ~/.cache/pip3
2. The packaging container is an image and pushed to DockerHub
2.1 log in to DockerHub
docker login
2.2 add commit
docker commit [ID] [Name]
2.3 tag and push
docker tag [ID] user name/Warehouse name docker push user name/Warehouse name
2.4 saving docker image
docker save -o docker_iamges_name.tar REPOSITORY:TAG
Image loading 2.5 docker
docker load -i docker_iamges_name.tar
3. Add environment variables after CUDA installation
export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} source ~/.bashrc
4. Unzip the zip file for Ubuntu
unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]
5. Correspondence between NVIDIA driver version and CUDA version
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
6. Install pytorch
6.1 v1.8.0+conda
# CUDA 10.2 conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch # CUDA 11.1 conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge # CPU Only conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cpuonly -c pytorch
6.2 v1.8.0+pip
# CUDA 11.1 pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html # CUDA 10.2 pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 # CPU only pip install torch==1.8.0+cpu torchvision==0.9.0+cpu torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
6.3 v1.7.1+conda
# CUDA 9.2 conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=9.2 -c pytorch # CUDA 10.1 conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch # CUDA 10.2 conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch # CUDA 11.0 conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch # CPU Only conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cpuonly -c pytorch
6.4 v1.7.1+pip
# CUDA 11.0 pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html # CUDA 10.2 pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 # CUDA 10.1 pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html # CUDA 9.2 pip install torch==1.7.1+cu92 torchvision==0.8.2+cu92 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html # CPU only pip install torch==1.7.1+cpu torchvision==0.8.2+cpu torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
7. Dockerfile packaging image
docker build [OPTIONS] PATH | URL | -
Options common parameters:
- -t: Package the name and label of the image, which is usually written as name:tag
- --rm: after successful construction, delete the intermediate container.
- --Force RM = true: delete the intermediate container no matter whether the construction is successful or not
- --No cache: no cache is used when building images.
- -f: Specify the path of DockerFile
8. Install ssh server and client
apt-get install openssh-server apt-get install openssh-client
Start ssh service and check startup
# Start SSH service /etc/init.d/ssh start # Check start-up ps -e | grep ssh
Generate key
ssh-keygen
9. Use MobaXterm to access the remote tensorboard locally
9.1 configuring mobashtunnel
forward port: local port
SSH Server: server and SSH port
Remote Server: localhost
Remote port: 6006
9.2 activate corresponding conda virtual environment
conda activate tensorboard_NN # Start tensorboard tensorboard --log_dir=/path/to/log_dir --host=localhost
9.3 connecting remote tensorboard
After activating SSHTunnel, enter in the browser
127.0.0.1: local port
or
Server ip:6006
Can access
10. Docker usage
A list of images appears in column 10.1
docker images docker image ls
10.2 deleting an image
docker rmi repo:tag
11. Install Nvidia docker
11.1 add source to Library
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
11.2 update apt Library
sudo apt-get update
Note: if you can't connect at this time https://nvidia.github.io However, if the server can connect to the Internet, you may need to set up apt agent, which can be installed and updated through temporary setting agent.
sudo apt-get -o Acquire::http::proxy="http://127.0.0.1:8889/" update
Note:8889 is the forwarding port of the local agent
11.3 installation of Nvidia docker
# Do not use temporary agents sudo apt-get install -y nvidia-docker2 # Use temporary agent sudo apt-get install -o Acquire::http::proxy="http://127.0.0.1:8889/" nvidia-docker2
11.4 restart Docker service
sudo systemctl restart docker
12. Apt get proxy settings
12.1 setting by environment variable
export http_proxy=http://127.0.0.1:8000 sudo apt-get update
12.2 setting through configuration file
Acquire::http::proxy "http://127.0.0.1:8000/"; Acquire::ftp::proxy "ftp://127.0.0.1:8000/"; Acquire::https::proxy "https://127.0.0.1:8000/";
12.3 temporary setting via command line
sudo apt-get -o Acquire::http::proxy="http://127.0.0.1:8000/" update
13. Embed Python script in Shell script
#!/bin/bash # "$@" is a parameter passed to the python script, which is equivalent to reading in from standard input # The indentation of python code segments should be controlled with spaces, otherwise illegal indentation errors will be encountered # < < - symbols are optional, but it's best to have them- When the symbol exists, the end identifier can be at any position. When the - symbol does not exist, the end identifier can only be at the beginning of a line # END is the END identifier, which can be customized, such as EOD, EOF, etc. python3 - "$@" <<-END import torch print('Hello, world!') END
14. Ubuntu turns on and off the graphical interface
# Close the graphical interface sudo systemctl set-default multi-user.target # Open graphical interface sudo systemctl set-default graphical.target # Restart effective sudo shutdown -r now
15. Ubuntu quick install NVIDIA graphics driver
# Check existing drivers ubuntu-drivers devices # The installed driver version is the version listed in the above command sudo apt install nvidia-driver-xxx-server
16. Transfer data between Ubuntu
# scp scp -r user@ip:/path/to/file /path/to/file
17. SSH public / private key password free login
17.1 enable key login mode on the server side
sudo vim /etc/ssh/sshd_config # ---------------------------------------- # Allow root remote login PermitRootLogin yes # Whether password login is on PasswordAuthentication yes # Turn on public key authentication RSAAuthentication yes # This parameter may not matter PubkeyAuthentication yes # File location where the login user's public key is stored # The location is under the home directory of the login user name ssh # Root is / root / ssh # Foo is / home / foo / ssh AuthorizedKeysFile .ssh/authorized_keys # ---------------------------------------- service sshd restart
17.2 public private key generated by client
ssh-keygen
This command will be displayed in the user directory / Create public / private key under ssh folder: ID_ RSA (private key), ID_ rsa. Pub (public key)
17.3 upload public key to server
ssh-copy-id -i user/.ssh/id_rsa.pub user@ip # Or upload using scp
17.4 write the public key into the authorization file
cat >>~/.ssh/authorized_keys<id_rsa.pub
18. Eject mobile hard disk from Ubuntu
# First use umount to unmount all sudo udisksctl power-off -b /dev/sdb
19. Access jupyter lab on remote server locally
19.1 establish conda environment and install jupyter lab
conda create -n jupyter python=3.8 conda activate jupyter pip install jupyter lab
19.2 generate jupyter notebook login password
ipython # If not, install it first
# Enter in the interactive interface from notebook.auth import passwd passwd() # Remember the input string and output the password at the same time
19.3 generate jupyter configuration file
jupyter notebook --generate-config # Modify profile vim ~/.jupyter/jupyter_notebook_config.py
# Main modifications c.NotebookApp.allow_remote_access = True c.NotebookApp.open_browser = False c.NotebookApp.ip='*' c.NotebookApp.port = 8888 c.NotebookApp.password = 'sha1:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
- Allow remote access
- Open without local browser
- IP whitelist
- Port mapped to local
- Configure password (output string from the previous step)
19.4 running jupyter on the server
nohup jupyter lab &
19.5 local use
Enter the server ip: port in the browser to access
summary
Previously, I wanted to use an article as a directory through hyperlink and link to other articles. I didn't expect to add a directory directly in front of it. I hope this basket can help you, and I will update it from time to time~