There are a lot of configurations for Ubuntu deep learning environment

There are a lot of configurations for Ubuntu deep learning environment

I changed from other majors to deep learning. Due to the lack of system knowledge of many computers at the beginning of contact, various problems often appear in the environment configuration. There are often many solutions to the same problem on the Internet, some of which are available and some are not available. At first, we can only explore the exact solution to the problem through continuous attempts. However, with more and more pits, we gradually understand the reasons behind some common problems. In order to avoid forgetting and facilitate search, I also hope to help other small partners in the same situation as me, and no longer waste a lot of time on information identification and screening, scheme collection and trial and error.

1. Install Python 3 & PIP3 for Ubuntu

1.1 installation foundation

sudo apt-get install software-properties-common

1.2 add source and update

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update

1.3 installing a specific version of Python & PIP3

sudo apt install -y python3.8
sudo apt install -y python3-pip

1.4 configure soft links

sudo ln -s /usr/bin/python3.8 /usr/bin/python
sudo ln -s /usr/bin/pip3 /usr/bin/pip

If the soft link already exists, use the - f parameter to overwrite it

1.5 cleaning apt cache

apt-get autoclean 
apt autoclean 
rm -rf /var/lib/apt/lists/*

rm -rf ~/.cache/pip
rm -rf ~/.cache/pip3

2. The packaging container is an image and pushed to DockerHub

2.1 log in to DockerHub

docker login

2.2 add commit

docker commit [ID] [Name]

2.3 tag and push

docker tag [ID] user name/Warehouse name
docker push user name/Warehouse name

2.4 saving docker image

docker save -o docker_iamges_name.tar REPOSITORY:TAG

Image loading 2.5 docker

docker load -i docker_iamges_name.tar

3. Add environment variables after CUDA installation

export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}}

export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
                         
source ~/.bashrc

4. Unzip the zip file for Ubuntu

unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]

5. Correspondence between NVIDIA driver version and CUDA version

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

6. Install pytorch

6.1 v1.8.0+conda

# CUDA 10.2
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch

# CUDA 11.1
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge

# CPU Only
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cpuonly -c pytorch

6.2 v1.8.0+pip

# CUDA 11.1
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html

# CUDA 10.2
pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0

# CPU only
pip install torch==1.8.0+cpu torchvision==0.9.0+cpu torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html

6.3 v1.7.1+conda

# CUDA 9.2
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=9.2 -c pytorch

# CUDA 10.1
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch

# CUDA 10.2
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch

# CUDA 11.0
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch

# CPU Only
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cpuonly -c pytorch

6.4 v1.7.1+pip

# CUDA 11.0
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

# CUDA 10.2
pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2

# CUDA 10.1
pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

# CUDA 9.2
pip install torch==1.7.1+cu92 torchvision==0.8.2+cu92 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

# CPU only
pip install torch==1.7.1+cpu torchvision==0.8.2+cpu torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

7. Dockerfile packaging image

docker build [OPTIONS] PATH | URL | -

Options common parameters:

  • -t: Package the name and label of the image, which is usually written as name:tag
  • --rm: after successful construction, delete the intermediate container.
  • --Force RM = true: delete the intermediate container no matter whether the construction is successful or not
  • --No cache: no cache is used when building images.
  • -f: Specify the path of DockerFile

8. Install ssh server and client

apt-get install openssh-server
apt-get install openssh-client

Start ssh service and check startup

# Start SSH service
/etc/init.d/ssh start
# Check start-up
ps -e | grep ssh

Generate key

ssh-keygen

9. Use MobaXterm to access the remote tensorboard locally

9.1 configuring mobashtunnel

forward port: local port

SSH Server: server and SSH port

Remote Server: localhost

Remote port: 6006

9.2 activate corresponding conda virtual environment

conda activate tensorboard_NN
# Start tensorboard
tensorboard --log_dir=/path/to/log_dir --host=localhost

9.3 connecting remote tensorboard

After activating SSHTunnel, enter in the browser

127.0.0.1: local port

or

Server ip:6006

Can access

10. Docker usage

A list of images appears in column 10.1

docker images
docker image ls

10.2 deleting an image

docker rmi repo:tag

11. Install Nvidia docker

11.1 add source to Library

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

11.2 update apt Library

sudo apt-get update

Note: if you can't connect at this time https://nvidia.github.io However, if the server can connect to the Internet, you may need to set up apt agent, which can be installed and updated through temporary setting agent.

sudo apt-get -o Acquire::http::proxy="http://127.0.0.1:8889/" update

Note:8889 is the forwarding port of the local agent

11.3 installation of Nvidia docker

# Do not use temporary agents
sudo apt-get install -y nvidia-docker2
# Use temporary agent
sudo apt-get install -o Acquire::http::proxy="http://127.0.0.1:8889/" nvidia-docker2

11.4 restart Docker service

sudo systemctl restart docker

12. Apt get proxy settings

12.1 setting by environment variable

export http_proxy=http://127.0.0.1:8000
sudo apt-get update

12.2 setting through configuration file

Acquire::http::proxy "http://127.0.0.1:8000/";
Acquire::ftp::proxy "ftp://127.0.0.1:8000/";
Acquire::https::proxy "https://127.0.0.1:8000/";

12.3 temporary setting via command line

sudo apt-get -o Acquire::http::proxy="http://127.0.0.1:8000/" update

13. Embed Python script in Shell script

#!/bin/bash

# "$@" is a parameter passed to the python script, which is equivalent to reading in from standard input
# The indentation of python code segments should be controlled with spaces, otherwise illegal indentation errors will be encountered
# < < - symbols are optional, but it's best to have them- When the symbol exists, the end identifier can be at any position. When the - symbol does not exist, the end identifier can only be at the beginning of a line
# END is the END identifier, which can be customized, such as EOD, EOF, etc.
python3 - "$@" <<-END
import torch

print('Hello, world!')
END

14. Ubuntu turns on and off the graphical interface

# Close the graphical interface
sudo systemctl set-default multi-user.target
# Open graphical interface
sudo systemctl set-default graphical.target
# Restart effective
sudo shutdown -r now

15. Ubuntu quick install NVIDIA graphics driver

# Check existing drivers
ubuntu-drivers devices
# The installed driver version is the version listed in the above command
sudo apt install nvidia-driver-xxx-server

16. Transfer data between Ubuntu

# scp
scp -r user@ip:/path/to/file /path/to/file

17. SSH public / private key password free login

17.1 enable key login mode on the server side

sudo vim /etc/ssh/sshd_config

# ----------------------------------------
# Allow root remote login
PermitRootLogin yes

# Whether password login is on
PasswordAuthentication yes

# Turn on public key authentication
RSAAuthentication yes # This parameter may not matter
PubkeyAuthentication yes

# File location where the login user's public key is stored
# The location is under the home directory of the login user name ssh
# Root is / root / ssh
# Foo is / home / foo / ssh
AuthorizedKeysFile .ssh/authorized_keys
# ----------------------------------------

service sshd restart

17.2 public private key generated by client

ssh-keygen

This command will be displayed in the user directory / Create public / private key under ssh folder: ID_ RSA (private key), ID_ rsa. Pub (public key)

17.3 upload public key to server

ssh-copy-id -i user/.ssh/id_rsa.pub user@ip
# Or upload using scp

17.4 write the public key into the authorization file

cat >>~/.ssh/authorized_keys<id_rsa.pub

18. Eject mobile hard disk from Ubuntu

# First use umount to unmount all
sudo udisksctl power-off -b /dev/sdb

19. Access jupyter lab on remote server locally

19.1 establish conda environment and install jupyter lab

conda create -n jupyter python=3.8
conda activate jupyter
pip install jupyter lab

19.2 generate jupyter notebook login password

ipython
# If not, install it first
# Enter in the interactive interface
from notebook.auth import passwd
passwd()
# Remember the input string and output the password at the same time

19.3 generate jupyter configuration file

jupyter notebook --generate-config
# Modify profile
vim ~/.jupyter/jupyter_notebook_config.py
# Main modifications
c.NotebookApp.allow_remote_access = True
c.NotebookApp.open_browser = False
c.NotebookApp.ip='*'
c.NotebookApp.port = 8888
c.NotebookApp.password = 'sha1:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
  • Allow remote access
  • Open without local browser
  • IP whitelist
  • Port mapped to local
  • Configure password (output string from the previous step)

19.4 running jupyter on the server

nohup jupyter lab &

19.5 local use

Enter the server ip: port in the browser to access

summary

Previously, I wanted to use an article as a directory through hyperlink and link to other articles. I didn't expect to add a directory directly in front of it. I hope this basket can help you, and I will update it from time to time~

Keywords: Docker ssh Ubuntu Deep Learning

Added by protokol on Mon, 07 Mar 2022 17:29:20 +0200