Teach you how to compile software

background

Compilation (compile), the process of using the compiler to generate the target program from the source program written in the source language. Compiling is to turn the high-level language into a binary language that can be recognized by the computer. The computer only knows 1 and 0. The compiler changes the familiar language into binary.

1, Interpretation language and compilation language

The authors of bioinformatics software use many types of programming languages, such as C, C++,python, java, python, R, etc. these languages are high-level programming languages with their own advantages and disadvantages. Computers eventually need to convert it into binary to execute. Then the process of converting to binary is divided into compiled type and interpreted type.

1.1 compiled and interpreted

Compiled languages include: C language, C + +, Object-C, etc; Usually, the source code will be compiled to generate executable binary code, and the compiled results will be executed

Representatives of interpretative languages include JavaScript, Python, Erlang, PHP, Perl, Ruby, etc; Generally, the source code does not need to be compiled. It is usually run after the script is loaded through the interpreter. Because each statement is interpreted and translated only when it is executed, the explanatory language needs to be translated every time it is executed, and the efficiency is relatively low.

Java has two characteristics of compilation and interpretation: its source code can be executed as a script or compiled into class code (bytecode) is loaded and run.

1.2 differences between compiled and interpreted

The purpose of compiled language is to compile the source code into binary code before it can run, so it has higher execution efficiency and better portability. For example, some programs written in C language can run directly by copying the compiled software to a new device. However, compiled languages have to be compiled every time before they can run. It is not easy to test when writing programs. The advantage of interpretive language is that it can run directly without compilation, which is convenient to view the source code. Moreover, it has good platform compatibility, can run in any environment, and can be deployed quickly without downtime for maintenance. However, the translation of interpreted languages to new devices also requires a compiler. For example, a python program can only be interpreted and executed if the python program is installed on the device. In addition, because you have to explain it every time you run, the performance is not as good as compiled language.

2, README file

The general software installation package includes not only various source code files, but also test data, software description and readme files. The file name can be readme, readme md,readme.txt,INSTALL.txt, install, etc. You can view it directly using the less command. The readme file will introduce the software in detail, including software description, installation method, use case, contact information, etc. It mainly depends on the installation method.

install R Language dependence
yum install -y --skip-broken zlib java gcc-gfortran gcc gcc-c++ readline-devel
libXt-devel bzip2-devel.x86_64 bzip2-libs.x86_64 xz-devel.x86_64 pcre-devel.x
86_64 libcurl-devel.x86_64
 download
wget https://cloud.r-project.org/src/base/R-4/R-4.1.1.tar.gz
tar -zxvf R-4.1.1.tar.gz -C ~/biosoft
cd ~/biosoft/ R-4.1.1

Check the compilation part and type the INSTALL file

As you are reading this file, you have unpacked the R sources and are
presumably in the top directory. Issue the following commands:
./configure
make
(If your make is not called `make', set the environment variable MAKE to
its name, and use that name throughout these instructions.)
This will take a while, giving you time to read `R-admin.html'.
Then check the built system worked correctly, by
make check
and make the manuals by either or both of
make pdf to create PDF versions
make info to create info files
However, please read the notes in `R-admin.html' about paper size and
making the reference manual.

3, Compiling software process

Generally, compiling software is divided into three steps, and some are divided into four steps. The specific steps depend on which language is written according to different software. The following is a typical software installation process.

3.1 configure

Configure is to check the environment configuration before compilation. You can also change the software installation directory through option parameters. Configure is a shell script file that can be opened and viewed directly. When configure runs, it will constantly check the environment and prompt some warnings and error information. Warnings can be ignored, but it will stop when an error occurs. You need to solve this dependency, and then run configure again. You can't make the next step until all the checks pass.

Detection configuration
./configure --enable-R-shlib --with-pcre1

3.2 make

When the configure run ends and there is no problem, you can use make to compile. Make is the process of compiling source code into binary. Some software also has a make test, make check and other processes before making. There are also some software that do not require configure and can be compiled directly by make. After making, you will find some executable files in the directory or an additional bin directory. At this time, you can run these software directly.

compile
make

3.3 make install

make has completed the compilation process. make install mainly links the software to the specified installation directory, that is, the directory specified by configure in the first step. If the first step is not specified, install to the default directory, usually the / usr directory. Note that if you are not an administrator user, you do not have permission to write to the / usr directory. A permission will be prompted at this time

The problem of "Permission denied" does not affect the operation of the software. You can manually link the executable program to your own software directory. In this way, configure, make and make install complete the software installation. The most important step is configure. If there is no problem with configure, make and make install can generally be completed successfully.

install
make install

4, Install compiled software

In addition to providing source code, many software also provides compiled versions, that is, compiled versions, which can be used directly. If the software provides a compiled version, it is recommended to choose such a version, which is very convenient. After downloading, unzip it and use it. What is the difference between source code and compiled code? Source code compilation will first check the hardware and environment configuration of each system, and then compile more specifically. Generally speaking, such software is more efficient than the compiled version. However, this efficiency difference is mainly for Internet applications, which has a large number of runs. If it is one second worse each time, it will have a great impact, and biological software has little impact.

The following will explain and practice through the installation process of several software. Before installing the software, we create three file directories: bin, biosoft and src.

create folder
mkdir bin biosoft src

bin: store the executable program of each software

biosoft: software installation directory;

src: software source code;

Here are some installed and compiled software. See the case code for more details.

1 blast+
axel -n 100 https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ncbi-blast-2.12.0+-x64-linux.tar.gz
tar -zxvf ncbi-blast-2.12.0+-x64-linux.tar.gz
cd ~/biosoft/ncbi-blast-2.12.0+/bin
ls -1 | while read i;do ln -s $PWD/$i ~/bin/;done;
blastn -h Check whether the installation is successful
2 edirect
wget https://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/edirect.tar.gz
tar -zxvf edirect.tar.gz
cd ~/bin/
ln -s ~/biosoft/edirect/efetch .
ln -s ~/biosoft/edirect/edirect.pl .
ln -s ~/biosoft/edirect/ecommon.sh .
ln -s ~/biosoft/edirect/esearch .
esearch -help Check whether the installation is successful
3 sratookit
 Download the specified version
axel -n 100 https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratoolkit.current-centos_linux64.tar.gz
tar -zxvf sratoolkit.current-centos_linux64.tar.gz
cd ~/bin
ln -s ~/biosoft/sratoolkit.2.11.2-centos_linux64/bin/prefetch ./
ln -s ~/biosoft/sratoolkit.2.11.2-centos_linux64/bin/fasterq-dump ./
ln -s ~/biosoft/sratoolkit.2.11.2-centos_linux64/bin/fastq-dump ./
prefetch -h Check whether the installation is successful

5, Self compiled software

Some software has no compiled version and needs to be compiled by themselves. The advantage of self compilation is that it can better adapt to the hardware environment and have better efficiency, but it has little impact on our ordinary users. It is not some Internet applications. There will be some differences in the tools that process hundreds of millions of requests per second. You don't have to worry too much about compiling by yourself. Some tools are easy to compile. You can directly click make. You can view the help documents for slightly more complex ones. This process is also a process of learning and in-depth understanding of computer principles. Due to different system configuration environments, some of the software compiled below may not succeed.

If the compilation fails, bioconda can be used for installation later. Unzip the source code to the installation directory biosoft. After compiling, link the executable ln -s to the bin directory.

Here are some examples of compiling software. See the script code for more details.

1 bwa
cd ~/biosoft
git clone https://github.com/lh3/bwa.git
cd bwa; make
bwa Check that the installation is complete
2 minimap2
git clone https://github.com/lh3/minimap2
cd minimap2 && make
vi ~/.bashrc Edit the environment variable and add the following line
export PATH=$PATH:/User path/biosoft/minimap2  Set path environment variable
source ~/.bashrc Refresh environment variables
minimap2 Check that the installation is complete
3 prodigal
git clone https://github.com/hyattpd/Prodigal.git
cd Prodigal
sudo make install
prodigal Check that the installation is complete
4 canu
git clone https://github.com/marbl/canu.git
cd canu/src
sudo make
export PATH=$PATH:/User path/biosoft/canu Set path environment variable
source ~/.bashrc Refresh environment variables
canu Check that the installation is complete
5 flye
git clone https://github.com/fenderglass/Flye
cd Flye
sudo make
export PATH=$PATH:/User path/biosoft/Flye Set path environment variable
source ~/.bashrc Refresh environment variables
Flye Check that the installation is complete

Added by MHardeman25 on Wed, 15 Dec 2021 07:58:43 +0200