Sequencing: Install mapping tools
I tried to do mapping the genome data I used before.
I tried to install a tool for mapping of the genome. I tried 2 tools: TopHat 2 [1], HISAT2 [2] for that.
The result is
- TopHat 2: I could not install it.
- HISAT 2: I could install it.
This article is to log for these 2 tools.
By the way, "alignment" and "mapping" have been used as a synonymous. [3] But nowadays "alignment" is not a way to to do the result "mapping" as a possibility.
TopHat2
TopHat looks popular tool for the mapping. We can see the article in EBI. There are TopHat and the next version TopHat2.
There are 2 dependency libraries "boost" [5] and "bowtie".
I could install those building from source.
But on Mac environment, installing by brew install
is easier.
$ brew install boost $ brew install bowtie2
I used GCC 5, as I saw the install was tried on GCC 5.
$ git clone git@github.com:infphilo/tophat.git $ cd tophat $ brew install gcc@5 $ /usr/local/bin/gcc-5 --version gcc-5 (Homebrew GCC 5.5.0_2) 5.5.0 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ ./autogen.sh $ ./configure --prefix=/usr/local/tophat-2-dev CC=gcc-5 ... -- tophat 2.1.1 Configuration Results -- C++ compiler: g++ -Wall -Wno-strict-aliasing -g -gdwarf-2 -Wuninitialized -O3 -DNDEBUG -I./samtools-0.1.18 -pthread -I/usr/local/include -I./SeqAn-1.4.2 Linker flags: -L./samtools-0.1.18 -L/usr/local/lib BOOST libraries: -lboost_thread-mt -lboost_system GCC version: gcc-5 (Homebrew GCC 5.5.0_2) 5.5.0 Host System type: x86_64-apple-darwin17.4.0 Install prefix: /usr/local/tophat-2-dev Install eprefix: ${prefix} See config.h for further configuration information. Email bug reports to <tophat.cufflinks@gmail.com>. $ make 2>&1 | tee -a make.log ... gcc -c -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_USE_KNETFILE -D_CURSES_LIB=0 -I. bam2depth.c -o bam2depth.o gcc -g -Wall -O2 -o samtools_0.1.18 bam_tview.o bam_plcmd.o sam_view.o bam_rmdup.o bam_rmdupse.o bam_mate.o bam_stat.o bam_color.o bamtk.o kaln.o bam2bcf.o bam2bcf_indel.o errmod.o sample.o cut_target.o phase.o bam2depth.o -Lbcftools libbam.a -lbcf -lm -lz #-lcurses Undefined symbols for architecture x86_64: "___ks_insertsort_heap", referenced from: _ks_combsort_heap in libbam.a(bam_sort.o) _ks_introsort_heap in libbam.a(bam_sort.o) ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) make[3]: *** [samtools_0.1.18] Error 1 make[2]: *** [libbam.a] Error 2 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2
Then according to this article [7], TopHat 2 The author is recommending using HISAT2 rather than TopHat2.
So, I moved on HISAT2, leaving the error.
HISAT2
The install is success.
$ git clone git@github.com:infphilo/hisat2.git $ cd hisat2 $ make $ ./hisat2 --version /Users/jun.aruga/git/hisat2/hisat2-align-s version 2.1.0 64-bit Built on users-MacBook-Air.local Fri May 4 00:29:47 CEST 2018 Compiler: InstalledDir: /Library/Developer/CommandLineTools/usr/bin Options: -O3 -m64 -msse2 -funroll-loops -g3 -DPOPCNT_CAPABILITY Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8} $ ls -l hisat2 -rwxr-xr-x 1 jun.aruga staff 18181 May 4 00:25 hisat2* $ ./hisat2 --help ...
Consideration
I would see the bioinformatics tools like an IT programming tools used in IT industry Python, Ruby of old ages. The bioinformatics' CI environment could be improved. The reason is the the number of users for bioinformatics are much smaller than the number for IT tools such as Python, Ruby. Bioinformatics is in a dawn of the age. This could be improved by people working for tomorrow.
References
- [1] TopHat 2: http://ccb.jhu.edu/software/tophat, GitHub: https://github.com/infphilo/tophat
- [2] HISAT 2: http://ccb.jhu.edu/software/hisat2, GitHub: https://github.com/infphilo/hisat2
- [3] Alignment and mapping
- [4] https://www.ebi.ac.uk/training/online/course/functional-genomics-ii-common-technologies-and-data-analysis-methods/read-mapping-or
- [5] Boost: Boost C++ Libraries
- [6] Bowtie2: Bowtie 2: fast and sensitive read alignment, Bowtie2 GitHub: https://github.com/BenLangmead/bowtie2
- [7] https://www.biostars.org/p/305409/