Jun's Blog

Output, activities, memo and etc.

Victoria and Albert Museum - The Future Starts Here

The Future Starts Here

I found interesting event " The Future Starts Here" in Victoria and Albert Museum, in London. Maybe I am going to visit there soon.

You can feel what's interesting from below URLs.

Victoria and Albert Museum (now until Sunday 4th November, 2018)

https://www.vam.ac.uk/exhibitions/the-future-starts-here

#TheFutureStartsHere hashtag on Twitter

I want to see especially Oxford Nanopore Sequencer MinION, Bento Lab [1] and "Who wants to live forever?".

MinION (UK base) is to do sequencing anywhere by anyone needless to say. Bento Lab (UK base) is to do genetic experience everywhere. When the the mobility is promote, as a result the most popular CPU company for the mobile device, ARM (UK base) is expanded more. The age of personal biology is starting now.

UK government campaign

By the way, Victoria and Albert Museum is collaborating UK base companies including UK company of multi regional company in variety of way, seasonal events and regular exhibitions.

In 2018, UK government is promoting a campaign "The Year of Engineering" [2]. It seems that the government is fighting against the separation of the some global companies by Brexit, and trying to change the country such as Switzerland and Singapore. I saw an article that UK government people visiting Switzerland to observe the strategy on BBC news.

The Year of Engineering is a government campaign which celebrates the world and wonder of engineering.

Sequencing: Run mapping

This is the next article from Sequencing: Install mapping tools - Another Japan in the World.

Try to do mapping by Hisat2 with previous fastq files trimmed by FastQC.

Go to Hisat2 official web [1], click H. sapiens GRCh38 "genome" link, and download index data file of reference genome data on the right side of the page.

$ wget ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/data/grch38.tar.gz

$ tar xzf grch38.tar.gz

$ cd grch38

$ ls -l
total 8844024
-rw-r--r--  1 jun.aruga  staff   986172031 Mar 12  2016 genome.1.ht2
-rw-r--r--  1 jun.aruga  staff   736462272 Mar 12  2016 genome.2.ht2
-rw-r--r--  1 jun.aruga  staff       11294 Mar 12  2016 genome.3.ht2
-rw-r--r--  1 jun.aruga  staff   736462267 Mar 12  2016 genome.4.ht2
-rw-r--r--  1 jun.aruga  staff  1295322177 Mar 12  2016 genome.5.ht2
-rw-r--r--  1 jun.aruga  staff   749943562 Mar 12  2016 genome.6.ht2
-rw-r--r--  1 jun.aruga  staff           8 Mar 12  2016 genome.7.ht2
-rw-r--r--  1 jun.aruga  staff           8 Mar 12  2016 genome.8.ht2
-rwxr-xr-x  1 jun.aruga  staff        1504 Mar 17  2016 make_grch38.sh

Then a directory that I created trimmed fastq files.

$ cd /path/to/SRA067162/trimmed_by_quality_28

$ ls *.fq
SRR747784_1_val_1.fq  SRR747784_2_val_2.fq

$ mkdir hisat2_out

Run mapping. sam (Sequence Alignment Map) [2] file is created as a output. I needed to wait a few minutes.

-x option is to set a prefix of the index files. If you extracted the index files in /Users/jun.aruga/doc/dev/bio/reference_genome/grch38/genome.1.ht2, set /Users/jun.aruga/doc/dev/bio/reference_genome/grch38/genome without the string .1.ht2.

$ hisat2 -x /Users/jun.aruga/doc/dev/bio/reference_genome/grch38/genome -1 SRR747784_1_val_1.fq -2 SRR747784_2_val_2.fq -S hisat2_out/SRR747784.sam
847491 reads; of these:
  847491 (100.00%) were paired; of these:
    60665 (7.16%) aligned concordantly 0 times
    765182 (90.29%) aligned concordantly exactly 1 time
    21644 (2.55%) aligned concordantly >1 times
    ----
    60665 pairs aligned concordantly 0 times; of these:
      469 (0.77%) aligned discordantly 1 time
    ----
    60196 pairs aligned 0 times concordantly or discordantly; of these:
      120392 mates make up the pairs; of these:
        111327 (92.47%) aligned 0 times
        6086 (5.06%) aligned exactly 1 time
        2979 (2.47%) aligned >1 times
93.43% overall alignment rate

$ ls -hl hisat2_out/SRR747784.bam
-rw-r--r--  1 jun.aruga  staff   546M May 20 13:54 hisat2_out/SRR747784.bam

Install samtools [3] to convert sam file to bam (Binary Alignment Map) file [4]. sam format is good to read by human. bam format is good to read by machine. Later I am going to use the bam file by a program.

$ brew install samtools

$ samtools --version
samtools 1.8
Using htslib 1.8
Copyright (C) 2018 Genome Research Ltd.

To convert the sam file to the bam file.

$ samtools view -b SRR747784.sam -o SRR747784.bam

As a reference, I'd like to show you the command to convert from bam file to sam file, though I have not use this command in this case.

$ samtools view -h SRR747784.bam -o SRR747784_new.sam