Next, let's look at the suite of tools to perform variant calling. The first one is sent tools with it's option mpileup. We've already looked at a number of command line options and a number of commands for samtools, but this is different. So the samtools mpileup command creates at every base in the genome which we have a line reads, a tally of the information that is present reads at that position. And it optionally also calculates a genotype likelihood which can be used by other tools further down the pipeline in order to produce the actual variant calls. The format for the output file is usually the specific mpileup output. However, it can also produce VCF and BCF format. So let's take a quick look. I'm gonna say samtools and i'll say samtools1, because I have multiple versions on my system, this are most recent mpileup and just like before I would like to look at the options, so I'm gonna just save them in a log file for easy viewing. So you can see that the comment is samtools mpileup, a number of options then the BAM file and bump to BAM file if that's the case. There are a number of options for the input, for instance the system for representing qualities, whether a long list of repairs should be discarded or not and so on, whether we should be looking at the entire genome or maybe only within a region or a set of regions, this file here. Then there are output options. And here's one important set, actually of important options -g will produce a BCF format, and it invokes the procedure for calculating the genotype likelihoods. The same for -v, but the output will be in VCF format. Each of these can be further compressed gzip. The -O option allows us to see the output based positions on reads, this corresponds to the last column that we have on the format slide, and lastly, there are a number of parameters that can be used to control the calculation of the genotype likelihoods. And one option that I would be using here -f for specifying the fast day reference file for the genome. I have a file here, sample.bam. We can visualize it with samtools view. And we can get how many alignments it contains. samtoolsfax.sample.bam to apply some of the commands that we've learned in the previous lecture. So then it could be small, small number, and now let's use samtools to produce a pileup format to start with samtools mpileup -f and we're specifying the reference genome and in this case is hg19, dot fa. And then lastly the bamfile. And I should make one observation- the file needs to be sorted in index, so let's see what we get. So I couldn't open the FASTA index. Typically, we need to sort the file, however this is file is already sorted, so all we need to do is to do samtools index sample.bam, and this will create a sample.bam.bai.index file. So now let's repeat that operation. And I couldn't find the fa file, because the extension was fast a, sorry. So now as you see that went by quite fast, so I'm going to capture that output into sample.mpileup. And we can do more on this file sample.mpileup. And you'll recognize the output that I showed you earlier. So these are all matches to chromosome 17. This is the positioning the genome, every position on which they are aligned is listed there, the reference genome letter. And then we have information about the number of reads, followed by information about the letters in those reads, where dots represent a match and then the qualities. So that's the amount of information that we have in this mpileup format. However often times, and actually, usually in order to produce variant coarse with the tools such as vcf tools or maybe jdk. One we need to use the vcf format or maybe the bcf format, and for that we give the option, let's try -v -u for vcf, -v says, "this u", -u says, "i'll compress" and were going to save that into a vcf file. More sample.vcf, and you can see so there is a sequence for every counting, then you'll recognise the info lines, the format lines and then, every base here had shown as one line. And if I wanted to produce vcf format, then we would really just have to put a g here. So that is a compress simple.vcf format. So let me show you sample.vcf. Okay, so that's it. So this concludes the section on samtools and mpileup. Next we'll be talking about how we can make variant calls with bcf tools.