So we're saying create another directory for sample test one. So there's going to be under the Tophat directory and we're getting the full path. So we can run this transcript from anywhere on our file system. And then we're running the command tophat2. We're saying put the output. In this directory, Test1. Run ten threads. Report at most ten results for each read. Use the information, the genome and transfer allocations in this data file. And with transcript of index. To inform your alignment process. Use this information, if available, about the distance between mate. Use this bolder index. And by the way, if a bolder index is not ready on the system, then you can create one using bolder by build. Just like I've shown you last time. And lastly, these are the two files containing reads, mate one and mate two. So this is fairly long, and we can make this a little bit easier. So first of all, let's observe that we have a base directory. Which is data1 igm to my name, or Sara L4, Okay? And, from those we can create a data directory. We can name a data directory, and a working directory. So, data directory, for instance, is a name that we're giving to the directory that contains our data. So, this data1 for Sara. And I just found a mistake now, so it should be in data. Okay, so now I can simply go and replace the string here with $DATADIR. And the same thing for the second file. So that was simple. The other name that is being typed multiple times here, is the working directory. So let's see that again. Working directory. So it's data one and so on all the way to top hat. Now we can simply replace the string here with work directory. Tophat test one. And now, we can write work directory, test one. Finally we can also give the information about the annotation. So we can replace all of these with just a node. And the same for the transcript index. And lastly, we can do so for the bowtie index as well. That's back in bowtie, too. So there's a much easier way of representing the amount of information. And it also ensures that any changes are going to be encapsulated at the top. So any change we need to make, we can make in the datasets at the top. And once we have this file, all we need to do is to say, sh com.tophat. We put nohup at the beginning because we don't want the process to be interrupted once we log out. Where this is being done remotely. And we can save whatever errors, we do com.tophat.log. And we can put that in the background. So we wrote a very simple shell script that could show us how we can run TopHat. I'm not going to do that. And as I told you, I'm going to cheat a little bit. Because all of the data files that I've selected are fairly large. They have about 120 million reads each. So instead I'm going to go and show you what the result looks like. So the output for Test1 will look as follows. In the directory Test1, we have a number of files. We have. The primary alignment part is accepted_hits.bam and you might recall that we can use santools viewer to look at the content. So the sequences are fairly short for the basis one. And then we have a file, a bed file showing the deletions, insertions, junctions. These are are displayed junctions. You might record the bed format. Of the unmapped, which should one be interesting performing any further operations, filtering, screening, and so on. And then, a summary file that tells us, that gives us basic information about the mapping base, and so on. So for instance, this tells us that of all the left reads. Of the 16,586,968 input weights. Left weights, or make one. 96% of them could be mapped. And that those 11.7% have not pull alignment. Of which 359,000 had more than ten alignments, and would not have been reported. Similar values are being reported for the right reads. So for made number 2, 94% mapping rate. For an overall mapping rate of 95%. So we have this many aligned pairs. Of which 6.5 roughly, millon. Or 11.6% have multiple alignments. And some other, 5% on this concordant alignment. So the concurrent alignment rate, in which the mates are mapped in the correct orientation, and at a proper distance. Is 87.6%, which is a fairly high rate. So we can obtain, for instance, we can very easily grep for the information. For the overall read mapping rate, in the other files as well. To see if we have similarly mapped Levels. So we have six directories containing this type of information. Add this in a line summary. .txt. And you can see that all of them have very high rates. Between 89%, between 9.5% and 97%. So these are the bam files containing read alignments that will be used in the final stage. To perform transcript assembly. So this concludes our illustration on how to use command line tools for transcript commits. And we're also at the end of our course on command line tools for genomic data science. Thanks for joining me, and happy computing.