[MUSIC] In this module I'm going to talk a little bit about how we approach finding diseased genes in families that have unusual diseases like the familial forms that I've talked about before. So here is a family, I think I showed this particular family tree before and this family has a pattern of inheritance that goes across generations. Each affected individual is shown in red, and you can see that the pattern is that women transmit to men, men transmit to women. And there's approximately equal numbers of men and women affected that in each generation an affected person transmits to approximately 50% of their offspring. So this is a family tree that I did show before and I wanted to highlight the idea that in generations three there's nobody marked as having the phenotype that we're interested in. And the reason for that is many of these diseases will have what's called variable penetrants. So there has to be somebody in that family, and it has to be this person, in fact that's shown in red now who is an obligate mutation carrier. That doesn't mean they have the disease. But we know that if the grandfather has it, and it is genetic, and it's transmitted in an autosomal dominant fashion, which is what this pattern looks like. Then that person in generation three, the mother of the two affected children in generation four must be a carrier. So an important step in starting to resolve these particular patterns is to make sure that we understand the phenotype and understand the family relationships because otherwise it's a garbage in, garbage out problem. And if we analyze on the wrong assumptions well, we'll get to the wrong place. So the question that an investigator would have when they're faced with large kindred like that. Where they're reasonably certain that it's a genetic disease, is where in the genome is the variant or perhaps the set of variants but hopefully the single variant causes the disease, how do you find that? Well it turns out there are places in the genome that are variable, and they're variable in a random relatively random way across individuals and those places in the genome allow us to do what's called mapping. So one kind of variant that's very, very common is a single nucleotide polymorphism or a snip. Another kind of variant that I haven't talked about much before is what's called a repeat variant. So here is, here are two separate genetic sequences and one person has four TA repeats and the other one has two TA repeats. That's called a dinucleotide repeat because it's a pattern of two. And here is a person who has four Trinucleotide repeats TAC, TAC, TAC, TAC, and the other one is only TACTAC. So, that's two copies or four copies. And there are ways that are pretty simple To distinguish between people with two copies or four copies of these repeats or ten copies or 17 copies. Most of these repeats will not be in the coding region, and in fact, if a dinucleotide or a trinucleotide repeat were in the coding region, it would almost certainly disrupt the amino acid sequence and have profound functional changes. So here is a small family. And these are representations of a chromosomal sequence in the mother and the father in this family. So the chromosomal sequence in the mother is shown in pink and green and in the father in pink. And then there's a little block at the top that's yellow and a little block in the middle that is blue. Those blocks represent individual haplotype blocks that have arisen over time, and make different sequences in mom and dad. Each one of those are mappable using dinucleotide or trinucleotide repeat patterns across the genome that allow people to distinguish those alleles. Now what happens here is that one daughter, it has a pattern that is actually interesting, it's not like mom or dad. It's sort of the top half is like mom, and the bottom half is like dad. The one son inherits that particular region of this particular chromosome from dad. The infected son inherits it from Dad, and the uninfected son inherits the maternal chromosome. So when you look at this pattern and you start to think about which patterns track across generations. You think, well maybe it's something in that blue region that confers disease. Everybody who has the blue region has the disease. Everybody who has not the blue region does not have the disease. That's why the phenotyping is so important, deciding who is a carrier and who is not. And then having large, large multigeneration families because that allows more recombination events to take place and more genetic mapping to occur. And how do you find that blue region? Uses the dinucleotide or trinucleotide repeats currently it uses snips and as I'll talk about in a couple of modules, there are much newer ways that have actually superseded these kinds of approaches which many people would regard as relatively primitive, but of huge historical importance. So, the way in which the statistics are done in these kinds of families is there's logarithm of the odds ratio score, the lod score. And a lod score greater than three, And there's a way of defining that statistic. Suggest that you have landed in the right region for this particular genetic variant. Now it doesn't tell you what the genetic variant is, the region might be huge, it might encompass dozens and dozens of genes, and the challenge to the investigator is to either get the region smaller by acquiring More recombination events, in other words, more generations. Or by tediously tracking through the particular region of linkage to find the gene that actually causes the disease. A Lod score less than two suggests that the region that you've been looking does not have the disease locus contained. And in between, you just need to do more. So I'm going to show you one example Of how this technology was used to crack a common, a relatively common disease that's inherited in an autosomal fashion, what's so called Mendelian disease. This is a very, very large kindred. You can see With affected individuals shown in dark, many of the individuals are dead because it's a large family. And the disease might kill them early, but in fact, age will kill many of them. So you have to have phenotypes. You have to have DNA on large, large numbers of patients, and do mapping. This particular phenotype happens to be a family, a French Canadian family in fact with a disease called hypertrophic cardiomyopathy that results in a thickened ventricle. This exercise in mapping initially would cover perhaps several hundred dinucleotide repeat regions across the genome. Mapping those would identify a region of the genome in which the disease gene would occur. So you can imagine with mapping 400 markers, you would arrive at a very very large chunk of the genome. In which the disease gene occur. So that was what was done. First this a small region of chromosome 14 in that particular family. And you can see there's a Lod score that is very, very high in this particular region of linkage. So, with that particular marker, it's likely that the disease chain the actual mutation that causes the disease is located in that region. But that region spans hundreds of thousands of base pairs. So with advancing technologies and the ability to define those regions in more detail, and to define the genes in those regions. Came the ability to actually isolate the specific gene and the specific mutation that causes the disease in that family and the way to do that is to have a large family with a high Lod score and some physiologic rationale. This is a map of the disease genes called the beta-myosin heavy chain gene. It's a gene that's expressed in the heart as an important part of the contractile apparatus. This gene was identified as a disease gene in the disease hypertrophic cardiomyopathy in 19 89. To this day it's not very clear how a mutation is beta myacin heavy chain results in the phenotype of hypertrophic cardiomyopathy. Nevertheless, the discovery has been a huge boon to clinicians because we can now use genetic testing to identify mutation carriers and non-mutation carriers. This is a cartoon from the very first publication that identified the mutation in this particularly family, but we now know that there are literally hundreds of mutations in this particular gene that cause the disease in other families. And there are literally dozens of genes in which mutations can cause the disease not just in betamyosin heavy chain, but in other genes. And the interesting thing about those genes is that they all encode proteins that interact in some way with the beta myosin heavy chain gene to effect the way the heart contracts. So it's a great story of going from a family, doing traditional linkage, finding the region, finding the gene and then using that information to uncover the cause of the disease in many many other families. And start to understand the underlying path of physiology, start to develop genetic testing, which we talk about more in the case studies. And I emphasize again, the reason that that was successful is that it starts with this very very large Family. Very small families, individuals with hypertrophic cardiomyopathy are just simply not informative using this linkage approach. But what the good news is is that with newer sequencing technologies which I'll talk about in a couple of modules. We can actually crack cases of patients who have unusual phenotypes and don't have much of a family history and we'll talk about that later. [MUSIC]