Welcome back to Section B. I would like to explain here what an integrated testing strategy is. But to do so we should first discuss what is a test. And tests can actually be rather simple or pretty complex. And I would start with a very simple definition. For me, personally, a test is a thing which gives a result. Is making some type of assessment, a value, a classification. In this case for being hazardous or having a certain degree of hazardous properties. This can be very simple test systems. Which are characterized by their SOP, their standard operating procedure, the way they were described. Which have for example one endpoint. This could be in case of a cell culture cytotoxicity, it could be the release of the marker, it could be gene expression. And this is then, in some way, interpreted. We having threshold. If there's more than 30% cytotoxicity we call it positive. Or if a certain amount of a mediator is formed we would call this a positive result. This is the most simplest idea of one system. One cell culture for example, one measured end point, and then interpretation procedure for the data leading to the result. But we should be aware that things could be more complicated. Sometimes we need different cell types. So we are analyzing, for example, cells which represent the adult body and embryonic cells in order to distinguish embryonic toxicity in vitro. And only the difference between the sensitivity of the embryonic tissue and the adult tissue is giving us an indication that this is not just toxicity. But it is embryo toxicity. So this would be an example of two test systems which only together give a result. In this case a common embryonic toxicity. But there's also many examples where we measure multiple endpoints. Where we do address not only, for example, cytotoxicity, but we combine it with other endpoints. A prominent example for this would be skin irritation. We have quite a few models of human skin, which can be grown in the laboratory to give us skin equivalence. And information on a chemicals corrosivity can be measured only by the chemicals aggressiveness towards this tissue. But if you wanted more subtle effects of skin irritation. It has been shown that I improve my prediction by also measuring, in this case into looking one other release, a stress signal. Which is helping me then in a joint interpretation of these two endpoints measured to get a result. Okay, so this means first of all, a single test can be composed of different test systems. It can be combined of different endpoints we are measuring. But in all cases there will be some type of interpretation how we derive a result. We have organized in 2013 a workshop on Integrated Testing Strategies for Safety Assessments. And that's another recommended reading. Because here we discuss with experts our thoughts on integrated testing strategies of the first paper which I mention. One of the figures from this report is shown on the lower end of this slide. It is showing you three different things which we want to distinguish. One is weight of evidence. What is weight of evidence? Weight of evidence describes a process where different test results are combined in a not necessarily structured way, not in an algorithm. But by taking a decision, weighing the different evidences, and coming to a conclusion. This is often necessary because test results can be contradictory, test results can be incomplete. And we need to see what is available. How do we weigh the different results? And in a non-formalized process we will decide this is what all of these tests tell us. With other cases, and mutagenicity testing's a prominent example here. Where we have a battery of tests. Where we typically carry out three tests in case of mutagenicity testing. And we could call it a positive if any of these three is positive. Starting with the AMES test, for example, and adding other mutagenicity tests depending on the area of application. So this is a very simple scheme where the result depends on all of the test results. But they're not combined in any type of integrated or algorithmic way. And this is the difference for a tiered testing strategy or integrated testing strategy. Here our process depends on the results within our testing. We test in the tier test strategy first in Test 1, the result of Test 1 informs Test 2. And depending on a positive or negative result, we would proceed to Test 3, to Test 4, or depending then also on Test 5. This is a completely theoretical example, but it's showing you the basic idea. If we have a positive, a negative result different type of consequences are taken. And we are moving ahead with a different second or third test in order to confirm what we have been seeing. Or to challenge what we have been seeing. And this is the idea of a testing strategy. Not all tests are necessarily being used. And one result informs how to best approach the subsequent testing. What would the reasons for moving from stand alone alternative methods, so simple in vitro tests. It's quite successful for acute and topical toxicity as we have learned earlier. And which have been formally validated. To move instead to an integrated testing strategy. There's a couple of reasons and. They're listed here. Which different time points impacted over the last 10, 15 years to the proposal of integrating different types of tests. And, actually, one of the first examples is shown In the area of what is called quantitative in vitro to in vivo extrapolation, QIVIVE. Quantitative in vitro to in vivo extrapolation, so the idea that I use results from a cell culture in vitro. And that I use computational tools. Our understanding of the kinetics in an organism of a substance to extrapolate to an in vivo exposure. Which is necessary to reach the concentration active in vitro. This was very much pioneered by a working group in the ECVAM, in the European Centre for the Validation of Alternative Methods, on integrated testing strategies. And you can see here one of the early reports in 1999. Of which was suggesting this combination which is one of the first example of an integrated testing strategy. Kinetics plus in vitro toxicity in order to optimize predictions. Another element which very much impacted on the development of testing strategies is the European REACH legislation. You've heard before that REACH is the largest program for systematically assessing chemicals as to their toxicity. Europe has passed this legislation in 2006 and until May 2018 some 30 to 50,000 chemicals are going to be assessed and registered in this process. So a large part of the chemicals which are on the market, many of them for many decades, but never properly tested. This legislation is a forerunner in many aspects. Among others it states that animal test should only be used as the last resort. This has many reasons, costs, animal use for ethical considerations. But also simply the laboratory testing capacities which are available to run that many animal tests. So very early already the European Commission published publications like this, REACH, and the need for intelligent testing strategies. It was stressing that if we rely on test systems which are simpler, which are reductionistic, which are not giving us the full bodies information. We are likely to need to combine several of these assays into what they call the time intelligent testing strategies. Similar developments in the US were favoring the toxicology that was found in the 21st Century paradigm, which is the red threat to this lecture series. So you have heard about toxicity testing in 21st Century already. And it is quite obvious that when relying on in vitro test systems. Which are representative of pathways of toxicity of certain modes of action only. You, typically, have to combine them. So already in a paper here in 2009, a Toxicology for the 21st Century Mapping the Road Ahead. I made a very strong point that the US approach of toxicity test in 21st century necessitates the development of testing strategies. As a consequence in order to make optimal use of the various parser based assay systems. And indeed the High Throughput Screening, the HTS. Which we have seen from EPA. Which we have seen from the collaborations with FDA and NIH in the Tox-21 alliance. Are actually large scale requiring the testing of the combination of different test systems to come to interpretations. You have heard about the example of endocrine disruptors, for example, by David Dicks. Where at the moment 18 assays for estrogenic endocrine eruption are used to predict possible estrogenic effects. The last element, toxicity testing, has some similarities with clinical diagnostics. Intellectually it's the same whether I diagnose with a number of tests that a patient has cancer. Or whether I diagnose with a couple of tests that a chemical is producing cancer. It is about testing the hypothesis, it is about confirming hypothesis. And it is about differentiating my diagnostic interpretation by the combined use of this information. And there's a lot of things we can learn from clinics. We stress this similarity in this article from 2005 which was entitled Diagnosis Toxic. I recommend this for your reading. Because it is trying to make the case that clinical diagnostics is actually a field which is very often advanced. In the way different sources of information are combined to come to an optimal interpretation of a case with a patient. So you see these are some developments which took place over the course of the last decade. And in parallel to this we have been presume ideas of evidence based toxicology. Which will be the topic of an entire lecture series in the fourth term. But I will come back to this at the end of this lecture. Because. Evidence based toxicology which is derived from evidence based medicine is one of the tools. Which at the moment Is under discussion for validating integrated testing strategies. So we'll come back to this at the very end. So the idea of an integrated testing strategy is that a test substance is entering in a first test. Often in comparison to reference substances. Our positive and negative controls we are applying. We are measuring endpoints and we interpret this first result. This is the result of our first test. And in the integrated testing strategies we continue choosing further test systems depending on these results. And we are exiting this loop as soon as we have accumulated enough results to fit a prediction model. Which allows us to make a decision on the hazard of a given substance. And this would be then subjected to some type of validation. We would assess the validity of this by using substance for which we know what the exact results should be. Which means we are defining a point of reference. Typically, because we see that our results correlate with the results which we have obtained for substances for which we have good data. But we will come back also to the opportunity to use actually mechanistic information. Asking not the question of whether we direct produce, whether we correlate with the results of an animal study, for example. But whether we reflect a mechanism which is relevant which is another level of validity. Which we will address in a second. So this is an attempt to define what an integrated testing strategy is. An integrated testing strategy is an algorithm to combine different tests and their results and, possibly, non-testing information. So non-testing information refers to things which we can do with a computer. Existing data, computer-based, in silico extrapolations from existing data or models. And combine these two types of information to give a combined test result. They will often have interim decision points that the first may distinguish them from a battery of tests where all components are done. And only then we come to an analysis. And these decision points determine whether a certain building block is being considered or not in the testing strategy. So these are some of the basic concepts. As I said in the very beginning, the idea is while we combine let's say two tests, 1+1. Predictive relevance should be considerably larger than two. So what these tests give us on their own. However, we have one problem and this is applicability domain. What does this term mean? Applicability domain describes the part of the chemical universe I can apply my test to. So let's say some assays cannot handle volatile substances. Some assays cannot handle very acidic substances. Some cannot handle color substance because they're measuring color, for example. The applicability domain of an assays describes this. The problem is when I have two assays, I can only test those chemicals which are applicable to both assays. So it's only the agreement in the applicability domain the overlap between the two applicability domains for which I can apply my testing strategy. This is something which is of considerable problems, because it reduces the number of substance I can actually test. Another problem, which we will see towards the end of this lecture, is validation. It is already an enormous effort to validate one test. We're talking, typically, about ten years of work. And more than $1 million spent to validate, ultimately, that one assay is valid and can replace an animal test. When we are validating a testing strategy we have to do so for all of the different branches. The combination of test results. And this means that our validation effort strongly increases based on the number of components in the testing strategy.