Hello and welcome to our conversation about securing Big Data systems. So when we think about Big Data, we have to be thinking about application vulnerabilities. How is data going to to be collected and whether or not the applications that's providing the data are going to to be secured. If they are, great, and if not, what are we going to do to deal with that? And what are some of the architectural design elements around these systems that we have to be concerned with? The concept of Big Data is one that's becoming very popular. It's kind of been bantered around for a while. It's a terminology and a thought process that we talk about, but maybe not everybody understands exactly what we mean by the term. When we think about Big Data, we don't just think about a lot of data, but we certainly mean that. But what we really think about are the systems that are going to gather the data and then manage the data, and interpret the data. And help us to use the data to be able to do a variety of things inside of our network or inside of our organization. There's so much data being generated by systems today that as an individual, it's really impossible for us to keep up with. As a security practitioner, you may go and look at certain log files. But the reality is you're not going to be able to look at a lot of them and not look at them consistently, and not look at them across multiple systems. This is going to prove to be a challenge for you. And so as a result of that, what Big Data systems help us to do is to aggregate and collect and centralize that data. And then use computers to chug away at analyzing it for us and looking for trends, looking for information, pulling in the things that we tell it are important. And then presenting them to us with reports, and dashboards, and the things that will make it easier for us to consume. This is what Big Data really means. Turning Big Data towards the concepts of security can really change the nature and the approach of what it is we're able to do with regards to implementation of policy, with regards to structuring and procedures, and with regards to confidentiality, integrity, and availability. We can do trend analysis on systems and access to them in ways that just escaped us before. We just couldn't see the trend, couldn't see the data and what it meant. But by using Big Data systems and the business intelligence capabilities that they can provide, we begin to see a new picture emerge of our networks and of our users and the system interaction that they generate as they are interacting and using and manipulating and wondering about and examining data. And then how that data's going to cross our networks and what that means. So as we talk about these items in this area, and we start to define this terminology, we want to be thinking about these things. The term Big Data, as you see as we jump in on the screen in front of you here, massive amounts of digital information companies and governments collect about all of us, all of our surroundings, everything we do. When you click a keyboard, when you use a mouse, when you hit Send, there's data associated with everything we do. Keeping track of that data, understanding it, and using it to our advantage is really what Big Data's all about. Interpreting Big Data, as I said, can be a bit of a challenge for us, right? What does it mean? There's so much of it, how do we really know? So for instance, one of the things that we want to look at, and we'll go to our trusty thought process here about demoing and looking at some stuff on the screen. I've created files before and I've showed you files. Let's just go in and let's take a look at, this particular directory right here called Scripts. Let's just put this over here. And when we take a look and we go in and we see that there is information about whether it may be shared or not, see security information about who has access to it. See if there's previous versions, and indeed there are. We can go back and find those previous versions, perhaps restore them. We may be able to go in and customize information about it, change the icon, etc. There's a lot of information available to us in here. When I go in and drill into Scripts, and I take a look here, and perhaps take a look at this all events document, I may see that there's additional information related to this. And the idea here is that there's so much data, when was this created, what kind of file is this? This is a PowerShell file. You'll look at properties of it. There's additional information associated with it. When we start to drill down and take a look, we see so much data just around us everywhere. Not just this kind of data, but also other kind of data. So for instance, we can go in and we can go to our desktop. We can go and we can take a look at this file right here, which is the CCM file, cloud capability matrix. And we can take a look at the fact that it's a different kind of file. It's got a spreadsheet item here. So let's just open it up real quick and take a look. And when we go in and we do that and we go to File, we may see that there's additional information in here about the data, the properties here, that may be of interest to us, right? Just show all properties. We could see how big the file is. There may be things like a title, a tag, others, related dates, when the file was created. We could specify a manager, specify an author, last modified by. So if I go in and I come in here, and I put in something like this, and I add an author. And I do this, and I add another author, and I do that, and then I go ahead, and I add some tags in here. CCM, cloud, what else, CSA and I put those tags in there, and I add a category, and this will be cloud security. And you could see as I kind of go through and I do all this, when I save this, I then have all that information added in here, as you can see, and I have that information available to me. And now that becomes what we call metadata. That's part of the data associated with this particular workbook. We may be able to see certain amounts of that data here. Other data, as we see, we have to go into the program to view, but the idea is that I've added to the amount of data. And by doing this, we are creating more and more data, and this is where Big Data comes from. It's just the aggregation of all this information, some of it really important and meaningful, others eh, not so much. Nobody may care about who wrote the spreadsheet as an author or who is managing it, but they may care about the fact that the spreadsheet contains certain kinds of information. And so the tagging may become important. So the idea of interpreting Big Data is really about understanding the options we have and then looking at the data to see whether or not what we want to know is there. And if it is, pulling that information out and creating some sort of relevancy map that helps us to understand what it is.