Hello and welcome to this course, and we're talking about Python for collection. In this video, we're going to talk about e-mail and e-mail collection. How an attacker can gain access to valuable data from e-mail, and how we can use Python code to accomplish that. If you think about it, the data that you put in e-mails every day could be a very valuable source of data for an attacker. There's probably a variety of different types of personally identifiable information or PII contained with an e-mail. Think about it, you might be sending credit card data, account information, medical data, etc, in the e-mails. At the very minimum, there's a good chance that you've probably e-mailed a picture to someone before, like taken at a backyard barbecue, pool party, birthday party at the house, etc. With pictures, there's often geographic metadata embedded in those. That picture alone reveals the exact location of where the picture was taken. It includes date stamps. If you're taking a picture of a kid with a happy birthday sign over their head, someone who looks at the picture can know where you live and that kid's birthday date based off of the metadata in that picture. Lots of PII in those photographs, if you haven't disabled that metadata in your camera. Additionally, e-mails are a source of intellectual property, internal e-mails within an organization probably provide a wealth of data regarding internal projects, research and development, marketing efforts, etc. The ability to read through those e-mails can provide an attacker with a lot of insight into how an organization works, and potentially access to some trade secrets and other intellectual property. Thirdly, e-mail is a database of information about the relationships between employees within the workforce. It's pretty obvious if you're reading an e-mail who's in charge of who, what the relative hierarchy is, who commonly corresponds with this employee, etc. You can also learn tone of voice, common topics of conversation, etc. Beyond the potential for leaks of intellectual property data here, you're looking at potential information for spearfishing attacks. If you know that Tom and Nancy often discuss project X, then a e-mail that looks like it's coming from Tom to Nancy referencing project X, and including an attachment or a link or something has a higher probability of being trusted and accessed, which gives an attacker the access they need to Nancy's system. E-mail obviously is a valuable target to an attacker. How do we get at it? Well, e-mail data can be stored in a few different locations. Some of these may be more or less accessible to an attacker. Many organizations use Software as a Service, webmail, so things like Gmail, Microsoft 365, etc. This means that one copy of a user's e-mail is stored online on the appropriate website. This is more and less accessible for certain reasons. Obviously, since it's on the Internet, it can be accessed from anywhere. That makes it easy to access theoretically. However, practically, the e-mail there is protected by user password, and so the security of the e-mail depends on the strength of that password. In contrast, a local e-mail cache, so like a OST or PST files stored by Outlook is stored on a slightly more protected system. To gain access to files like that, you probably need access to an employee's computer. However, those e-mails are probably stored unencrypted, meaning that if you can gain access to the computer, you have access to all of the cached e-mail data stored locally on the machine. This is where we're going to be exploring the use of Python and e-mail collection by taking a look at Python's ability to access those local caches of e-mails and take advantage of them to extract potentially useful information from them, that can be used for data breaches or to perhaps further an attack by using the collected data for spearfishing. Thank you.