- [Morgan] This is a walkthrough for Exercise 1, "Creating an Amazon OpenSearch Service Cluster." So the context for this exercise is that you are working at a company that is measuring the temperature of water in lakes, ponds, and streams around your area. And you have sensors that are taking these measurements and sending data, every 15 minutes, and we want to collect this data and put it in a data lake. And then we want to be able to search and index and catalog this data. So we are going to be using services like API Gateway, to ingest the data, a Lambda function, Amazon S3, Amazon OpenSearch Service and also a couple more. So, before we can dive right into this exercise, let's first create the IAM roles that we're going to need. So to do that, let's first navigate to the IAM service, select "IAM." And then from here, we can click "Roles" on the left hand navigation, then click "Create role." Under "Use case", we're going to select "Lambda" and then select "Next." So this is going to be the execution role that the AWS Lambda function uses. So we're going to go ahead and click "Next". And now we need to attach some permissions. So first, I want to attach a permission to allow full access to Amazon S3. And then I want to attach a permission for "AmazonES", which I'm going to retype that, make sure I can actually spell it right. All right, there we go. So now we can have "AmazonESFullAccess." This is for Amazon OpenSearch Service. So then I'll go ahead and click "Next," and then we need to give the role a name. So for this one, let's call it "data-lake-week-2" and then we can see here, we can see the "trusted entities," which is going to be a Lambda function and then we can see the policies that we've attached and I'll go ahead and click "Create role." Now we can see that that role has been created. So next, what we want to do is we want to create an Amazon OpenSearch Service Cluster. So you're going to use OpenSearch Service for cataloging and indexing the water temperature. So before you can start ingesting documents into your OpenSearch Service domain, you have to first create it. You could also use AWS Glue for cataloging your data and we'll explore that more in a future lesson, but for now, let's go ahead and navigate to OpenSearch Service, select "Amazon OpenSearch Service" and then let's click "Create domain" and let's give the domain a name, "water-temp-domain." And then we can scroll down here, under "Deployment type", we'll click "Development and testing", and then we want to make sure we have the latest version selected, which we do. Now, if we keep scrolling down here, for "Data nodes", I want to change the "Instance type" to be a "t3.small.search." And then you can see here, it's giving you a little warning that this is only suitable for testing and development purposes, which is what we're doing. We're just exploring, learning. So that's good for us. So then we can scroll down some more and then from "Network", I want to change this to be "Public access." And then if we scroll down, I'm going to disable "fine-grained access control" and then I'm going to, under "Access policy", I want to change this to be "Configure domain level access policy", and what this is going to do, this "access policy" is essentially a resource policy for this domain that controls whether a request is accepted or rejected, once it reaches the OpenSearch Service domain. So, you could either have an "IAM ARN" be allowed here. So for example, an IAM role, like if you have an IAM role that's trying to send a request to OpenSearch Service, then you would need to include that here. But what we want to do is instead select an IPv4 address, so an IP address. So for this, you would need to navigate to "whatismyip.com" and that will give you your IP address. And then you will replace the IP address in the star, with your IP address and then change the action from "Deny" to "Allow." So this will allow your IP to post to this domain. So if you click on "JSON" here, you can then also see the JSON policy here. So we want to "Allow" all "Actions" to OpenSearch to this "water-temp-domain", as long as under the "Condition", that the "IpAddress" matches my IP address. So now we can go ahead and scroll down and then we can click "Create." Please note, that it can take up to 15 minutes for this OpenSearch Service domain to be created. So we will go ahead and move on to the next task for now, which is going to be creating an S3 bucket. And then we'll come back to this in a little bit and see how it's doing. So let's go ahead and navigate to S3 and then we need to click "Create bucket" and our bucket name needs to be globally unique and DNS-compliant. So I'm going to go ahead and type "datalakes-week2" and then you would need to make sure that you're using something that is unique here. So I'm going to go ahead and put my initials and then the year, and hopefully that's unique enough, we will see. And then we can go ahead and make sure that the "Region" is the region that you're trying to use, which for me is going to be "us-east-1." And that is also the "Region" in the instructions. And then we can scroll down and we can click "Create bucket." Now we want to make sure that we remember this bucket name, because we will be using it in future steps. All right, so now we have an S3 bucket that will be the storage layer for our data lake. In a later task, we're going to modify the bucket access policy, so that the Lambda can actually write files to this bucket. And we'll learn about bucket access policies in future lessons. But what it is, is essentially, is another resource policy that will control whether requests will be like, accepted or denied, based off of a principal. So again, a principal can be something like an ARN from an IAM role, things like that, maybe an IP address, it could be an account, an IAM user. So we'll go ahead and edit that later. But what we want to do right now, is create the Lambda function. So let's go ahead and navigate to the Lambda service, select "Lambda." And Lambda is a serverless compute service. So what we want to do is we want to upload our code, for our function, and then the function will run in response to incoming requests. So this Lambda function will capture data that is on the payload of the incoming request and then it will upload a JSON document to S3. The Lambda function will then upload the document to OpenSearch Service as well. So we're going to go ahead and choose "Create function" here. And then we want to select "Author from scratch." Then under "Function name", I want to put "upload-data." And then I'm going to select the latest version of Python, which is "Python 3.9" right now. And under "Permissions", we can expand "Change default execution role." And then we want to say, "Use an existing role." And then we want to select that role that we created earlier, "data-lake-week-2." So we'll select that. And then we can click "Create function." Now the Lambda function code will need to be uploaded for this, so what we're going to do is scroll down here to the "Code source" tab and then I want to click "Upload from", and then I want to select a "zip file." Now we can upload the zip file that we had downloaded from the first step. Now I've selected the "upload-data" zip file and I'll go ahead and click "Save." All right so, now what we need to do now that that has been saved, is we need to scroll down to the "Runtime settings", which is right here. And then what we want to do is click "Edit", and then we want to change this "Handler" here. So the "Handler" is where the function execution will begin and we need to make sure that the "Handler" that we define here, matches the "Handler" name in the code, so that it knows where to invoke that function. So I'm going to go ahead and change that, so that it matches which is going to be "lambda.handler." And then I will click "Save." Then we want to choose the "Configuration" tab here, and then we want to select "Environment variables." And what we need to do now is "Edit" the "Environment variables", add a new "environment variable." And then we want to add the S3 bucket that our code is going to be uploading to. So the S3 bucket that we just created, is brand new. The code is pre-written. So it's not hard coded. Instead, we need to set it as an environment variable and then the code will retrieve that information from the environment variable. So the key will be "S3_bucket", in all capital letters. And then the "Value" should be the name of your bucket, which for me is "datalakes-week2-mw2022." Then we'll go ahead and click "Save." All right. So now, what we need to do is go to the "Permissions" tab and then I want to scroll down here and for the "Role", let's go ahead and choose this link. And now, this will open up in a new tab. And what I'm going to do is copy the "ARN" for this role. And the reason for that is that the code that we just uploaded, is uploading to S3 in Amazon Elasticsearch, so I need to make sure that we can actually make that call. So giving the permissions to the Lambda function isn't enough by itself, we also then need to make sure that the S3 bucket policy will allow that call to be made. So I'm going to copy this "ARN" and paste it in a notepad that I have over on the side here. And then we're going to go back in our Lambda function console, and we're going to scroll up and type in "S3" so that we can go modify that bucket access policy. Okay, so now we will select this bucket that we created, click on the "Permissions" tab, scroll down to the "Bucket policy" and click "Edit." And for here, you would paste in the provided JSON code that was in the instructions. And this is going to allow a principal to do any S3 action against a resource. So for the "Resource", you want to copy the "Bucket ARN" from the top there. And then you can just paste that in, just like this. And then the "ARN" for the "Principal" that needs to be filled in, is actually going to be the Lambda execution role. So I'm going to delete the brackets and the "fill me in" and then this is what this policy will look like. So we are saying, I want to "Allow" all "Actions" against "s3". What "S3 Resource?" This data lakes bucket specifically and then for what "Principal," for the execution role of the Lambda function. So then I'll click "Save changes" here. And now we can return to our OpenSearch Service domain and see if our domain is done. All right, let's go ahead and click on our domain. And we can see that that status is loading. We will come back when this is done. All right, so we're back, and we can see that we have an active domain. So what I want to do is click on the domain and then note how we now have a domain endpoint. So that's going to become important later. So what we want to do is click "Actions" for now. And what we need to do is edit the security of this, so that way our Lambda function can actually write things to this domain to record those water temperatures. So I want to click on "Edit security configuration." And then what I want to do is scroll down to this "Access policy" and click on the "JSON" tab. And then I'm going to highlight this JSON function and delete it, and then paste in the JSON from the lab instructions. And then what I want to do is paste back in some important pieces. So I want to make sure that the "ARN" for this domain is pointing to my account. So I'm going to do that first. And then what I want to do is put back in my IP address. So I'm going to go ahead and do that. And then the last thing I need to do, is paste in the role for that Lambda function. So I have that also in my text editor, on a different screen. So I'll copy that and paste it over here. So now we have a JSON policy that is going to allow actions against this domain, as long as it's coming from this IP address. And we also have another statement here, that's going to allow actions against this domain, as long as the Principal is that Lambda function's execution role. So now we can go ahead and scroll down and click "Save changes." So now the permissions on this have been updated. So now, the next thing we need to do is modify the Lambda function to add an environment variable. That's going to point the code to this domain. So I'm going to go ahead and copy this domain endpoint and make a note of it in my tab over here. And then I'm going to navigate to the Lambda service. Click on "Lambda", click on my "upload-data" function, click on the "Configuration" tab, click on the "Environment variables" tab in the left hand side, click "Edit." I want to click "Add environment variable" and now I'm going to add one for "ES_DOMAIN_URL". And then I want to paste in that domain URL, that I had noted in the last step. Oops, pasted it in the wrong spot. Let me go ahead and fix that. Okay. So then we paste that there and then I'm going to click "Save." All right. So now what we need to do is create an API Gateway that's going to forward requests to this Lambda function. So we have API Gateway sending to Lambda. Lambda sending to S3 and OpenSearch Service. So we need that API Gateway piece next. So we're going to go ahead and open up the service API Gateway. So we'll navigate to API Gateway. And then I want to create a new API and I'm going to select the "Rest API" and click "Build." And if you need to, you might also need to close some dialogue boxes about creating your first API. I didn't have that come up, so I don't need to exit them, but if you get any dialogues, just go ahead and close out of those. So now what I want to do is give this API a name. I'm going to name it "sensor-data" and then I'm going to click "Create API." Now I need to create my methods and my resources. So what I want to do next, is click on "Actions" and then click "Create Method." And then from this dropdown, I will go ahead and select "POST", because we're going to be accepting posts from those sensors. So those sensors are going to be posting, using an HTTP post to send that data. So now at this point, we need to set up some details for this post. For the "Integration type", it's going to be "Lambda", because the API Gateway will be sending information to the Lambda function. And then under "Lambda Function", we need to select the correct one. So if you start typing, "upload-data", we can select that there. And then click "Save". "Add Permission to Lambda Function." We are going to give API Gateway permission to invoke the Lambda function. We're going to go ahead and say "OK" to that, say yes. And now that this "POST-Method Execution" has been created, we're in this Method Execution pane. I'm going to go ahead and click the "TEST" button here and then scroll down to the Request Body and paste in some data. So this is going to be mocking up testing the data. So you can imagine this data coming from an outside sensor. We're going to just go ahead and paste it right in here, so that we can test it directly from the API Gateway console. So I'll go ahead and click "Test." And now, if we look over in these logs, on the right hand side, we can scroll down and we can see, "Sending request" to the Lambda function. And then we can see that we got a "200 Status" from that Lambda function and we successfully completed this execution. So this looks like it did work. So next what we need to do is check the OpenSearch Service domain for this data. So, what we need to do here is we want to query the domain (keys clicking) and in the instructions, it provides you an example of the pattern for the URL that you're going to enter in to the browser. So now that we have tested this through API Gateway, what we need to do now is check the OpenSearch Service for data. So in the instructions, it gives you a template for how to create the URL to actually query your domain. So what I want to do is show you what it looks like after I did my query here. So I'm going to bring this over to this screen and you can see I have my domain. So this "search-water-temp-domain" and the domain, "/lambda-s3-index/lambda-type/_search?pretty." And then, this is what came back once I made that request. So we can see that I had a "sensorID" and the "timestamp" and the "temperature", which if we come back here, we can see that that does match what we entered in to our "Request Body", testing it through API Gateway. So in the real world, your data probably wouldn't originate from API Gateway, it would come from a fleet of sensors and devices. But again, we've generated things through API Gateway, as if it was coming from outside. And by creating the OpenSearch Service domain, you also now have access to Kibana, where you can then go and further explore, view and analyze the data that was loaded. So from here, we need to go ahead and clean everything up. And to do that, you would want to navigate to IAM, click "IAM", and then we want to delete the role first. So we can click here, on "data-lake-week-2" and click "Delete" and then we can delete the role. "Delete." And then next, we need to delete the OpenSearch Service domain. So we will go ahead and navigate to OpenSearch, select "OpenSearch", and then we want to go to "All domains", select the domain and delete it. And then type in the name "water-temp-domain" to delete it. "water-temp-domain", and then click "Delete." And then we want to navigate to S3. And for S3, we want to delete our bucket here. So I want to go ahead and click this, click "Delete". Oh, we need to empty the bucket first. Let's go ahead and do that. So we can see this JSON, that was from the Lambda function. We're going to go ahead and delete this and you'd type "permanently delete" and then click "Delete objects." Now we can come back to the "Source," go back to "Buckets" and then we can click this, click "Delete." We want to delete this bucket. So I'm just going to copy and paste the name into the delete field. And then we need to delete the Lambda function. So I'll navigate to Lambda, select "upload-data", click "Actions", click "Delete", and then type "Delete" to confirm. Click "Delete". Exit this. And now finally, "API Gateway." We can delete that one as well. Click "API Gateway", select the "sensor-data" API, select "Actions", select "Delete", click "Delete." And that is it. We are totally cleaned up from this lab. All right, thank you. See you next time.