Almost all IT systems need to store some data. Sometimes, it's a lot of data, sometimes, it's only bits and pieces of information. Cloud providers give us a lot of storage options. Picking the right solution for data storage will depend on what service you're building. You'll need to consider a bunch of factors, like; how much data you want to store, what kind of data that is, what geographical locations you'll be using it in, whether you're mostly writing or reading the data, how often the data changes, or what your budget is. This might sound like a lot of things to consider, but don't worry, it's not that bad. We'll check out some of the most common solutions offered by Cloud providers to give you a better idea of when to choose what. When choosing a storage solution in the Cloud, you might opt to go with the traditional storage technologies, like block storage, or you can choose newer technologies, like object or blob storage. Let's check out what each of these mean. As we saw in an earlier video, when we create a VM running in the Cloud, it has a local disk attached to it. These local disks are an example of block storage. This type of storage closely resembles the physical storage that you have on physical machines using physical hard drives. Block storage in the Cloud acts almost exactly like a hard drive. The operating system of the virtual machine will create and manage a file system on top of the block storage just as if it were a physical drive. There's a pretty cool difference though. These are virtual disks, so we can easily move the data around. For example, we can migrate the information on the disk to a different location, attach the same disk image to other machines, or create snapshots of the current state. All of this without having to ship a physical device from place to place. Our block storage can be either persistent or ephemeral. Persistent storage is used for instances that are long lived, and need to keep data across reboots and upgrades. On the flip side, ephemeral storage is used for instances that are only temporary, and only need to keep local data while they're running. Ephemeral storage is great for temporary files that your service needs to create while it's running, but you don't need to keep. This type of storage is especially common when using containers, but it can also be useful when dealing with virtual machines that only need to store data while they're running. In typical Cloud setups, each VM has one or more disks attached to the machine. The data on these disks is managed by the OS and can't be easily shared with other VMs. If you're looking to share data across instances, you might want to look into some shared file system solutions, that Cloud providers offer using the platform as a service model. When using these solutions, the data can be accessed through network file system protocols like NFS or CIFS. This lets you connect many different instances or containers to the same file system with no programming required. Block storage and shared file systems work fine when you're managing servers that need to access files. But if you're trying to deploy a Cloud app that needs to store application data, you'll probably need to look into other solutions like objects storage, which is also known as blob storage. Object storage lets you place in retrieve objects in a storage bucket. These objects are just generic files like photos or cat videos, encoded and stored on disk as binary data. These files are commonly called blobs, which comes from binary large object, and as we called out, these blobs are stored in locations known as buckets. Everything that you put into a storage bucket has a unique name. There's no file system. You place an object into storage with a name, and if you want that object back, you simply ask for it by name. To interact with an object store, you need to use an API or special utilities that can interact with the specific object store that you're using. On top of this, we've called out in earlier videos that most Cloud providers offer databases as a service. These come in two basic flavors, SQL and NoSQL. SQL databases, also known as relational, use the traditional database format and query language. Data is stored in tables with columns and rows that can be indexed, and we retrieve the data by writing SQL queries. A lot of existing applications already use this model, so it's typically chosen when migrating an existing application to the Cloud. NoSQL databases offer a lot of advantages related to scale. They're designed to be distributed across tons of machines and are super fast when retrieving results. But instead of a unified query language, we need to use a specific API provided by the database. This means that we might need to rewrite the portion of the application that accesses the DB. When deciding how to store your data, you'll also have to choose a storage class. Cloud providers typically offer different classes of storage at different prices. Variables like performance, availability, or how often the data is accessed will affect the monthly price. The performance of a storage solution is influenced by a number of factors, including throughput, IOPS, and latency. Let's check out what these mean. Throughput is the amount of data that you can read and write in a given amount of time. The throughput for reading and writing can be pretty different. For example, you could have a throughput of one gigabyte per second for reading and 100 megabytes per second for writing. IOPS or input/output operations per second measures how many reads or writes you can do in one second, no matter how much data you're accessing. Each read or write operation has some overhead. So there's a limit on how many you can do in a given second, and latency is the amount of time it takes to complete a read or write operation. This will take into account the impact of IOPS, throughput and the particulars of the specific service. Read latency is sometimes reported as the time it takes a storage system to start delivering data after a read request has been made, also known as time to first byte. While write latency is typically measured as the amount of time it takes for a write operation to complete. When choosing the storage class to use, you might come across terms like hot and cold. Hot data is accessed frequently and stored in hot storage while cold data is accessed infrequently, and stored in cold storage. These two storage types have different performance characteristics. For example, hot storage back ends are usually built using solid state disks, which are generally faster than the traditional spinning hard disks. So how do you choose between one and the other? Say you want to keep all the data you're service produces for five years, but you don't expect to regularly access data older than one year. You might choose to keep the last one year of data in hot storage so you have fast access to it, and after a year, you can move your data to cold storage where you can still get to it, but it will be slower and possibly costs more to access. There's a lot more to say about storage in the Cloud. It's really quite a hot topic, but we won't go into more detail here. We'll provide links to more info in the next reading in case you want to learn more. Up next, we're going to look at a different Cloud scale characteristic that we've already touched on using multiple machines for the same service.