- [Seph] Hey, welcome back. Throughout this class, you've heard multiple references from us in regards to indexes. We've mentioned how the use of indexes can help with your queries and base table throughput capacity. What we're going to do now is focus specifically on what secondary indexes are. Hopefully, with that information, you'll be able to think of some ways you can start using them for your performance benefit. In general, a secondary index is a data structure that contains a subset of attributes from a base table along with an alternate key to support your query operations. One of the secondary index types that you have is the local secondary index. You'll commonly see this in the documentation listed as LSI. An LSI is a reorganization of data where the primary key remains and the sort key is chosen or projected from the other attributes. Early on in this class, we discussed how using the partition key to query your data is the best and most straightforward way to work with the queries. We also mentioned how the sort keys can provide you with more organizational power to help you add more specificity to your queries. So even though many requests may only need to query data using the base table's primary and sort key, there may be situations where an alternate sort key would be helpful. Situations like this occur when you need to pull items based on attributes that are not the current primary or sort key. There is the ability to scan in order to perform these operations, but a scan is a more resource-intensive operation. Because scan operations read every attribute on the table, there is an additional latency that needs to be considered, especially for larger tables. Also, the size of the returned result set can be much larger than intended, and this would start to eat into your provision throughput capacity. If instead of a scan you were able to utilize an LSI, you would be able to make the request to the index in order to pull the items you were looking for. LSIs also do not have one of the limitations that exist on the base table. Instead of every item having to be a unique combination of the primary and sort keys, in the LSI, the sort key value does not need to be unique for a given partition key. If there are multiple items in the local secondary index that have the same sort key value, a query operation will return all of the items that have the same partition key. Furthermore, you have the option to query the LSI with either eventually or strongly consistent reads. This can be specified using the consistent read parameter of the query operation. To create one or more of the local secondary indexes on a table, use the local secondary indexes parameter of the create table operation. Local secondary indexes are created when the table is created and cannot be added later. This is important to note so that you can plan ahead with your table creation and data modeling. When you delete a table, any local secondary indexes on that table are also deleted. Essentially, the LSIs are fully tied to the table from creation to deletion. The other type of secondary index is a little different. These are called global secondary indexes, or GSIs, and they are a bit more independent than the LSIs. A GSI is used when you need to reorganize your data not by the sort key but by the primary key. Creation of the GSI can be done at table creation or later. This time, you use the global secondary indexes parameter with the create table operation when creating a new table, or update table operation when creating one for an existing table. You then have to specify the new primary and sort keys for the GSI, and DynamoDB will begin to allocate resources and then backfill the index with the reorganized data from the base table. In terms of the data replication, DynamoDB automatically handles synchronization of each global secondary index with its base table. When the application writes or deletes items in a table, any GSIs associated with that table are updated asynchronously using an eventually consistent model. Applications never write directly to the indexes but continue to write to the base table, and the index will be updated. When you create a global secondary index on a provision mode table, you must specify read and write capacity units for the expected workload on that index. The provision throughput settings of a global secondary index are separate from those of its base table. A query operation on a global secondary index consumes read capacity units from the index and not from the base table. And when you put, update, or delete items in a table, the GSIs on that table are also updated and these index updates consume write capacity units from the index, and again not from the base table. The reads made targeting any LSI or GSI connected to a table need to be made by referencing the table as well as the desired index, and both indexes support querying and scanning. A few other things about secondary indexes before we close this video out. One, use your indexes efficiently and keep the number to a minimum. Don't create secondary indexes on attributes that you don't often query. Indexes that are seldom used contribute to increased storage and throughput usage without improving application performance. Two, because secondary indexes consume storage and provision throughput, you should keep the size of the indexes as small as possible. Also, the smaller the index, the greater the performance advantage compared to querying the full table. If your query usually returns only a small subset of attributes, and the total size of those attributes is much smaller than the whole item, project only the attributes that you regularly request. And three, to optimize frequent queries with the lowest possible latency, project all the attributes that you expect those queries to return. In particular, if you query a local secondary index for attributes that are not projected, DynamoDB automatically fetches those attributes from the table, which requires scanning the entire table for the item. Well, that's all for our review on LSIs and GSIs. Be sure to look at the AWS documentation for more information on how these can be used and other best practices. I'll see you later.