We’ve recently encountered an issue with a few of our Alfresco clients that are using Microsoft SQL Server as their backing database.
Over time, it appears that indexes on Alfresco tables can become heavily fragmented. Generally speaking, fragmentation of database indexes is completely normal and expected. However, the level of fragmentation we’ve been noticing is unusually high and tends to have a noticeable impact on performance.
Unfortunately, we have not identified a cause for this fragmentation (through our own investigation as well as through working with Alfresco Support). However, we are able to execute a SQL query that can help determine the level of fragmentation, allowing us to take action and help improve overall performance of the repository.
The following query can be executed against a SQL Server database:
SELECT object_name(IPS.object_id) AS [TableName], SI.name AS [IndexName], IPS.Index_type_desc, IPS.avg_fragmentation_in_percent, IPS.avg_fragment_size_in_pages, IPS.avg_page_space_used_in_percent, IPS.record_count, IPS.ghost_record_count, IPS.fragment_count, IPS.avg_fragment_size_in_pages FROM sys.dm_db_index_physical_stats(db_id(N'<YOUR_DATABASE_NAME;>'), NULL, NULL, NULL , 'DETAILED') IPS JOIN sys.tables ST WITH (nolock) ON IPS.object_id = ST.object_id JOIN sys.indexes SI WITH (nolock) ON IPS.object_id = SI.object_id AND IPS.index_id = SI.index_id WHERE ST.is_ms_shipped = 0 ORDER BY 4 DESC GO
The above query will utilize a built-in SQL Server function that provides detailed index statistics from the database (“sys.dm_db_index_physical_stats”). In order to use this query, replace “
<YOUR_DATABASE_NAME>” with the name of the database in question. The output of this query is a list of all indexes from all tables in the database, sorted by the level of fragmentation. An example is below:
SQL Server Index Fragmentation – Sample Query Output
Notice the “avg_fragmentation_in_percent” column. Determining which indexes to clean up will be largely based on the value of this column for each index. In general terms, you should look to rebuild any indexes that are fragmented more than 30%. However, if the index in question is only on a table of a few rows, this may be overkill. Use the “record_count” column to help determine whether or not the number of rows in the table is worthy of requiring an index rebuild.
Microsoft provides some general guidelines on when you should consider rebuilding vs. reorganizing indexes. You can find those guidelines here:
Rebuilding an index is generally going to be a more resource intensive operation than reorganizing and may impact performance during the actual rebuild process. So, take into consideration peak usage times when planning to rebuild indexes and try to do this when usage of the system is low. Note: it is not necessary to take Alfresco down when rebuilding an index. SQL Server is smart enough to keep the “live” index in place until the rebuild has completed after which point it will swap in the new index and remove the existing, fragmented index.
Once you’ve identified which indexes should be rebuilt, you can use the following command to issue the rebuild request, substituting “
<INDEX_NAME>” and “
<TABLE_NAME>” with actual values from the output of the query above:
<TABLE_NAME> REBUILD PARTITION = ALL;
It can be a little tedious to write the above query by hand if you have a number of heavily fragmented indexes. To make this process easier, we use a spreadsheet that, given the index and table names, will generate the queries for you. That spreadsheet is located here:
Simply take the TableName and IndexName columns for those indexes you wish to rebuild and paste them into columns A and B in the spreadsheet. Column C of the spreadsheet will then contain the proper query to rebuild each index. You can highlight all cells in column C, copy, then paste into a new SQL Server query window in order to execute the index rebuild in batch.
The amount of time needed to rebuild all of your fragmented indexes will really depend on the number of rows in each table affected, as well as resource utilization on the database server when the queries are running. To play it safe, you should consider running these queries at a time when it’s least impactful to users.
Finally, you should periodically run the fragmentation report query in order to understand whether or not indexes need to be rebuilt. Depending one repository usage, this could be every few weeks or months.
We’ll continue to keep an eye on this issue. Should Alfresco release a fix for this, we’ll update this blog post.