Troubleshooting TFS DB Growth
Spoiler: This was caused by ReleaseManager 2013 retaining builds. Read on for the work around that was put in place.
As part of a project I was working on recently; I put in a lot of work around TFS Builds, Unit/Integration Tests & continous delivery using ReleaseManager.
It was a fairly standard pipeline:
→ Check in → Unit & Integration Tests Run → Build → Release To Test
Up until this project TFS Build hadn’t been used, TFS was just used for source control and some workitem tracking, it was hosted on a VM and space was never an issue.
After a few months of the project I started getting reports that the TFS DB size was growing constantly. The IT team had to keep allocating more space to the disks used for storing the DB backups.
It turns out that the Tfs_DefaultCollection
had grown from about 6GB to over 50GB and the growth was constant.
After doing a little research online I found the following query that gives some insight into which tables are taking up the most space in the db.
USE Tfs_DefaultCollection
SELECT
t.NAME AS TableName,
s.Name AS SchemaName,
p.rows AS RowCounts,
SUM(a.total_pages) * 8 AS TotalSpaceKB,
SUM(a.used_pages) * 8 AS UsedSpaceKB,
(SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN
sys.schemas s ON t.schema_id = s.schema_id
GROUP BY
t.Name, s.Name, p.Rows
ORDER BY
TotalSpaceKB desc
Table | Row Count | Size (GB) |
---|---|---|
tbl_TestResult |
367386 |
19.660805 |
tbl_TestMessageLogEntry |
22354468 |
12.085457 |
tbl_Content |
62475 |
4.4598694 |
tbl_BuildInformation2 |
22736801 |
7.6754837 |
Normally TFS retains builds as defined by a build retention policy, ours tells it to only keep the last 10 builds.
However, there is a known issue with the Microsoft Release Management 2013 that causes every build that is involved in a release to be retained indefinitely, this doesn’t really fit with the continous delivery model and has been changed in newer versions.
Every check in we do triggers a build which triggers a release so all our builds are retained.
This has issues for the storage capacity on the drops folder where builds are dropped and also storage capacity in the database.
The issue was exacerbated for this project because of the logging setup in the integration tests. Each test class library gets a DB deployed and dropped and seed scripts run, all the logging from these activities is echoed to the console. The application logging is also set to console output.
All this log info for each test run is stored in the TFS database which grew unmanageably.
The logging was turned off on the automated builds so this should help keep the size down for newer builds
ConsoleOut
in [SetUpFixture]
as the log information is available for each Unit Test. When I realised that all the logging info was retained in the DB I replaced the ConsoleOut
logger with a NoOp
logger on the build server. public static class TestLogSetup
{
public static void Logger()
{
Console.WriteLine(Environment.MachineName);
if (Environment.MachineName.Contains("<ServerName>"))
{
Console.WriteLine("No Op Logger");
LogManager.Adapter = new Common.Logging.Simple.NoOpLoggerFactoryAdapter();
}
else
{
Console.WriteLine("Console Logger");
LogManager.Adapter = new Common.Logging.Simple.ConsoleOutLoggerFactoryAdapter();
}
}
}
To manage the retained builds I wrote a powershell script that does the following steps for every build older than 2 weeks:
-
Update the Retain Indefinitely flag to false
-
Delete the build
-
Destroy the build
Delete the build
This is what is available through the UI, it is only a logical delete and while it does delete artifacts from the network drop folders, it does not delete rows from the db
Destroy the build
This is the delete from the db. This is not available though the UI and is only available through the API
param ($serverName = 'http://<snip>:8080/tfs/DefaultCollection')
[void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.TeamFoundation.Client")
[void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.TeamFoundation.Build.Client")
[void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.TeamFoundation.VersionControl.Client")
$tfs = [Microsoft.TeamFoundation.Client.TeamFoundationServerFactory]::GetServer($serverName)
$buildServer = $tfs.GetService([Microsoft.TeamFoundation.Build.Client.IBuildServer])
$vcs = $tfs.GetService([Microsoft.TeamFoundation.VersionControl.Client.VersionControlServer])
$projects = $vcs.GetAllTeamProjects($true) # We will run this for all team projects
foreach ($project in $projects){
Write-Host $project.Name
# Get all the build definitions for a given project
$buildDefs = $buildServer.QueryBuildDefinitions($project.Name)
foreach ($buildDef in $buildDefs){
Write-Host "-" + $buildDef.Name
# Delete all builds older than 14 days. This is a bit aggressive and can be pushed to a month once the logs are smaller on automated builds.
$endDate = (Get-Date).AddDays(-14)
# Keep going back until we hit dec 2014
while($endDate.Year -ge 2014){
$year = $endDate.Year
$month = $endDate.Month
write-host $year $month
# Create a search object for the project/build def
$buildDetailSpec = $buildServer.CreateBuildDetailSpec($project.Name, $buildDef.Name)
$startDate = $endDate.addMonths(-1)
$buildDetailSpec.MaxFinishTime = $endDate # Search criteria
$buildDetailSpec.MinFinishTime = $startDate # Search criteria
# only build info, not workitems, labels, etc
$buildDetailSpec.InformationTypes = $null
## This is important for the query, builds that are deletd by retention
## or the ui are only logically deleted, not destroyed in the db
$buildDetailSpec.QueryDeletedOption = [Microsoft.TeamFoundation.Build.Client.QueryDeletedOption]::IncludeDeleted
# Search all the things
$builds = $buildServer.QueryBuilds($buildDetailSpec)
if($builds.Builds.Length -ge 1){
Write-Host "before update" $builds.Builds
foreach($build in $builds.Builds){
# Need to turn off the Keep Forever (Retain) flag set by Release Manager.
if ($build.KeepForever -eq $true){
Write-Host "Updating build " $build.Uri
# Gets an updatable ref to the build
$buildToEdit = $buildServer.GetBuild($build.Uri)
# Edit the build flag
$buildToEdit.KeepForever = $false;
# save the build back to the tfs
$buildServer.SaveBuilds(@($buildToEdit))
}
}
# refresh the query given that some of the builds have been updated
$builds = $buildServer.QueryBuilds($buildDetailSpec)
Write-Host "after refresh" $builds.Builds
# Delete the build, test results, symbols, drop, etc.
$buildServer.DeleteBuilds($builds.Builds,
[Microsoft.TeamFoundation.Build.Client.DeleteOptions]::All)
# Destroy the db records
$buildServer.DestroyBuilds($builds.Builds)
}
# rolling back the years.
$endDate = $endDate.addMonths(-1)
}
}
}
The script could have been set to run as a scheduled task somewhere but I liked the idea of having it run within TFS as it made it easy for the dev team to monitor.
The impact of destroying the old builds was significant.
Table |
Row Count |
Size (GB) |
tbl_Content |
69733 |
1.889136 |
tbl_TestResult |
174868 |
0.323192 |
tbl_LocalVersion |
413449 |
0.122072 |
tbl_BuildCodeChange |
148950 |
0.078016 |
tbl_Version |
101484 |
0.068872 |
tbl_BuildInformation2 |
88673 |
0.064232 |
tbl_Command |
92974 |
0.039192 |
Tracking down the cause of the size increase was an interesting exercise, hopefully some of this detail may help someone else in future.