six demon bag

Wind, fire, all that kind of thing!

2015-03-12

Repairing a Suspect msdb Database

Sometimes a situation arises where an SQL Server instance comes back up with a database tagged as "suspect". Apparently there is a number of possible causes for this, like transaction log corruption, insufficient memory or disk space, or unexpected shutdowns due to hardware or power failure. In our case the reason was probably a hardware failure, since the database resides on an iSCSI volume, and we were making changes to our iSCSI network. And to make things a little bit more interesting, our msdb database just had to be among the affected databases.


First and foremost, should you have any other suspect databases along with a suspect msdb: ignore them for the time being. Chances are they'll return to normal once you got the msdb fixed.

There's just one tiny problem with that: how do you actually repair an msdb database? Normally, when one of your databases becomes suspect, you'll do something like this:

EXEC sp_resetstatus [dbname];
ALTER DATABASE [dbname] SET EMERGENCY;
DBCC checkdb([dbname]);
ALTER DATABASE [dbname] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DBCC CheckDB ([dbname], REPAIR_ALLOW_DATA_LOSS);
ALTER DATABASE [dbname] SET MULTI_USER;

But since msdb is a system database the above procedure already fails at the second step:

Option 'EMERGENCY' cannot be set in database 'MSDB'.

To get around this issue you need to restart your SQL Server instance in single-user mode, rebuild the database and restore the latest backup:

  1. Stop the SQL Server Agent service (net stop SQLSERVERAGENT).
  2. Stop the SQL Server service (net stop MSSQLSERVER).

    On an SQL Server cluster you need to take the SQL Server role offline to stop the service on the owner node. Bring the storage, server name, and IP address resources back online, but make sure the SQL Server role stays offline. Then log in on the owner node.

    SQL Server role offline; Name, IP Address, and Storage resources online

  3. Start the SQL Server service in single-user mode (net start MSSQLSERVER -m -T3608).

  4. Run sqlcmd -E -S<INSTANCE>.
  5. Detach the database (EXEC sp_detach_db msdb). Contrary to this answer on Database Administrators you must detach an msdb database in order to be able to rebuild/restore it, otherwise you'll get an error like this:

    Restore of database 'msdb' failed. (Microsoft.SqlServer.Management.RelationalEngineTasks)
    
    
    ------------------------------
    Program Location:
    
    
       at Microsoft.SqlServer.Management.RelationalEngineTasks.RestoreDatabaseTaskFormCom...
       at Microsoft.SqlServer.Management.RelationalEngineTasks.RestoreDatabaseTaskFormCom...
       at Microsoft.SqlServer.Management.TaskForms.TaskExecutionManager.ExecuteTaskSequen...
    
    
    ===================================
    
    
    System.Data.SqlClient.SqlError: File initialization failed. RESTORE cannot continue.
    (Microsoft.SqlServer.SmoExtended)
    
  6. Move or rename the files MSDBDATA.mdf and MSDBLOG.ldf in the database's data directory.

  7. Run "%ProgramFiles%\Microsoft SQL Server\MSSQL*\MSSQL\Install\instmsdb.sql" to rebuild the database.
  8. Restore the latest backup (RESTORE DATABASE [msdb] FROM DISK = 'C:\path\to\msdb.bak' WITH REPLACE,RECOVERY).
  9. Stop the SQL Server (net stop MSSQLSERVER) and restart it in normal mode (net start MSSQLSERVER for standalone servers). On a cluster bring the SQL Server role back online.

Posted 19:07 [permalink]