Tuesday 23 October 2018

In Search of a Smarter Maintenance Plan

I wrote a blog post about Implementing Maintenance Plans a while back.  In it I suggested that if you have time to rebuild the indexes once a week, you should go ahead and do it. 

But what if you DON'T have time?

Indexes

I suggested rebuilding the indexes once a week.  And the bog standard maintenance plan rebuilds every index, whether it needs it or not.  But if things are heavily fragmented, or there are a lot of indexes with millions of rows each, that might take an awfully long time.  I would expect heavily fragmented indexes to adversely affect performance, so set up a Smart Rebuild / Reorganise job to run as part of the weekly maintenance plan.  



Instead of the usual Rebuild Index task, grab the Execute T-SQL Statement and add it to the maintenance plan.  You can restrict the run time with the Execution Time Out parameter - specifying 3600 seconds means that after an hour the job fails, so that it does not over-run the maintenance window.  Note that if it hits that time out, the job will fail, so I have put this step last.  It will also send an error message email if you have it set up to do so.  




Here's the script - I swiped it from Artemakis Artemiou - poli efcharisto!  
The job identifies all the indexes with fragmentation levels >15%
It starts with the worst and works down
Above 30% it rebuilds them
Between 15% and 30% it re-organises them

If the database is badly fragmented, you might want to run this daily, at least until they are under control.  


Statistics
I also set up a job to update statistics.  Statistics can get out of date, and queries may be less efficient because of this. 

Updating statistics is a bit of a funny one.  If you rebuild the indexes, you get a nice new set of statistics.  But if we are not rebuilding the indexes, you have to rely on Microsoft to do it as it thinks fit, according to its algorithm.  Sod's Law says that it won't update the ones you want it to update.  The SQL execution plan can ignore a perfectly good index because the statistics are not up to date, and may take far longer to run a query because of this.



As before, I added a T-SQL task to the Daily tasks,  last in the line.  
I've given it a timer limit of 5400 seconds (an hour and a half, to save you getting your calculator out)
And it runs a stored procedure which I found here...
Thanks to Bill Graziano!



Note that Bill's script sets a time limit too, so I don't really need to set an execution timeout.

There's an alternative to the above - the well regarded and reputable scripts from Ola Hallengren.
A lot of people swear by them.







Wednesday 17 October 2018

The Madness Continues


I have a script which identifies MDF files which do not appear in my list of databases.

The purpose of this is to find databases, usually on development boxes, which have been detached and forgotten, but are taking up space on the disk.  A detached database is of course invisible, unless you go poking around looking for it.

What I usually do is:
i)  attach them again
ii) take a final backup just in case they actually are needed
     (not very likely, but sod's law says if you don't do this, they will suddenly become critical)
iii)  drop the thing

But today, I failed at the first hurdle:



Create file encountered operating system error 32

The process cannot access the file because it is being used by another process

SQL Server error 5123 tells me it's a problem attaching the database

Huh?  It's a detached database file, isn't it?  
What could possibly be using the detached file?

Some utter maniac renamed the .ldf file as .mdf

name fileid filename filegroup size maxsize growth usage
GenDraft 1 F:\User Databases\GenDraft_20180925.mdf PRIMARY 40960 KB Unlimited 8192 KB data only
GenDraft_log 2 G:\Logs\GenDraft_20180925.mdf NULL 12352 KB 2147483648 KB 4096 KB log only


Thursday 16 August 2018

Database Sizes and Growth Rates

I've been asked to write a few words about this thrillingly sexy topic.  Why?  Because I've been looking at some existing servers and finding quite a few issues which are Bill's Fault.

Microsoft thinks that a good initial size for a database is 1MB, and it should grow by 1MB every time it grows.   Logs should grow by 10%, which isn't utterly mad, but isn't a good idea either.   Since Microsoft sets these defaults, it's easy to set up a database.  Which is good.  But since they are lousy defaults, it's bad.

I suppose these figures are there because there is no right answer to the question  "How big should these settings be then, Mr Smartypants DBA?". 

So what's the problem?

Well, I recently found a database which had grown from 500MB to 11071MB in 1MB increments.  That's a LOT of growth events.  Each time it grows, it has to stop and think about it.  And if it has to grow ten times, that's a lot of hanging around, twiddling your thumbs, waiting for it to think. 

Hindsight is a wonderful thing.  I set that database to grow in 512MB increments.  Because I now know how big it is, and how big it was when it started. 

In a perfect world, you would carefully calculate how big the database is likely to be after a year or two of adding data, and size it to that size, plus maybe 10% for luck.  In the real world, you rely on SQL Server's capacity to grow when it needs to. 

So what's wrong with growing by a percentage? 

First answer:  Small database, growing by 10% each time, not a big issue.  Big database, growing by 10% each time, maybe 10GB every time it grows, is going to take a long time!
Second answer: Log files are composed of Virtual Log Files (VLFs) and ideally you don't want huge numbers of them, all different sizes - which they will be if they keep growing by 10% at a time. 

And if you forced me to set a default growth rate without knowing anything about the database, I would choose 16MB for the data, 8MB for the logs.  Better than Microsoft's default values, anyway.  That's what I put in the MODEL database. 

Friday 10 August 2018

Login Failed on Named Instance



I have a SQL Server 2016 server – call it Gandalf

There are three instances – Default, Frodo, and Sam

Every day I check the error logs.  Every day I see that someone or something is trying to log in at 0200. 

Error: 18456, Severity: 14, State: 5. 
Login failed for user ‘agent_default'. Reason: Could not find a login matching the name provided. [CLIENT: <local machine>]  -

And on Sam, Login failed for user ‘agent_default' and Login failed for user ‘agent_Frodo'

Microsoft's syspolicy_purge_history job runs every night at 0200 on every instance.  It always succeeds.  But it is the only SQL Server agent job that runs at 0200.  So it must be that, surely?  I change it to run at 0215.  Next day, the error log says that our mysterious friend attempted to log in at 0200. 

Hmm, not that then.  Is it an automated scan from our white hat penetration testing server?  I wasted a lot of time trying to pin this down until I was able to find confirmation that Nope - it is not set up to look on Gandalf. 

And then suddenly I noticed that the login attempt had happened at 0215, not 0200.  It IS syspolicy_purge_history !  But it is the job on the OTHER instance which is causing the problem. 



1)      When the job   runs on the Frodo instance, no issues
2)      When the job runs on the default instance, the error log on Frodo says Default agent account failed to log in -
Error: 18456, Severity: 14, State: 5. 
Login failed for user ‘agent_default'. Reason: Could not find a login matching the name provided. [CLIENT: <local machine>]  -
3)      When the job runs on the Sam instance, the error logs on Default and Frodo says
Login failed for user ‘agent_default' and
Login failed for user ‘agent_Frodo'


If you check out that job, Step 3 is running a chunk of Powershell.  It gets the Instance name and runs a task to erase records (older than the specified number of days).

if ('$(ESCAPE_SQUOTE(INST))' -eq 'MSSQLSERVER') {$a = '\DEFAULT'} ELSE {$a = ''};
(Get-Item SQLSERVER:\SQLPolicy\$(ESCAPE_NONE(SRVR))$a).EraseSystemHealthPhantomRecords()

Except that it also looks on all the other instances.  And because I have a different service account for each instance, it fails to find a login, which is why the job succeeds on each instance, but the other instance records a failed login attempt!
It's a known bug with Powershell, apparently.  And that link gives me the Powershell code to fix it!

$applicationName = "SQLPS ($env:USERNAME@$env:COMPUTERNAME)"
$SQLServerConnection = New-Object System.Data.SqlClient.SqlConnection
$SQLServerConnection.ConnectionString = "Data Source=$(ESCAPE_NONE(SRVR));Initial Catalog=master;Integrated Security=SSPI;Application Name=$applicationName"
$PolicyStoreConnection = New-Object Microsoft.SqlServer.Management.Sdk.Sfc.SqlStoreConnection($SQLServerConnection)
$PolicyStore = New-Object Microsoft.SqlServer.Management.Dmf.PolicyStore ($PolicyStoreConnection)
$PolicyStore.EraseSystemHealthPhantomRecords()

Just cut and paste that into Step 3 of the syspolicy_purge_history job, and the mysterious errors stop.

How long does it take to create a database?


How long does it take to create a database? 
Well, depends on the piece of string on your pocket, of course – how big is the database, how zippy your server is, and so on. 
But ideally, you want to create it ASAP. 

Does the database ever need to grow? 
Well, if you have sized it perfectly, it will never need to grow.  <Howls of derisive laughter>
OK, seriously, of course it does, and you want it to grow as quickly as possible too. 


By default, before it creates or grows a file, SQL Server pauses for a moment, maybe quite a few moments, to wipe out any existing data by writing zeroes all over the space it needs.  Which is a sensible precaution if you want to make absolutely sure that nobody can hack into your server and read stuff that you had previously deleted.  But if you are writing data over it anyway, that might be a bit excessively paranoid.  Or maybe you ARE excessively paranoid?  Perhaps you have enough security in place that you would like to consider trading that precaution for speed and performance.  Only you can decide that.  



Find a development server, and check the file location of an existing database, like this:

use dbadmin
go
sp_helpfile


Now create a database.  See how long it takes.  In my Dev environment, this ran in 3 minutes, 17 seconds:

USE [master]
GO


CREATE DATABASE [Fred] ON  PRIMARY
( NAME = N'Fred_Data', FILENAME = N'G:\data\Fred_Data.mdf' , SIZE = 102596608KB , MAXSIZE = UNLIMITED, FILEGROWTH = 0)
LOG ON
( NAME = N'Fred_Log', FILENAME = N'H:\logs\Fred_Log.ldf' , SIZE = 2621440KB , MAXSIZE = 2048GB , FILEGROWTH = 0)
GO


Okay,  three minutes for 100GB, not totally shabby. 

Let’s try it without writing the zeroes.  It's called Instant File Initialization, if you want to do any further reading - I like Brent Ozar's write-up


It isn’t a box to tick or a script to run, I’m afraid.   you need to restart the SQL Server service, so it will have to be done out of hours on Production.


First thing you need is the name of your service account:


-- 2008r2 onwards

SELECT service_account, * FROM sys.dm_server_services;


For example - e.g. GANDALF\sqldev_frodo


Open a Remote Desktop Connection to the server
Run SECPOL.MSC (Local Security Policy)
Open up the Local Policies
Click on User Rights Assignment
Scroll right down to Perform Volume Maintenance Tasks
Open it up and add the service account
Click on OK


Now you need to restart the SQL Server Service
Run SERVICES.MSC
Scroll down to the SQL Server services – there are probably a few of them
Stop the SQL Server Agent (MSSQLSERVER) service first
Then stop SQL Server (MSSQLSERVER)

Then start SQL Server (MSSQLSERVER)
And finally start SQL Server Agent (MSSQLSERVER)


Why in that order?  If you stop SQL Server (MSSQLSERVER), the Agent service should stop automatically
If you restart SQL Server (MSSQLSERVER), it should stop, stop the agent, and restart (though you have to remember to restart the agent).


So why bother with the extra steps?
Because that is what SHOULD happen.  Sometimes the restart thing gets stuck.  This may be me remembering something that used to happen and has now been fixed, senile old fool, stuck in his ways.  So feel free to try restart.  Just don’t come whingeing to me if it doesn’t do what it should do.  Because I shall say I Told You So



Back to that script.  Remember the script to create the database?


USE [master]
GO


DROP DATABASE [Fred]
GO



CREATE DATABASE [Fred] ON  PRIMARY
( NAME = N'Fred_Data', FILENAME = N'G:\data\Fred_Data.mdf' , SIZE = 102596608KB , MAXSIZE = UNLIMITED, FILEGROWTH = 0)
LOG ON
( NAME = N'Fred_Log', FILENAME = N'H:\logs\Fred_Log.ldf' , SIZE = 2621440KB , MAXSIZE = 2048GB , FILEGROWTH = 0)
GO


Before, this ran in 3 minutes, 17 seconds.

Now?  18 seconds. 

That’s a pretty good improvement


Tuesday 7 August 2018

Tables with Spaces in the Name

Someone has created a table called Complaint Log
Not ComplaintLog or even Complaint_Log

I want to drop an index, so enclosing it in square brackets does the trick

DROP INDEX [IX_Complaint_Log_SNP_Number] ON [dbo].[Complaint Log];
And of course I could do it with the GUI, just by pointing and clicking, all that stuff

But, I want to do it with a script and run it at 0300, and I want to check whether the index actually exists before my script tries to drop it. 

Normal thing would be to do something like this:

IF EXISTS(SELECT * FROM sys.indexes

       WHERE object_id = object_id('dbo.Complaint Log')

       AND NAME ='IX_Complaint_Log_SNP_Number')

DROP INDEX [IX_Complaint_Log_SNP_Number] ON [dbo].[Complaint Log];

But because of that pesky space, it doesn't work.  Oh, it says it does - it says
Command(s) completed successfully.
But it is lying - the index is not found, and it is not deleted

It would be very satisfying to hunt down the guy who created Complaints Log instead of Complaints_Log, but in these namby pamby days of political correctness gone mad, torturing and killing Developers is frowned upon.

Stefan Spraakman gave me the answer - thanks Stefan!  I need to join sys.indexes with sys.objects and test on the name, like this:


IF EXISTS(SELECT * FROM sys.indexes AS i

          INNER JOIN sys.objects AS o ON o.object_id = i.object_id

          WHERE o.name = 'Complaint Log'

          AND i.NAME ='IX_Complaint_Log_SNP_Number')


DROP INDEX [IX_Complaint_Log_SNP_Number] ON [dbo].[Complaint Log];


Friday 13 July 2018

Backup Compression

Who wants big backups?  Not me, and probably not you

You can compress your backups, starting from SQL Server 2008. 

In a maintenance plan, just select the option Compress

In a script, add the magic words WITH COMPRESSION

But you can set the default so that if you forget, it compresses for you. 

I'm indebted to Microsoft for this unusually clear description

-- before - 0 means OFF
SELECT value  
FROM sys.configurations  
WHERE name = 'backup compression default' ; 
GO 

EXEC sp_configure 'backup compression default', 1 ; 
RECONFIGURE WITH OVERRIDE ; 
GO
 

--after - 1 means ON
SELECT value  
FROM sys.configurations  
WHERE name = 'backup compression default' ; 
GO


You don't even have to restart anything!


Is it worth doing?  What do you mean, don't you trust me?
I think so.  Here's a way to calculate the actual compression achieved - on my server, out of 1,628 backups, I got a compression ratio of 3.02


SELECT
       sum(backup_size) as All_Backups
       ,sum(compressed_backup_size) as All_Compressed
       ,sum(backup_size)/sum(compressed_backup_size) AS Compression_Ratio
FROM msdb..backupset; 


Madness

What maniac thought that the Approved field would be a good candidate for an index?

Approved  HowMany
0 876528
1 446593
-1 24

And what on earth can -1 signify?

And why did they use an INT as the data type?

Argh!

Wednesday 20 June 2018

Database Mail – What could possibly go wrong?


Fans of SQL Server will know that the answer is LOTS!  So I shall try to make some kind of sense of it.  I hope you find it helpful.
If you are in a rush, skip to the end where I have listed three things to make sure
Start by sending a test message to yourself
Easiest way I think is to open Management in SSMS, right click on database mail, and select Send Test E-Mail. Have a look in your inbox.
But if you want to do it with a bit of T-SQL, why not?

--send test message
USE msdb
GO


EXEC sp_send_dbmail
@profile_name='H2SO4',     -- whatever the mail profile is called
@recipients='jack.whittaker@mycompany.co.uk',  -- YOUR email, duh
@subject='Test Message',
@body='If you can read this email, your database mail is working';
go


If you find that message in your inbox, well done, go and have a nice cup of tea. 


No sign of your message?
Check out your Junk / Deleted Items folders – I did a lot of troubleshooting today, as emails were being  caught by a Junk rule and being automatically moved to my Junk folder.  If you have set up rules to send automated messages to a folder, check there too. 


Is it a permissions issue? 
Add this little bit of code before the above script and try again. 
EXECUTE AS LOGIN = 'SA'




Check the Logs
USE MSDB
GO

SELECT
       sai.sent_status
       ,sai.subject
       ,sai.body
       ,sai.last_mod_date
       ,sai.recipients
       ,sel.description as Log_Message 

FROM dbo.sysmail_allitems as sai
INNER JOIN dbo.sysmail_event_log AS sel
    ON sai.mailitem_id = sel.mailitem_id;


If all are UNSENT, think about restarting the SQL server agent service





The Event Log may or may not be helpful.  It usually tells you either

1)                DatabaseMail process is started
2)                DatabaseMail process is shutting down 

You can ignore these by adding a where clause, but the started and shutting down messages indicate that it is working normally.

use msdb
go

select *
from dbo.sysmail_event_log
--where event_type != 'information'
order by log_id desc;


There are a couple of other things you can look at, but probably not helpful if you have already looked at sysmail_allitems

-- list all emails
SELECT sent_status, *
FROM dbo.sysmail_mailitems;

--List all SENT emails
SELECT sent_status, *
FROM dbo.sysmail_sentitems;


And one of these things might give you a clue

--Should be 1 ie Yes
 SELECT is_broker_enabled FROM sys.databases WHERE name = 'msdb';

--should be started
 EXECUTE msdb.dbo.sysmail_help_status_sp;

--state should be active
 sysmail_help_queue_sp @queue_type = 'Mail' ;

 --ensure that sysmail has been started
 EXECUTE dbo.sysmail_start_sp;


Things to make sure

A)     Database Mail insists that you have .NET Framework 3.5 on your server.  You might notice that you have .NET Framework 4.5, so you don’t need the old version, right?  WRONG Check out how to do it here

B)     If you can send an email from the Database Mail, but not from SQL Agent jobs, you may have forgotten to enable a mail profile for alerts.
  1. Right click on "SQL Server Agent"
  2. Then "Properties".
  3. Then go to "Alert System" section.
  4. Tick the "Enable mail profile" box.
  5. Then "OK".         
  6. MUST do - Restart the SQL Server Agent service. 

C)     And if Database Mail seems to be sending, but you are not receiving, check again that your message is not being caught by a filter.  Sysadmins will be able to look for the message in Exchange.  You can check it by sending a test message to a colleague.