Wednesday, 5 December 2012

Implementing Maintenance Plans

OK, so I've been reading stuff, a dangerous hobby I know.  Lenin read Das Kapital, and before you knew it he had overthrown the Tsar and his ministers, and declared a Bolshevik Republic in Russia.  (Speaking of Lenin, have you ever noticed that all his statues show him hailing a taxi?)


In my case I've been reading Brad McGehee's guide to maintenance plans, and I've been looking at ways to put it into practice.  Of course there is no single right way to create a maintenance plan.  Just looking at backups, you can do it in any number of ways.  Using a maintenance plan is one way, you could also use Red Gate Backup or another third party tool and set up a job for backup.  Or I know a colleague who set up a cursor to work through each database in turn, backing it up.  The important thing is that you take a backup, not how you do it.

But the virtue of maintenance plans, which came in with SQL Server 2005, is that they provide a clean straightforward way of performing routine maintenance tasks, without forcing you to do much in the way of hard work.  That's the good thing; the bad thing is that you can't script them out and install them on another server - you have to create a new plan from scratch. Fortunately it's an easy drag and drop process.  Maintenance plans use a subset of SSIS - SQL Server Integration Services - to carry out the necessary maintenance tasks.  So you start by selecting a task from a "toolbox" menu, and dragging a box onto the screen.  Each box describes a particular task: backup, update statistics, indexing, whatever - you configure it, and then go on to the next task.  I find this an elegant way of doing stuff, particularly when you want to do things in order, and you don't want to start the second task until you are sure that the first thing has worked.

Here's one I did earlier - backing up over a network.   This example shows another nice feature - see the red line?  This tells the system what to do if there is a failure - if the first backup fails, take another backup to a different backup server.



Let me show you what works for me - and I'll try to talk about what you might like to change.

First of all, I like to set up two separate maintenance plans - one for the system databases (master, model, msdb) and another for all the user databases.  The system DBs are usually relatively small, and straightforward,  Having separate maintenance plans for System and User databases allow the System DBs to be backed up in seconds; a failure in the User databases does not stop the system databases from being backed up.  If there is a failure, chances are it will be with one of the big user DBs, and by having a separate maintenance plan, I know that one important thing at least is working, and system DBs are fine.

The description is optional, but it is good practice to enter one, with initials and date.  Each plan has appropriate sub-plans - for Daily, Weekly and Hourly Tasks as required.  So for example:


DB Admin System DB Maintenance
Planned Preventative Maintenance for System DBs - jw 13/11/2012
    • Daily Tasks
    • Weekly Tasks
DB Admin User DB Maintenance
Planned Preventative Maintenance for User DBs - jw 13/11/2012

    • Hourly tasks
    • Daily Tasks
    • Weekly Tasks
Here's what it looks like on screen:
SQL Server 2012 has an extra column:- Run As e.g. run as SQL Server Agent service account.
Notice I've given each sub-plan a schedule - it's probably better not to do this immediately, but wait until you have tested each element.  At the very least, don't schedule the transaction log backups until you have taken a full backup!

I don't anticipate a need to take transaction log backups of the system databases, so the hourly Tasks sub-plan can be skipped for the System DBs,  

Did I mention that SQL Server will create a log report of all your maintenance plan activities?  A quick flashback to the menu - see the button that vaguely resembles two pieces of paper?  If you hover over it, a tooltip pops up to say Reporting and Logging. 




      Click on that Reporting and Logging button, and you get this dialog box.  There are just a couple of things to do here.  First of all, tick the box to say Log Extended Information.  This fills the log with extra data.  You will never need this extra data unless you haven't got it, in which case it will be crucial to troubleshooting your problem.  Second, copy and paste the folder location into your clipboard - you'll need it in a few minutes.  

  


Weekly Tasks

I've put down three tasks to be done weekly - Rebuild the Indexes, clean up the History, and clear out the log reports.  


Rebuild Index 

Once a week is probably enough, but of course that depends on your particular circumstances.  The Rebuild might be needed more frequently if you have a high volume of data edits, inserts or deletes, in which case you can do this as one of the Daily Tasks.  If you have a narrow maintenance window, you might like to consider the need for a SQL task to run custom code to rebuild, reorganise or leave indexes alone, depending on the level of fragmentation it detects.  If you want to do this with a separate job, right click on the box above and select Disable.  It's also a good idea to leave a reminder note to yourself - right click on the background next to the box and select Annotation.  This lets you write something like "Replaced by Special Indexing Job jw 01/12/2012" - so you don't come across the disabled job in three months' time and enable it.  

But if you have time to Rebuild, why not?  Here's the dialog box.  I won't go into the details because usually all you have to do is Pick the Rebuild Index task off the toolbox menu, drag it onto the screen and tell it what databases you want. If you want to know all the whys and wherefores, have a look at Brad McGehee's guide.   



I like to pick either all the system databases or all the user databases, depending.  You can do different things to different databases, by picking individual ones, but you then have the problem of  keeping on top of changes - if someone adds a new database you have to remember to go through all your maintenance plans deciding what to do with it.  Keep it standard if you can.  One box I do recommend you tick is the one to ignore databases where the state is not online - it saves a lot of hassle.  Obviously you don't want to re-index an offline database, but if you forget to tick this box, SQL server will try, and fail, and then the whole job fails.  



Clean out history 
Once a week, clear out stuff older than four weeks.  That's the default, you can change it if you can think of a good reason for doing so.  




Clean out old log reports 

 Remember the reporting and logging folder which you added to your clipboard a few minutes ago?  This is where you need it.  The Maintenance Cleanup Task is usually used for clearing out old backups, but you can also use it here for clearing out old log reports.  If you think about it, you have a couple of daily tasks and maybe a dozen hourly transaction log backup tasks.  Over a few weeks, this will mount up.  So the default is to sling those that are older than four weeks old.  You don't have to - you can keep them for ever if you want, it's no skin off my nose - but four weeks seems reasonable.



 Warning - SQL Server can be a bit flaky at deleting old files, especially things that aren't old BAK files.  In theory, this was resolved years ago when they brought out Service Pack 2 for SQL Server 2005. In practice, it may work as designed, or it may not.  If it doesn't, check out this blog post - Maintenance Plan Log Files Won't Die

Daily Tasks

I've put down four tasks to be done each day- check database integrity, update statistics, take the backup and remove old backups.




Check Database Integrity

This is going to check for corruption in your databases.  It actually runs DBCC CheckDB(database) for each of the databases you select.


Don't believe me?  Just press the View T-SQL button:



USE [ReportServer]
GO
DBCC CHECKDB(N'ReportServer')  WITH NO_INFOMSGS
GO
USE [adventureworks2012]
GO
DBCC CHECKDB(N'adventureworks2012')  WITH NO_INFOMSGS
GO



You can type out these commands yourself if you want, but why would you ever go to all that trouble?  I think this is a good illustration of maintenance plans making life simple.

Here's the other good thing.  In 99 times out of a hundred, probably even more often than that, DBCC CheckDB will find no problems.  But that one time, there's a corruption in the database, and that's potential bad news.  And if it happens, the maintenance plan stops, sends me a warning message (I'll get on to that) and doesn't back up the corrupt database or delete the old, good, backup.  Turn it the other way round - by doing the check before taking the backup, you know that you are not backing up a corrupt database.

Update Statistics


Update the statistics, and it could potentially make the queries run quicker.  If statistics are out of date, the query execution plan is relying on out of date information and could make a bad decision.  I've never seen proof of this in practice, but that might be because I like to update the statistics every day!


Back Up Database

This is the big one, the one that really matters.  If you don't back up your databases, perhaps DBA work is not the right niche for you?  Here's how:


I'll stick with my policy of backing up all the user databases, and taking a full backup of all of them every night.  But this might not always be possible.  If a database is very large, it might be best to take a full backup at the weekend, and take differential backups on all other days.  Check how long the backups are taking.

I like to create a sub-directory for every database - I find this particularly useful if I'm taking full and transaction log backups - this keeps them neatly together, and makes it easy to find the backups you need in a hurry.  Not everyone likes this - try it both ways and decide.

I like to use the file extension usr.bak for user databases, sys.bak for system databases.  This ensures that the maintenance cleanup task only clears out the correct backups.  (It probably isn't critical, unless you have very different deletion schedules for user and system backups).


Here's the code to take the backup: 


BACKUP DATABASE [adventureworks2012]
       TO  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\adventureworks2012\adventureworks2012_backup_2012_12_05_203624_3408922.usr.bak'
       WITH NOFORMAT,
              NOINIT, 
              NAME = N'adventureworks2012_backup_2012_12_05_203624_3408922',
              SKIP,
              REWIND,
              NOUNLOAD,
              COMPRESSION, 
              STATS = 10
GO


Then, if you tick the Verify box, it does an extra chunk of code, and runs restore with verify only - in other words, it simulates running a restore without actually doing so.  This is a good test of whether you have a good backup or not.  (The ultimate test, of course, is actually restoring your database - some would argue that this is the only test that matters).


declare @backupSetId as int
select @backupSetId = position from msdb..backupset
       where database_name=N'adventureworks2012'
       and backup_set_id=(select max(backup_set_id) from msdb..backupset where database_name=N'adventureworks2012' )

if @backupSetId is null
begin
       raiserror(N'Verify failed. Backup information for database ''adventureworks2012'' not found.', 16, 1)
       end

RESTORE VERIFYONLY FROM  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\adventureworks2012\adventureworks2012_backup_2012_12_05_203827_4199319.usr.bak'
       WITH  FILE = @backupSetId, 
       NOUNLOAD, 
       NOREWIND
GO

The downside of running Verify is that the maintenance plan will take longer to run.  Perhaps 10% longer, perhaps twice as long.  The default is OFF, but I would always turn it on unless I have good reason to be concerned about running time.  Try it and see.  

Finally, if you are running SQL Server 2008 onwards, you can choose to compress your database.  Third party products like the excellent Red Gate Backup offer different levels of compression; the native Microsoft product offers only on or off.  On is fine - I don't really see the value in fine-tuning the amount of compression - just do it!  One word of warning - you can set the default for compression to be "Use default server setting", which sounds well and good until you realise that you don't know what the default server setting is.  Best to explicitly say whether you want it or not, if you ask me.  


Clean up old backups

This is the Maintenance Cleanup task again - we saw this earlier in the weekly tasks when clearing out the log files.  It's the same, except that we are removing usr.bak files, and probably the retention period is something like three days.


Hourly Tasks

A quick one - Back Up the Transaction Log, and remove old backups




Backup Transaction Log

Pretty much as before - the essential difference is that the Backup Type is Transaction Log, and the file extension is trn.  





Clean up old backups

And again - same as before, remembering to clear out the trn files.  


Scheduling

It rather depends on your business - let's assume that your business works office hours (Monday to Friday, 9am to 5pm).  If it doesn't, you are going to have to think about when you can do things to be minimally disruptive.  

Weekly jobs - well, duh - once a week, perhaps on a Saturday or Sunday evening.  Let's say 1800 Sunday

Daily jobs - every evening, let's say at 2000.  This should mean that the weekly job is finished on Sunday before the daily job starts.  But you'll need to check the typical duration.  

Hourly jobs - start an hour or two before normal start time, finish an hour or two after knocking off time.  So every hour from 0600 to 1900 should cover it.  If nothing is happening overnight, there isn't much point in backing up that transaction logs.  

Whatever you decide, keep tabs on it for the first week at least to make sure that things aren't overlapping and generally screwing up.   


Problems

I'm going to make this the subject of another blog post, as this one is getting rather long-winded.

Monday, 5 November 2012

Troubleshooting Database Mail - Part II

I started out with a blog about how to set database mail up correctly.  How about when it goes wrong, or fails to work?

I finished Part I with how to send a test message.  Right click on Database Mail and select Send a Test Email.


There's an alternative if you want - use the sp_send_dbmail stored procedure.   There's no great benefit in using this at the moment, but just for completeness:



EXEC sp_send_dbmail
       @profile_name='ServerName',
       @recipients='Jack.Whittaker@companyname.com',
       @subject='Test message',
       @body='This is the body of the test message.  Mail is now working.'


Whichever way you do it, an email should arrive in your inbox after a few moments.  Press Send and Receive if nothing happens for a while.  Okay, twiddle your thumbs and go and make a cup of tea.

Still nothing?  You may have a problem.



Check the Logs


First things first - view the database mail logs.  Right click on Database Mail, and select View Database Mail Log.  You are looking for clues here.  See any?  Nope, OK, lets look at some more logs.  

-- list log events
SELECT * FROM msdb.dbo.sysmail_log
GO

-- list error messages
SELECT * FROM msdb.dbo.sysmail_event_log
WHERE mailitem_id >= 1;

Anything?  Hmm - OK, let's look at the mail.  There are several tables which contain mail items.  Just think about the difference between Failed items and Unsent items for a minute.  
--To list all items:
USE msdb
SELECT sent_status, *
FROM sysmail_allitems
GO


--To list the unsent items only:
USE msdb
SELECT sent_status, *
FROM sysmail_unsentitems
GO


--To list the failed items only:
USE msdb
SELECT sent_status, *
FROM sysmail_faileditems
GO

Is your test mail failing?  Then there is probably some sort of error message somewhere.  

Is it unsent?  This may mean that something is wrongly configured.



Can you pass the Buck?

Before looking at the multitude of things it might be, there may be a legitimate reason for your server not being able to send email, even if it is correctly configured.

  • Check the exclusion list of your anti-virus software - if AV is active on your server, make sure that databasemail.exe is allowed to send messages.
  • Is the server behind a firewall?  If it is, you can send mail until you are blue in the face, they won't get through.  

Next thing to check: can you TELNET to your smtp server on port 25 to make sure nothing is blocked?
Open a DOS Prompt on your server and type telnet mail.domain.ext 25 
It should return something like: 220 smtp esmtp Relay 
Which means that Telnet is working - 220 is the greeting code.

In my case I was able to get a colleague from the network services team to send an email from that box using his email diagnostics tool - that worked fine, so narrowing it down to a SQL Server issue,




Further Reading

Here's a reference to Microsoft's article Troubleshooting Database Mail: General steps - the version for 2008r2 - Database Mail doesn't look much different in the other versions, but if you go to that link it'll take you to the other versions.



Things to Check

Make sure that the login SA is the owner of all the system databases, and especially msdb.
It's probably a good idea if a named individual is not the owner of the user databases - things might go awry when that person leaves.  But for the moment, we are just checking the system DBs.



select name, suser_sname(owner_sid) as Owner 
from sys.databases

Or as so often in SQL Server, there's another way to do it:

-- check database owners
EXEC sp_helpdb
     
  
And if you find any stray databases, bring them back into the fold         
-- change db owner if needed          
USE eWorkDev
GO
EXEC sp_changedbowner [SA]
GO


Are you a member of the sysadmin fixed server role or msdb db_owner role?  This is a fairly safe bet if you are a DBA.  
Otherwise, you should be a member of the DatabaseMailUserRole.



Is Database Mail enabled?

sp_CONFIGURE 'show advanced', 1
GO
RECONFIGURE
GO
sp_CONFIGURE;
GO
RECONFIGURE
GO
sp_configure 'show advanced options', 0
GO
RECONFIGURE
GO
In the results pane, the run_value for Database Mail XPs should be set to 1.


Check DB MAIL Status - Sysmail should be STARTED
EXECUTE msdb.dbo.sysmail_help_status_sp


You can try stopping and then restarting 
EXECUTE dbo.sysmail_stop_sp

EXECUTE dbo.sysmail_start_sp



Check that the Service broker is enabled
SELECT is_broker_enabled FROM sys.databases WHERE name = 'msdb';

sysmail_help_queue_sp     This stored procedure looks at the status of the queue, but unfortunately it isn't as helpful as you might think.  It shows the queue as INACTIVE most of  the time, unless it is actually sending or receiving mail right at this moment.  

EXEC msdb.dbo.sysmail_help_queue_sp @queue_type = 'Mail' ;

The other possible messages are NOTIFIED (queue has been notified receipt to occur), and RECEIVES_OCCURRING (queue is receiving).


Getting Desperate?

You can always try the old standby - switch it off and switch it on again...

Go to the services on the server - run down to MSSQLSERVERAGENT and stop it,  Then start it again,  
Getting more desperate - locate the MSSQLSERVER service - stop it, start it.  
(Beware of Restart - it occasionally doesn't do anything if the service has crashed).  


Tidying Up

If by now your email has filled up with Unsent or Failed messages, you might like to clear them out with this handy stored procedure:

--clear out unsent messages
EXEC sysmail_delete_mailitems_sp @sent_status = 'unsent'

A good reason for doing this is that eventually you are going to fix this issue, and when you do, a hundred and one test messages are going to suddenly appear in your inbox.


Check the Database Mail program

Verify that the Database Mail External Program is located in the correct directory, 
In my case it's in F:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn



Check the Service Account

Verify that the service account for SQL Server has permission to run the executable DatabaseMail.exe.  Was there a change in the Service account that SQL Server runs under? 

If someone changed the account using the Services MMC rather than SQL Server Configuration Manager, then it may not have the rights required - these are usually set by the SSCM.  Using SSCM, reset the service account to something like NetworkService and restart SQL Server and SQL Agent. Try sending a test mail now.  Then using SSCM, set it back. This should confirm that the service accounts have the necessary rights. 

In my case, it was the Service Account which was the problem.  If you want, you can read an account of the hunt here.  Apart from thanking all involved again, I would like to make the point that you can search Google all you want - it's not as good as setting the problem-solving skills of a team of DBAs to work!

Sunday, 4 November 2012

Troubleshooting Database Mail - Part I

I've been wrestling with Database Mail lately - working through a list of servers and instances, making sure that they all send emails in the event of a problem.  Most did, some didn't.  Some had never been set up; some had been set up to use an old mail server which was about to be decommissioned; and one was apparently set up perfectly - except that it didn't actually work  <sigh>.


My purpose in setting up database mail on a server is to send warning messages to myself and my team in case anything goes wrong.  If a backup job fails, for example, I want to know about it.  Some people set up a corresponding message for when a job succeeds - this is up to you, but my attitude is that things normally work, and I don't plan to worry unless they don't.  I usually run a job under supervision the first time, so I can see whether it runs or not.  


Let's start with setting Database Mail up from scratch.  This came in with SQL Server 2005, replacing the cleft stick and smoke signals used by earlier versions.  (What?  you still use SQL Server 2000?  I don't have personal experience, but this may help).  There's a perfectly good wizard for doing this, and in general I suggest that you follow the wizard.

Starting in SSMS then, expand Management by clicking on the + symbol.
Right click on Database Mail and select Configure Database Mail - up pops the wizard.


The Welcome screen is optional - if you don't want it, click on the Skip this page in the future box.  Press Next.


The default is best - Set up Database Mail by performing the following tasks...  The other options are really there for making changes and additions subsequently, although you probably could set things up from scratch this way.  But you are using the wizard because you want to do things the easy way, aren't you?  I certainly am.

Assuming this is your first time setting up Database mail, up pops this box:


Well, fairly obviously, Yes I do!  If not, you might as well give up now...

Here's the new profile dialog box:

Profile name - I usually use the server name - it's as good as any other.  And ditto for the description.  But if you are going to have several mail profiles per server, you'll need to give this a bit more thought.  

Then click on Add to add a new SMTP (outgoing mail) account,  


Again, use the server name for the account name and description, and also for the Display name.  The email address and company name need to be valid addresses, but you don't really want any emails being sent to them, hence noreply@...  My purpose in setting up database mail on a server is to send warning messages to myself and my colleagues in case anything goes wrong.  I don't then send a message back to the server - this only encourages the server gremlins.  

The bit that really matters here is the name of the outgoing mail server (SMTP).  Your network team will be able to tell you the name of the mail server, the port it uses (default is 25) and whether it requires a secure connection.  Notice the name: mail.companyname.com; not mail@... - it's a server name, not an email address.  

Click on OK and it takes you back to the New Profile screen, this time with the Account name added.  Not very exciting, but for completeness, here's what it looks like:


Now, click on Next.  This is the Manage Profile Security page:


Make the Profile public by ticking in the box.
Make it the default profile by selecting Yes from the pull down menu.  The pull-down doesn't show until you click in the box, but it's there, I promise.  
You can set up private profiles that can only be accessed by specific users.  But we don't need that here.

Click on Next and you get a chance to configure the system parameters.  Most of the time, the defaults are fine, so don't worry about it.



Click on Next and it gives you a chance to review the choices you have made:


Click on Finish and it does what you told it to do.  Look, you only have yourself to blame...  


Close, and test.  

The easiest way to test is by right clicking on Database Mail and selecting Send a Test Email.


It should have the profile name you entered before, although if you need several profiles you will need to use the pull down menu.  Put your own email address in the To: box, the subject and Body are fine (although you can change them if you want) and press Send.  Sit and twiddle your thumbs for a few moments and a test email should appear in your inbox.  

What if it doesn't?  What you want is Troubleshooting Database Mail - Part II
Stay tuned to this channel, database fans...




Tuesday, 16 October 2012

Why is Replication not Working?

First of all, what is Replication about?  You take a copy of a database - all of it, part of it, whatever you need.

  • Maybe you want to copy data to a redundant server so that it's ready to take over if the original one fails.  Today you might use mirroring or AlwaysOn, but before these technologies came along you might use Replication
  • Maybe the master database contains a price list - you can send an up to date price list to every branch every morning at 0700 ready for the working day.  


Here's a one page intro that I like.  


So why is it not working?  No doubt there are hundreds of reasons why replication isn't working.  Here's one more to add to the list.  I set up what I thought was a straightforward Publication, using the wizard that Microsoft kindly provide.



I won't go into detail - just work through the wizard, tell it what table you want to publish, etc, etc and schedule it to run at an appropriate time - every hour, let's say - that's the default, fine by me.


So the wizard creates a publication - it takes data from a database and makes it available for other databases, probably on other servers, to read.  Sorry, to subscribe to.  




OK, so much for Publication, now for the Subscription.  Again, follow the wizard:



Start by telling it the name of the Publication you just created


 Connect to the subscription server i,e, the destination:



Schedule the job, and set it off.  Easy peasy.  Actually, there's a bit more work than I have shown here, but essentially you follow the wizard and it sets up two SQL Agent jobs - one to publish the data from Location A, and one to subscribe to that data and copy it to Location B.

The problem that cropped up was that I scheduled the jobs using the wizard to run every hour - as it assumes by default.  Then I changed one item of data in the original, and sat back to watch it replicate to the destination.

It didn't.

There's a Replication Monitor (just open up Replication and right click on the Publication or Subscription) and it said it ran fine - no errors, nothing.  But the changed data did not appear in the destination.

It took me a little while to work it out, and then a bit longer to stop kicking myself. The wizard assumes that the jobs will run every hour on the hour.  So by default the Publication and the Subscription both start at the same time i.e. the subscription starts before the publication has finished.

Therefore a change to the original Article gets published okay, but it won't get across to the Subscriber for an hour.

Solution - edit the schedule of the subscriber job to run 2 minutes after the hour. (Publication might only take a few seconds to run, depending of course on how much you are publishing)

Thus:

18:17 Change is made to the original data

19:00 Article is published

19:02 Subscriber receives article

Had I been a bit more patient, it would have worked out fine, an hour later.  But I wanted to go home!


Thursday, 4 October 2012

Book Review - Brad's Guide to Maintenance Plans

Or to give it its full title:
Brad's Sure Guide to SQL Server Maintenance Plans (Dba Handbooks)
by Brad M McGehee

An exhaustive haul through the various options that Microsoft provide for maintenance.  Brad spends a lot of time talking about the Wizard, much of that time warning against it!  He is wary of the wizard, in part because it could lead the unwary to ticking every box and thus doing unnecessary and even damaging stuff. Ticking Rebuild and Reorganise, for example - it just makes your plan take twice as long. If you Rebuild, then do an unnecessary Update Statistics, the update can screw up the statistics that the rebuild created and hit performance as well as wasting time.

I've never used the wizard myself - the SSIS-style drag and drop interface in the plan designer is so easy to use that I don't see the need. Brad also recommends using T-SQL or Powershell scripts for more complex maintenance. In fact he recommends this on average once per chapter, so it tends to get a little tedious.

To my mind, this leaves a lot of middle ground where the simplicity of the graphic maintenance plan designer does the job neatly and effectively. Want to take a backup? Just pick the backup task, spend a few moments telling it what to backup and where to, and then schedule it to run at a convenient time. Job done! Why write the code when the plan can do it for you?

Here's why - you can rebuild indexes, or reorganise indexes - but best practice is probably to read the fragmentation level of each one and then decide whether to rebuild, reorganise or leave alone. Especially on large databases, especially if you have a narrow maintenance window, you don't want to do an unnecessary rebuild. For this your best bet is a custom script.

But for most stuff the maintenance plan is fine. Brad reminds me of all the things I should be doing - like the history clean up task. Have I scheduled this? Probably not everywhere - note to self - go and check. And he explains very well the point of multiple sub-plans which I hadn't grasped before.

Saturday, 22 September 2012

UK Phone Numbers

I've been wrestling with a list of predominantly UK phone numbers for the last couple of days.  The data has come from a list of all the calls made to and from an organisation in the past year.  In particular, I want to identify the dialling code so that I can identify what geographical areas people are calling from.

First port of call is usually Google - I certainly don't want to write some code if I can just copy something that someone else has already done.  But sadly every entry I could find was about US phone numbers, which are easy peasy - area dialling codes are 3 digit across North America.  How rational!  Here in the UK dialling codes may be three, four, five or six digits long.  So if you have North American phone numbers to sort out, most of this article will be irrelevant.

Let's start with a bit of research - here and here. "In the United Kingdom, area codes are two, three, four, or, rarely, five digits long (after the initial zero). Regions with shorter area codes, typically large cities, permit the allocation of more telephone numbers as the local number portion has more digits. Local customer numbers are four to eight figures long. The total number of digits is ten, but in a very few areas the total may be nine digits (after the initial zero). The "area code" is also referred to as an "STD (code)" (subscriber trunk dialling) or a "dialling code" in the UK."  So, basically, it's a tangle of different standards.  OK, so I need some code which checks the first few digits and identifies which one is which. 


Data Cleansing
But before we get on to that, let's do some data cleansing.  I have a lot of what look like perfectly good phone numbers with a 92 prefix.  Like most firms, it's 9 to dial out, so I don't know where 92 comes from.  It's easy to deal with though- just a straightforward REPLACE() function.

replace(LEFT(strddi,2), '92', '')
means when the leftmost two characters of the DDI string are 92, replace them with a blank.

-- identify Direct Dialled numbers with 92 prefix - apparent error
update  [CUSTOMER].[PhoneCalls]
Set    strddi = replace(LEFT(strddi,2), '92', '')
WHERE LEFT(strddi,2) = '92'
AND         strDDIAreaCode IS NULL
AND         LEN(strddi) > 5;

There's another issue with incoming mobile phone numbers - in the UK, these start with 07, but somehow they have been recorded as starting with 7.  Replace does the job.

-- identify mobile numbers omitting 0 prefix - apparent error
-- change to UK standard 0
update  [CUSTOMER].[PhoneCalls]
Set    strcli = replace(LEFT(strCLI,1), '7', '07')
WHERE LEFT(strCLI,1) = '7'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) = 10;


Then we have the incoming calls from UK numbers, but specifying the full international dialling code 0044.  Fine if you are based somewhere else, but really not necessary if you are in the UK and your caller is in the UK.

-- identify UK numbers with an international code 0044
-- change to UK standard 0
update  [CUSTOMER].[PhoneCalls]
Set    strcli = replace(LEFT(strcli,4), '0044', '0')
WHERE LEFT(strcli,4) = '0044'
AND         strCLIAreaCode IS NULL
AND         LEN(strcli) > 5;



Having dealt with UK codes that think they are worldwide, I can now go on and identify the legitimate international codes.  I could pick out the French calls (0033), German calls (0049) and so on, but I only have a very small percentage of non-UK calls so I'm just going to treat all these Johnny Foreigners the same and lump them together as "International" calls.  If you do want to be more accurate with your international calls, click here for a list of the codes.

-- identify  international codes 00
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'International'
WHERE LEFT(strcli,2) = '00'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;



And use the same technique to identify Mobile numbers:

-- identify mobile phone numbers 07
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Mobile'
WHERE LEFT(strcli,2) = '07'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


Freefone numbers are free only to people calling from Landlines, although I understand that there are plans to make them free to people using mobile  phones too.

-- identify Freefone  phone numbers 0800 etc
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Freefone'
WHERE LEFT(strcli,4) in ('0800', '0500', '0808')
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;
 

There is a string of special rate numbers beginning with 08 - once upon a time you could identify 0845 as local rate and 0870 as national rate, but the list has proliferated and now you can't really tell how much it is going to cost.   

-- identify special rate phone numbers 0845 0870 etc
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Special rate'
WHERE LEFT(strcli,2) = '08'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


One thing you can be sure of - a call to an 09 number is going to be outrageously expensive...


-- identify premium rate phone numbers 09
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Premium rate'
WHERE LEFT(strcli,2) = '09'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


Identifying UK Dialling Codes
So - on to the main point of this article.  UK dialling codes - area codes - whatever you want to call them, allow you to identify where in the country a caller is based.  Aberdeen is 01224, York is 01904.  But very often the dialling code and the subscriber number are held in a single field e.g. 01169158424.
Taking this example, it could in theory be divided thus:
011        6915 8424
0116      915 8424
01169    158424
011691  58424

A dialling code can be anything from 3 digits to 6 digits - a subscriber number can be anything from 4 to 8 digits.

So here's how to tackle the problem.  There are only five cities with three digit dialling codes, ten with four digits, and 12 with six - the rest have five.  So thanks to a bit of research in Wikipedia (see the links I posted earlier) I was able to construct the following CASE statement:


-- UK Dialling codes may have 3, 4 , 5 or 6 digits
-- pick appropriate code
--CLI
 update  [CUSTOMER].[PhoneCalls]
set strCLIAreaCode =
      (
            CASE
            -- 3 digit dialling codes e.g. London, Belfast
            WHEN LEFT(STRcLI,3) IN 
                 ('020', '023', '024', '028', '029' )
                  THEN LEFT(strcli,3)
                 
            -- 4 digit dialling codes e.g. Bristol, Leicester
            WHEN LEFT(STRcLI,4) IN 
                 ('0118', '0117', '0116', '0115', '0114', '0113',
                  '0121', '0131', '0141', '0151', '0161', '0191')
                  THEN LEFT(strcli,4)
                 
            -- 6 digit dialling codes e.g. Langholm, Keswick
            WHEN LEFT(STRcLI,6) IN 
                 ('013873', '015242', '015394', '015395',
                  '015396', '016973', '016974', '016977',
                  '017683', '017684', '017687', '019467' )
                  THEN LEFT(strcli,6)
                                   
            -- The remaining majority of codes are 5 digit
            ELSE  LEFT(strcli,5)
      END
      )

WHERE
       strcli <> 'WITHHELD'
       AND LEN(strcli) > 5
      AND strCLIAreaCode IS null;
       


Looking at 01169158424, it's easy to see that the first four digits match the four digit option, so the area code part of this number is 0116 - which represents Leicester

Finally, a bit of tidying up:
      
-- Remove any remaining nulls in area code fields
-- nulls remaining represent internal codes or unidentifiable
-- CLI
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode =  'N/A'
WHERE strCLIAreaCode IS NULL;


Incidentally, the code shown here is half the code I wrote - I had to do the job for both inbound (CLI - Caller Line Identification) numbers and outbound (DDI - Direct Dial In).  the code is essentially the same so I haven't troubled you with it. 

I hope you find this useful!