Saturday 22 September 2012

UK Phone Numbers

I've been wrestling with a list of predominantly UK phone numbers for the last couple of days.  The data has come from a list of all the calls made to and from an organisation in the past year.  In particular, I want to identify the dialling code so that I can identify what geographical areas people are calling from.

First port of call is usually Google - I certainly don't want to write some code if I can just copy something that someone else has already done.  But sadly every entry I could find was about US phone numbers, which are easy peasy - area dialling codes are 3 digit across North America.  How rational!  Here in the UK dialling codes may be three, four, five or six digits long.  So if you have North American phone numbers to sort out, most of this article will be irrelevant.

Let's start with a bit of research - here and here. "In the United Kingdom, area codes are two, three, four, or, rarely, five digits long (after the initial zero). Regions with shorter area codes, typically large cities, permit the allocation of more telephone numbers as the local number portion has more digits. Local customer numbers are four to eight figures long. The total number of digits is ten, but in a very few areas the total may be nine digits (after the initial zero). The "area code" is also referred to as an "STD (code)" (subscriber trunk dialling) or a "dialling code" in the UK."  So, basically, it's a tangle of different standards.  OK, so I need some code which checks the first few digits and identifies which one is which. 


Data Cleansing
But before we get on to that, let's do some data cleansing.  I have a lot of what look like perfectly good phone numbers with a 92 prefix.  Like most firms, it's 9 to dial out, so I don't know where 92 comes from.  It's easy to deal with though- just a straightforward REPLACE() function.

replace(LEFT(strddi,2), '92', '')
means when the leftmost two characters of the DDI string are 92, replace them with a blank.

-- identify Direct Dialled numbers with 92 prefix - apparent error
update  [CUSTOMER].[PhoneCalls]
Set    strddi = replace(LEFT(strddi,2), '92', '')
WHERE LEFT(strddi,2) = '92'
AND         strDDIAreaCode IS NULL
AND         LEN(strddi) > 5;

There's another issue with incoming mobile phone numbers - in the UK, these start with 07, but somehow they have been recorded as starting with 7.  Replace does the job.

-- identify mobile numbers omitting 0 prefix - apparent error
-- change to UK standard 0
update  [CUSTOMER].[PhoneCalls]
Set    strcli = replace(LEFT(strCLI,1), '7', '07')
WHERE LEFT(strCLI,1) = '7'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) = 10;


Then we have the incoming calls from UK numbers, but specifying the full international dialling code 0044.  Fine if you are based somewhere else, but really not necessary if you are in the UK and your caller is in the UK.

-- identify UK numbers with an international code 0044
-- change to UK standard 0
update  [CUSTOMER].[PhoneCalls]
Set    strcli = replace(LEFT(strcli,4), '0044', '0')
WHERE LEFT(strcli,4) = '0044'
AND         strCLIAreaCode IS NULL
AND         LEN(strcli) > 5;



Having dealt with UK codes that think they are worldwide, I can now go on and identify the legitimate international codes.  I could pick out the French calls (0033), German calls (0049) and so on, but I only have a very small percentage of non-UK calls so I'm just going to treat all these Johnny Foreigners the same and lump them together as "International" calls.  If you do want to be more accurate with your international calls, click here for a list of the codes.

-- identify  international codes 00
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'International'
WHERE LEFT(strcli,2) = '00'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;



And use the same technique to identify Mobile numbers:

-- identify mobile phone numbers 07
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Mobile'
WHERE LEFT(strcli,2) = '07'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


Freefone numbers are free only to people calling from Landlines, although I understand that there are plans to make them free to people using mobile  phones too.

-- identify Freefone  phone numbers 0800 etc
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Freefone'
WHERE LEFT(strcli,4) in ('0800', '0500', '0808')
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;
 

There is a string of special rate numbers beginning with 08 - once upon a time you could identify 0845 as local rate and 0870 as national rate, but the list has proliferated and now you can't really tell how much it is going to cost.   

-- identify special rate phone numbers 0845 0870 etc
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Special rate'
WHERE LEFT(strcli,2) = '08'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


One thing you can be sure of - a call to an 09 number is going to be outrageously expensive...


-- identify premium rate phone numbers 09
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode = 'Premium rate'
WHERE LEFT(strcli,2) = '09'
AND         strCLIAreaCode IS NULL
AND         LEN(strCLI) > 5;


Identifying UK Dialling Codes
So - on to the main point of this article.  UK dialling codes - area codes - whatever you want to call them, allow you to identify where in the country a caller is based.  Aberdeen is 01224, York is 01904.  But very often the dialling code and the subscriber number are held in a single field e.g. 01169158424.
Taking this example, it could in theory be divided thus:
011        6915 8424
0116      915 8424
01169    158424
011691  58424

A dialling code can be anything from 3 digits to 6 digits - a subscriber number can be anything from 4 to 8 digits.

So here's how to tackle the problem.  There are only five cities with three digit dialling codes, ten with four digits, and 12 with six - the rest have five.  So thanks to a bit of research in Wikipedia (see the links I posted earlier) I was able to construct the following CASE statement:


-- UK Dialling codes may have 3, 4 , 5 or 6 digits
-- pick appropriate code
--CLI
 update  [CUSTOMER].[PhoneCalls]
set strCLIAreaCode =
      (
            CASE
            -- 3 digit dialling codes e.g. London, Belfast
            WHEN LEFT(STRcLI,3) IN 
                 ('020', '023', '024', '028', '029' )
                  THEN LEFT(strcli,3)
                 
            -- 4 digit dialling codes e.g. Bristol, Leicester
            WHEN LEFT(STRcLI,4) IN 
                 ('0118', '0117', '0116', '0115', '0114', '0113',
                  '0121', '0131', '0141', '0151', '0161', '0191')
                  THEN LEFT(strcli,4)
                 
            -- 6 digit dialling codes e.g. Langholm, Keswick
            WHEN LEFT(STRcLI,6) IN 
                 ('013873', '015242', '015394', '015395',
                  '015396', '016973', '016974', '016977',
                  '017683', '017684', '017687', '019467' )
                  THEN LEFT(strcli,6)
                                   
            -- The remaining majority of codes are 5 digit
            ELSE  LEFT(strcli,5)
      END
      )

WHERE
       strcli <> 'WITHHELD'
       AND LEN(strcli) > 5
      AND strCLIAreaCode IS null;
       


Looking at 01169158424, it's easy to see that the first four digits match the four digit option, so the area code part of this number is 0116 - which represents Leicester

Finally, a bit of tidying up:
      
-- Remove any remaining nulls in area code fields
-- nulls remaining represent internal codes or unidentifiable
-- CLI
update  [CUSTOMER].[PhoneCalls]
set   strCLIAreaCode =  'N/A'
WHERE strCLIAreaCode IS NULL;


Incidentally, the code shown here is half the code I wrote - I had to do the job for both inbound (CLI - Caller Line Identification) numbers and outbound (DDI - Direct Dial In).  the code is essentially the same so I haven't troubled you with it. 

I hope you find this useful!


Sunday 2 September 2012

Book Review - Confessions of an IT Manager


You've read Dilbert?  Then you may have thought Scott Adams was joking.  He wasn’t, and Phil fleshes out the details.  From contract programmer to IT Director, Phil spills the dirt on them all,  not to mention recruitment agencies, for whom a special place is reserved in hell.  



This book was originally published as a series of articles on the Simple-Talk website. Take IT projects now, a subject close to my heart at the moment - they have a reputation for failing, especially the big ones.  Phil thinks it’s a well-deserved reputation.  Nonetheless he suggests a few things you can do to actually succeed in getting an IT project come in on time, under budget, and actually do something vaguely resembling whatever it’s supposed to do.  

Start by doing nothing until everyone agrees what is to be done.  Then any improvement, any change, has to go through a strict change control process.  (You may see already why there are so many failures). 

Don’t be a pioneer - use only boring proven technology.  If the programmers want to play with the latest TLAs (three letter acronyms) and put them on their CVs then fine, set up a sandbox to muck about, but do the real work using last year’s release. 

Code then recode.  Sling the first version and write it again properly.  It will take half the time and effort, and run twice as fast as the first tentative fumblings. (I can see his point, but I'm not sure that I would be brave enough to sling a working program and then rewrite my code.  Nor am I sure I could get it past my manager - "Yes, I'm finished, but now I'm going to rewrite it").

Allow no virtuoso programming - if you can’t understand it, it’s no good.  He mentions elsewhere the joy of documented code.  Oh, so true.  I like to put comments in my code because I haven't a clue what I've done otherwise when I look back at it three months later.  I know some people believe that their code is so clear that they don't need to bother - to them I say "Phooey" and Phil would probably agree.  

And encourage results by rewarding deadlines met. 

Fortunately it doesn’t matter if your project is abandoned or dies a painful unpleasant death.  Here I refer you to Dilbert’s The Joy of Work.  Suppose you design jet engines.  You may only ever have designed one of them, and on its first flight it crashed in flames and destroyed a remote logging town.  What matters is that you can put “Jet Engine Designer” on your CV.