Bug Me Not - Redgate · post-it notes describing the most important issues causing "drag" on a team. They then collectively seek to resolve these issues, whilst striving to remain

Bug Me NotPublished Friday, July 08, 2011 10:10 AM

Bug metrics are a notoriously erratic way to judge the performance of a development team and project, but despite this almost all softwareprojects use them. There is a lot of data you can get from an electronic bug-tracking system, from bugs per lines of code, bugs per component,to defect trend graphs and bug fix rates. It is tempting to try to find meaning in the data, but how useful is this data, ultimately, in driving upsoftware quality, in the long term?

If you judge software testers on the number of bugs that they find, then more bugs will be found. If you judge developers on the number of bugs forwhich their code is responsible, then you'll get much less buggy code, but you'll probably struggle to ship a product on any reasonable timescale.Over the course of a project, it's easy for the team and even individual developers to feel oppressed by the bugs, and under intense pressure toproduce 'better' code. Bugs continue to be logged and reported assiduously, but many of them simply disappear into the backlog, to be fixed "atsome later time". As the pressure of the ship date mounts, developers are simply forced to cut corners, to change their perception of what"done" really means, in order to increase their velocity, and meet the deadline. Software quality and team morale suffers as result, and despitebeing rigorously tracked and reported, bugs fester from release to release, since there is never time to fix them. Before long the team finds itselfmired in the oubliette.

So how can we use bug metrics to drive up software quality software, over the long term, while enabling the team to ship on a predictable andreasonable timescale? In all likelihood, the surprising answer is "we can't". In fact, the ultimate goal of an agile development team might be todispense with the use of an electronic bug tracking system altogether!

Certainly at Red Gate, some teams are using JIRA for incoming customer defects, but also a more holistic "technical debt wall", consisting ofpost-it notes describing the most important issues causing "drag" on a team. They then collectively seek to resolve these issues, whilst strivingto remain close to zero new internal defects.

The team works to cultivate an atmosphere of zero tolerance to bugs. If you cause a bug, fix it immediately; if you find a bug in the area in whichyou're working, tidy it up, or find someone on the team who can. If you can't fix a bug as part of the current sprint, decide, very quickly, how andeven if it will be fixed. This is not easy to achieve; it requires, among other things, an environment where it is "safe" for the team to stop and fixbugs, where developers and testers work very closely together, and both are strongly aligned with the customer, so they understand what theyneed from the software and which bugs are significant and which not.

However, when you get there, what becomes important is not the number of bugs, and how long they stay open in your bug-tracking system, buta deeper understanding of the types of bugs that are causing the most pain, and their root cause. The team are then judged on criteria such ashow quickly they learn from their mistakes, by for example, tightening up automated test suites so that the same type of bug doesn't crop up timeand again, or by improving the acceptance criteria associated with user stories, so that the team can focus on fixing what's important, as soonas the problem arises.

These are criteria that really will drive up software quality over the long term, and allow teams to produce software on predictable timescales,and with the freedom to feel they can "do the right thing".

What do you think? Is this a truly achievable goal for most teams, or just pie-in-the-sky thinking?

Cheers,Tony.

by Tony Davis

I

Of course, if you want to know more about whatPowerShell Eventing is, then I suggest you readthe links at the end of the article.

PowerShell Eventing and SQL Server Restores05 July 2011by Laerte Junior

When you're managing a large number of servers, it makes no sense to run maintenance tasks one at a time, serially. PowerShell isable to use events, so is ideal for, say, restoring fifty databases on different servers at once, and be notified when each is finished.Laerte shows you how, with a little help from his friends.

t all began one bright morning, when my good friend and Powershell Jedi Ravikanth Chaganti (blog| twitter) asked me if I had a PowerShellscript to restore databases. This sounded like a pretty simple process, and so I told him that what he needed was available on CodePlex in

the form of SQLPSX. However, it turned out the challenge he faced was not so simple, and he elaborated on his real problem:

He actually needed to restore 50 databases in asynchronous mode and, having discovered that the Restore class had events, wanted to usethose to trigger a message when the restore process finished.

Now this sounded interesting! But how to do it? Helloooo PowerShell Eventing…

Powershell EventingEventing is a feature built into PowerShell V2 which lets you respond to the asynchronous notifications that many objects support (as seen on theWindows Powershell Blog). However, my goal is not to explain what the PowerShell Eventing feature is; I’m here to demonstrate how toimplement an effective real-world solution using that feature.

Before we get started, I’ll explain that I modified Chad Miller’s (blog|twitter) originalInvoke-Sqlrestore function to use the complete Restore event for our purposes (withChad’s kind permission, naturally). In the course of the article, I’ll show you step by stephow I got the final solution, and you can download the finished script from the speechbubble at the top of the article. The altered function is called Invoke-SqlrestoreEventing , inside the PoshTest.Psm1 module, and comes with additionalSMO assemblies to import it directly into your Powershell user profile.

The ProblemI needed an automated and reasonably scalable way to restore 50 databases asynchronously, and be notified when each one was finished.

Step 1 – Just Show a Message

My first step towards Eldorado was to just show a "Restore Completed" message when a restore operation was finished. If we take a look at theMSDN information for the Restore Class, we find the available Events, including Complete:

Figure 1 – The available events on the Restore Class (click to enlarge)

So I wrote some PowerShell to use that:

$restore = new-object ("Microsoft.SqlServer.Management.Smo.Restore")Register-ObjectEvent -InputObject $restore -EventName "Complete" -SourceIdentifier CompleteRestore -Action{ Write-Host "Restore Completed"} | Out-Null

And tested it to make sure it works:

Figure 2 – Our initial script, working fine. (click to enlarge)

That all looked OK. So imagine my surprise when I tried to restore again, and saw this:

Figure 3 – The same simple message script, but something’s gone wrong. (click to enlarge)

Cannot subscribe to event. A subscriber with source identifier 'CompleteRestore' already exists.

I realized I had created a CompleteRestore subscriber in the SourceIdentifier parameter of the Register-ObjectEvent cmdlet, so I neededto unregister it before I could run the cmdlet again:

try { $restore.SqlRestore($server) }catch { blablablabla } finally { Unregister-Event CompleteRestore}

Second Step – Running in an Asynchronous Powershell Job

With my message script running smoothly, my second thought was "Neat, but it’s not much good without being asynchronous”. If I have torestore 50 databases, it cannot be in a serialized way! So I tried :

$server = "Vader"$dbname = "TestPoshEventing_6"$filepath = "c:\temp\backup\TestPoshEventing.bak"$Realocatefiles = @{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_6.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_6.ldf'}Start-Job -Name "Restore1" -InitializationScript {Import-Module c:\temp\testes\PoshTest.psm1 -Force} -scriptblock { Invoke-SqlRestoreEventing -sqlserver $args[0] -dbname $args[1] -filepath $args[2] -relocatefiles $args[3] -force } -ArgumentList $server, $Dbname ,$filepath ,$Realocatefiles

Aaand…it didn’t work. Why not? Because background Jobs run in a different runspace, and so anything we send to output in the console won’tshow up. To work around that, I needed to use Receive-Job :

$server = "Vader"$dbname = "TestPoshEventing_6"$filepath = "c:\temp\backup\TestPoshEventing.bak"$Realocatefiles = @{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_6.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_6.ldf'}$job = Start-Job -Name "Restore1" -InitializationScript {Import-Module c:\temp\testes\PoshTest.psm1 -Force}-scriptblock { Invoke-SqlRestoreEventing -sqlserver $args[0] -dbname $args[1] -filepath $args[2] -relocatefiles $args[3] -force } -ArgumentList $server, $Dbname ,$filepath ,$Realocatefiles Wait-Job $job | Receive-Job

And the Oscar goes to… Powershell! Everything now works just fine.

Third Step –Showing a Message and the Database Name

The “Restore Completed” message I put together earlier is handy, but not actually that useful without knowing which database has beenrestored. To improve that, I added the $dbname element:

Invoke-SqlRestoreEventing -sqlserver Vader -dbname "TestPoshEventing_6" -filepath"c:\temp\backup\TestPoshEventing.bak" -relocatefiles @{TestPoshEventing=

'c:\temp\restore\TestPoshEventing_6.mdf';TestPoshEventing_log = 'c:\temp\restore\TestPoshEventing_6.ldf'} -forceTestPoshEventing_6 Restore Completed

Figure 4 – The “Restore Complete” message, complete with the database name. (click to enlarge)

Now you should hopefully be thinking, as I was, that because backgrounds jobs run in a different runspace, $dbname will not be displayed whenwe put these two scripts together. How do we solve this?

Never fear! In this case, I used the -messagedata parameter on Register- ObjectEvent, and got the value we need using$event.Messagedata:

Register-ObjectEvent -InputObject $restore -EventName "Complete" -SourceIdentifier CompleteRestore -Action{ Write-Host "$($event.MessageData) restore Completed"} -MessageData $dbname | Out-Null

Now let’s run the function:

$server = "Vader"$dbname = "TestPoshEventing_6"$filepath = "c:\temp\backup\TestPoshEventing.bak"$Realocatefiles = @{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_6.mdf'; TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_6.ldf'}$job =Start-Job -InitializationScript {Import-Module c:\temp\testes\PoshTest.psm1 -Force} -scriptblock { Invoke-SqlRestoreEventing -sqlserver $args[0] -dbname $args[1] -filepath $args[2] -relocatefiles $args[3] -force }-ArgumentList $server, $Dbname ,$filepath ,$RealocatefilesWait-Job $job | Receive-Job

… and watch the magic happening:

Figure 5 – Asynchronous database restores, complete with “Restore Complete” messages for each database. (click to enlarge)

Scaling out The codeOne of the main reasons why I use PowerShell is because of its inherent capacity to manage multiple servers with just a few lines of script. Thatis, scaling out my code is relatively easy. Which is just as well, because while the solution as it stands is fine for a test case, it’s not quite readyto deal with 50 databases efficiently.

The first thing I needed to do was to add the server name into the message so that I knew exactly which database was being managed at eachstage. For this I used -messagedata again, but with a twist: I passed the parameters as a property to PSObject and used$Event.MessageData.<ProPertyName>

$pso = new-object psobject -property @{Server=$server;DbName=$dbname}Register-ObjectEvent -InputObject $restore -EventName "Complete" -SourceIdentifier CompleteRestore -Action{ Write-Host "Server $($event.MessageData.Server), database $($event.MessageData.dbname) restore Completed"}-MessageData $pso | Out-Null

And with that in place, let’s see how this code deals with restoring 2 databases:

$server = "Vader"$dbname = "TestPoshEventing_20"$filepath = "c:\temp\backup\TestPoshEventing.bak"$Realocatefiles = @{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_20.mdf'; TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_20.ldf'}$job = Start-Job -InitializationScript {Import-Module c:\temp\testes\PoshTest.psm1 -Force} -scriptblock {Invoke-SqlRestoreEventing -sqlserver $args[0] -dbname $args[1] -filepath $args[2] -relocatefiles $args[3] -force } -ArgumentList $server, $Dbname ,$filepath ,$Realocatefiles $server = "Vader"$dbname = "TestPoshEventing_21"$filepath = "c:\temp\backup\TestPoshEventing.bak"$Realocatefiles = @{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_21.mdf'; TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_21.ldf'} $job1 = Start-Job -Name "Restore2" -InitializationScript {Import-Module c:\temp\testes\PoshTest.psm1 -Force} -scriptblock { Invoke-SqlRestoreEventing -sqlserver $args[0] -dbname $args[1] -filepath $args[2] -relocatefiles $args[3] -force } -ArgumentList $server, $Dbname ,$filepath ,$Realocatefiles Wait-Job $job | Receive-JobWait-Job $job1 | Receive-Job

Figure 6 – Managing 2 Restore jobs with no trouble. (click to enlarge)

How Cool is that?!

But now you're thinking, “OK Laerte, that’s neat, but will I have to hard code all the servers and databases that I want to restore?” And myanswer is, “No... let's use an XML file for that”. First I created an XML file (restore.xml) with the following structure, and populated it with all theservers and databases I wanted to restore:

<?xml version="1.0" standalone="yes" ?><config> <Values> <Server>Vader</Server> <Dbname>TestPoshEventing_8</Dbname> <Filepath>c:\temp\backup\TestPoshEventing.bak</Filepath> <Realocatefiles>TestPoshEventing= 'c:\temp\restore\TestPoshEventing_8.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_8.ldf'</Realocatefiles> </Values> <Values> <Server>Vader</Server> <Dbname>TestPoshEventing_9</Dbname> <Filepath>c:\temp\backup\TestPoshEventing.bak</Filepath> <Realocatefiles>TestPoshEventing= 'c:\temp\restore\TestPoshEventing_9.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_9.ldf'</Realocatefiles> </Values> <Values> <Server>Vader</Server> <Dbname>TestPoshEventing_10</Dbname> <Filepath>c:\temp\backup\TestPoshEventing.bak</Filepath> <Realocatefiles>TestPoshEventing= 'c:\temp\restore\TestPoshEventing_10.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_10.ldf'</Realocatefiles> </Values> <Values> <Server>Vader</Server> <Dbname>TestPoshEventing_11</Dbname>

<Filepath>c:\temp\backup\TestPoshEventing.bak</Filepath> <Realocatefiles>TestPoshEventing= 'c:\temp\restore\TestPoshEventing_11.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_11.ldf'</Realocatefiles> </Values></config>

And now, with the addition of the code below, I was able to scale out the final script to run Invoke-SQLRestoringEventing:

$xmldata = [xml] (Get-Content c:\temp\testes\Restore.xml )$cmdblock = ""$cmdBlock_1 = ""[int] $counter = 1foreach( $Value in $xmldata.config.values){ $cmdBlock += " `$server = ""$($value.server)""; `n `$dbname = ""$($value.dbname)""; `n `$filepath = ""$($value.filepath)""; `n `$Realocatefiles = @{$($value.realocatefiles)};`n `$job_$($Counter) = Start-Job -InitializationScript {Import-Module c:\temp\testes\PoshTest.psm1 -Force} -scriptblock { Invoke-SqlRestoreEventing -sqlserver `$args[0] -dbname `$args[1] -filepath `$args[2]-relocatefiles `$args[3] -force } -ArgumentList `$server, `$dbname ,`$filepath ,`$Realocatefiles ; `n" $cmdBlock_1 += "`n wait-job `$job_$($Counter) | receive-job ;" $counter++ } $cmdblockTotal = $cmdblock + $cmdBlock_1$scriptBlock = $ExecutionContext.InvokeCommand.NewScriptBlock($cmdBlockTotal)invoke-command -ScriptBlock $ScriptBlock

And just to prove that it works...

Figure 7 – PowerShell eventing, working it’s multi-server management magic. (click to enlarge)

That’s all well and good, but then I discovered a much more elegant way to control the queue. The first step was to change the realocatefile tagin the XML to a hashtable syntax:

<?xml version="1.0" standalone="yes" ?><config> <Values> <Server>Vader</Server> <Dbname>TestPoshEventing_8</Dbname> <Filepath>c:\temp\backup\TestPoshEventing.bak</Filepath> <Realocatefiles>@{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_8.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_8.ldf'}</Realocatefiles> </Values> <Values> <Server>Vader</Server> <Dbname>TestPoshEventing_9</Dbname> <Filepath>c:\temp\backup\TestPoshEventing.bak</Filepath> <Realocatefiles>@{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_9.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_9.ldf'}</Realocatefiles> </Values> <Values>

<Server>Vader</Server> <Dbname>TestPoshEventing_10</Dbname> <Filepath>c:\temp\backup\TestPoshEventing.bak</Filepath> <Realocatefiles>@{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_10.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_10.ldf'}</Realocatefiles> </Values> <Values> <Server>Vader</Server> <Dbname>TestPoshEventing_11</Dbname> <Filepath>c:\temp\backup\TestPoshEventing.bak</Filepath> <Realocatefiles>@{TestPoshEventing= 'c:\temp\restore\TestPoshEventing_11.mdf';TestPoshEventing_log ='c:\temp\restore\TestPoshEventing_11.ldf'}</Realocatefiles> </Values></config>

.. And then add this code to my function parameters:

$queue.Enqueue($item)

In this simple case, the code only uses one parameter, but I needed tp use more for my function. Thankfully, this was easily solved this by usingPSObject (custom objects) again :

[hashtable] $x = Invoke-Expression ($Value.Realocatefiles)$pso = new-object psobject -property@{Server=$value.server;DbName=$value.dbname;Filepath=$Value.filepath;Realocatefiles = $x}$queue.Enqueue($pso)

As you can see, I changed the XML to use Invoke-Expression and converted the String in realocatefiles tag to a PowerShell HashTable.Next, I added the–messagedata parameter into Register-ObjectEvent (This tip for how to show the “Restore Completed“ message is thanksto Ravikanth.)

So finally, using the handy MSDN code with a few changes, the script now looks like this:

$maxConcurrentJobs = 5$xmldata = [xml] (Get-Content c:\temp\testes\Restore.xml )$queue = [System.Collections.Queue]::Synchronized( (New-Object System.Collections.Queue) )foreach( $Value in $xmldata.config.values){ [hashtable] $Realocatefiles = Invoke-Expression ($Value.Realocatefiles) $pso = new-object psobject -property@{Server=$value.server;DbName=$value.dbname;Filepath=$Value.filepath;Realocatefiles = $Realocatefiles } $queue.Enqueue($pso)} function RunJobFromQueue { if( $queue.Count -gt 0) { $job = Start-Job -InitializationScript {Import-Module c:\temp\testes\PoshTest.psm1 -Force} -scriptblock {Param ($pso);Invoke-SqlRestoreEventing -sqlserver $pso.Server -dbname $pso.dbname -filepath$pso.filepath -relocatefiles $pso.realocatefiles -force } -ArgumentList $queue.Dequeue() Register-ObjectEvent -InputObject $job -EventName StateChanged -Action { RunJobFromQueue; receive-job $event.MessageData.Id; Remove-Job $event.MessageData.Id } -MessageData $job | Out-Null } } for( $i=0;$i -lt $maxConcurrentJobs;$i++ ) { RunJobFromQueue }

We now have complete control of job queues, and the scaling of the code is more elegant:

Figure 8 – More elegant PowerShell eventing, still working it’s multi-server management magic. (click to enlarge)

Even better, you still can work in the same session!

Now I’m Just Showing OffSo now the solution works, but what if we want a windows task bar balloon notification too, rather than just the console messages? Well, first ofall, we have to download the appropriate module from Robertro Belo´s Blog (Sly PowerShell - Balloon tip notifications) and load that module intoyour Powershell user profile too. Once you’ve done that, a little change is required in Invoke-SQLrestoring:

$pso = new-object psobject -property @{Server=$server;DbName=$dbname}Register-ObjectEvent -InputObject $restore -EventName "Complete" -SourceIdentifier CompleteRestore -Action{ Import-Module ShowBalloonTip -force;Write-Host "Server $($event.MessageData.Server), database$($event.MessageData.dbname) restore Completed";Show-BalloonTip "Server $($event.MessageData.Server),database $($event.MessageData.dbname) restore Completed"} -MessageData $pso | Out-Null… And we need to adjust our script to run with PoshTestBallon PowerShell module (which you can download fromthe top of this article):$maxConcurrentJobs = 5$xmldata = [xml] (Get-Content c:\temp\testes\Restore.xml )$queue = [System.Collections.Queue]::Synchronized( (New-Object System.Collections.Queue) )foreach( $Value in $xmldata.config.values){ [hashtable] $x = Invoke-Expression ($Value.Realocatefiles) $pso = new-object psobject -property@{Server=$value.server;DbName=$value.dbname;Filepath=$Value.filepath;Realocatefiles = $x} $queue.Enqueue($pso)} function RunJobFromQueue { if( $queue.Count -gt 0) { $j = Start-Job -InitializationScript {Import-Module PoshTestBallon.psm1 -Force} -scriptblock{Param ($pso);Invoke-SqlRestoreEventingBallon -sqlserver $pso.Server -dbname $pso.dbname -filepath$pso.filepath -relocatefiles $pso.realocatefiles -force } -ArgumentList $queue.Dequeue() Register-ObjectEvent -InputObject $j -EventName StateChanged -Action { RunJobFromQueue; receive-job $sender.Id; Remove-Job $Sender.Id } | Out-Null } } for( $i=0;$i -lt $maxConcurrentJobs;$i++ ) { RunJobFromQueue }

Once you’ve done that, revel in your Power(Shell)…

Figure 9. Balloon notifications of completed database restorations. (click to enlarge)

Can you see this? Can you feel the power? If you want to see the full solution, please feel free to download it and take a look.

In ClosingWell my friends, that’s the first time I’ve done anything with PowerShell Eventing, and I’m now bursting with ideas for what to do next, and I hopeyou are too. I will finish my article by paraphrasing Powershell MVP and friend Max Trinidad (blog|twitter), who is also working on a really coolCodePlex project: SQLDevTools – Powershell SQL Server Developers Toos)

Happy PowerShelling!

Credits

Thanks to all the PowerShell Jedis who helped me, Jeffery Hicks (blog|twitter), Marco Shaw (Blog), Sean Kearney (blog|twitter) and especially Ravikanth ( instrumental in finishing this script), who all pointed me in the right directions to create this procedure.

A special thanks goes to my friend and mentor Chad Miller (blog|twitter), who kindly gave permission to use his invoke-sqlrestore function inmy tests.

Links for Powershell

Ravikanth Chaganti BlogWindows Powershell BlogHey, Scripting Guy!BlogJeffery Hicks - The Lonely AdministratorSean Kearney BlogDon Jones on PowershellJonathan Medd's BlogShay Levy - If you repeat it, PowerShell it!

Links for Powershell with SQL Server

Chad Miller – Sev17Aaron Nelson (SQLVariant) BlogJorge Segarra – SQL University BlogMax Trinidad – The PowerShell Front

© Simple-Talk.com

I

ASP.NET MVC Action Results and PDF Content06 July 2011by Dino Esposito

The Action Result in ASP.NET MVC provides a simple and versatile means of returning different types of response to the browser.Want to serve a PDF file with dynamically-generated content? Do an SEO-friendly permanent redirect? Dino shows you how simplethis can be using a tailor-made ActionResult class

n ASP.NET Web Forms, the vast majority of HTTP requests are for pages, upon which an HTML stream is returned. You can force anASP.NET Web page to return a different type of response such as an image but that is a rather unnatural action. In fact, if you served an image

from an ASPX endpoint, you would set up a much more costly operation than that from a plain custom HTTP handler. This is because ASP.NETWeb Forms is a framework designed around the page and HTML content whereas ASP.NET MVC can serve any type of content at the samecost.

In this article, I’ll delve deep into the hidden code that takes the return value of a controller’s action method down to the browser. Along the way,I’ll discuss the implementation of custom action result types.

The Result of an Action

In ASP.NET MVC, each HTTP request is mapped to an action method defined on a controller class. The action method is merely a publicmethod with no special constraints on the input parameters and is forced to return a type that inherits from a system type—the ActionResulttype. More precisely, you can design an action method to return any .NET type, including primitive and complex types. You can also return void. Ifthe action method is void, the actual type being processed is EmptyResult. If the type is any .NET type that doesn’t inherit ActionResult, theactual response is encapsulated in a ContentResult type.

The actual return value of any controller action is an object that inherits from ActionResult. As the name suggests, this object represents theresult of the action: It embeds data and knows how to process it in order to generate the response for the browser. It is important to note that theActionResult object is not what the client browser is going to receive. Getting an ActionResult object is only the first step to finalize therequest.

Here’s the code of the ActionResult class as returned by .NET Reflector. As you can see, ActionResult is an abstract class with just oneoverridable method—ExecuteResult.

public abstract class ActionResult{ // Methods protected ActionResult() { } public abstract void ExecuteResult(ControllerContext context);}

The response for the browser is generated and written to the output stream by invoking ExecuteResult on a concrete type that derives fromActionResult. The class that does this is the action invoker, a system component that governs the whole process of executing a request andcreating the response for the browser.

You can envisage the ActionResult class as being a way to encapsulate the particular type of response that you want to send to the browser.The response certainly comprises the actual data but it also includes the content type, the status code, and any cookies and headers that youintend to send. All of these things are aspects of the response you might want to control through a tailor-made ActionResult class.

Inside Real-World Action Result Classes

The mechanics of action result classes is best understood by taking a tour of a couple of system-provided action result classes. Let’s start with avery simple class such as HttpStatusCodeResult.

public class HttpStatusCodeResult : ActionResult{ // Methods public HttpStatusCodeResult(int statusCode) : this(statusCode, null) {

} public HttpStatusCodeResult(int statusCode, string statusDescription) { this.StatusCode = statusCode; this.StatusDescription = statusDescription; } public override void ExecuteResult(ControllerContext context) { if (context == null) { throw new ArgumentNullException("context"); } context.HttpContext.Response.StatusCode = this.StatusCode; if (this.StatusDescription != null) { context.HttpContext.Response.StatusDescription = this.StatusDescription; } } // Properties public int StatusCode { get; private set; } public string StatusDescription { get; private set; }}

As you can see, all it does is to set the status code and description of the HTTP response object. In ASP.NET MVC 3, a specific HTTPresponse class is built on top of HttpStatusCodeResult; the HttpUnauthorizedResult class. As the code listing shows, this class is just awrapper that does nothing more than set the status code:

public class HttpUnauthorizedResult : HttpStatusCodeResult{ // Fields private const int UnauthorizedCode = 401; // Methods public HttpUnauthorizedResult() : this(null) { } public HttpUnauthorizedResult(String statusDescription) : base(UnauthorizedCode, statusDescription) { }}

To create, for example, a HttpNotFoundResult custom type, all you have to do is to duplicate the previous code and just set the proper statuscode; 0x194 (404) in this case.

A slightly more sophisticated example is the FileResult class. This class supplies a public property, the ContentType property that containsthe type of the data being written to the output stream.

public abstract class FileResult : ActionResult{ // Fields private string _fileDownloadName; // Methods protected FileResult(string contentType) { if (string.IsNullOrEmpty(contentType))

{ throw new ArgumentException(); } this.ContentType = contentType; } public override void ExecuteResult(ControllerContext context) { if (context == null) { throw new ArgumentNullException("context"); } var response = context.HttpContext.Response; response.ContentType = this.ContentType; if (!String.IsNullOrEmpty(FileDownloadName)) { var headerValue = ContentDispositionUtil.GetHeaderValue(FileDownloadName); context.HttpContext.Response.AddHeader( "Content-Disposition", headerValue); } WriteFile(response); } protected abstract void WriteFile(HttpResponseBase response); // Properties public string ContentType { get; private set; } public string FileDownloadName { get { return (_fileDownloadName ?? string.Empty); } set { _fileDownloadName = value; } }}

In the implementation of ExecuteResult, the class simply downloads the content of a given file. FileResult is a base class, and in fact itexposes an abstract method—the WriteFile method—through which derived class can specify how to download and where and how to get thebits. It is interesting to note the role of the FileDownloadName property. The property doesn’t refer to the name of the server file to return to thebrowser. Instead, it gets and sets the content-disposition header. The header gives a name for the file that can be used to save the file locally onthe client, and the browser would use to display the default name within a file-download dialog box.

The class FilePathResult builds on top of FileResult and just adds the ability to download any type of file. Here’s the source code:

public class FilePathResult : FileResult{ // Methods public FilePathResult(string fileName, string contentType) : base(contentType) { if (string.IsNullOrEmpty(fileName)) { throw new ArgumentException(); } FileName = fileName; } protected override void WriteFile(HttpResponseBase response) { response.TransmitFile(FileName); }

// Properties public string FileName { get; private set; }}

This class defines a FileName property and uses the native TransmitFile method of the Response object to download the file. With thisinformation to hand, we can conclude that the action result object is simply a way to encapsulate all the tasks you need to accomplish inparticular situations such as:

When a requested resource is missingWhen a requested resource is redirectedWhen some special response must be served to the browser

Let’s examine a few interesting cases where you might want to have custom action result objects.

Permanent Redirect

Suppose that at some point you decide to expose a given feature of your application through another URL but still need to support the old URL.To increase your Search-Engine Optimization (SEO) ratio, you might want to implement a permanent redirect instead of a classic (temporary)HTTP 302 redirect. ASP.NET MVC 3 supplies a RedirectResult class with a Boolean property to make the redirect permanent via a HTTP 301status code. This feature, however, is lacking in ASP.NET MVC 2. Anyway, it would be a good exercise to have a look at a possibleimplementation that follows closely that of RedirectResult in ASP.NET MVC.

public class PermanentRedirectResult : ActionResult{ public string Url { get; set; } public bool ShouldEndResponse { get; set; } public PermanentRedirectResult(string url) { if (String.IsNullOrEmpty(url)) throw new ArgumentException("url"); Url = url; ShouldEndResponse = false; } public override void ExecuteResult(ControllerContext context) { // Preconditions if (context == null) throw new ArgumentNullException("context"); // Mark all keys in the TempData dictionary for retention context.Controller.TempData.Keep(); // Prepare the response var url = UrlHelper.GenerateContentUrl(Url, context.HttpContext); var response = context.HttpContext.Response; response.Clear(); response.StatusCode = 301; response.AddHeader("Location", url); // Optionally end the request if (ShouldEndResponse) response.End(); }}

Having this class available you can easily move your features around without impacting the SEO level of your application. Here’s how to use thisclass in a controller method.

public ActionResult Old(){

string newUrl = "/Home/Index"; return new PermanentRedirectResult(newUrl);}

Figure 1 shows the result of the call as it shows up in FireBug.

Figure 1. The original URL results permanently moved.

Returning PDF Data

A common developer requirement is to return binary data from a request. Within the category of ‘binary data’ falls many different types such asthe pixels of an image, the content of a PDF file, or even a Silverlight package. Frankly, you don’t really need an ad hoc action result object todeal with binary data. Among the built-in action result objects, you can certainly find one that helps you to work with your particular type of binarydata.

If the content you want to transfer is stored within a disk file, you can use the FilePathResult object. If your content is available through a streamyou use FileStreamResult and you opt for FileContentResult if you have it available as a byte array. All these objects derive from FileResultand differ from one another only in the way that they write out data to the response stream. Let’s see what it takes to return some PDF data.

To be honest, for the task “return a PDF file from an action method”, the big problem to solve is how you get hold of the PDF content. If yourPDF content is a static resource such as a server file, then all you need is using FilePathResult. Better yet, you can use the ad hoc File methodas below:

public ActionResult About(){ : return File(documentName, "application/pdf");}

To create PDF content on the fly, you can use a bunch of libraries such as Report.net or iTextSharp. Some commercial products and variousopen-source projects also let you create PDF content from HTML content. This would be particularly interesting for an ASP.NET MVCapplication in that you can grab a partial view and turn it into downloadable PDF content. A simple and effective tool that turns a web page orHTML document into PDF is wkhtmltopdf—a command line to you use as below:

wkhtmltopdf [url] [pdf-file]

The link where you can get the tool is http://code.google.com/p/wkhtmltopdf. In ASP.NET MVC, you deploy the executable to the server,configure permissions properly so that a new file can be created in a given folder, and then invoke the executable programmatically:

var exe = new Process(); exe.StartInfo.FileName = "wkhtmltopdf.exe"; p.StartInfo.Arguments = "/home/getpdf yourfile.pdf"; p.Start();

In this case, getpdf is assumed to be the name of an action method in your particular controller class. There are a couple of details that you'llneed to set in this code snippet. First, how do you locate the executable (variable PATH or perhaps the application’s working directory)?Second, where do you save the PDF file? How do you deal with related write permissions? Once you have the PDF file, you just pass its path to

the TransmitFile of the Response object for actual transmission.

Creating PDF Content from Word Documents

You might want to return PDF content that is created from a fixed template such as a Word document template (*.dotx). If you don’t mind havingWord installed on the Web server you can quickly develop a solution by using interop assemblies. Here’s the code you would need to create anew Word document from a fixed template padded with bookmarks. Next, you fill in the bookmarks with fresh data and save it as PDF.

public static void CreatePdfDocument( String fileName, String templateName, DateTime current, String author){ // Run Word and make it visibly for demo purposes var wordApp = new Application { Visible = false }; // Create a new document var templatedDocument = wordApp.Documents.Add(templateName); templatedDocument.Activate(); // Fill the bookmarks in the document templatedDocument.Bookmarks[CurrentDateBookmark].Range.Select(); wordApp.Selection.TypeText(current.ToString()); templatedDocument.Bookmarks[SignatureBookmark].Range.Select(); wordApp.Selection.TypeText(author); // Save the document templatedDocument.SaveAs(fileName, WdSaveFormat.wdFormatPDF); // Clean up templatedDocument.Close(WdSaveOptions.wdDoNotSaveChanges); wordApp.Quit();}

The first parameter indicates the name of the resulting PDF file. The second argument is the name of the Word DOTX template. Finally, theremaining parameters are for the data to store in the document via bookmarks. In particular, the template must have a couple of bookmarks withfixed names (the ones in the code are merely examples). Once the document is finalized and saved, you may close the Word application but youneed to choose the close option wdDoNotSaveChanges as otherwise it would pop up a dialog box which would be very, very, bad for anunattended server environment.

A moment ago, I just raised a little doubt on the effectiveness of using an Office application from within a server environment such as ASP.NET.Actually, there’s a KB article from Microsoft that marks it as definitely not being recommended practice. The URL ishttp://support.microsoft.com/?id=257757. Overall, the reason for not using Office applications server-side is that these applications may exhibitan unpleasant behavior in terms of scalability, permissions, and interactivity. Having said that, as long as this was not a critical productionsystem, and if you can find the right balance of permissions and parameters and fine tune the ASP.NET application the way you want the onlyserious concern is scalability. For this reason, you might want to move Office functions to a service that does the work in isolation and returns aPDF stream.

The companion code for this article provides a sample ASP.NET MVC project. The code works just fine as long as you use the embeddedVisual Studio Web server. If you move it to a real IIS Web server saving the file locally fails because of default security permissions. TheASP.NET default account doesn’t, by default, have write permission on the folder where the file is created. Figure 2 shows the sample PDFdocument created by previous code. Note the typical ASP.NET MVC about URL and then the content in the browser.

Figure 2. Downloading a PDF document

Final Considerations

When it comes to PDF in ASP.NET MVC, the biggest problem is not that of returning the content but just of creating it. What’s the template ofthe document? Most libraries offer a markup language or an API through which you “draw” content on a surface. For example, this is what youneed for a trivial hello-world using Report.net. The source code is adapted from one of the examples coming with the library. (Note that the libraryis under the AGPL license, meaning that either your project is open-source or you need to buy a commercial license.)

var report = new Report(new PdfFormatter());var fd = new FontDef(report, "Helvetica");var fp = new FontPropMM(fd, 25);var page = new Page(report);page.AddCenteredMM(80, new RepString(fp, "Hi there!"));RT.ViewPDF(report, "Hello.pdf");

For a fixed template, Word offers a great tool at the cost of running Office on the server. In some cases, an interesting workaround would be tocreate a PDF template with input forms (this requires Adobe tools) and then using PDF libraries to just fill in those values and return the savedtemplate.

As it often happens in software, the major difficulty is not technical but analytical. PDF content from ASP.NET MVC is no exception.

© Simple-Talk.com

The Cookie CrumblesPublished Monday, July 04, 2011 10:34 PM

Cookies were never intended to invade your privacy. Transient session cookies were invented out of necessity, by Lou Montulli at Netscape inJune 1994, purely to make the use of a shopping cart possible on the stateless web. Permanent, or tracking, cookies soon followed, in order toidentify users between sessions and so save users the tedium of having to identify themselves for every session. Such cookies should beinnocuous because they can only be read by the site contained in the cookie. Nevertheless they were, even then, viewed with suspicion, as asecurity risk, since if the cookie can be somehow copied it can used to impersonate the user.

The biggest problem with persistent cookies, however, is that they can contain 'third-party' domains, rather than just the domain of the site thatwrites it. A 'third-party' domain is easily written to from any site that chooses to do so, allowing an unscrupulous marketing agency to collectinformation about the browsing and buying habits of internet shoppers, across all the sites where it has its advertisements or web bugs placed,and so target advertising to the individual. Although the US government has strict rules against the use of persistent cookies in this manner, thesame isn't true of commercial sites, which have resisted voluntary regulation of the use of third-party cookies.

Due to widespread concerns about this invasion of privacy, the "European Commission Privacy and Electronic Communications Directive" wasissued, and has to be implemented by every member state by May 25th. (So far, only the northern European countries have complied). Itchanges the requirement that the user has a right to refuse to store cookies, either third-party or not, on a local machine, to an obligation on thepart of the website to give explicit "informed consent" on all cookies being used, even the session cookies.

This is a botched reaction. The legislators seem to have confused simple cookies and third-party cookies. The Directive allows cookies only foractivities that are 'strictly necessary' for the operation of a website and its delivery of those services that a user has explicitly requested. This isopen to varying interpretations, and the UK government has concluded that 'we consider that it will, for instance, cover (allow) the use of cookiesthat underpin the use of shopping baskets on websites.' However, the law was passed together with the ambiguous wording.

The IT industry foolishly concluded that, if the user's browser settings allowed third-party cookies, then they have given consent to them. Not so,said the EU, and since then the IT industry has been negotiating with the EU and member states on best practices that would be consideredcompliant. The UK came up with a 'best practice' consisting of "â€¦an easily recognisable internet icon, a privacy policy notice, a singleconsumer control page, with a self-regulatory compliance and enforcement mechanismâ€¦", via which a consumer could access details abouteach specific internet advert, the advertiser, the server, and so on, and refuse the cookie, if desired.

This is all very well-meaning, but also very silly. Cookies are necessary, and very few users will tolerate having to click to opt-in on every site theyvisit. It's also unlikely that users will check the details of every advert before opting to allow cookies for a site, and the site containing the advertcould be completely unaware of the data being collected. Therefore, it will hardly prevent an unscrupulous marketing organisation fromharvesting the users' internet activities.

As if to prove that the 'e-privacy' directive is fatuous and unworkable, tracked traffic to the website of the Information Commissioner's Office(ICO) fell by 90% when it recently adopted measures to gain cookie consent. A freedom of information (FOI) request by Vicky Brock, a WebAnalyst forced them to release the information.

This is a graph that is likely to strike fear into any well-meaning site that tries to comply with this regulation.

Surely, a much more sensible solution is this: Browsers shouldn't allow third-party cookies by default, as they serve no honourable purpose;though, for some reason, Microsoft's Hotmail, MSN, and Windows Live Mail webmail require them!. Currently, users have to explicitly opt out, byturning off third-party cookies. This simply needs to change;

Instead, all browser publishers seem hell-bent on making it more difficult to opt out of third-party cookies; it took me ten minutes to discover howto do it in Firefox. It would seem that the power of the advertising and marketing interests on the Internet are too powerful to ignore, and insteadwe are likely to see, on every site, tedious opt-in forms and â€˜easily recognisable internet icons, privacy policy notices, consumer controlpages, and other mandatory gubbins. How much more sensible would have been a voluntary code of practice.

by Andrew Clarke

F

Arrays in SQL that Avoid Repeated Groups05 July 2011by Joe Celko

It is certainly possible to fake an Array in SQL, but there are only a few occasions when it would be the best design. Most often, thewish for an array in SQL is a sign of a forlorn struggle against poorly-normalised data. One of the worst sins against Codd is therepeating group, as Joe explains.

or a long time, procedural languages have used arrays to hold and manipulate data, so programmers are used to designing data with them.The relational model states that all data is shown in tables, and we have no other data structures. To be in First Normal Form (1NF) a table

can have no repeating groups, which mean that it has only columns of scalar values. You can make a good case that you can only declare atable in the 1NF in SQL. But that does not stop programmers from finding ways to “fake it” and make their SQL look like their familiar applicationlanguages.

In procedural languages, arrays are stored in “row major” or “column major” order. Go to Wikipedia (Row-major order) for details, but theimportant concept is how to take an n-dimensional array and “flatten” it into sequential physical storage. This is important for procedurallanguages because it determines the algorithms used to access the array elements. In SQL, we do not care about any physical storage.

It is quite common to see repeated groups used in SQL to attempt to simulate an Array. An example of a repeated group is an employee tablewith a set of columns for dependents, as follows:

CREATE TABLE Personnel(emp_id INTEGER NOT NULL PRIMARY KEY, emp_name VARCHAR(25) NOT NULL, kid_name_1 VARCHAR(25), kid_name_2 VARCHAR(25), kid_name_3 VARCHAR(25), kid_name_4 VARCHAR(25), kid_name_5 VARCHAR(25), ..);

This table has many problems. All the columns in the repeating group must be NULL-able, otherwise you cannot represent employees with zero,one, two, three, or four dependent children. The alternative of requiring all employees to have exactly five kids does not work. Likewise, the firstemployee that has six children messes up the table. And you cannot require him to ditch one of his dependents.

If "number one son" dies, you have to decide to leave his slot NULL or to move the others up one notch. Where would you put the name of a newdaughter in the group? Position in a repeating group often has meaning in itself, so you cannot use a single simple UPDATE, INSERT, orDELETE procedure for such tables.

Queries are much more difficult. As an exercise, try to write queries against the Personnel table to find:

1. All kids named "George" by the same employee. George Foreman should be the only answer to this one.2. All employees with three or more offspring3. All employees with a kid who has the same name as they do (find the juniors)4. Pairs of employees whose children have the same names

You are unlikely to enjoy this task. Using children makes the silliness easy to see.

Representing Arrays in SQL

SQL cannot represent arrays directly, but vendors often provide array language extensions. Two methods for supporting arrays are to havecolumns with "array" data types (as a whole) or to allow the referencing of groups of columns by subscript (element by element). Subscripts arealso called array indexes, but that term can be confused with table indexes in SQL, so I use the term "subscript" instead.

An array in other programming languages has a name and subscripts by which you reference the array elements. Typically, the array elementsall have the same data type and the subscripts are all integers. Some languages start numbering at zero, some at one, and others let the userset the upper and lower bounds. A Pascal array declaration, for example, would look like this:

MyArray : ARRAY [1..5] OF INTEGER;

This would have elements MyArray[1], MyArray[2], MyArray[3], MyArray[4], and MyArray[5]. The same structure is often mapped into a SQL

declaration as:

CREATE TABLE MyArray1(element1 INTEGER NOT NULL, element2 INTEGER NOT NULL, element3 INTEGER NOT NULL, element5 INTEGER NOT NULL);

You have to go to all this trouble because there is no subscript that you can iterate in a loop. In fact, there is no loop control structure at all! Youmust use column names.

A better alternative approach to faking an array in the relational model is to map arrays into a table with an integer column for each subscript, asfollows:

CREATE TABLE MyArray2(i INTEGER NOT NULL CHECK (i BETWEEN 1 AND 5), element INTEGER NOT NULL, PRIMARY KEY (i));

This looks more complex than the first approach, but it is closer to what the original Pascal, or any other procedural language, declaration doesbehind the scenes. Subscripts resolve to unique physical addresses, so it is not possible to have two values for MyArray[i]; hence i is a key. Thecompiler will check to see that the subscripts are within the declared range using the CHECK() clause.

The first advantage of this approach is that you can easily handle multidimensional arrays by adding another column for each subscript. ThePascal declaration:

ThreeD : ARRAY [1..3, 1..4, 1..5] OF REAL;

becomes:

CREATE TABLE ThreeD(i INTEGER NOT NULL CHECK (i BETWEEN 1 AND 3), j INTEGER NOT NULL CHECK (j BETWEEN 1 AND 4), k INTEGER NOT NULL CHECK (k BETWEEN 1 AND 5), element INTEGER NOT NULL, PRIMARY KEY (i, j, k));

Obviously, GROUP BY clauses on the subscript columns will produce row and column totals. If you used the original one-element/one-columnapproach, the table declaration would have 120 columns, named "element111" to "element345." This would be too many names to handle in anyreasonable way.

This idiom can support matrix math, but I am not going to go into that in this article. If anyone is interested, try to write the following operations fora classic two dimensional matrix:

1. Matrix equality2. Matrix addition3. Matrix multiplications4. Matrix sorting5. Compute a determinant

Let's go back the Personnel table and declare it using this approach:

CREATE TABLE Personnel(emp_id INTEGER NOT NULL PRIMARY KEY, emp_name VARCHAR(25) NOT NULL, kid_name VARCHAR(25) NOT NULL, birth_seq INTEGER NOT NULL, –-subscript with a good name ..);

Actually, you should normalize this table further by splitting it into Personnel and Dependents tables. The Dependents table needs its ownconstraints and references back to the Personnel table. But the way to think about it is that you are doing explicitly what has been done implicitly.

CREATE TABLE Personnel(emp_id INTEGER NOT NULL PRIMARY KEY, emp_name VARCHAR(25) NOT NULL, ..); CREATE TABLE Dependents(emp_id INTEGER NOT NULL REFERENCES Personnel(emp_id) ON DELETE CASCADE ON UPDATE CASCADE, birth_seq INTEGER NOT NULL CHECK (birth_seq > 0), PRIMARY KEY (emp_id, birth_seq), kid_name VARCHAR(25) NOT NULL,);

The four proposed queries are now simple and will work for families of any size.

1. All kids named "George"

SELECT kid_name FROM Dependents WHERE kid_name = 'George';

2. All Personnel with three or more offspring

SELECT P.emp_name, COUNT(*) AS kid_cnt FROM Dependents AS D, Personnel AS P WHERE D.emp_id = P.emp_id GROUP BY P.emp_nameHAVING COUNT(*) >= 3;

3. All Personnel with a kid who has the same name as they do ( find the "juniors"). George Foreman will get several lines in the output…

SELECT P.emp_name AS junio, D.birth_seq FROM Dependents AS D, Personnel AS P WHERE D.emp_id = P.emp_id AND D.kid_name = P.emp_name;

4. The query "find pairs of employees whose children all have the same names" is very restrictive. Both Mr. X and Mr. Y must have exactly thesame number of dependents, and both sets of names must match and be in the same birth_seq. (We can assume that nobody has twochildren with the same name -- except George Foreman.) Begin by constructing a table of sample data:

INSERT INTO Dependents (emp_id, kid_name, birth_seq)VALUES (1, 'Dick', 2), (1, 'Harry', 3), (1, 'Tom', 1), (2, 'Dick', 3), (2, 'Harry', 1), (2, 'Tom', 2), (3, 'Dick', 2), (3, 'Harry', 3), (3, 'Tom', 1), (4, 'Harry', 1), (4, 'Tom', 2), (5, 'Curly', 2), (5, 'Harry', 3), (5, 'Moe', 1):

In this test data, employees 1, 2, and 3 all have kids named Tom, Dick, and Harry. The birth order is the same for the children of employee 1 and3. For testing purposes, you might consider adding an extra child to the family of employee 3, and so on.

While there are many ways to do this query, this approach gives you some flexibility that others do not. Construct a VIEW that gives you thenumber of each employee's dependents:

CREATE VIEW Familysize (emp_id, dependent_cnt)ASSELECT emp_id, COUNT(*)

FROM Dependents GROUP BY emp_id;

Create a second VIEW that holds pairs of Personnel who have families of the same size:

CREATE VIEW Samesize_Families (emp_id1, emp_id2, dependent_cnt)ASSELECT F1.emp_id, F2.emp_id, F1.dependent_cnt FROM Familysize AS F1, Familysize AS F2 WHERE F1.dependent_cnt = F2.dependent_cnt;

You can test for set equality by doing a self-join on the dependents of Personnel with the same size families. If one set can be mapped ontoanother with no children left over, then the two sets are equal:

SELECT D1.emp_id AS first_emp_id, D2.emp_id AS second_emp_id, S1.dependent_cnt FROM Dependents AS D1, Dependents AS D2, Samesize_Families AS S1 WHERE S1.emp_id1 = D1.emp_id AND S1.emp_id2 = D2.emp_id AND D1.kid_name = D2.kid_name AND D1.birth_seq = D2.birth_seq GROUP BY D1.emp_id, D2.emp_id, S1.dependent_cntHAVING COUNT(*) = S1.dependent_cnt;

If birth order is not important, then drop the predicate "D1.birth_seq = D2.birth_seq" from the query. Obviously, all of the VIEWs could be donewith CTEs.

Flattening a Table into an Array

In a report, you often want to see an array distributed horizontally on a line. The one-element-per-column approach to mapping arrays into SQLwas based on seeing such reports and duplicating that structure in a table. Yes, you can use non-relational proprietary extensions like PIVOTtoday. But you need to know the “subscripts and value” approach so that you can read old code.

Imagine, for example, a company that collects monthly sales reports with each sales representative's name, the month_name, and each rep'stotal dollar figure. You want to produce a report with one line for each person and his or her year's work across the page. The sales reports tablelooks like Listing 1:

CREATE TABLE Sales(salerep_name CHAR(25) NOT NULL, month_name CHAR(3) NOT NULL, -- the subscript sales_amt DECIMAL (10,2) NOT NULL, -- the element PRIMARY KEY (salerep_name, month_name));

You need to flatten out this table to get the desired rows for the report. First create a working storage table from which the report can be built.

CREATE TABLE ReportWork -- working storage(salerep_name CHAR(25) PRIMARY KEY, jan DECIMAL(8,2) DEFAULT (0.00) NOT NULL, feb DECIMAL(8,2) DEFAULT (0.00) NOT NULL, mar DECIMAL(8,2) DEFAULT (0.00) NOT NULL, .. “dec” DECIMAL(8,2) DEFAULT NOT NULL (0.00));

NOTE: DEC is a reserved shorthand word for DECIMAL in Standard SQL, so it has to be double quoted.

Notice that the primary key is the sales rep's name and that the monthly data columns default to zero dollars. The first step is to get a row forevery sales rep in the working table:

INSERT INTO ReportWork (salerep_name)SELECT emp_name FROM Personnel WHERE job_title = 'Salesrep';

Because of the DEFAULT() clause, the other columns will fill with zero amounts. If your SQL does not have a DEFAULT() clause, simply add adozen constant zeros to the SELECT list.

The data from the Sales table is then added into the working table with a series of UPDATEs of the form:

UPDATE ReportWork AS RW SET jan = (SELECT sales_amt FROM Sales AS S1 WHERE S1.month_name = 'Jan' AND S1.salerep_name = RW.salerep_name);

Or, using newer SQL features:

UPDATE ReportWork AS RW SET jan = (SELECT sales_amt FROM Sales AS S1 WHERE S1.month_name = 'Jan' AND S1.salerep_name = RW.salerep_name); or using newer SQL features:MERGE INTO ReportWork AS RWUSING Sales AS S1 ON S1.salerep_name = RW.salerep_nameWHEN MATCHEDTHEN UPDATE SET jan = CASE WHEN S1.month_name = 'jan' THEN S1.sales_amt ELSE 0.00 END,feb = CASE dec S1.month_name = 'feb' THEN S1.sales_amt ELSE 0.00 END, ..dec = CASE WHEN S1.month_name = 'dec' THEN S1.sales_amt ELSE 0.00 END,

This basic technique can be modified to handle NULLs, collect totals, and so forth. The trick is in having a column in the flatten table thatmatches a value from the 1NF table.

© Simple-Talk.com

A

The One Way That High Availability Will Help You29 June 2011by Wesley David

High Availability (HA) is a term that is beloved of the marketing people, with its connotations of an unspecific sense of reassurance.However, service reliability cannot be bought like bath salts: But, explains a seasoned and cynical expert in the field, HA can bemore than the start of 'HAHA!'. There is a role for HA in keeping your services running, if used as part of a broader solution.

fter eviscerating high availability over not one, not two but three articles, I’ve probably left you wondering if there’s anything positive that highavailability can do for you. Well, you’ve waiting so patiently, so it gives me great pleasure to confirm that, fortunately, there is something good

about high availability. While you might be expecting the article to keep with the theme I’ve established thus far, and be titled something like“Seven Things that High Availability Will Help You With”, in truth there really aren’t seven distinct things that HA is good for. Instead, you can readabout the one thing that it’s good for, seven times.

I should give fair warning that the startling conclusion to this epic saga may be obvious to many of you. Indeed, I hope it is, because that way Ican sleep soundly, knowing that the networks of the world are patrolled by savvy administrators. However, that doesn’t mean that it doesn’t bearpointing out, or that there mightn’t be a twist you hadn’t considered. Read on, intrepid SysAdmins, and have your hunches confirmed.

Definition and Exposition

First, let’s make sure that we all understand what high availability actually is. HA is any implementation of a set of hardware or software featuresthat protects a specific service, allowing for some part of the system that the service is running on to fail, while still allowing access to thatservice. Perhaps that description, in spite of being a run-on sentence, is an oversimplification. There are plenty of nuances, caveats andexceptions that will muddy the waters in some HA implementations. However, generally speaking, that’s a fair summation of high availability.

With that in mind, the one thing that HA systems are good for should be self evident: Keeping a very specific service protected fromunavailability as a result of a very specific set of system failures. Yes, there is a lot of emphasis on specificity. If it sounds to you like HA is onlygood for a virtual knife edge of scenarios, then you’re thinking correctly. Indeed there are many different methods of achieving high availability onthe market, including application HA, hardware HA and site HA among the more popular kinds. Regardless, each kind of HA has a miniscule listof disasters that it can protect against.

The concept of “what HA will protect you from” is probably not much different from an emergency plan that you may have in case of a natural (orunnatural) disaster. Where I grew up, earthquakes and volcanic eruptions were significant concerns, and my family had an emergency plan incase of a sudden disaster. We did our best to practice it and keep it ready, but we obviously all hoped we’d never have to use it. Of course, theplans we had would have proved virtually useless in the event of a different disaster, like a flood or house fire. In the same way, HA is merely afirst line of defense against a specific set of worst-case scenarios, and will prove wholly inadequate for a separate set of disasters.

And Now For An Example

Just recently I had an up-close-and-personal encounter with the narrow scope of systems protection. I’m designing a small datacenterdeployment that needs a highly available firewall solution, and one of my better design proposals involves a pair of Active/Passive ASA 5510s. Ibegan to evaluate exactly what disasters that design protected me from, and was saddened at the shockingly small list - I’m only protected froma hardware fault in the primary ASA. No upstream network redundancy. No site redundancy. Just plain, ol’ lemon protection in case an ASAbites the dust.

It might be obvious to many (I certainly hope it is), and indeed I wasn’t surprised, but it is a good reality check to explicitly list out all of thepotential disasters that lurk in the shadows, followed by a list of what (if anything) you’re protected from with your current HA systems. Thedisparity between the number of items on the first list and the number of items on the second will determine your resiliency (as well as your abilityto sleep soundly at night).

Grudging Admittance

"Is that it?!" you might be thinking. "Surely there must be more that high availability is good for!" Okay, maybe there’s a few other things that I’mwilling to admit HA is good for. First, it’s a good time-buyer. Remember that if one component dies and the failover steps in, you haven’t exactlyavoided a disaster; you are currently living in one. You will not be out of that disaster until that failed node is repaired and brought back into thecluster. However, the failover component has bought you some valuable time to bring the situation back to normal. Many a grievous mistake hasbeen made when a service is completely unavailable and a replacement needs to be ordered and implemented in mere hours. HA allows yousome calmer moments to make more perspicacious decisions.

High availability could also be a useful marketing tool. HA looks good to customers and management. Before I go any further, know that I’m mostcertainly not advocating false bravado. However, if you’ve implement an impressive high availability system, then you should tell people about it.If your HA system faces inwards relative to your company, then make sure that upper management knows about it. Executives love to brag toother executives about the gadgetry that makes their company run, so you can play a small part in developing the positive image of yourcompany (just make sure that the image is warranted).

Equally, if the HA system in question directly faces your customers, then tell them about it! It can increase customer confidence and potentiallyincrease sales. Every little bit, as long as it’s truthful, can count in the mystical art of customer conversion rates.

In Conclusion

If your reading speed is in the average range of an adult (250 to 300 WPM) then it’s only taken you four or five minutes to power through thisarticle. Twenty minutes if you’re multitasking with salesmen on the phone and a flapping Nagios monitor victimizing your cell phone. Hopefullynothing I said was terribly groundbreaking for you. If it was, not to worry! We all learn new things every day. If it wasn’t new, then consider yourselfrefreshed and ready to re-assess your HA efforts.

At the end of the matter, HA is just the tip of a very large technological iceberg. More like a single ice crystal in a snowflake in a drift on the southend of B-15. HA in its most proper form is built on top of excellent change management, stellar lifecycle management and a magnificentsupporting infrastructure (meaning not having an old DSL line as your web connection for a monster Hadoop cluster). However, if kept in properperspective, high availability can be a very effective tool in keeping your services running and your business profitable.

See? It’s not all blood and gore! I managed to (somehow) end this series on a positive note, which, given the amount of marketing hyperbole I’vebeen bombarded with in my day job, is something I’m rather proud of. High availability does have a valuable role. A very tightly scoped role, but avaluable one nonetheless. Just make sure that you’re intimately acquainted with what few scenarios are covered by that scope. After all, thedaemon is in the details.

© Simple-Talk.com

Inside Red Gate - TestersPublished Thursday, July 07, 2011 2:50 PM

Developers might write good code, but no matter how good they are the result will always have bugs in it. It's up to the testers in the team tomake sure the final product is as bug-free as it can be.

Deciding what to testWithin a project there's normally no official documentation produced, no official record of what the project should accomplish. The closest thingwe get to documentation is the greenlight presentation slides (I'll be covering the greenlight process in a later post). This means there's nospecification to validate against, no way of confirming that the project is 'complete'. So how does a tester know what to test in the first place?

Within Red Gate, the same team normally stays on the same product for several minor and major releases (as an example, the same team hasworked on SQL Source Control since it's inception back in 2009). Everyone within the team has an intimite knowledge of what the tool does,what problems it solves, how customers use it, and what the main bugs and issues are. Testers are as much a part of the team as thedevelopers or project manager.

This means testers simply don't need a specification. They know how customers use the tool, what bugs they're running into, how new featuresshould interact with existing ones, and (roughly) how they actually work. The testers are fully involved in the project from day 1. This gives thetesters deep knowledge of the application and application domain, so they know where and what they should test to ensure the final productworks.

How to test a productAs with other things at Red Gate, there's no officially-mandated way of testing a product, no documentation that needs to be produced. It'sentirely up to the testers to decide how they want to test the product. Most of the time this will involve writing automated unit and integration teststo run on the build server with each new build. When testing the user interface, this usually means manual testing of all the different parts of the UIto make sure it all behaves sensibly whatever actions the user performs (although some testers are experimenting with automated UI testing).

As a product matures, it gains more and more tests. Not just tests for new features, but tests covering specific bugs encountered by customers.SQL Compare, our oldest existing product, now has over 30,000 unit and integration tests, whereas SmartAssembly (the product I'm workingon) was acquired by Red Gate a couple of years ago, and has about 1000 unit tests and counting; Jason's writing more all the time.

At the end of the project, it's up to the testers to give the final go-ahead to release the product to customers. They need to be satisfy themselvesthat the product is as bug-free as it can be and the new features work as they should, on every configuration the product will be run on, usingwhatever tools and methods they see fit. Only then does the installer go up on the website, and the online documentation and product web pagesupdated with the new features and new documentation.

Testers are an integral part of the project team, and only they decide when the product can be released. They act as Red Gate's gatekeepersand quality control.

by Simon CooperFiled Under: Inside Red Gate

L

Building Your DBA Skillset06 July 2011by Chris Shaw

As a DBA and hiring manager, Chris Shaw has been on those sides of the recruitement process. As an MVP and active member ofthe SQL Server community, he knows what resources are available to help DBAs hone their abilities. Who better to guide youthrough the many paths to developing your DBA skillset?

ike all professional technologists, we must constantly learn new skills and concepts in order to keep pace. What skills should we develop, andwhere should we go to get educated and trained?

We’ll try to answer this in this article.

Three of the more important traits for database professionals are:

the ability to adapt;to be aware of what can be done with the technology they are working with,to know how to learn more about the technology in a given timeframe.

While it’s possible to teach people about how to extend their skills, the ability to adapt is not something that a traditional classroom educationcan easily teach, as it ultimately boils down to attitude more than aptitude. Fortunately, this attitude can be nurtured and developed alongsidetechnical skills; indeed, someone willing to continue improving their skills and knowledge probably possesses strong elements of the rightperspective, already. As a result, most of the educational options we’re going to look at are not just available to those wishing to embark on acareer as a DBA, but to working DBAs looking for career advancement as well.

Choosing your Training Options

In many professions, a form of higher education leading to a degree or similar certification is almost a necessity. However, in the databaseprofession, higher education and extensive experience are on a more-or-less equal footing. A DBA's wages, and the decision on who to extendan offer to, will often depend on a mix of the two. Indeed, in the case of veteran DBAs with years of experience under their belts, highereducation is desirable but not essential.

If you are seeking your first DBA role, or looking for promotion from a junior role, then the practical training provided by online classes,conferences, workshops, and User Group attendance can be invaluable. They are also excellent places to build your network of contacts in theprofession.

Figure 1. There are a huge number of ways to get the information you need.

If, however, you are seeking a career as a DBA but lack direct experience of working with databases, then a relevant qualification or certificationis a good place to start. However, the sort of real-world experience that can be gained from the more informal education paths may still beequally important to developing your skill set.

The informal routes to improving your skills are always worth pursuing, but the balance you strike between those and “traditional” methods willdepend on your circumstances and needs.

"If you are consideringa technical school,check their accreditationcarefully"

Traditional Education Options

The options available for becoming a database professional have dramatically improved over the last 10 years. The quality and quantity oftraditional education sources have grown, thanks in part to the rise of online resources. Traditional schools and institutions have come to realizethat the expert management of data in the computing environment, is essential in today’s business environment, so they have become far morewilling to provide appropriate courses

Colleges

Colleges and Universities are starting to develop specialized Computer Science degrees, but only a select few offer programs that arespecifically database-related. While the range and quality of classes is increasing in the colleges that do offer database programs, I would stillhighly recommend that anyone considering formal education as a route into the database profession performs some thorough research. Findthe colleges offering the best courses and the best instructors.

Because the technology changes so quickly, the better instructors are professionals that not only teach, but also use the technology on a regularbasis. These instructors can pass along not only the technical “how-to”, but also the best practices and understanding of why things should bedone a certain way. Many of the top professionals that work with SQL Server also teach classes about SQL server, and their wisdom isinvaluable.

Just to give you a sense of what’s on offer, a recent Internet search provides a good example of the classes that a Computer Science degree(with a focus in database technology) now includes:

Required

Quantitative Methods for Information SystemsBusiness Data Communication NetworksDatabase Design and Implementation for BusinessInformation Systems Analysis and DesignIT Strategy and Management

Concentration

Data Mining and Business IntelligenceAdvanced Database Management

Elective

Database SecurityDesigning and Implementing a Data WarehouseDatabase Administration

Database professionals may not agree on all aspects of such courses, but it's certainly as big step up from a single class in introductory SQL.

Technical Schools

One immediately noticeable difference between technical schools and traditional colleges or universities is the overall cost of attending. Thetraditional school, such as a University, may require you to take classes that are not directly impacting your career in order to achieve a diploma.These classes come with their own costs, in terms of both time and money. A traditional University often also requires a full time commitment,whereas technical schools generally cater to individuals who maintain a full time job, and are more inclined to offer associate degrees.

It's true that attending a traditional college as opposed to a technical school often makes a bigger impression on your resume. However, from ahiring manager’s point of view, in terms of database skills, there is not much difference between candidates who attended a prestigious fouryear school and those who went to a technical school.

That said, if you are considering a technical school, check their accreditation carefully. Accreditationrequires adhering to a strict set of guidelines set by accreditation boards to maintain a certainstandard of education.

Most schools have accreditation and work hard to keep it, and validating that a school is accreditedhelps protect you as the student. Many of the classes taken at an accredited technical school orcommunity college will roll over to a traditional school if, as a student, you would like to obtain a moretraditional degree.

Equivalent Experience

"Companies understandthese classes will notcreate experts, but willgenerate the coreknowledge needed tostart the advancedlearning process."

Many job postings specify that a degree or "equivalent experience" is required for the applicant to be eligible for a role. Somewhat frustratinglyfor applicants, the definition of equivalent experience is entirely unregulated. Essentially, each employer is left to define for themselves who isqualified to become a database professional. A company that has a strong need for a database administrator who understands databasemirroring may consider a candidate’s single year of experience to be equivalent to another candidate’s success in a database mirroring class.To be fair, working with a database feature in production does generate many more “real experiences” than working with the same feature in alab environment, but drawing an equivalence between those experiences and a certified qualification is too often a subjective judgment.

Figure 2. Learning from an educational enviornment and learning from experience are both valuable.

Unfortunately, there is no neat solution to this problem, and it falls to the applicant to research the company’s needs and expectations (in terms ofyour personal characteristics as well as your technical skills), and then consider not only what information should be included in the educationportion of their resume, but how that information on should be presented.

Microsoft Certified Classes

Microsoft offers classes for most of the software packages they sell, which can last between one day and one week. The classes are taught bycertified Microsoft Trainers and tend to follow a standard pattern of lectures, followed by practical labs, followed by Q&A sessions.

The classes tend to be thorough and focused, generally starting right from the basic installation of therelevant software, along with installation options, and including a chance to try out the tools and featuresin well-equipped, high-spec labs. However, these classes frequently don’t dive deep into the advancedfeatures of SQL Server.

Upon completion of the class, students are given a certification of completion, which is not to beconfused with a certification. Often companies will use these certified classes as a method to giveemployees a foundation in a new technology. Companies understand these classes will not createexperts, but will generate the core knowledge needed to start the advanced learning process. Theseclasses are all listed at: http://learning.microsoft.com/.

Certification

Microsoft also offers a range of Certifications designed to test a person's knowledge and skill in aspecific technology, such as "SQL Server 2008", or in a broader discipline, such as "Database Administration". Some of the certifications that aMicrosoft-centric DBA may want to consider include:

Microsoft Certified Technology Specialist (MCTS):

There are a range of certifications based around the SQL Server technology, such as MCTS: SQL Server 2008, Database Development.A number of the server software offerings, such as Exchange, SQL Server and Access, begin with this certification.Microsoft Certified IT Professional (MCITP):

This certification combines multiple exams to test an individual's knowledge of a broad discipline, such as Database Development,Database Administration, or Business Intelligence. Generally, attaining MCITP will require you to pass one of the relevant MCTS exams.Microsoft Certified Master (MCM):

An advanced certification that goes beyond taking the exams and includes experience as a condition for passing. Yet even completing theexams and meeting the experience requirements does not guarantee the MCM certification, as advanced knowledge in the specialistarea is also a requirement. The cost of this certification can be high, but it is very well respected in the industry because of the difficulty in

obtaining it.Microsoft Certified Architect (MCA):

An advanced certification is for individuals that work closely with Microsoft on their products, either at Microsoft or with a Microsoft partner.An MCA has completed more than the tests and classes, as this certification requires an advanced understanding of the specific productin question.

Figure 3. The many paths to Microsoft SQL Server Certification. Image © Microsoft

While most managers will not consider certification to be an adequate replacement for real experience with the relevant tools and technology,many will take them into account as a 'differentiating factor' between two candidates of similar experience. Furthermore, they are also a goodindication of a candidate’s desire to learn and keep up-to-date with latest technology in their field.

Boot Camps

A boot camp is a class with one goal: to assist an individual in passing certification exams. They can be as short as a single day or can lastseveral weeks, depending on the number of exams the individual wants to take. Their schedules can be compared to finals week in traditionaleducation arenas, with a lot of studying and a pass-or-fail exam at the finish line.

On the plus side, boot camps offer a quick route to becoming certified. People with experience in a specific product can focus on upgrade tests,or on areas where they may not have traditionally worked.

On the negative side, boot camps offer little in the way of practical experience, and are an expensive alternate to other education routes. Manyare focused solely on getting you through the certification exam and, while this may sound like an advantage over other education routes, in myexperience, when people are "cramming" purely to pass a test, they rarely retain the knowledge for long afterwards.

Ongoing Education

As a database novice, or an established professional looking to take their career to the next level, there appear to be endless classes andresources to aid your education. And yet, once the classes are completed, and the next conference is months away, the need for education andassistance in solving real life problems will continue. This is where you need to rely on the wide variety of day-to-date resources available to theSQL community. Things like workshops, SQL Saturday’s, User Groups and your own network. From an employer’s point of view, your regularparticipation in knowledge-sharing resources also demonstrates your desire to excel in your chosen field.

Meeting and Mentoring

As a DBA, you will be on a journey of continual learning, partly to refine and enhance your skills, and partly to keep up-to-date with currenttechnologies. The informal education options available to you will help you on that journey, as will your willingness to participate in the databasetechnology community.

And just as good chess players become great chess players by playing with people better than themselves, database professionals who workwith each other will continually be encouraged to meet and exceed their peers’ skills. As a young or novice database professional, mistakes canbe avoided when senior professionals are around to help you to navigate through the more difficult aspects of SQL Server.

Likewise, senior database professionals keep their skills sharp by assisting novices. Look for a

"There is no better wayto become an experton a topic then to teachthat topic to someoneelse, so you should giveas well as take withinthese communities."

mentor in your social networks, user groups or even at work, and if you’re a more senior professional,be willing to mentor others. There is no better way to become an expert on a topic then to teach thattopic to someone else, so you should give as well as take within these communities, be aware thatbeing a mentor yourself sharpens your skills.

As a hiring manager myself, I appreciate that this kind of practice is absolutely invaluable, becausealthough training budgets can quickly be reduced, the need for advanced skills do not diminish withthem.

Workshops

Workshops are short but focused events that tend to dig deep into a specific topic – most last 6-8hours, but others can last a couple of days. Most major conferences include pre-conference seminarswhich are essentially workshops by another name, and tend to cover specific technical topics. Occasionally they’ll cover more general topicssuch as "what do I need to learn in the first 30 days of my DBA job".

In addition to pre- and post-conference workshops, there are a number of single of multiple day workshops available in any given month, many ofwhich are sponsored by companies that may be trying to sell a service or a software package. A swift internet search and a keen eye onnewsletters and online communities should quickly reveal your options.

Conferences

Conferences are larger events, offering many different speakers talking about a range of topics. A few of the more popular conferences for SQLServer include the PASS Summit, Tech Ed , and SQL Connections. As mentioned earlier, many conferences have the days before and/or thedays after designated as seminar days, where speakers come in and teach workshops.

Figure 4. SQL Bits 8, the largest SQL Server event in the UK.

It's good to be a little bit scaredPublished Friday, July 08, 2011 8:50 AM

How many times have you read or heard that phrase from someone about to embark on some endeavour? A stand-up comedian, a stunt-man, asports person, each preparing for a big event where they are putting themselves at some risk, challenging themselves physically, mentally orpsychologically in some way.

Well, I can't say that I'm seeing the good part about it yet. I am just about a week away from doing a presentation at SQL in the City, an eventbeing staged by Red Gate software in London* on 17th July.

Being asked to speak at this event was a catalyst for a lot of things in my professional life. I had never spoken in front of any number of people,certainly not a group of professionals about their own subject. Since being asked I have done a lightning talk at SQLBits 8, spoken at SQL Sotonuser group, started a local user group (sqlsouthwest) and spoken there and I have also just yesterday submitted for a full session at SQLBits 9(Liverpool, 29th Sep). The two talks I have done so far were to no more that 20 people combined and SQLBits 9 is several months away.

SQL in the City is going to be my first talk to a big group of people. A BIG group of people. The event is, I understand, for approximately 300people and there are two tracks so I guess, unless the other speaker has a much bigger room than me, I will be talking to around 150 people. Ithink its fair to say that Red Gate are putting a lot of trust in me. I am very pleased that they are, there must be something in what I have said anddone in my past dealings with them that has given them this confidence. I am going to do everything I can to repay this to the best of my ability.

I am going to be talking about two of Red Gate's applications - SQL Backup and SQL Monitor. Now I use these products every day in the officeand that will hopefully be to my advantage, I am on home ground, familiar territory, somewhere that I can put my confidence and experience in touse. This doesn't however stop me from starting to feel somewhat apprehensive about it all. It's possibly not helped by the fact that my wife hasstarted the whole "In a week's time we'll be ..." comments!

So, the presentation is written, I have sorted and shuffled the notes around, it's been passed to Red Gate for editorial comments and I have onelast weekend to practice. Then, next week I am off to Cambridge on Tuesday and then down to London for the event on Friday.

If you are coming to SQL in the City then I hope you have a great day, I certainly hope to see some of the other sessions as there are some greatspeakers there, if you see me please come along and say hello.

* - There is also a mirror event being staged in Los Angeles in August. As yet I have not been, and to be honest never expected to be, invitedto speak there also.

by fatherjackFiled Under: SQL in the City

S

Regular Rapid Releases: An Agile Tale07 July 2011by Mark and Stephanie

While developing their SQL Source Control tool, a team at Red Gate learned a lot about Agile development, as well as the benefits(and challenges) of rapid, regular Early Access releases. Stephanie Herr (project manager for SQL Source Control) and MarkWightman (head of development at Red Gate Software) tell us more.

QL Source Control is an add-in for SQL Server Management Studio (SSMS) that allows database professionals to source control theirdatabases in their existing source control systems, which gives them the ability to track who changed what, when, and why. In this article, we’ll

talk about how we created Version 1.0 of this tool using Scrum and Agile techniques here at Red Gate. We'll describe how incrementaldevelopment allowed us to get extensive user feedback, by deploying quick releases to our users. We achieved this by focusing on our productbacklog, and by automating our testing - all of which we'll discuss. Finally, we’ll take a look at our story estimations, which helped us make betterdecisions across the project.

Hopefully you’re already familiar with what source control is. What SQL Source Control does is bring all the benefits of source control into thedatabase development environment.

Let’s begin by looking at the development process.

Guiding Principles

Right from the start, we had three main principles that guided us throughout the entire development process.

First and foremost, it was important to have an ingeniously simple design, in line with the rest of our products: we knew that people weren'tgoing to adopt this tool if it was hard to use, added any additional overhead, or made them change the way they worked. Currently, there arevery few tools out there for database source control, and people don’t (as a matter of course) use source control in this context. So, if ourcustomers were going to adopt our new tool, we had to create something that completely integrated into their normal working environment.

Secondly, we wanted to get the minimal set of valuable functionality out to users as soon as possible, partly to garner fine-grained feedbackearly on in the process, and partly because we needed to get the product out into the market place quickly (we’ll discuss this later).

Finally, our sprints were two weeks long and, in order to get each new iteration of the tool out to users quickly, we wanted to be able torelease right after each sprint ended. In order to do that, we needed to minimize the amount of technical debt we were carrying forward (suchas bugs introduced in a previous sprint), and automate testing as much as possible, bearing in mind that our confidence in the stability of thereleases was as important as maintaining our momentum.

So, to summarize our guiding principles:

Simple, integrated design (without which we were dead in the water).Rapid releases, providing the minimum functionality needed for users to effectively use features of the system (all the better for early,realistic feedback).Regular releases after every sprint, requiring automated testing and high confidence in the stability of each new build.

We’re going to take a look at the hurdles we faced in doing it this way, and what the benefits were, as well as the repercussions.

User Feedback and Early Access Releases

From previous experience, we’ve learnt that we can’t always trust our understanding of customer requirements. We might think they need onepiece of functionality and spend a lot of time implementing it, but until the users actually try the tool, we don’t know whether what we've built fitstheir purpose. An intensive Early-Access Program, getting minimum viable releases of the product (as it’s being built) out to customers asquickly as possible and encouraging early and regular feedback, dramatically reduced the risk of building the wrong product.

Another recurring software development problem is that recreating realistic test environments is pretty hard. While we have labs set up withdifferent configurations, we can’t test every possible combination, and we're aware that the test databases we have are probably more simplisticthan real customer databases. Early and regular releases of the tool, getting it installed across a variety of users’ environments, would also go along way to increasing confidence in the tool’s stability.

That’s all very well on paper, but getting releases out fast, and collecting user feedback as effectively as possible took a significant amount offorward planning.

The Early Access Timeline

The project began back in December 2008, when we spent a month doing an initial prototype of what this tool might look like. It was knocked

together in a hurry, but it was a way to get something set up in our own environment so that we could run usability tests. We ran a handful of thesetests, and the feedback was already telling us that the tool needed to be integrated into SSMS, because that’s where most database developerslike to work when they make changes to their databases. So, right away, a kind of early access release was shaping the tool.

Figure 1: SQL Source Control development timeline

In March 2009, we were able to start fleshing out the user interface (UI) designs and preparing the backlog (we’ll talk about the backlog in moredetail later). Our first sprint kicked off at the end of August 2009, and in about three months we had the minimum functionality necessary to dosome more detailed usability sessions: the ability to add a database to source control, commit database changes, and retrieve the changes forone supported source control system, Subversion. (We chose Subversion based on polls that we ran over the last two years, which showedSubversion as being the most popular tool that was continuing to grow in popularity.) We were able to get useful feedback on those featuresand, about 2 months later, we had our first ‘real’ Early-Access Release, which we could give to end users to install in their environments, andhope to get more detailed feedback.

Between January 2010 and the final release in June 2010, we had six separate Early-Access Releases, as well as two beta releases, and arelease candidate, so we were, on average, releasing a new version with additional functionality every three weeks for our users to install andevaluate.

Collecting the Feedback

Of course, lots of early access is only as good as the feedback it generates, so let’s look at the different mechanisms we used to collect thefeedback.

We ran usability tests; essentially remote sessions in which we watched users try the product for the first time, and saw where their pain pointswere.

Within the product, we incorporated links to UserVoice, a website that allows people to suggest new features or enhancements, and then voteon those suggestions. This really helped us to prioritize what we needed to do next, and see which features were really important to our users.

We also took the time to implement SmartAssembly. If you’re not aware of it, SmartAssembly is a tool that handles source code obfuscationand, more importantly from our point of view, automated error reporting. Whenever there was a bug or error in the tool, a dialog box with the errordetails would pop up, and users were able to send that information back to us. The error report also contains helpful information for thedevelopers to debug the issue so that we could incorporate a fix into the next early release.

Figure 2: A SmartAssembly error dialog box

As you might imagine, that really helped with the stability of the tool. Without SmartAssembly, it’s likely that users would have tried the tool, hit abug, and then given up without getting back to us.

We also had our usual support channels, including support forums, e-mail chats, and a phone number that people could call if they had anyissues.

So how did we decide what features went into each release so we could collect feedback from our users? Well, for starters, in order to get thesereleases out to our users quickly, we really needed to take a look at the product backlog and ask ourselves “what’s the smallest, most basicfeature we can work on?”

The Product Backlog - Keeping it Small

Something that helped us with this minimalist approach was to really focus on our product backlog, and have small stories (albeit ascomponents of something ultimately more akin to a large epic), which would let us implement the smallest unit of valuable work possible.

We needed to build up the product feature by feature enabling a coherent workflow, so that we would have a working product at the end of eachsprint. We therefore needed a backlog that comprised small, well-written stories. To achieve this, we put more up-front effort into creating thatbacklog than we have done in the past, and we tried to have a product backlog which was fairly comprehensive from the start of the project.

Basically, we were looking for opportunities to split stories, by saying “this part of the story is critical”, “this part we can maybe split out andmaybe prioritize lower”, or even “let’s wait until we’ve got positive evidence that we need this bit of functionality before we implement it”.

Vertical vs. Horizontal

Instead of developing horizontal slices of the product, we took vertical slices. This meant that a user could go from the beginning all the waythrough to the end of a specific scenario. All the parts may not have been fully polished, which would have been the horizontal aspect of theproduct, but they could still complete valuable tasks and workflows.

The first few features we addressed were linking the database to Source Control, committing changes, and getting changes out. To keep itsmall, this didn’t take into account anything complex, like conflict handling or merging, and we also focused our attention on only one sourcecontrol system at a time (although we now support multiple source control systems).

It’s worth bearing in mind that making a vertically consistent tool from day 1 makes early access users easier to come by, as you’re immediatelyoffering them something they can use effectively, rather than something that causes them more pain. When users can see that they’re going toget something tangible from working with you, they’re much more willing to help.

Benefits and Challenges of Keeping It Small

Creating many small stories had its advantages and disadvantages…

On the plus side, our particular focus on the backlog meant that the stories were well written and we had a good understanding of the overallshape and size of the project. As a benefit of having so many small stories, we were able to rapidly change priorities based on what we heardfrom users. Moreover, because the stories were at such a low level, it allowed us to define them better, and also provide very clear acceptancecriteria for them, which made it easier to estimate. This was important for keeping our velocity high, as it meant that we knew when somethingwas done, and we weren't polishing features when we weren’t sure they were exactly what our users needed / wanted.

Of course, there were some drawbacks to having small stories. It took a lot of time and effort to define them, and it generally didn’t feel very

“Agile”, particularly at the beginning of the project, since we were spending a lot of time detailing later stories that weren’t at the very top of thebacklog. And there were, by necessity, so many small stories to manage - we had over a hundred on our product backlog. Also, ideally storiesshould be independent of each other so that they can individually be pulled in or out at any time; however, because our stories were so small,some dependencies were unavoidable and this was difficult to manage.

That’s enough theory; now we’ll look at an example of what happened in practice.

Applying the Theory

The feature we'll look at is “adding a database to source control”; Figure 3 shows the original UI design that was created at the start of theproject.

Figure 3: Our original UI design for SQL Source Control

You can see that we had lots of options planned for the UI. We were able to break those down into really small stories, to be implementedindividually, as shown in Figure 4.

Figure 4: Fine-grained story point decisions

At the end of sprint two we only completed the very basic “add database to Subversion” functionality. So at about four weeks into the product,users had to copy and paste their repository URL into a text box (see Figure 5). The URL had to be entered correctly -it would only work forSubversion, there was no validation, no ability to browse, and no ability to create folders. The user couldn't even enter credentials, because thecredentials were initially hard coded; what was important was that, at that early stage, we had implemented enough to allow us to do usabilitysessions.

Figure 5: The minimum viable functionality - entering the repository URL

Once we added the ability for users to enter their own credentials in sprint nine, we were ready for our first Early Access Release.

It was sprint 13 before we decided to start supporting another source control system (Team Foundation Server). The UI was still in the form ofvery basic text box inputs. In sprint 14, we polished the feature up by remembering which source control system the user last used and defaultingto that to make it easier for them. We also remembered the URLs to make linking easier.

Figure 6: Sprint 13, adding TFS support

Version 1.0 was finally released on June 30th, 2010. It didn’t look anything like the original UI designs, but it was all that was needed for gettingcustomers to use our tool. These decisions had to be prioritized with other work and other considerations, for example, linking a database tosource control is a one off task, so we wanted to focus more on the other features that could have a larger impact to our customers.

Automating Testing

As we’ve already mentioned, there was quite a lot of work needed to make our fairly ambitious release plans even remotely feasible.

We had a real challenge testing this product. There was a complexity in the number of environments we had to support: five different operatingsystems, seven different versions of SQL Server (factoring in different service packs), different versions of the SSMS that the product isembedded in, as well as five different versions of Subversion and Team Foundation Server… 175 distinct environments, all told.

That's not to say we were ever going to test the full set of environments, but there was clearly complexity there that we had to handle.

Without automation, we would definitely have been looking at several weeks of manual regression testing per release. In that situation, we wouldnot have been able to achieve a release program of the intensity we were hoping for, despite knowing all the benefits that that it would bring, andwe definitely would have suffered a high cost for supporting additional platforms (new versions of SQL Server, new versions of Subversion,Team Foundation Server, new source control systems, etc).

So we put a lot of effort into our automated testing, and we had a variety of different automated tests.

We had unit tests (running off CruiseControl via Continuous Integration) which took five minutes to run, and failed the build if they didn’t pass.

At a slightly higher level, our integration tests talked to SQL Server and source control systems. They took more like an hour to run, and again,were triggered by each build of the product.

Finally, and this was really a new thing for us, we ran a fairly sophisticated set of automated graphical user interface (GUI) tests nightly, whichsimulated real product usage to Acceptance Test levels. We’ll go into more detail about that in a moment.

Of course, we also continued with manual testing, as there’s clearly a lot of value to be added by human testers. For starters, it was the testengineers who created all of the automated tests in the first place! The test engineers also manually tested each story, and there was a subset oftests that had to be tested manually (such as manual regression tests and exploratory tests for each release).

GUI Test Automation

The GUI automation part of the test suite was a system that allowed us to simulate real customer usage of the product within a virtualizedenvironment, and was run from our continuous integration server on a nightly basis. The suite of tests installed and tested the product in sixdifferent environments, and took roughly six hours to run.

In the automated GUI tests, we spawned an automation application, which launched a set of virtual machines containing different environments. Itthen installed various bits of infrastructure, installed the product, and ran tests that simulated real customer usage. That means things likemanipulating the user interface, talking to the product, setting up states, manipulating the UI, and then checking the state of the UI after thosemanipulations (such as whether the correct set of information is displayed, and whether the icons in the Object Explorer are the color that wewould expect them to be).

The results were then returned back to CruiseControl, where they could be browsed like any other set of automated test results. A diagram ofthis entire process is shown in Figure 7.

Figure 7. Our newly-built automation framework

What Were the Benefits?

The nature of the product is such that there was a lot of complexity in the UI - it’s very tightly coupled with SSMS; it manipulates parts of theObject Explorer; it changes icons; it retrieves states; and so on. It would be very difficult to test that with any confidence in a non-UI sort ofenvironment.

Moreover, given the complexity of the different environments we were supporting, we felt that GUI test automation was really something thatwould reap benefits. And it did catch bugs; it caught UI bugs where we were displaying the wrong sort of information, it caught regressions, itcaught product crashes.

Ultimately, it meant that we had much less manual testing for each release. In fact, for each of those early access releases, we did no more thana couple of days of manual testing to complement the automated suite, and we calculated that one run of the automated GUI tests is equivalentto about 4 weeks of a test engineer’s time.

It also meant that we had much more confidence in those early builds than we might otherwise have had - we knew that the product basicallyworked. That made it an easy decision for us to say, whenever we had a bit more functionality ready, that we’d like to give it to users and find outif it was right.

What Were the Challenges?

It proved hard to make the suite reliable, and the higher you go up the testing stack, the more fragile tests tend to be. Having said that, inpractice we knew which tests we could trust and which ones we needed to take with a pinch of salt. We knew what was a real product failure andwhat was not, so there was still a human element to this process.

We’d set out with this vision that we would like to keep testing in sync with development, but in practice we found that was really difficult. In fact,because we put such a lot of effort into the test automation at the start of the project, we’d built up a backlog of tests quite early on in the project.That took a lot of effort to reduce, and there was some testing roll-over between each sprint. Initially, we attempted to close every single bugbecause we didn’t want to carry any technical debt forward, but about half-way through the project we realized that this wasn’t sustainable and itwasn't the best use of our time. At that point, we started focusing on only closing the high priority bugs, and pushing some of the work out into thefuture if we didn’t think was going to have a negative impact on the Version 1.0 release.

At the same time, if we encountered a really large bug or feature request, we would convert that into a story, estimate it, and put it on the board tobe worked in (or not, as appropriate). Nevertheless, all the little bugs added up, and ended up taking a lot of time.

In short, it can be more time consuming to build automated tests than to write manual test cases. It took the whole team a lot of time to build thetesting infrastructure, and it wasn’t just the test engineers’ time; it was developers’ time too. But obviously there is also a huge payback as aresult, and now that we have that infrastructure in place, it will be a lot quicker to build future tests.

The Big Challenge – Time to Market

When we started the project there was really only one competitor product of any substance, and that was the database edition of Visual Studio2008, which was perceived to have a very high learning curve, so we weren’t too worried about that.

However, we knew that Visual Studio 2010 was on the horizon, which would be a more serious competitor and could prompt a lot of users tostart to evaluate and consider source controlling their databases. Therefore, it was important for us to be in the market at the same time.

So we put a lot of effort into sizing our backlog, and knowing the amount of work that we had to do at any given stage. Our attention to estimatingand understanding how quickly we were getting through that work (our velocity) also helped us to understand how much work was left on theproject and how long it would take us to complete.

Figure 8 shows the amount of work that we'd actually completed at any given stage, along with our estimates if we continued working at our best/ mean / worst velocity, and the red line at the top shows the size of the product backlog. Where the lines intersected was our best/mean/worstestimates for when we could release the final version.

Figure 8: Our projected project timeline

We'd had some flirtations with adding support for Visual SourceSafe and then decided against it, hence the random spike early on.

What we were also able to tell from this chart was that our best case release date was the end of June 2010, and Visual Studio was actually setfor release around the same time. But that was our best case estimate - our worst case estimate was end of the year, maybe even the beginningof 2011.

So the effort we put into estimating paid off immediately; we knew that we had a problem, and we had to do something about it.

Getting to Market, Quickly

We had two options: we could add more people to the team or we could reduce the scale of the project. We did add some people to the team(with great results since they were very good, the code was written very well, and we had good automated test coverage to catch regressions),but we weren’t confident that that would solve the problem. We would have had to add a lot of people to the team to guarantee an appropriaterelease date, and we felt that that was a very, very risky thing to do and probably wouldn’t even work.

Instead, we looked at the backlog and started asking which features we could drop, and where the big wins were. However, there were reallyonly one or two places we could drop more features. Even those were quite limited in scope, and in fact only really got us about 1/3 closer to ourtarget release date.

The real lesson that we learnt from this was that we were able to solve this problem by looking at all of the stories again. We’d sliced the storiessmall, but when we renewed our focus on them we asked how we could slice them even smaller. This was done by the entire team and wechallenged ourselves to find any extras that shouldn’t be in Version 1.0. We were looking for opportunities to do, let’s say, 20% of the work andget 80% of the benefit. Even more than when we initially set out, we were asking ourselves “Where are the really difficult things that don’tactually add that much value? What is the minimum set of stuff we can do here?”

Confrontational Splitting

So we split the stories down further, applied some very strict prioritization, challenged the team’s set of stories, and found that a very valuableapproach was to ask “Is this story X more important than story Y?”

It turned out that, when you look at any story in isolation, it’s actually very easy to say “Of course we’ve got to have that, we can’t possibly dowithout it”. However, when you compare it against another story that you’ve already decided that you absolutely have to have, it’s a very differentsituation.

For example, we had decided that support for Team Foundation Server was clearly something we had to have, as that was half our targetmarket. It became a very illuminating test for us to say of a story that we thought it was important, but not as important as Team FoundationServer support. Bringing in new team members also had the added benefit that they were able to challenge assumptions that the team had heldfor a long time about what was necessary or possible, and what was not.

Those were all very valuable processes for us to go through, and we managed to identify a fairly sizeable amount of work we could remove fromVersion 1.0. out of the backlog, which turned out to have no material effect on the success of the product.

By spending a couple of dozen hours in total looking at the backlog in really fine-grained detail, we managed to identify a lot of savings, andrelease at the same time as Visual Studio 2010.

Figure 9: Our actual project timeline

Estimates and Planning

Everything we’ve talked about so far in terms of how we scoped and controlled the project hinges on our estimation process, so let's talk a littlebit more about how that helped us to make better decisions throughout the project.

Even before the first sprint, the team had estimated most of the backlog. Not all of the backlog was estimated to the same fine degree; it wasvery large, it took a long time, and we got rather bored doing it, to be honest.

However, we estimated most of the backlog, and went through a variety of exercises to really try and get some handle on our expected velocityrate. From our best and worst estimates of velocity we then derived the earliest and latest dates for the release (with quite a wide margin oferror), but at that stage these estimates were largely backed up by gut feel.

Crucially, we didn’t only trust the numbers, neither did we rely on plucked-from-the-air guesstimates. We used both of them together, and theytended to agree with each other, so we had some faith in them. Indeed, after a few sprints we updated our predictions based on our observedvelocities and they generally seemed to hold true.

Spikes

It’s worth mentioning as a quick aside that we found the use of spikes extremely useful in this project. A spike, in this context, is a time-boxedtechnical investigation to answer a particular question such as, “How easy is it to achieve this certain thing with this API? Can Subversion doit? Can Team Foundation Server support this sort of operation?”

The team found this approach very useful, especially for de-risking, but also when they couldn’t make an estimate. So we’d look at a story andwe’d say “We don’t know what to estimate for this because it depends on the answer to this question, and we don’t know how easy it is to dothis thing”.

Spikes were a very good solution to that problem; when we couldn’t anticipate or estimate something, we’d do a spike in the relevant sprint andthen estimate it once we’d done that investigation. As it turns out, the team started to insist on doing that at any point where they couldn’testimate due to a lack of information, and that helped to drive out a lot of uncertainty in the backlog. And it worked.

With this process for clearing up unknowns, and a solid estimation procedure, the team estimated their most likely velocity to be around 23points per sprint, and their actual mean velocity over the course of the project turned out to be 22 points.

What Did We Learn?

So what were our lessons from this? Well, we think that good estimation really made a difference to the project. The team tried very hard toestimate their stories carefully, but that didn’t necessarily mean spending a huge amount of time on it. It just meant being thoughtful and sayingwhat you can and can’t estimate, and doing something to clear up those things that couldn’t be estimated at the time.

Our product owner and project manager used those numbers frequently throughout the project to run “what if” scenarios. "What if we droppedthis entire feature? What if we built just this part of this feature, and this other thing? If we prioritized that story, could we do the early accessrelease sooner?" …those sorts of scenarios.

As a result, it felt like we had a good understanding of our status throughout the project, consistently backed up by gut feel, and there were veryfew, if any, radical moments of “oh my goodness, we’re not where we thought we wanted to be!”

Final Lessons

The most important things that we learnt from this project were:

An approach of incremental development and early access releases definitely helped us to build the right product. It helped us to build theproduct that people wanted and were prepared to pay money for. Additionally, in order to achieve that rapid, incremental development, testautomation was vital - we just couldn’t have done without it.Focusing on the backlog and putting effort into making that backlog work for us really helped to build the product quicker. We could easilyhave spent twice as long building this product if we hadn’t continuously groomed the backlog, and it’s not clear that having spent that extratime would have made a more successful product.Finally, attention to the estimating process, and the planning that we could do on top of that, really helped us to make good decisions thatwe trusted throughout the project.

Of course, we didn’t get everything right. We spent a lot of time on the backlog upfront, which didn’t feel very Agile and may have been a bit ofwasted time since we cut features later. We had amazing UI designs that were extremely exciting, but couldn’t all be implemented due to tryingto get a release out quickly. We didn’t always complete the testing of all the stories that we pulled into the sprint.

There were also lots of stories that we remained undecided about for some time, and that had a cost. I think if we’d really been on top of thenumbers right at the start of the project, we would have saved a lot of time in saying “No”, letting us just focus on the genuinely crucial stories.Regardless, in the end, we delivered a tool in record time, delivered it on-target, and were happy to put the Red Gate name on it.

This describes the development of SQL Source Control 1. If you are interested in database source control, you can download, installand try the latest version of SQL Source Control in under 5 minutes.

© Simple-Talk.com

Windows 8 inspired websitePublished Tuesday, July 05, 2011 1:42 AM

Download the zip here

See the example site here

Working in New Biz

As I work in Red Gate's new business division much of what we do isn't very visible. So along withMarine Barbaroux I decided to try to create an engaging website to capture what we're doing in aneasy to digest way.

The idea is to create a website that can be used by everybody in Red Gate to get a picture of whatwe're doing on a day-to-day basis. It isn't quite there yet but I thought I'd share the work so far so thatanybody can play with it.

How it works

The website is run entirely in javascript and makes use of jquery so should be hostable on whateverweb server you have lying around. The main index.htm page reads a data page html and uses the<div> data inside to construct the pages and segments on the fly. The reader can then browsearound the various pages reading all the updates.

A sample "empty" page

The segments are arranged in 3 columns and 3 rows, each being a known size. Each segment canspan half to 3 columns and 1 to 3 rows. The smallest segment being a perfect square of 175px whichlooks really nice with just a simple image on it.

Have a play

Please feel free to download the source code for the update pages and have a play with it. I've triedto go through and comment the code so hopefully it should make sense.

Things to do

There are still quite a few refinements I'd like to make which I will probably do over time if it takes off.These are (in no particular order)

1. Back from sub-page goes to referring page2. Clean up carousel - probably re-implement.3. Allow swipe to work on iPad etc (did try it but couldn't get it to work)4. Similarly allow mouse drag to shift pages5. Also allow keyboard to work left + right6. Back button via pushing history on stack or clever stuff?7. Neaten sources - probably provide some sort of data source server

Download the zip here

See the example site here

by Richard Mitchell