Winners of She Hacks 2014!

I was really excited to attend the inaugral SheHacks 2014 hackathon in Sydney, organised by the lovely women from Girl Geeks Sydney – Georgi Knox, Denise Fernandez, Kris Howard, Sera Prince McGill and Peggy Kuo. It was held at Google’s offices in Pyrmont and was a fantastic event! (SheHacks was running in parallel in Melbourne too, so you can check out a rundown of the Melbourne event by Tammy Butow.)

Everyone hard at work

It was the first hackathon for quite a lot of people, and it was great to see people getting involved in an event they might not otherwise attend.  Tickets for the event were sorted into several types:

  • Developers (the majority of the tickets)
  • UX/Designer
  • Non-technical

People were encouraged to form teams of about 5 people – 3 developers, 1 ux person, and one non technical – with the goal that your devs can build, the UX person makes it look amazing, and your non-technical person can coordinate and concentrate on your presentation (following excellent advice laid out by Kris just a month ago on presenting your hackathon project.)

Team Disasterama (minus me)

I was also amazed at the generous catering – pizza, caffeine, snacks, lots of cookies made by team mate Denise, and a decidedly un-male breakfast spread of yoghurt, muesli and fresh-cut fruit!

snacks aplenty

The result? 50 women in 11 teams competed for some great prizes donated by Google, Atlassian, Microsoft and Razorfish. There were some fantastic team hacks presented, and I personally enjoyed:

  • Mini Jobs – finding odd jobs for younger people to do to boost their confidence/skills and earn some pocket money
  • Share the Paw Paw – crowdsourcing locations around your neighbourhood where fruit and vegetables are freely available, or if you have a surplus to give away
  • Coffee Run – formalising coffee rounds in the office, including keeping tabs of who owes who

HOWEVER… our team of Denise Fernandez, Luciana Carrolo, Kim Chatterjee, Anna Zaitsev and myself won first prize with our “Mission Possible” app!! The site is designed to connect volunteers with coordinators to assist with disaster relief. The amazing prezi designed by Anna and Kim describes the idea in detail.

The source code is available on GitHub. The app was designed to be realtime so that volunteers can see up-to-the-minute information about where their help is needed, and in our demo we used two screens to great effect (realtime updates are always a crowd pleaser!). It was written using node, socket.io, handlebars, google maps, twitter bootstrap and a lovely set of custom icons designed by Kim.

A screenshot from our app shows a shaded area where the “disaster” has occurred (an oil spill), and a point which is the muster point for volunteers to go to to help (save the penguins!). Everything updated in real time from a master coordinator, who would add extra muster points and specify numbers of volunteers that should be at each point.

Mission Possible

Our prize was a Nexus 7 tablet and a 3D printed trophy, which was a pink computer.

Hello, computer!

I was pretty happy with the outcome of that! It’s the third hackathon in a year that I’ve participated in and won prizes for.  I really love the energy and creativity that comes out of such an intense situation, and it’s a lot of fun to see what everyone else does in such a short time as well.

Thanks very much to Girl Geek Sydney for a great event!

Save the penguins!

Three days of Haskell

I spent three days up in Brisbane between March 17-19 on a course called “Introduction to Functional Programming using Haskell“.  It was intense!

The course was run by Tony Morris & Mark Hibberd from NICTA, and Katie Miller from Red Hat. It was originally billed as Lambda Ladies, but it turns out there weren’t quite enough ladies to fill the course, so anyone else interested was invited along too.

The course is a bunch of practical exercises. They excluded the standard Haskell library from the project, and we spent time reimplementing first principles, starting with functions involving Lists.  It’s a very hands-on way of learning how Haskell works. The first day covers pattern matching, folding and functional composition, the next couple deals with abstracts on binding & functors, getting towards monads. You spend some time implementing a couple of concrete problems – a string parser, and a problem involving file IO – to see Haskell in practice.

If you’re familiar with functional programming, you’d understand that’s a LOT of material to cover in three days. I would say that the average learning curve went a bit like this:

Screen Shot 2014-03-20 at 4.39.53 PM

However, having a solid understanding of programming concepts (e.g. lambdas) meant that the more complex concepts were a lot easier to pick up (to a degree).  When I was learning functional programming at university, it took me days to reimplement map properly in Haskell!  Earlier this week, it took five minutes.

Getting to your solution for each problem felt a lot like algebraic substitution and refactoring. First, you make it work, and then you refactor constantly to get the most elegant (read: shortest) solution by taking advantage of functional composition.

I was surprised at how much it ended up looking like a normal chained method once you introduce the point notation, aka functional composition, which is something C# looks to have borrowed heavily from when introducing LINQ.

To take the example from the link above,

ghci> map (\xs -> negate (sum (tail xs))) [[1..5],[3..6],[1..7]]  
[-14,-15,-27]

turns into…

ghci> map (negate . sum . tail) [[1..5],[3..6],[1..7]]  
[-14,-15,-27]

I was also surprised just how much of a rush it was to a) have a solution that type checked properly, and b) actually worked.  Haskell felt like an all-or-nothing proposition, where it either compiled and worked, or was otherwise hopelessly broken and gave you a type checked error that was difficult to decipher.  Otherwise, most other programming languages have a more granular feedback loop and are much easier to debug – you can put logging statements in, for example.

The best takeaway of all were these amazing lambda earrings!

Lambda Earrings

Learn You a Haskell is an excellent (and cute, and free) resource for learning Haskell.

Angry Birds in CSS

I recreated an Angry Bird in CSS as an experiment to learn more front end styling.  It has been tested on recent versions of Chrome and Firefox, but cross-browser compatibility wasn’t really the goal – I wanted to try drawing shapes and learn more about CSS transformations.

The code is on github, and you can preview the output here.

Learnings:

  • Any kind of non-standard shapes are difficult! Particularly curves and the border-radius property, which has a slightly confusing syntax.
  • Triangles can’t have borders easily 😦
  • This.

angry-bird-css

Finding a Memory Leak

This post originally appeared on the 7digital developer blog on 15th February 2011. It has been moved here for preservation. 

A few weeks ago, we launched the shiny, redesigned new 7digital.com to a beta audience. Unfortunately, we had a memory leak.

The new site was hosted on the same set of hardware as a few other applications, and it was gradually bringing the other sites down. We put a limit on the amount of virtual memory to shield the other sites from the memory leak,  but performance kept deteriorating. Thankfully, the memory leak was eventually found – here’s a set of steps I followed to find it.

Step 1: Take a memory dump from the live site

Graham, a fellow dev, helpfully pointed out userdump and also gave me a crash course in windbg. Userdump is a command line tool which will take a snapshot of the memory space used by a process. It’s important to note that it freezes your process while it takes the dump, so if you’re doing this in live, your site might stop for a minute or more. You can use the inbuilt iisapp.vbs script on the command line to find out exactly which w3wp process belongs to which Application Pool, and therefore which process to dump. Once you have the process id, take the memory dump and examine it with windbg.  Two useful articles were Getting Started with windbg by JohanS, and Tess Ferrandez’s excellent lab/tutorial on how to navigate through a memory dump.

Step 2: Add some performance counters

Since the live dump didn’t highlight any obvious problems (it only had information for a minute or less of runtime before the app pool recycled), we added some performance counters to see if we could find any trends. You can access perfmon under Start > Administrative Tools > Performance.  MSDN has a good explanation of the different counters and what they mean. Since we were concentrating on memory, I added the following counters and waited for any trends to appear.

.NET CLR Exceptions\#Exceps thrown
.NET CLR Memory\#Bytes in all Heaps
.NET CLR Memory\Gen 2 Heap Size
.NET CLR Memory\Large Object Heap Size

Edit: It’s possible to show counters for a single process, but if you have multiple w3wp processes running on the same box (as we do), it’s difficult to get the counters for the right one.  I was looking at counters for the whole box, which didn’t give me a lot of detail.

Step 3: Do some local profiling 

A live memory dump is all well and good, but it just looks like a screen full of hex 🙂 Local profiling gives you some lovely graphs, stack traces, statistics on running time, etc which you can use to drill down into specific methods or lines of code. If you know what user action is causing the leak (e.g. clicking the “Purchase” button), you can profile that on your local machine and easily identify which method or line of code is causing the problem.

I downloaded ANTS Memory ProfilerDotTrace, and AQTime to try some local profiling. The learning curve on ANTS seemed to be the gentlest, although if you are familiar with any of the tools, it would help greatly. The ANTS inline help files were an excellent refresher course on how .NET garbage collection works.

Step 4: Local profiling with load testing

I spent about a day learning how ANTS works, and doing some common page loads on my local machine. I didn’t see anything unusual. But…. my mistake was to profile without load. It’s very difficult to spot trends unless the changes being made by an action are exaggerated.

ApacheBench was recommended, which is a command line tool for benchmarking performance, but also handy for making lots of concurrent requests. So I lined up multiple requests (and executed them multiple times, all while running ANTS) for common pages in our site, like the search page, artist page and album page. Nothing really turned up until I tried to add products to a basket – and got my breakthrough. Here are the two graphs of memory usage from ANTS. The first shows code behaving itself and being cleaned up by the garbage collector when some normal actions were load tested. The second illustrates our memory leak – the line in green highlights the total memory (managed + unmanaged) being used by our process, the line in red is the amount of managed memory allocated by .NET. Unforunately, this meant that our leak was in unmanaged memory, which ANTS couldn’t help me track down.

Good memory profile:

trace-good

Bad memory profile:

trace-bad

Step 5: Finding unmanaged memory leaks

So, back to the dump taken from the live site with userdump.  James Kovacs has written a helpful article which, among other things, lists reasons why you might be leaking unmanaged memory.  I took another memory dump with more user activity to examine, and had a look at the assemblies in the app domain. Along with the usual suspects:

Assembly: 034a3fd8 [C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Temporary ASP.NET Files\b\970be4ca\1a5ec57f\assembly\dl3\139d25740cf5f9d_99b8cb01\Lucene.Net.dll]
ClassLoader: 034a4048
SecurityDescriptor: 034a3d18
Module Name
04ac1d74 C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Temporary ASP.NET Files\b\970be4ca\1a5ec57f\assembly\dl3\139d25740cf5f9d_99b8cb01\Lucene.Net.dll

....

There were an enormous number of dynamic assemblies being loaded into our app domain:

Assembly: 286ff688 (Dynamic) []
ClassLoader: 286ff6f8
SecurityDescriptor: 286ff600
Module Name
0062429c Dynamic Module
0062461c Dynamic Module

This was the reason that the memory kept increasing. Some piece of code was dynamically loading assemblies, and once there, they never get unloaded. However, it’s very difficult to get any more information about them in windbg for framework version 2.0.  Windbg for v2.0 has less commands than windbg for v1.1 (strange!), and the internet seems to be full of demos using windbg 1.1 showing more information than you get now.   They are a good starting point, but be aware you won’t be able to follow them 100%. Tess Ferrandez again has a great tutorial on chasing down unmanaged memory leaks if dynamic assemblies aren’t your problem.

Step 6: Local debugging

The Modules window in Visual Studio shows you which assemblies have been loaded, and it gives you more information than windbg (the name of the assembly, at least) so it was just a matter of repeating the step that caused the error with the debugger attached, and watching when the number of assemblies changed. The culprit was finally found – it was the Application_Error event handler.  We were mis-using a piece of 3rd party code which was creating dynamic assemblies every time an error occurred. And unfortunately for us, it was a catch-22 because our beta users were finding errors we’d missed in testing, making the leak worse.

Step 7: Verification Profiling

We fixed the offending code, and then re-profiled with ApacheBench to verify that the memory was no longer leaking. The whole process took almost three days to track down and fix, mostly because I hadn’t managed to isolate what action was causing the leak. Once I started load testing, the leak was much easier to identify. I was amazed at the number of tools and apps used when trying to find the leak, mostly to rule things out in a process of elimination. Quite satisfying once found, though 🙂

Managing Dependencies With TeamCity

This post originally appeared on the 7digital developer blog on 8th June 2011. It has been moved here for preservation. You can also use a newer TeamCity feature called Snapshot Dependencies for a cleaner way of managing dependent builds. 

We have recently switched to using TeamCity to manage the building and updating of our shared code at 7digital, which is great.  The process is fast, completely automated and configurable, which is a vast improvement over our old build process – very manual, error prone and could take up to 3 hours of a developer’s time.

Background

We have a large set of “domain” dlls which contain a lot of legacy code shared between several applications. When someone updates this code, we need to ensure that:

  • All domain dlls are compiled against each other for build integrity
  • The newest version of the domain set is available to all projects
  • All consumers should update their references as soon as possible to catch any bugs.

Here is how we do it using TeamCity and a set of project conventions.

Solution & Folder Structure

Each solution has multiple projects, and a lib folder which contains third party and in-house domain dlls used by the projects in that solution. By convention, the lib folder sits in the top level folder.  Projects create references to the dlls straight from this location.

lib-folder-explorer_0

Updating a dll in the lib means that all projects will use the new version immediately. This is really handy for upgrading all projects to a new version of a third party tool like NUnit, RhinoMocks or StructureMap, but it also works for our own in-house dlls. All we need is an automated way of updating the dll in the lib folders to the latest version whenever someone commits a change. Enter TeamCity!

Using Teamcity

We’ve placed the set of in-house domain projects in a linear build order.  Each project in the list is configured to trigger the next in line when it successfully builds, using the TeamCity “Dependencies” tab. If somone makes a change to a domain project, TeamCity will pick up the commit, build the project and run its tests, and this will kick off the rest of the chain underneath. In the screenshot examples below, I’ve used a portion of our domain chain where SevenDigital.Domain.Catalogue is dependent on SevenDigital.Domain.Catalogue.MetaData.

domain-triggering1_0

We split each project into two (or more) builds in TeamCity, which are run in order if the previous build succeeds. 0) Dependency 1) Build and Unit Test 2+) Integration Tests (if they exist), code metrics, etc.

domain-front-page_0

Build and Unit Test

The (1) Build and Unit Test is a normal build triggered by developer check in, which builds the solution and runs unit tests.  On each successful run, it will export the assemblies from its lib folder and \bin\debug folder to artifacts, using TeamCity artifact paths.   The assemblies are then accessible by other TeamCity builds, and used by the (0) Dependency Build.

domain-general-settings_0

domain-artifacts_0

Dependency Build

The (0) Dependency Build is always triggered by a previous build in the chain.  It is responsible for updating the project lib folder with the latest versions of the previous build’s dlls, which sounds a bit complicated, but is easily broken into steps. We use the build agent like an automated developer – it checks out the project source code to a local folder, pulls down the artifact dlls from the previous build to the local lib folder, and then does a command line commit to either git or svn depending on where that project is hosted.

    1. On the “Version Control Settings”, we always set the VCS checkout mode to “Automatically on agent”.  This means the source code goes to the build agent machine rather than on the central server.domain-vcs_0
    2. On the “Dependencies” tab, we add an Artifact Dependency to the previous Build & Unit Test in the chain, taking all of the published dlls.  The destination path is set to “lib”, meaning the agent takes care of downloading the dlls into the local lib folder, effectively overwriting them (or adding new ones in to the folder if they don’t already exist).  From a version control point of view, the lib folder now looks like it has updated files that are ready for checkin. domain-dependencies_0
    3. We use an msbuild or rake task that executes a command line commit from the root folder.  The agent already has a link back to the main repository, because we checked the code out directly to the agent.
(svn|git) add .
(svn|git) commit -m "Auto Commit from $(agent_name) for build no $(build_number)"
  1. The commit from the agent is just like a regular checkin.  It triggers (1) Build & Unit Test, and the cycle continues down the chain.

Setting up the entire chain took a large amount of configuration, but it’s been worth it. The biggest gain has been removing the manual component of the build, which means we get faster feedback if something is broken, and people are able to make changes more confidently.