Lessons learned from Code::Dive

I live and work in Wroclaw, Poland. There has been an exceptional number of technical conferences held in my city this year. In addition to a regular SQL Day, we had .NET Developer Days; Infrastructure Days is about to happen (with a possibility of meeting the legendary Uncle Bob). Next year looks just as promising as my company Objectivity is sponsoring a new event for .NET developers called Wroc#. Maybe it isn’t much of an exaggeration to call Wroclaw Poland’s Silicon Valley.


Centennial Hall with Spire (from Wikipedia)

Anyway, recently I had a pleasure of participating in yet another event called code::dive. Hosted in Congress Centre at Centennial Hall, sponsored by Nokia with free entry – it was definitely something not to miss. Over 1000 software developers took part in it. The announcer was brilliant at times, as when he commented, after a session on cache, that the word ‘cash’ would now evoke different feelings in him.

Bartosz Ciepluch, in welcoming the audience, said that although a lot of people have heard that Nokia was sold to some company starting with M in Redmond, it did not mean that the Nokia company itself was dead! The company is still here and it is rising like a phoenix from the ashes. It has two R&D centres in Poland with WrocÅaw being the biggest technology centre in Europe. He revealed one of the reasons for the conference by mentioning the history of Silicon Valley, and emphasising that it would not have happened had they not been able to share the knowledge.

The event was live streamed, which is nice – I haven’t seen the broadcast, so unfortunately I can’t comment on its quality, but the video materials were published and you can watch selected sessions.

As I’m a .NET-oriented specialist, I attended sessions more appealing and applicable to my daily work, somehow avoiding any C++ related stuff. That was possible since there were two tracks of presentations. Here are some notes from sessions I participated in.

  • Andrzej KrzemieÅski in his “Seeing the bigger picture” presentation talked about bugs and ways to prevent them. He started on a funny note by saying that the death is the second scariest things for people, right after doing a public talk.

After showing a relatively easy and common example of SQL injection, he set out to explain how to identify the root cause of bugs. One of his suggestions was the abstraction gap between the developer’s viewpoint and the hacker’s viewpoint. To be able to better detect holes in our software, we need to change our perspective. He much preferred assertions and preconditions to throwing exceptions as way to describe contract of code. In his opinion even predicates expressed in comments, not validated by compiler, are better. He stated that the caller should be the one checking preconditions.

Although I do not agree with all his observations, I admit that we should make better use of type systems. For example, by defining sub-types of base types (like string), we can add additional constraints such as length or valid characters. This can prevent functions being called with invalid arguments by mistake (accidental signature match).

  • Scott Meyers, one of the invited stars, elaborated on “CPU caches and why you care“. He began with the interesting theme of “A Tale of Two Traversals” where he explained how CPU memory access works and why traversing by columns is not as scalable as traversing by rows. Next he showed Herb Sutter’s scalability issue in counting odd matrix elements. This time the introduction of a local variable makes the code perfectly scalable and the reason is CPU cache. (NB: I wonder how those algorithms perform in C#, anybody knows?)

Next Scott divided CPU caches into three groups: data, instruction and translation look-aside buffer (TLB), and advised that it’s better for us to focus on the first two in our optimizations. You might ponder that with the contemporary state of processors, languages, IDEs, etc., you don’t have to think about low-level design. To counter that, he quoted some voices of experience:

Sergey Solyanik (from Microsoft):

Linux was routing packets at ~30Mbps [wired], and wireless at ~20. Windows CE was crawling at barely 12Mbps wired and 6Mbps wireless. …

We found out Windows CE had a LOT more instruction cache misses than Linux. …

After we changed the routing algorithm to be more cache-local, we started doing 35MBps [wired], and 25MBps wireless – 20% better than Linux.

Dmitriy Vyukov (developer of Relacy Race Detector):

Cache-lines are the key! Undoubtedly! If you will make even single error in data layout, you will get 100x slower solution! No jokes!

Then he talked about cache hierarchies, which are very common in contemporary processors. He showed still small, sample cache sizes (for the Intel Core i7-9zz processor): 32KB L1, 256KB L2 and 8MB L3 – really funny accentuating units e.g. ‘kiiiilllllo‘ to express how little the cache is. After that, with the help of an amusing animation, the audience could see things like how slow access to memory is in comparison to caches, while Scott was sitting on the couch. Nicely designed!

Next he revealed the “cache lines” concept – clarifying that main memory is read/written in terms of multiple adjacent words, not single bytes (64 byte size of cache line is common). At that point Scott referenced the traversal example explaining that cache lines are the main reason why column traversal is slow. Then he moved to speculative cache lines prefetching by hardware. Some implications of this behaviour are that locality or predictable access patterns count a lot.

Cache coherency was revealed as a yet another problem of data access. Luckily it is taken care of by the hardware and is often invisible to developers. An exception to this transparency happens when we come across the “false sharing” problem – this happens when memory blocks have to be synchronized between cores due to colocation, where in fact no real data are shared between those cores. This is the issue from Herb Sutter’s code, mentioned at the beginning, where introducing local variable removes ‘false sharing’ and enhances performance.

There are several conditions what all have to be true for a ‘false sharing’ to arise: independent variables have to fall on one cache line, different core need to frequently access those variables and at least one of them needs to be a writer. Out of the all data types, few are more susceptible: globals, statics, heap allocated and handed-out thread references. And here Scott, quoted another fascinating voice of experience:

Joe Duffy at Microsoft:

During our Beta1 performance milestone in Parallel Extensions, most of our performance problems came down to stamping out false sharing in numerous places.

So it looks like the .NET platform is not free from the risk.

In the summary, the audience was left with three observations: small means fast, locality counts, and predictable access patterns count as well. Scott divided guidance in two areas: data and code. Regarding data he recommended using linear traversals on arrays (“Hardware loves arrays“) and extracting subsets of attributes over which we often iterate into separate arrays of objects (after Bruce Dawson’s). He also encouraged everybody to watch the “Data-Oriented Design” presentation by Mike Action.

As for the code optimizations, Scott generally advised programming so that amount of memory that a process requires in a given time interval, so called working set, fits in cache. One of important ways to make it possible is to avoid iterations over heterogeneous types – they regularly lead to swaps of code instructions related to implementation of different types in cache. As a solution for this particular issue, he suggested sorting lists by type of objects. This made me think about OWIN and Katana initiatives and wonder how much of performance gains could we get by minimizing program size? Scott warned about inlining, which is both: good (reduces branching and facilitates deeper optimization) and bad (code duplication and reduced available cache). Finally he advocated taking advantage of any build-in optimizations in compilers (WPO and PGO).

  • In the second part of his “Seeing the bigger picture” lecture, Andrzej KrzemieÅski discussed an aircraft weight calculator. In his sample, bogus weight (a magic number) was returned when there was a problem with lazy initialization. Andrzej advised reflecting irregularities in an interface (for example, by returning nullable type) and reverting to exceptions as a last resort, when changes cannot be afforded. Next he moved on to the issue of unit-less result of calculator (i.e. when we return number from calculations, developer has to guess in what units this number is expressed) and encouraged creation of dedicated types for results (Newton type in his case). As a final point, he was against the practice of catching exception which we don’t know what to do with, and emphasised that “the terminate function is our friend“. It is better to crash an app with an unrecoverable condition than an aircraft (using our weight calculator software).
  • Then came Venkat Subramaniam with a talk on “Core Principles and Practices for Creating Lightweight Design”. His presentation mostly focussed on SOLID principles, although he had not discussed DI and didn’t mention the SOLID acronym directly. Generally, he encouraged building systems that make changes affordable, and waiting for the external need to refactor them.

One very enlightening thing that Venkat said was that “Every knowledge in a system should have a single authoritative unambiguous representation“. He explained as well that DRY is not only about code duplication, but also about duplication of effort. Sometimes we are afraid to refactor our code to remove duplication, because we may break the system and that’s why we need good automated tests. He also emphasized that one must not confuse our inability to express logic in clean way with impossibility.

He suggested suffixing YAGNI principle with yet adverb indicating that sometimes the functionality we develop is not something we don’t need at all, but the last responsible moment haven’t come yet. Sometimes we are too attached to our code and do not control our emotions to make good decisions.

  • Damian Czernous reminded the audience of the history of presentation patterns in “Model – View – Whatever (MVW)” talk. He kicked off with interesting observation on AngularJS, saying that several years ago it was closer to MVC and it’s now more like MVVM. He clarified that MVC in the meaning of 1979 definition is dead – it was used originally to organize interactions between controls in rich client apps.
  • The second of Venkat’s presentations was on functional programming (FP), not a new topic. It took Object-Oriented Programming 22 years to get popular, but the time for FP is yet to come. He reminded us why mutability is bad (error prone, hard to reason, hard to make concurrent) and that’s why assignment-less programming is our future.

Venkat explained that functions are first-class citizen in FP world and that functional style consists of state transformations. He called a FOR loop the monster, which we treat as a hammer and that’s why we frequently see nails. Luckily functional, declarative concepts (like lambdas) leak out into other languages (like C# and Java) and get more common. Thanks to functional ideas our code starts to read as a story, not a puzzle.

Then he discussed lexical scope and closures to finish with other benefits of FP i.e. function composition and lazy evaluation.


WrocÅaw Multimedia Fountain located near Centennial Hall (from Wikipedia)

Overall the event was good and I hope it will be repeated. Unfortunately, the quality of photos I took with my phone is low, but here you can find some made by others. It is always possible to find fault, especially when things are done for the first time. There were, naturally a few organizational hiccoughs which I hope are addressed for next time:

  • Until the third exit was opened in the main room, it took really long time to get out for the break.
  • Even if you managed to get outside somehow, there was not that many points serving coffee and food. That was a bit of pain, especially in the morning. I got to the venue without a breakfast, hoping for get one during the breaks. Although the food trucks were present, the queues were too long, so ultimately we had to choose between real hunger and hunger for knowledge.
  • Most of sessions were a bit too long – 75 minutes can be wearying even if you like the speaker and topic: If the subject was boring or the delivery was mediocre then it was frustrating as well.
  • Overall the whole event was supposed to end at 18:50, not counting the evening party. There was small delay and I felt really exhausted by that time.
  • One of the conference rooms had an unfortunate layout – it was very wide, but not too deep; consisting of a couple really long rows of chairs. It was strange to sit there, somehow feeling remote and detached from the presenter.
  • In the auditorium there were two screens to minimize distance to the code examples and make text easily readable. This initially seemed like a good idea. But the problem was that the right one sometimes showed live video instead of slides. It annoyed participants, especially those in the first rows. Even one of the speakers was misled, when suddenly he noticed he is pointing with laser on his own image.