Learning Regular Expressions

In the recent Q&A on developer career strategies Cory House said he tried to follow the principle of learning “just in time, not just in case” adding that he’s better off focusing on the fundamentals than learning some speculative new technology. I think this is something that requires further explanation because it might be something that could be taken in the wrong way.

I have found that sometimes a learn just in time policy can become a “just out of time” policy. As developers we tend to get thrown into a variety of technical problems requiring a variety of expertise.

If we don’t have that expertise then we generally end up making flawed decisions. The problem is, there are always numerous ways in which we can write an inferior solution to any given problem. If we don’t understand the details necessary to conceive of the best solution then we can fool ourselves into thinking our inferior solution is the best solution.

However nobody can be an expert in everything so we need to prioritise what we spend our time learning, and focusing on the fundamentals tends to give biggest bang per buck because that’s the type of work that we tend to do most.

Throughout my career I have managed to get by with a minimal knowledge of regular expressions. I have tried to avoid them as much as possible to be honest. For me, a couple of hundred lines of clean C# is much more readable than one long regular expression.

I knew the difference between a character class and a * Kleene, but not much more than that.

This week one of my tasks is to review another developers 500 character regular expression. Any developer in this situation has 3 options:

1. Try to find a regular expressions expert to delegate the work to (out of luck on that one)
2. Do a bit of Googling, review the unit tests, shrug shoulders and say “yeah, I guess that looks okay” (not recommended)
3. Insist on some training time to upskill before going any further.

Employers generally don’t like employees taking significant amounts of time out for training. However this is a case where it is entirely necessary, and the only professional option to take.

Free Software

Microsoft have released a free Regex Fuzzer to auto-detect performance problems and denial of service risks. Unfortunately a lot of the Regex features aren’t supported.

A website with better regular expression features than most text editors is Regexr. This is a good site for getting you started with building regular expressions.

If you’re a Pluralsight subscriber

I have been watching the .NET Regular Expressions course by Dan Sullivan.

The Regex engine in .NET is much more powerful than the Regex in JavaScript or Perl, covering all of the features in those and much much more.

This is one of toughest courses I have seen on Pluralsight. Although, its less than 4 hours long, its so dense that you’ll need to keep pausing the video to catch up. Well at least that’s what I found I needed to do. Allow at least a full day for learning it. I spent 8 hours, but by the end of it I was able to pass the assessment with full marks.

A much more gentle introduction to Regular Expressions in .NET is by Jon Skeet in one of the modules of his Mastering C# 4.0 course. This is less than half an hour long and can teach you the most basic aspects. He begins with the joke:

“A programmer had a problem. He decided to solve it using regular expressions. Now he has two problems!”

The Dan Sullivan course is far more comprehensive so although it’s hard work I definitely recommend it if you need to learn regular expressions in depth.

If you like reading books

The best rated book I have found is Mastering Regular Expressions 3rd Edition by Jeffrey Friedl

You can find details on this and other Regular Expression books here

Premium Software

If you’re doing serious work with regular expression, I can recommend buying the Regex Buddy for 30 euros. I am not getting paid anything for saying this. The tool will make your work significantly easier, that’s why we have bought several copies at my company.

Conclusion

I consider this an example of just in time learning. I held out without knowing the subject properly for many years, but now a situation has arrived where I can no longer wing it.

Some people would consider regular expressions to be one of the fundamentals. It is certainly a reasonably important part of the .NET framework. It is also an important part of JavaScript and many other languages. Regular Expressions are nothing new, but they are here to stay, and very powerful.

Further Reading
Discuss this post on reddit
Regular Expressions Tutorial for beginners
Mastering Regular Expressions book

4 thoughts on “Learning Regular Expressions

  1. Pingback: CSS3 In Depth | Zombie Code Kill

  2. Pingback: Programming C# (Microsoft Exam 70-483) Learning Path | Zombie Code Kill

  3. Pingback: Understanding the Dot NET Framework Learning Path | Zombie Code Kill

  4. Pingback: 24 Hours to Becoming a C Sharp Master learning path | Zombie Code Kill

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s