Working Effectively with Legacy Code

Michael C. Feathers

Mentioned 309

The average book on Agile software development describes a fairyland of greenfield projects, with wall-to-wall tests that run after every few edits, and clean & simple source code.


The average software project, in our industry, was written under some aspect of code-and-fix, and without automated unit tests. And we can't just throw this code away; it represents a significant effort debugging and maintaining. It contains many latent requirements decisions. Just as Agile processes are incremental, Agile adoption must be incremental too. No more throwing away code just because it looked at us funny.


Mike begins his book with a very diplomatic definition of "Legacy". I'l skip ahead to the undiplomatic version: Legacy code is code without unit tests.


Before cleaning that code up, and before adding new features and removing bugs, such code must be de-legacified. It needs unit tests.


To add unit tests, you must change the code. To change the code, you need unit tests to show how safe your change was.


The core of the book is a cookbook of recipes to conduct various careful attacks. Each presents a particular problem, and a relatively safe way to migrate the code towards tests.


Code undergoing this migration will begin to experience the benefits of unit tests, and these benefits will incrementally make new tests easier to write. These efforts will make aspects of a legacy codebase easy to change.


It's an unfortunate commentary on the state of our programming industry how much we need this book.

More on Amazon.com

Mentioned in questions and answers.

If you could go back in time and tell yourself to read a specific book at the beginning of your career as a developer, which book would it be?

I expect this list to be varied and to cover a wide range of things.

To search: Use the search box in the upper-right corner. To search the answers of the current question, use inquestion:this. For example:

inquestion:this "Code Complete"

Applying UML and Patterns by Craig Larman.

The title of the book is slightly misleading; it does deal with UML and patterns, but it covers so much more. The subtitle of the book tells you a bit more: An Introduction to Object-Oriented Analysis and Design and Iterative Development.

Masters of doom. As far as motivation and love for your profession go: it won't get any better than what's been described in this book, truthfully inspiring story!

Beginning C# 3.0: An Introduction to Object Oriented Programming

This is the book for those who want to understand the whys and hows of OOP using C# 3.0. You don't want to miss it.

alt text

Mastery: The Keys to Success and Long-Term Fulfillment, by George Leonard

It's about about what mindsets are required to reach mastery in any skill, and why. It's just awesome, and an easy read too.

Pro Spring is a superb introduction to the world of Inversion of Control and Dependency Injection. If you're not aware of these practices and their implications - the balance of topics and technical detail in Pro Spring is excellent. It builds a great case and consequent personal foundation.

Another book I'd suggest would be Robert Martin's Agile Software Development (ASD). Code smells, agile techniques, test driven dev, principles ... a well-written balance of many different programming facets.

More traditional classics would include the infamous GoF Design Patterns, Bertrand Meyer's Object Oriented Software Construction, Booch's Object Oriented Analysis and Design, Scott Meyer's "Effective C++'" series and a lesser known book I enjoyed by Gunderloy, Coder to Developer.

And while books are nice ... don't forget radio!

... let me add one more thing. If you haven't already discovered safari - take a look. It is more addictive than stack overflow :-) I've found that with my google type habits - I need the more expensive subscription so I can look at any book at any time - but I'd recommend the trial to anyone even remotely interested.

(ah yes, a little obj-C today, cocoa tomorrow, patterns? soa? what was that example in that cookbook? What did Steve say in the second edition? Should I buy this book? ... a subscription like this is great if you'd like some continuity and context to what you're googling ...)

Database System Concepts is one of the best books you can read on understanding good database design principles.

alt text

Algorithms in C++ was invaluable to me in learning Big O notation and the ins and outs of the various sort algorithms. This was published before Sedgewick decided he could make more money by dividing it into 5 different books.

C++ FAQs is an amazing book that really shows you what you should and shouldn't be doing in C++. The backward compatibility of C++ leaves a lot of landmines about and this book helps one carefully avoid them while at the same time being a good introduction into OO design and intent.

Here are two I haven't seen mentioned:
I wish I had read "Ruminations on C++" by Koenig and Moo much sooner. That was the book that made OO concepts really click for me.
And I recommend Michael Abrash's "Zen of Code Optimization" for anyone else planning on starting a programming career in the mid 90s.

Perfect Software: And Other Illusions about Testing

TITLE Cover

Perfect Software: And Other Illusions about Testing by Gerald M. Weinberg

ISBN-10: 0932633692

ISBN-13: 978-0932633699

Rapid Development by McConnell

The most influential programming book for me was Enough Rope to Shoot Yourself in the Foot by Allen Holub.

Cover of the book

O, well, how long ago it was.

I have a few good books that strongly influenced me that I've not seen on this list so far:

The Psychology of Everyday Things by Donald Norman. The general principles of design for other people. This may seem to be mostly good for UI but if you think about it, it has applications almost anywhere there is an interface that someone besides the original developer has to work with; e. g. an API and designing the interface in such a way that other developers form the correct mental model and get appropriate feedback from the API itself.

The Art of Software Testing by Glen Myers. A good, general introduction to testing software; good for programmers to read to help them think like a tester i. e. think of what may go wrong and prepare for it.

By the way, I realize the question was the "Single Most Influential Book" but the discussion seems to have changed to listing good books for developers to read so I hope I can be forgiven for listing two good books rather than just one.

alt text

C++ How to Program It is good for beginner.This is excellent book that full complete with 1500 pages.

Effective C++ and More Effective C++ by Scott Myers.

Inside the C++ object model by Stanley Lippman

I bough this when I was a complete newbie and took me from only knowing that Java existed to a reliable team member in a short time

Not a programming book, but still a very important book every programmer should read:

Orbiting the Giant Hairball by Gordon MacKenzie

The Pragmatic programmer was pretty good. However one that really made an impact when I was starting out was :

Windows 95 System Programming Secrets"

I know - it sounds and looks a bit cheesy on the outside and has probably dated a bit - but this was an awesome explanation of the internals of Win95 based on the Authors (Matt Pietrek) investigations using his own own tools - the code for which came with the book. Bear in mind this was before the whole open source thing and Microsoft was still pretty cagey about releasing documentation of internals - let alone source. There was some quote in there like "If you are working through some problem and hit some sticking point then you need to stop and really look deeply into that piece and really understand how it works". I've found this to be pretty good advice - particularly these days when you often have the source for a library and can go take a look. Its also inspired me to enjoy diving into the internals of how systems work, something that has proven invaluable over the course of my career.

Oh and I'd also throw in effective .net - great internals explanation of .Net from Don Box.

I recently read Dreaming in Code and found it to be an interesting read. Perhaps more so since the day I started reading it Chandler 1.0 was released. Reading about the growing pains and mistakes of a project team of talented people trying to "change the world" gives you a lot to learn from. Also Scott brings up a lot of programmer lore and wisdom in between that's just an entertaining read.

Beautiful Code had one or two things that made me think differently, particularly the chapter on top down operator precedence.

K&R

@Juan: I know Juan, I know - but there are some things that can only be learned by actually getting down to the task at hand. Speaking in abstract ideals all day simply makes you into an academic. It's in the application of the abstract that we truly grok the reason for their existence. :P

@Keith: Great mention of "The Inmates are Running the Asylum" by Alan Cooper - an eye opener for certain, any developer that has worked with me since I read that book has heard me mention the ideas it espouses. +1

I found the The Algorithm Design Manual to be a very beneficial read. I also highly recommend Programming Pearls.

This one isnt really a book for the beginning programmer, but if you're looking for SOA design books, then SOA in Practice: The Art of Distributed System Design is for you.

For me it was Design Patterns Explained it provided an 'Oh that's how it works' moment for me in regards to design patterns and has been very useful when teaching design patterns to others.

Code Craft by Pete Goodliffe is a good read!

Code Craft

The first book that made a real impact on me was Mastering Turbo Assembler by Tom Swan.

Other books that have had an impact was Just For Fun by Linus Torvalds and David Diamond and of course The Pragmatic Programmer by Andrew Hunt and David Thomas.

In addition to other people's suggestions, I'd recommend either acquiring a copy of SICP, or reading it online. It's one of the few books that I've read that I feel greatly increased my skill in designing software, particularly in creating good abstraction layers.

A book that is not directly related to programming, but is also a good read for programmers (IMO) is Concrete Mathematics. Most, if not all of the topics in it are useful for programmers to know about, and it does a better job of explaining things than any other math book I've read to date.

For me "Memory as a programming concept in C and C++" really opened my eyes to how memory management really works. If you're a C or C++ developer I consider it a must read. You will defiantly learn something or remember things you might have forgotten along the way.

http://www.amazon.com/Memory-Programming-Concept-C/dp/0521520436

Agile Software Development with Scrum by Ken Schwaber and Mike Beedle.

I used this book as the starting point to understanding Agile development.

Systemantics: How Systems Work and Especially How They Fail. Get it used cheap. But you might not get the humor until you've worked on a few failed projects.

The beauty of the book is the copyright year.

Probably the most profound takeaway "law" presented in the book:

The Fundamental Failure-Mode Theorem (F.F.T.): Complex systems usually operate in failure mode.

The idea being that there are failing parts in any given piece of software that are masked by failures in other parts or by validations in other parts. See a real-world example at the Therac-25 radiation machine, whose software flaws were masked by hardware failsafes. When the hardware failsafes were removed, the software race condition that had gone undetected all those years resulted in the machine killing 3 people.

It seems most people have already touched on the some very good books. One which really helped me out was Effective C#: 50 Ways to Improve your C#. I'd be remiss if I didn't mention The Tao of Pooh. Philosophy books can be good for the soul, and the code.

Discrete Mathematics For Computer Scientists

Discrete Mathematics For Computer Scientists by J.K. Truss.

While this doesn't teach you programming, it teaches you fundamental mathematics that every programmer should know. You may remember this stuff from university, but really, doing predicate logic will improve you programming skills, you need to learn Set Theory if you want to program using collections.

There really is a lot of interesting information in here that can get you thinking about problems in different ways. It's handy to have, just to pick up once in a while to learn something new.

I saw a review of Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools on a blog talking also about XI-Factory, I read it and I must say this book is a must read. Altough not specifically targetted to programmers, it explains very clearly what is happening in the programming world right now with Model-Driven Architecture and so on..

Solid Code Optimizing the Software Development Life Cycle

Although the book is only 300 pages and favors Microsoft technologies it still offers some good language agnostic tidbits.

My vote is "How to Think Like a Computer Scientist: Learning With Python" It's available both as a book and as a free e-book.

It really helped me to understand the basics of not just Python but programming in general. Although it uses Python to demonstrate concepts, they apply to most, if not all, programming languages. Also: IT'S FREE!

Managing Gigabytes is an instant classic for thinking about the heavy lifting of information.

Object-Oriented Programming in Turbo C++. Not super popular, but it was the one that got me started, and was the first book that really helped me grok what an object was. Read this one waaaay back in high school. It sort of brings a tear to my eye...

My high school math teacher lent me a copy of Are Your Lights Figure Problem that I have re-read many times. It has been invaluable, as a developer, and in life generally.

I'm reading now Agile Software Development, Principles, Patterns and Practices. For those interested in XP and Object-Oriented Design, this is a classic reading.

alt text

Kernighan & Plauger's Elements of Programming Style. It illustrates the difference between gimmicky-clever and elegant-clever.

alt text

The Back of the Napkin, by Dan Roam.

The Back of the Napkin

A great book about visual thinking techniques. There is also an expanded edition now. I can't speak to that version, as I do not own it; yet.

to get advanced in prolog i like these two books:

The Art of Prolog

The Craft of Prolog

really opens the mind for logic programming and recursion schemes.

Here's an excellent book that is not as widely applauded, but is full of deep insight: Agile Software Development: The Cooperative Game, by Alistair Cockburn.

What's so special about it? Well, clearly everyone has heard the term "Agile", and it seems most are believers these days. Whether you believe or not, though, there are some deep principles behind why the Agile movement exists. This book uncovers and articulates these principles in a precise, scientific way. Some of the principles are (btw, these are my words, not Alistair's):

  1. The hardest thing about team software development is getting everyone's brains to have the same understanding. We are building huge, elaborate, complex systems which are invisible in the tangible world. The better you are at getting more peoples' brains to share deeper understanding, the more effective your team will be at software development. This is the underlying reason that pair programming makes sense. Most people dismiss it (and I did too initially), but with this principle in mind I highly recommend that you give it another shot. You wind up with TWO people who deeply understand the subsystem you just built ... there aren't many other ways to get such a deep information transfer so quickly. It is like a Vulcan mind meld.
  2. You don't always need words to communicate deep understanding quickly. And a corollary: too many words, and you exceed the listener/reader's capacity, meaning the understanding transfer you're attempting does not happen. Consider that children learn how to speak language by being "immersed" and "absorbing". Not just language either ... he gives the example of some kids playing with trains on the floor. Along comes another kid who has never even SEEN a train before ... but by watching the other kids, he picks up the gist of the game and plays right along. This happens all the time between humans. This along with the corollary about too many words helps you see how misguided it was in the old "waterfall" days to try to write 700 page detailed requirements specifications.

There is so much more in there too. I'll shut up now, but I HIGHLY recommend this book!

Agile Software Development by Alistair Cockburn

Do users ever touch your code? If you're not doing solely back-end work, I recommend About Face: The Essentials of User Interface Design — now in its third edition (linked). I used to think my users were stupid because they didn't "get" my interfaces. I was, of course, wrong. About Face turned me around.

"Writing Solid Code: Microsoft's Techniques for Developing Bug-Free C Programs (Microsoft Programming Series)" by Steve MacGuire.

Interesting what a large proportion the books mentioned here are C/C++ books.

While not strictly a software development book, I would highly recommend that Don't Make me Think! be considered in this list.

As so many people have listed Head First Design Patterns, which I agree is a very good book, I would like to see if so many people aware of a title called Design Patterns Explained: A New Perspective on Object-Oriented Design.

This title deals with design patterns excellently. The first half of the book is very accessible and the remaining chapters require only a firm grasp of the content already covered The reason I feel the second half of the book is less accessible is that it covers patterns that I, as a young developer admittedly lacking in experience, have not used much.

This title also introduces the concept behind design patterns, covering Christopher Alexander's initial work in architecture to the GoF first implementing documenting patterns in SmallTalk.

I think that anyone who enjoyed Head First Design Patterns but still finds the GoF very dry, should look into Design Patterns Explained as a much more readable (although not quite as comprehensive) alternative.

Even though i've never programmed a game this book helped me understand a lot of things in a fun way.

How influential a book is often depends on the reader and where they were in their career when they read the book. I have to give a shout-out to Head First Design Patterns. Great book and the very creative way it's written should be used as an example for other tech book writers. I.e. it's written in order to facilitate learning and internalizing the concepts.

Head First Design Patterns

97 Things Every Programmer Should Know

alt text

This book pools together the collective experiences of some of the world's best programmers. It is a must read.

Extreme Programming Explained: Embrace Change by Kent Beck. While I don't advocate a hardcore XP-or-the-highway take on software development, I wish I had been introduced to the principles in this book much earlier in my career. Unit testing, refactoring, simplicity, continuous integration, cost/time/quality/scope - these changed the way I looked at development. Before Agile, it was all about the debugger and fear of change requests. After Agile, those demons did not loom as large.

One of my personal favorites is Hacker's Delight, because it was as much fun to read as it was educational.

I hope the second edition will be released soon!

You.Next(): Move Your Software Development Career to the Leadership Track ~ Michael C. Finley (Author), Honza Fedák (Author) link text

I've been arounda while, so most books that I have found influential don't necessarily apply today. I do believe it is universally important to understand the platform that you are developing for (both hardware and OS). I also think it's important to learn from other peoples mistakes. So two books I would recommend are:

Computing Calamities and In Search of Stupidity: Over Twenty Years of High Tech Marketing Disasters

Working Effectively with Legacy Code is a really amazing book that goes into great detail about how to properly unit test your code and what the true benefit of it is. It really opened my eyes.

I worked on an embedded system this summer written in straight C. It was an existing project that the company I work for had taken over. I have become quite accustomed to writing unit tests in Java using JUnit but was at a loss as to the best way to write unit tests for existing code (which needed refactoring) as well as new code added to the system.

Are there any projects out there that make unit testing plain C code as easy as unit testing Java code with JUnit? Any insight that would apply specifically to embedded development (cross-compiling to arm-linux platform) would be greatly appreciated.

Personally I like the Google Test framework.

The real difficulty in testing C code is breaking the dependencies on external modules so you can isolate code in units. This can be especially problematic when you are trying to get tests around legacy code. In this case I often find myself using the linker to use stubs functions in tests.

This is what people are referring to when they talk about "seams". In C your only option really is to use the pre-processor or the linker to mock out your dependencies.

A typical test suite in one of my C projects might look like this:

#include "myimplementationfile.c"
#include <gtest/gtest.h>

// Mock out external dependency on mylogger.o
void Logger_log(...){}

TEST(FactorialTest, Zero) {
    EXPECT_EQ(1, Factorial(0));
}

Note that you are actually including the C file and not the header file. This gives the advantage of access to all the static data members. Here I mock out my logger (which might be in logger.o and give an empty implementation. This means that the test file compiles and links independently from the rest of the code base and executes in isolation.

As for cross-compiling the code, for this to work you need good facilities on the target. I have done this with googletest cross compiled to Linux on a PowerPC architecture. This makes sense because there you have a full shell and os to gather your results. For less rich environments (which I classify as anything without a full OS) you should just build and run on the host. You should do this anyway so you can run the tests automatically as part of the build.

I find testing C++ code is generally much easier due to the fact that OO code is in general much less coupled than procedural (of course this depends a lot on coding style). Also in C++ you can use tricks like dependency injection and method overriding to get seams into code that is otherwise encapsulated.

Michael Feathers has an excellent book about testing legacy code. In one chapter he covers techniques for dealing with non-OO code which I highly recommend.

Edit: I've written a blog post about unit testing procedural code, with source available on GitHub.

Edit: There is a new book coming out from the Pragmatic Programmers that specifically addresses unit testing C code which I highly recommend.

It wasn't that long ago that I was a beginning coder, trying to find good books/tutorials on languages I wanted to learn. Even still, there are times I need to pick up a language relatively quickly for a new project I am working on. The point of this post is to document some of the best tutorials and books for these languages. I will start the list with the best I can find, but hope you guys out there can help with better suggestions/new languages. Here is what I found:

Since this is now wiki editable, I am giving control up to the community. If you have a suggestion, please put it in this section. I decided to also add a section for general be a better programmer books and online references as well. Once again, all recommendations are welcome.

General Programming

Online Tutorials
Foundations of Programming By Karl Seguin - From Codebetter, its C# based but the ideas ring true across the board, can't believe no-one's posted this yet actually.
How to Write Unmaintainable Code - An anti manual that teaches you how to write code in the most unmaintable way possible. It would be funny if a lot of these suggestions didn't ring so true.
The Programming Section of Wiki Books - suggested by Jim Robert as having a large amount of books/tutorials on multiple languages in various stages of completion
Just the Basics To get a feel for a language.

Books
Code Complete - This book goes without saying, it is truely brilliant in too many ways to mention.
The Pragmatic Programmer - The next best thing to working with a master coder, teaching you everything they know.
Mastering Regular Expressions - Regular Expressions are an essential tool in every programmer's toolbox. This book, recommended by Patrick Lozzi is a great way to learn what they are capable of.
Algorithms in C, C++, and Java - A great way to learn all the classic algorithms if you find Knuth's books a bit too in depth.

C

Online Tutorials
This tutorial seems to pretty consise and thourough, looked over the material and seems to be pretty good. Not sure how friendly it would be to new programmers though.
Books
K&R C - a classic for sure. It might be argued that all programmers should read it.
C Primer Plus - Suggested by Imran as being the ultimate C book for beginning programmers.
C: A Reference Manual - A great reference recommended by Patrick Lozzi.

C++

Online Tutorials
The tutorial on cplusplus.com seems to be the most complete. I found another tutorial here but it doesn't include topics like polymorphism, which I believe is essential. If you are coming from C, this tutorial might be the best for you.

Another useful tutorial, C++ Annotation. In Ubuntu family you can get the ebook on multiple format(pdf, txt, Postscript, and LaTex) by installing c++-annotation package from Synaptic(installed package can be found in /usr/share/doc/c++-annotation/.

Books
The C++ Programming Language - crucial for any C++ programmer.
C++ Primer Plus - Orginally added as a typo, but the amazon reviews are so good, I am going to keep it here until someone says it is a dud.
Effective C++ - Ways to improve your C++ programs.
More Effective C++ - Continuation of Effective C++.
Effective STL - Ways to improve your use of the STL.
Thinking in C++ - Great book, both volumes. Written by Bruce Eckel and Chuck Ellison.
Programming: Principles and Practice Using C++ - Stroustrup's introduction to C++.
Accelerated C++ - Andy Koenig and Barbara Moo - An excellent introduction to C++ that doesn't treat C++ as "C with extra bits bolted on", in fact you dive straight in and start using STL early on.

Forth

Books
FORTH, a text and reference. Mahlon G. Kelly and Nicholas Spies. ISBN 0-13-326349-5 / ISBN 0-13-326331-2. 1986 Prentice-Hall. Leo Brodie's books are good but this book is even better. For instance it covers defining words and the interpreter in depth.

Java

Online Tutorials
Sun's Java Tutorials - An official tutorial that seems thourough, but I am not a java expert. You guys know of any better ones?
Books
Head First Java - Recommended as a great introductory text by Patrick Lozzi.
Effective Java - Recommended by pek as a great intermediate text.
Core Java Volume 1 and Core Java Volume 2 - Suggested by FreeMemory as some of the best java references available.
Java Concurrency in Practice - Recommended by MDC as great resource for concurrent programming in Java.

The Java Programing Language

Python

Online Tutorials
Python.org - The online documentation for this language is pretty good. If you know of any better let me know.
Dive Into Python - Suggested by Nickola. Seems to be a python book online.

Perl

Online Tutorials
perldoc perl - This is how I personally got started with the language, and I don't think you will be able to beat it.
Books
Learning Perl - a great way to introduce yourself to the language.
Programming Perl - greatly referred to as the Perl Bible. Essential reference for any serious perl programmer.
Perl Cookbook - A great book that has solutions to many common problems.
Modern Perl Programming - newly released, contains the latest wisdom on modern techniques and tools, including Moose and DBIx::Class.

Ruby

Online Tutorials
Adam Mika suggested Why's (Poignant) Guide to Ruby but after taking a look at it, I don't know if it is for everyone. Found this site which seems to offer several tutorials for Ruby on Rails.
Books
Programming Ruby - suggested as a great reference for all things ruby.

Visual Basic

Online Tutorials
Found this site which seems to devote itself to visual basic tutorials. Not sure how good they are though.

PHP

Online Tutorials
The main PHP site - A simple tutorial that allows user comments for each page, which I really like. PHPFreaks Tutorials - Various tutorials of different difficulty lengths.
Quakenet/PHP tutorials - PHP tutorial that will guide you from ground up.

JavaScript

Online Tutorials
Found a decent tutorial here geared toward non-programmers. Found another more advanced one here. Nickolay suggested A reintroduction to javascript as a good read here.

Books
Head first JavaScript
JavaScript: The Good Parts (with a Google Tech Talk video by the author)

C#

Online Tutorials
C# Station Tutorial - Seems to be a decent tutorial that I dug up, but I am not a C# guy.
C# Language Specification - Suggested by tamberg. Not really a tutorial, but a great reference on all the elements of C#
Books
C# to the point - suggested by tamberg as a short text that explains the language in amazing depth

ocaml

Books
nlucaroni suggested the following:
OCaml for Scientists Introduction to ocaml
Using Understand and unraveling ocaml: practice to theory and vice versa
Developing Applications using Ocaml - O'Reilly
The Objective Caml System - Official Manua

Haskell

Online Tutorials
nlucaroni suggested the following:
Explore functional programming with Haskell
Books
Real World Haskell
Total Functional Programming

LISP/Scheme

Books
wfarr suggested the following:
The Little Schemer - Introduction to Scheme and functional programming in general
The Seasoned Schemer - Followup to Little Schemer.
Structure and Interpretation of Computer Programs - The definitive book on Lisp (also available online).
Practical Common Lisp - A good introduction to Lisp with several examples of practical use.
On Lisp - Advanced Topics in Lisp
How to Design Programs - An Introduction to Computing and Programming
Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp - an approach to high quality Lisp programming

What about you guys? Am I totally off on some of there? Did I leave out your favorite language? I will take the best comments and modify the question with the suggestions.

Java: SCJP for Java 6. I still use it as a reference.

Haskell:

O'Reilly Book:

  1. Real World Haskell, a great tutorial-oriented book on Haskell, available online and in print.

My favorite general, less academic online tutorials:

  1. The Haskell wikibook which contains all of the excellent Yet Another Haskell Tutorial. (This tutorial helps with specifics of setting up a Haskell distro and running example programs, for example.)
  2. Learn you a Haskell for Great Good, in the spirit of Why's Poignant Guide to Ruby but more to the point.
  3. Write yourself a Scheme in 48 hours. Get your hands dirty learning Haskell with a real project.

Books on Functional Programming with Haskell:

  1. Lambda calculus, combinators, more theoretical, but in a very down to earth manner: Davie's Introduction to Functional Programming Systems Using Haskell
  2. Laziness and program correctness, thinking functionally: Bird's Introduction to Functional Programming Using Haskell

Some books on Java I'd recommend:

For Beginners: Head First Java is an excellent introduction to the language. And I must also mention Head First Design Patterns which is a great resource for learners to grasp what can be quite challenging concepts. The easy-going fun style of these books are ideal for ppl new to programming.

A really thorough, comprehensive book on Java SE is Bruce Eckel's Thinking In Java v4. (At just under 1500 pages it's good for weight-training as well!) For those of us not on fat bank-bonuses there are older versions available for free download.

Of course, as many ppl have already mentioned, Josh Bloch's Effective Java v2 is an essential part of any Java developer's library.

Let's not forget Head First Java, which could be considered the essential first step in this language or maybe the step after the online tutorials by Sun. It's great for the purpose of grasping the language concisely, while adding a bit of fun, serving as a stepping stone for the more in-depth books already mentioned.

Sedgewick offers great series on Algorithms which are a must-have if you find Knuth's books to be too in-depth. Knuth aside, Sedgewick brings a solid approach to the field and he offers his books in C, C++ and Java. The C++ books could be used backwardly on C since he doesn't make a very large distinction between the two languages in his presentation.

Whenever I'm working on C, C:A Reference Manual, by Harbison and Steele, goes with me everywhere. It's concise and efficient while being extremely thorough making it priceless(to me anyways).

Languages aside, and if this thread is to become a go-to for references in which I think it's heading that way due to the number of solid contributions, please include Mastering Regular Expressions, for reasons I think most of us are aware of... some would also say that regex can be considered a language in its own right. Further, its usefulness in a wide array of languages makes it invaluable.

C: “Programming in C”, Stephen G. Kochan, Developer's Library.

Organized, clear, elaborate, beautiful.

C++

The first one is good for beginners and the second one requires more advanced level in C++.

I know this is a cross post from here... but, I think one of the best Java books is Java Concurrency in Practice by Brian Goetz. A rather advanced book - but, it will wear well on your concurrent code and Java development in general.

C#

C# to the Point by Hanspeter Mössenböck. On a mere 200 pages he explains C# in astonishing depth, focusing on underlying concepts and concise examples rather than hand waving and Visual Studio screenshots.

For additional information on specific language features, check the C# language specification ECMA-334.

Framework Design Guidelines, a book by Krzysztof Cwalina and Brad Abrams from Microsoft, provides further insight into the main design decisions behind the .NET library.

For Lisp and Scheme (hell, functional programming in general), there are few things that provide a more solid foundation than The Little Schemer and The Seasoned Schemer. Both provide a very simple and intuitive introduction to both Scheme and functional programming that proves far simpler for new students or hobbyists than any of the typical volumes that rub off like a nonfiction rendition of War & Peace.

Once they've moved beyond the Schemer series, SICP and On Lisp are both fantastic choices.

For C++ I am a big fan of C++ Common Knowledge: Essential Intermediate Programming, I like that it is organized into small sections (usually less than 5 pages per topic) So it is easy for me to grab it and read up on concepts that I need to review.

It is a must read for me the night before and on the plane to a job interview.

C Primer Plus, 5th Edition - The C book to get if you're learning C without any prior programming experience. It's a personal favorite of mine as I learned to program from this book. It has all the qualities a beginner friendly book should have:

  • Doesn't assume any prior exposure to programming
  • Enjoyable to read (without becoming annoying like For Dummies /
  • Doesn't oversimplify

For Javascript:

For PHP:

For OO design & programming, patterns:

For Refactoring:

For SQL/MySQL:

  • C - The C Programming Language - Obviously I had to reference K&R, one of the best programming books out there full stop.
  • C++ - Accelerated C++ - This clear, well written introduction to C++ goes straight to using the STL and gives nice, clear, practical examples. Lives up to its name.
  • C# - Pro C# 2008 and the .NET 3.5 Platform - Bit of a mouthful but wonderfully written and huge depth.
  • F# - Expert F# - Designed to take experienced programmers from zero to expert in F#. Very well written, one of the author's invented F# so you can't go far wrong!
  • Scheme - The Little Schemer - Really unique approach to teaching a programming language done really well.
  • Ruby - Programming Ruby - Affectionately known as the 'pick axe' book, this is THE defacto introduction to Ruby. Very well written, clear and detailed.

So we have this huge (is 11000 lines huge?) mainmodule.cpp source file in our project and every time I have to touch it I cringe.

As this file is so central and large, it keeps accumulating more and more code and I can't think of a good way to make it actually start to shrink.

The file is used and actively changed in several (> 10) maintenance versions of our product and so it is really hard to refactor it. If I were to "simply" split it up, say for a start, into 3 files, then merging back changes from maintenance versions will become a nightmare. And also if you split up a file with such a long and rich history, tracking and checking old changes in the SCC history suddenly becomes a lot harder.

The file basically contains the "main class" (main internal work dispatching and coordination) of our program, so every time a feature is added, it also affects this file and every time it grows. :-(

What would you do in this situation? Any ideas on how to move new features to a separate source file without messing up the SCC workflow?

(Note on the tools: We use C++ with Visual Studio; We use AccuRev as SCC but I think the type of SCC doesn't really matter here; We use Araxis Merge to do actual comparison and merging of files)

Another book you may find interesting/helpful is Refactoring.

My sympathies - in my previous job I encountered a similar situation with a file that was several times larger than the one you have to deal with. Solution was:

  1. Write code to exhaustively test the function in the program in question. Sounds like you won't already have this in hand...
  2. Identify some code that can be abstracted out into a helper/utilities class. Need not be big, just something that is not truly part of your 'main' class.
  3. Refactor the code identified in 2. into a separate class.
  4. Rerun your tests to ensure nothing got broken.
  5. When you have time, goto 2. and repeat as required to make the code manageable.

The classes you build in step 3. iterations will likely grow to absorb more code that is appropriate to their newly-clear function.

I could also add:

0: buy Michael Feathers' book on working with legacy code

Unfortunately this type of work is all too common, but my experience is that there is great value in being able to make working but horrid code incrementally less horrid while keeping it working.

Exactly this problem is handled in one of the chapters of the book "Working Effectively with Legacy Code" (http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052).

I'm working on a large c++ system that is has been in development for a few years now. As part of an effort to improve the quality of the existing code we engaged on a large long-term refactoring project.

Do you know a good tool that can help me write unit tests in C++? Maybe something similar to Junit or Nunit?

Can anyone give some good advice on the methodology of writing unit tests for modules that were written without unit testing in mind?

Applying unit tests to legacy code was the very reason Working Effectively with Legacy Code was written. Michael Feathers is the author - as mentioned in other answers, he was involved in the creation of both CppUnit and CppUnitLite.

alt text

I'm strongly considering adding unit testing to an existing project that is in production. It was started 18 months ago before I could really see any benefit of TDD (face palm), so now it's a rather large solution with a number of projects and I haven't the foggiest idea where to start in adding unit tests. What's making me consider this is that occasionally an old bug seems to resurface, or a bug is checked in as fixed without really being fixed. Unit testing would reduce or prevents these issues occuring.

By reading similar questions on SO, I've seen recommendations such as starting at the bug tracker and writing a test case for each bug to prevent regression. However, I'm concerned that I'll end up missing the big picture and end up missing fundamental tests that would have been included if I'd used TDD from the get go.

Are there any process/steps that should be adhered to in order to ensure that an existing solutions is properly unit tested and not just bodged in? How can I ensure that the tests are of a good quality and aren't just a case of any test is better than no tests.

So I guess what I'm also asking is;

  • Is it worth the effort for an existing solution that's in production?
  • Would it better to ignore the testing for this project and add it in a possible future re-write?
  • What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality?

(Obviously the answer to the third point is entirely dependant on whether you're speaking to management or a developer)


Reason for Bounty

Adding a bounty to try and attract a broader range of answers that not only confirm my existing suspicion that it is a good thing to do, but also some good reasons against.

I'm aiming to write this question up later with pros and cons to try and show management that it's worth spending the man hours on moving the future development of the product to TDD. I want to approach this challenge and develop my reasoning without my own biased point of view.

It's unlikely you'll ever have significant test coverage, so you must be tactical about where you add tests:

  • As you mentioned, when you find a bug, it's a good time to write a test (to reproduce it), and then fix the bug. If you see the test reproduce the bug, you can be sure it's a good, alid test. Given such a large portion of bugs are regressions (50%?), it's almost always worth writing regression tests.
  • When you dive into an area of code to modify it, it's a good time to write tests around it. Depending on the nature of the code, different tests are appropriate. One good set of advice is found here.

OTOH, it's not worth just sitting around writing tests around code that people are happy with-- especially if nobody is going to modify it. It just doesn't add value (except maybe understanding the behavior of the system).

Good luck!

I've introduced unit tests to code bases that did not have it previously. The last big project I was involved with where I did this the product was already in production with zero unit tests when I arrived to the team. When I left - 2 years later - we had 4500+ or so tests yielding about 33 % code coverage in a code base with 230 000 + production LOC (real time financial Win-Forms application). That may sound low, but the result was a significant improvement in code quality and defect rate - plus improved morale and profitability.

It can be done when you have both an accurate understanding and commitment from the parties involved.

First of all, it is important to understand that unit testing is a skill in itself. You can be a very productive programmer by "conventional" standards and still struggle to write unit tests in a way that scales in a larger project.

Also, and specifically for your situation, adding unit tests to an existing code base that has no tests is also a specialized skill in itself. Unless you or somebody in your team has successful experience with introducing unit tests to an existing code base, I would say reading Feather's book is a requirement (not optional or strongly recommended).

Making the transition to unit testing your code is an investment in people and skills just as much as in the quality of the code base. Understanding this is very important in terms of mindset and managing expectations.

Now, for your comments and questions:

However, I'm concerned that I'll end up missing the big picture and end up missing fundamental tests that would have been included if I'd used TDD from the get go.

Short answer: Yes, you will miss tests and yes they might not initially look like what they would have in a green field situation.

Deeper level answer is this: It does not matter. You start with no tests. Start adding tests, and refactor as you go. As skill levels get better, start raising the bar for all newly written code added to your project. Keep improving etc...

Now, reading in between the lines here I get the impression that this is coming from the mindset of "perfection as an excuse for not taking action". A better mindset is to focus on self trust. So as you may not know how to do it yet, you will figure out how to as you go and fill in the blanks. Therefore, there is no reason to worry.

Again, its a skill. You can not go from zero tests to TDD-perfection in one "process" or "step by step" cook book approach in a linear fashion. It will be a process. Your expectations must be to make gradual and incremental progress and improvement. There is no magic pill.

The good news is that as the months (and even years) pass, your code will gradually start to become "proper" well factored and well tested code.

As a side note. You will find that the primary obstacle to introducing unit tests in an old code base is lack of cohesion and excessive dependencies. You will therefore probably find that the most important skill will become how to break existing dependencies and decoupling code, rather than writing the actual unit tests themselves.

Are there any process/steps that should be adhered to in order to ensure that an existing solutions is properly unit tested and not just bodged in?

Unless you already have it, set up a build server and set up a continuous integration build that runs on every checkin including all unit tests with code coverage.

Train your people.

Start somewhere and start adding tests while you make progress from the customer's perspective (see below).

Use code coverage as a guiding reference of how much of your production code base is under test.

Build time should always be FAST. If your build time is slow, your unit testing skills are lagging. Find the slow tests and improve them (decouple production code and test in isolation). Well written, you should easilly be able to have several thousands of unit tests and still complete a build in under 10 minutes (~1-few ms / test is a good but very rough guideline, some few exceptions may apply like code using reflection etc).

Inspect and adapt.

How can I ensure that the tests are of a good quality and aren't just a case of any test is better than no tests.

Your own judgement must be your primary source of reality. There is no metric that can replace skill.

If you don't have that experience or judgement, consider contracting someone who does.

Two rough secondary indicators are total code coverage and build speed.

Is it worth the effort for an existing solution that's in production?

Yes. The vast majority of the money spent on a custom built system or solution is spent after it is put in production. And investing in quality, people and skills should never be out of style.

Would it better to ignore the testing for this project and add it in a possible future re-write?

You would have to take into consideration, not only the investment in people and skills, but most importantly the total cost of ownership and the expected life time of the system.

My personal answer would be "yes of course" in the majority of cases because I know its just so much better, but I recognize that there might be exceptions.

What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality?

Neither. Your approach should be to add tests to your code base WHILE you are making progress in terms of functionality.

Again, it is an investment in people, skills AND the quality of the code base and as such it will require time. Team members need to learn how to break dependencies, write unit tests, learn new habbits, improve discipline and quality awareness, how to better design software, etc. It is important to understand that when you start adding tests your team members likely don't have these skills yet at the level they need to be for that approach to be successful, so stopping progress to spend all time to add a lot of tests simply won't work.

Also, adding unit tests to an existing code base of any sizeable project size is a LARGE undertaking which requires commitment and persistance. You can't change something fundamental, expect a lot of learning on the way and ask your sponsor to not expect any ROI by halting the flow of business value. That won't fly, and frankly it shouldn't.

Thirdly, you want to instill sound business focus values in your team. Quality never comes at the expense of the customer and you can't go fast without quality. Also, the customer is living in a changing world, and your job is to make it easier for him to adapt. Customer alignment requires both quality and the flow of business value.

What you are doing is paying off technical debt. And you are doing so while still serving your customers ever changing needs. Gradually as debt is paid off, the situation improves, and it is easier to serve the customer better and deliver more value. Etc. This positive momentum is what you should aim for because it underlines the principles of sustainable pace and will maintain and improve moral - both for your development team, your customer and your stakeholders.

Hope that helps

I was watching Rob Connerys webcasts on the MVCStoreFront App, and I noticed he was unit testing even the most mundane things, things like:

public Decimal DiscountPrice
{
   get
   {
       return this.Price - this.Discount;
   }
}

Would have a test like:

[TestMethod]
public void Test_DiscountPrice
{
    Product p = new Product();
    p.Price = 100;
    p.Discount = 20;
    Assert.IsEqual(p.DiscountPrice,80);
}

While, I am all for unit testing, I sometimes wonder if this form of test first development is really beneficial, for example, in a real process, you have 3-4 layers above your code (Business Request, Requirements Document, Architecture Document), where the actual defined business rule (Discount Price is Price - Discount) could be misdefined.

If that's the situation, your unit test means nothing to you.

Additionally, your unit test is another point of failure:

[TestMethod]
public void Test_DiscountPrice
{
    Product p = new Product();
    p.Price = 100;
    p.Discount = 20;
    Assert.IsEqual(p.DiscountPrice,90);
}

Now the test is flawed. Obviously in a simple test, it's no big deal, but say we were testing a complicated business rule. What do we gain here?

Fast forward two years into the application's life, when maintenance developers are maintaining it. Now the business changes its rule, and the test breaks again, some rookie developer then fixes the test incorrectly...we now have another point of failure.

All I see is more possible points of failure, with no real beneficial return, if the discount price is wrong, the test team will still find the issue, how did unit testing save any work?

What am I missing here? Please teach me to love TDD, as I'm having a hard time accepting it as useful so far. I want too, because I want to stay progressive, but it just doesn't make sense to me.

EDIT: A couple people keep mentioned that testing helps enforce the spec. It has been my experience that the spec has been wrong as well, more often than not, but maybe I'm doomed to work in an organization where the specs are written by people who shouldn't be writing specs.

While, I am all for unit testing, I sometimes wonder if this form of test first development is really beneficial...

Small, trivial tests like this can be the "canary in the coalmine" for your codebase, alerting of danger before it's too late. The trivial tests are useful to keep around because they help you get the interactions right.

For example, think about a trivial test put in place to probe how to use an API you're unfamiliar with. If that test has any relevance to what you're doing in the code that uses the API "for real" it's useful to keep that test around. When the API releases a new version and you need to upgrade. You now have your assumptions about how you expect the API to behave recorded in an executable format that you can use to catch regressions.

...[I]n a real process, you have 3-4 layers above your code (Business Request, Requirements Document, Architecture Document), where the actual defined business rule (Discount Price is Price - Discount) could be misdefined. If that's the situation, your unit test means nothing to you.

If you've been coding for years without writing tests it may not be immediately obvious to you that there is any value. But if you are of the mindset that the best way to work is "release early, release often" or "agile" in that you want the ability to deploy rapidly/continuously, then your test definitely means something. The only way to do this is by legitimizing every change you make to the code with a test. No matter how small the test, once you have a green test suite you're theoretically OK to deploy. See also "continuous production" and "perpetual beta."

You don't have to be "test first" to be of this mindset, either, but that generally is the most efficient way to get there. When you do TDD, you lock yourself into small two to three minute Red Green Refactor cycle. At no point are you not able to stop and leave and have a complete mess on your hands that will take an hour to debug and put back together.

Additionally, your unit test is another point of failure...

A successful test is one that demonstrates a failure in the system. A failing test will alert you to an error in the logic of the test or in the logic of your system. The goal of your tests is to break your code or prove one scenario works.

If you're writing tests after the code, you run the risk of writing a test that is "bad" because in order to see that your test truly works, you need to see it both broken and working. When you're writing tests after the code, this means you have to "spring the trap" and introduce a bug into the code to see the test fail. Most developers are not only uneasy about this, but would argue it is a waste of time.

What do we gain here?

There is definitely a benefit to doing things this way. Michael Feathers defines "legacy code" as "untested code." When you take this approach, you legitimize every change you make to your codebase. It's more rigorous than not using tests, but when it comes to maintaining a large codebase, it pays for itself.

Speaking of Feathers, there are two great resources you should check out in regard to this:

Both of these explain how to work these types of practices and disciplines into projects that aren't "Greenfield." They provide techniques for writing tests around tightly coupled components, hard wired dependencies, and things that you don't necessarily have control over. It's all about finding "seams" and testing around those.

[I]f the discount price is wrong, the test team will still find the issue, how did unit testing save any work?

Habits like these are like an investment. Returns aren't immediate; they build up over time. The alternative to not testing is essentially taking on debt of not being able to catch regressions, introduce code without fear of integration errors, or drive design decisions. The beauty is you legitimize every change introduced into your codebase.

What am I missing here? Please teach me to love TDD, as I'm having a hard time accepting it as useful so far. I want too, because I want to stay progressive, but it just doesn't make sense to me.

I look at it as a professional responsibility. It's an ideal to strive toward. But it is very hard to follow and tedious. If you care about it, and feel you shouldn't produce code that is not tested, you'll be able to find the will power to learn good testing habits. One thing that I do a lot now (as do others) is timebox myself an hour to write code without any tests at all, then have the discipline to throw it away. This may seem wasteful, but it's not really. It's not like that exercise cost a company physical materials. It helped me to understand the problem and how to write code in such a way that it is both of higher quality and testable.

My advice would ultimately be that if you really don't have a desire to be good at it, then don't do it at all. Poor tests that aren't maintained, don't perform well, etc. can be worse than not having any tests. It's hard to learn on your own, and you probably won't love it, but it is going to be next to impossible to learn if you don't have a desire to do it, or can't see enough value in it to warrant the time investment.

A couple people keep mentioned that testing helps enforce the spec. It has been my experience that the spec has been wrong as well, more often than not...

A developer's keyboard is where the rubber meets the road. If the spec is wrong and you don't raise the flag on it, then it's highly probable you'll get blamed for it. Or at least your code will. The discipline and rigor involved in testing is difficult to adhere to. It's not at all easy. It takes practice, a lot of learning and a lot of mistakes. But eventually it does pay off. On a fast-paced, quickly changing project, it's the only way you can sleep at night, no matter if it slows you down.

Another thing to think about here is that techniques that are fundamentally the same as testing have been proven to work in the past: "clean room" and "design by contract" both tend to produce the same types of "meta"-code constructs that tests do, and enforce those at different points. None of these techniques are silver bullets, and rigor is going to cost you ultimately in the scope of features you can deliver in terms of time to market. But that's not what it's about. It's about being able to maintain what you do deliver. And that's very important for most projects.

I'm having trouble figuring out how to get the testing framework set up and usable in Visual Studio 2008 for C++ presumably with the built-in unit testing suite.

Any links or tutorials would be appreciated.

This page may help, it reviews quite a few C++ unit test frameworks:

  • CppUnit
  • Boost.Test
  • CppUnitLite
  • NanoCppUnit
  • Unit++
  • CxxTest

Check out CPPUnitLite or CPPUnitLite2.

CPPUnitLite was created by Michael Feathers, who originally ported Java's JUnit to C++ as CPPUnit (CPPUnit tries mimic the development model of JUnit - but C++ lacks Java's features [e.g. reflection] to make it easy to use).

CPPUnitLite attempts to make a true C++-style testing framework, not a Java one ported to C++. (I'm paraphrasing from Feather's Working Effectively with Legacy Code book). CPPUnitLite2 seems to be another rewrite, with more features and bug fixes.

I also just stumbled across UnitTest++ which includes stuff from CPPUnitLite2 and some other framework.

Microsoft has released WinUnit.

Also checkout Catch or Doctest

This question has been preserved for historical reasons, but it is not considered on-topic, so don't use it as an excuse to post something similar.

More info at http://stackoverflow.com/faq.


For me to read code and learn, not to play...

...of course ;-)

Jagged Alliance 2

Its source code was released in 2004 (I think) and since then it has been improved very much by the mod community. The mod goes under the name JA2 v1.13 and the community resides at Bear's Pit.

P.S. For reading and learning from the code, this might not be the best project. It's old C code with many functions spanning hundreds of lines. Unless you want to learn how to work with legacy code, playing it is more fun. ;)

I have been reading about Agile, XP methodologies and TDDs.

I have been in projects which states it needs to do TDD, but most of the tests are somehow integration tests or during the course of project TDD is forgotten in effort to finish codes faster.

So, as far as my case goes, I have written unit tests, but I find myself going to start writing code first instead of writing a test. I feel there's a thought / design / paradigm change which is actually huge. So, though one really believes in TDD, you actually end up going back old style because of time pressure / project deliverables.

I have few classes where I have pure unit tested code, but I can't seem to continue with the process, when mocks come into picture. Also, I see at times : "isn't it too trivial to write a test for it" syndrome.

How do you guys think I should handle this?

When you are in a big mess of legacy code I found Working Effectively with Legacy Code extremely useful. I think improve you motivation for TDD allot even though it is about writing unit tests before you do any changes to your old legacy code. And from the undertone of your question it seems like this is the position you have been.

And of course as many others pointed out discipline. After a while forcing your self you will forgot why you ever did it another way.

Buy "Test Driven Development: By Example" by Kent Beck, and read it.

Then, write a failing unit test.

We have a large, multi-platform application written in C. (with a small but growing amount of C++) It has evolved over the years with many features you would expect in a large C/C++ application:

  • #ifdef hell
  • Large files that make it hard to isolate testable code
  • Functions that are too complex to be easily testable

Since this code is targeted for embedded devices, it's a lot of overhead to run it on the actual target. So we would like to do more of our development and testing in quick cycles, on a local system. But we would like to avoid the classic strategy of "copy/paste into a .c file on your system, fix bugs, copy/paste back". If developers are going to to go the trouble to do that, we'd like to be able to recreate the same tests later, and run in an automated fashion.

Here's our problem: in order to refactor the code to be more modular, we need it to be more testable. But in order to introduce automated unit tests, we need it to be more modular.

One problem is that since our files are so large, we might have a function inside a file that calls a function in the same file that we need to stub out to make a good unit test. It seems like this would be less of a problem as our code gets more modular, but that is a long way off.

One thing we thought about doing was tagging "known to be testable" source code with comments. Then we could write a script scan source files for testable code, compile it in a separate file, and link it with the unit tests. We could slowly introduce the unit tests as we fix defects and add more functionality.

However, there is concern that maintaining this scheme (along with all the required stub functions) will become too much of a hassle, and developers will stop maintaining the unit tests. So another approach is to use a tool that automatically generates stubs for all the code, and link the file with that. (the only tool we have found that will do this is an expensive commercial product) But this approach seems to require that all our code be more modular before we can even begin, since only the external calls can be stubbed out.

Personally, I would rather have developers think about their external dependencies and intelligently write their own stubs. But this could be overwhelming to stub out all the dependencies for a horribly overgrown, 10,000 line file. It might be difficult to convince developers that they need to maintain stubs for all their external dependencies, but is that the right way to do it? (One other argument I've heard is that the maintainer of a subsystem should maintain the stubs for their subsystem. But I wonder if "forcing" developers to write their own stubs would lead to better unit testing?)

The #ifdefs, of course, add another whole dimension to the problem.

We have looked at several C/C++ based unit test frameworks, and there are a lot of options that look fine. But we have not found anything to ease the transition from "hairball of code with no unit tests" to "unit-testable code".

So here are my questions to anyone else who has been through this:

  • What is a good starting point? Are we going in the right direction, or are we missing something obvious?
  • What tools might be useful to help with the transition? (preferably free/open source, since our budget right now is roughly "zero")

Note, our build environment is Linux/UNIX based, so we can't use any Windows-only tools.

G'day,

I'd start by having a look at any obvious points, e.g. using dec's in header files for one.

Then start looking at how the code has been laid out. Is it logical? Maybe start breaking large files down into smaller ones.

Maybe grab a copy of Jon Lakos's excellent book "Large-Scale C++ Software Design" (sanitised Amazon link) to get some ideas on how it should be laid out.

Once you start getting a bit more faith in the code base itself, i.e. code layout as in file layout, and have cleared up some of the bad smells, e.g. using dec's in header files, then you can start picking out some functionality that you can use to start writing your unit tests.

Pick a good platform, I like CUnit and CPPUnit, and go from there.

It's going to be a long, slow journey though.

HTH

cheers,

Michael Feathers wrote the bible on this, Working Effectively with Legacy Code

Have you ever added unit tests, after the fact, to legacy code? How complicated was code and how difficult to stub and mock everything? Was the end result worthwhile?

Yes, and it's generally painful. I've often ended up having to write integration tests instead.

The book The Art of Unit Testing has some good advice on this. It also recommends the book Working Effectively with Legacy Code; I haven't read the latter yet, but it's on my stack.

EDIT: But yes, even minimal code coverage was worthwhile. It gave me confidence and a safety net for refactoring the code.

EDIT: I did read Working Effectively with Legacy Code, and it's excellent.

What book would you recommend to learn test driven development? Preferrably language agnostic.

Growing Object-Oriented Software, Guided by Tests by Addison-Wesley - it is about mocking frameworks - JMock and Hamcrest in particular.

From description of the book:

Steve Freeman and Nat Pryce describe the processes they use, the design principles they strive to achieve, and some of the tools that help them get the job done. Through an extended worked example, you’ll learn how TDD works at multiple levels, using tests to drive the features and the object-oriented structure of the code, and using Mock Objects to discover and then describe relationships between objects. Along the way, the book systematically addresses challenges that development teams encounter with TDD--from integrating TDD into your processes to testing your most difficult features.

EDIT: I'm now reading Working Effectively with Legacy Code by Michael Feathers which is pretty good. From the description of the book:

  • Understanding the mechanics of software change: adding features,
    fixing bugs, improving design, optimizing performance
  • Getting legacy code into a test harness
  • Writing tests that protect you against introducing new problems
  • This book also includes a catalog of twenty-four dependency-breaking techniques that help you work with program elements in isolation and make safer changes.

I read it already, it is one of the best programming books I've ever read (I personally think that it must be called Refactoring to Testability - it describes the processes for making your code testable). Because a testable code is good code.

The Astels book is a solid introduction, Beck's book is good on the underlying concepts, Lasse Koskela has a newish one (Test Driven: TDD and Acceptance TDD for Java Developers). Osherove's book, as he says, is about Unit Testing, rather than TDD. I'm not sure that the Pragmatics' TDD book has aged as well as their original book.

Most everything is Java or C#, but you should be able to figure it out yourself.

For me, this is the best one:

I am looking for podcasts or videos on how to do unit testing.

Ideally they should cover both the basics and more advanced topics.

Other hanselminutes episodes on testing:

Other podcasts:

Other questions like this:

Blog posts:

I know you didn't ask for books but... Can I also mention that Beck's TDD book is a must read, even though it may seem like a dated beginner book on first flick through (and Working Effectively with Legacy Code by Michael C. Feathers of course is the bible). Also, I'd append Martin(& Martin)'s Agile Principles, Patterns & Techniques as really helping in this regard. In this space (concise/distilled info on testing) also is the excellent Foundations of programming ebook. Goob books on testing I've read are The Art of Unit Testing and xUnit Test Patterns. The latter is an important antidote to the first as it is much more measured than Roy's book is very opinionated and offers a lot of unqualified 'facts' without properly going through the various options. Definitely recommend reading both books though. AOUT is very readable and gets you thinking, though it chooses specific [debatable] technologies; xUTP is in depth and neutral and really helps solidify your understanding. I read Pragmatic Unit Testing in C# with NUnit afterwards. It's good and balanced though slightly dated (it mentions RhinoMocks as a sidebar and doesnt mention Moq) - even if nothing is actually incorrect. An updated version of it would be a hands-down recommendation.

More recently I've re-read the Feathers book, which is timeless to a degree and covers important ground. However it's a more 'how, for 50 different wheres' in nature. It's definitely a must read though.

Most recently, I'm reading the excellent Growing Object-Oriented Software, Guided by Tests by Steve Freeman and Nat Pryce. I can't recommend it highly enough - it really ties everything together from big to small in terms of where TDD fits, and various levels of testing within a software architecture. While I'm throwing the kitchen sink in, Evans's DDD book is important too in terms of seeing the value of building things incrementally with maniacal refactoring in order to end up in a better place.

Many times I find myself torn between making a method private to prevent someone from calling it in a context that doesn't make sense (or would screw up the internal state of the object involved), or making the method public (or typically internal) in order to expose it to the unit test assembly. I was just wondering what the Stack Overflow community thought of this dilemma?

So I guess the question truly is, is it better to focus on testability or on maintaining proper encapsulation?

Lately I've been leaning towards testability, as most of the code is only going to be leveraged by a small group of developers, but I thought I would see what everyone else thought?

In general I agree with @jrista. But, as usual, it depends.

When trying to work with legacy code, the key is to get it under test. After that, you can add tests for new features and existing bugs, refactor to improve design, etc. This is risky without tests. Legacy code tends to be rife with dependencies, and is often extremely difficult to get under test.

In Working Effectively with Legacy Code, Michael Feathers suggests multiple techniques for getting code under test. Many of these techniques involve breaking encapsulation or complicating the design, and the author is up front about this. Once tests are in place, the code can be improved safely.

So for legacy code, do what you have to do.

I have been working on some 10 year old C code at my job this week, and after implementing a few changes, I went to the boss and asked if he needed anything else done. That's when he dropped the bomb. My next task was to go through the 7000 or so lines and understand more of the code, and to modularize the code somewhat. I asked him how he would like the source code modularized, and he said to start putting the old C code into C++ classes.

Being a good worker, I nodded my head yes, and went back to my desk, where I sit now, wondering how in the world to take this code, and "modularize" it. It's already in 20 source files, each with its own purpose and function. In addition, there are three "main" structs. each of these structures has 30 plus fields, many of them being other, smaller structs. It's a complete mess to try to understand, but almost every single function in the program is passed a pointer to one of the structs and uses the struct heavily.

Is there any clean way for me to shoehorn this into classes? I am resolved to do it if it can be done, I just have no idea how to begin.

First, tell your boss you're not continuing until you have:

http://www.amazon.com/Refactoring-Improving-Design-Existing-Code/dp/0201485672

and to a lesser extent:

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

Secondly, there is no way of modularising code by shoe-horning it into C++ class. This is a huge task and you need to communicate the complexity of refactoring highly proceedural code to your boss.

It boils down to making a small change (extract method, move method to class etc...) and then testing - there is no short cuts with this.

I do feel your pain though...

I recently read that making a class singleton makes it impossible to mock the objects of the class, which makes it difficult to test its clients. I could not immediately understand the underlying reason. Can someone please explain what makes it impossible to mock a singleton class? Also, are there any more problems associated with making a class singleton?

A Singleton, by definition, has exactly one instance. Hence its creation is strictly controlled by the class itself. Typically it is a concrete class, not an interface, and due to its private constructor it is not subclassable. Moreover, it is found actively by its clients (by calling Singleton.getInstance() or an equivalent), so you can't easily use e.g. Dependency Injection to replace its "real" instance with a mock instance:

class Singleton {
    private static final myInstance = new Singleton();
    public static Singleton getInstance () { return myInstance; }
    private Singleton() { ... }
    // public methods
}

class Client {
    public doSomething() {
        Singleton singleton = Singleton.getInstance();
        // use the singleton
    }
}

For mocks, you would ideally need an interface which can be freely subclassed, and whose concrete implementation is provided to its client(s) by dependency injection.

You can relax the Singleton implementation to make it testable by

  • providing an interface which can be implemented by a mock subclass as well as the "real" one
  • adding a setInstance method to allow replacing the instance in unit tests

Example:

interface Singleton {
    private static final myInstance;
    public static Singleton getInstance() { return myInstance; }
    public static void setInstance(Singleton newInstance) { myInstance = newInstance; }
    // public method declarations
}

// Used in production
class RealSingleton implements Singleton {
    // public methods
}

// Used in unit tests
class FakeSingleton implements Singleton {
    // public methods
}

class ClientTest {
    private Singleton testSingleton = new FakeSingleton();
    @Test
    public void test() {
        Singleton.setSingleton(testSingleton);
        client.doSomething();
        // ...
    }
}

As you see, you can only make your Singleton-using code unit testable by compromising the "cleanness" of the Singleton. In the end, it is best not to use it at all if you can avoid it.

Update: And here is the obligatory reference to Working Effectively With Legacy Code by Michael Feathers.

I have a big mess of code. Admittedly, I wrote it myself - a year ago. It's not well commented but it's not very complicated either, so I can understand it -- just not well enough to know where to start as far as refactoring it.

I violated every rule that I have read about over the past year. There are classes with multiple responsibilities, there are indirect accesses (I forget the term - something like foo.bar.doSomething()), and like I said it is not well commented. On top of that, it's the beginnings of a game, so the graphics is coupled with the data, or the places where I tried to decouple graphics and data, I made the data public in order for the graphics to be able to access the data it needs...

It's a huge mess! Where do I start? How would you start on something like this?

My current approach is to take variables and switch them to private and then refactor the pieces that break, but that doesn't seem to be enough. Please suggest other strategies for wading through this mess and turning it into something clean so that I can continue where I left off!


Update two days later: I have been drawing out UML-like diagrams of my classes, and catching some of the "Low Hanging Fruit" along the way. I've even found some bits of code that were the beginnings of new features, but as I'm trying to slim everything down, I've been able to delete those bits and make the project feel cleaner. I'm probably going to refactor as much as possible before rigging my test cases (but only the things that are 100% certain not to impact the functionality, of course!), so that I won't have to refactor test cases as I change functionality. (do you think I'm doing it right or would it, in your opinion, be easier for me to suck it up and write the tests first?)

Please vote for the best answer so that I can mark it fairly! Feel free to add your own answer to the bunch as well, there's still room for you! I'll give it another day or so and then probably mark the highest-voted answer as accepted.

Thanks to everyone who has responded so far!


June 25, 2010: I discovered a blog post which directly answers this question from someone who seems to have a pretty good grasp of programming: (or maybe not, if you read his article :) )

To that end, I do four things when I need to refactor code:

  1. Determine what the purpose of the code was
  2. Draw UML and action diagrams of the classes involved
  3. Shop around for the right design patterns
  4. Determine clearer names for the current classes and methods

Pick yourself up a copy of Martin Fowler's Refactoring. It has some good advice on ways to break down your refactoring problem. About 75% of the book is little cookbook-style refactoring steps you can do. It also advocates automated unit tests that you can run after each step to prove your code still works.

As for a place to start, I would sit down and draw out a high-level architecture of your program. You don't have to get fancy with detailed UML models, but some basic UML is not a bad idea. You need a big picture idea of how the major pieces fit together so you can visually see where your decoupling is going to happen. Just a page or two of some basic block diagrams will help with the overwhelming feeling you have right now.

Without some sort of high level spec or design, you just risk getting lost again and ending up with another unmaintainable mess.

If you need to start from scratch, remember that you never truly start from scratch. You have some code and the knowledge you gained from your first time. But sometimes it does help to start with a blank project and pull things in as you go, rather than put out fires in a messy code base. Just remember not to completely throw out the old, use it for its good parts and pull them in as you go.

I'll second everyone's recommendations for Fowler's Refactoring, but in your specific case you may want to look at Michael Feathers' Working Effectively with Legacy Code, which is really perfect for your situation.

Feathers talks about Characterization Tests, which are unit tests not to assert known behaviour of the system but to explore and define the existing (unclear) behaviour -- in the case where you've written your own legacy code, and fixing it yourself, this may not be so important, but if your design is sloppy then it's quite possible there are parts of the code that work by 'magic' and their behaviour isn't clear, even to you -- in that case, characterization tests will help.

One great part of the book is the discussion about finding (or creating) seams in your codebase -- seams are natural 'fault lines', if you like, where you can break into the existing system to start testing it, and pulling it towards a better design. Hard to explain but well worth a read.

There's a brief paper where Feathers fleshes out some of the concepts from the book, but it really is well worth hunting down the whole thing. It's one of my favourites.

I manage a rather large application (50k+ lines of code) by myself, and it manages some rather critical business actions. To describe the program simple, I would say it's a fancy UI with the ability to display and change data from the database, and it's managing around 1,000 rental units, and about 3k tenants and all the finances.

When I make changes, because it's so large of a code base, I sometimes break something somewhere else. I typically test it by going though the stuff I changed at the functional level (i.e. I run the program and work through the UI), but I can't test for every situation. That is why I want to get started with unit testing.

However, this isn't a true, three tier program with a database tier, a business tier, and a UI tier. A lot of the business logic is performed in the UI classes, and many things are done on events. To complicate things, everything is database driven, and I've not seen (so far) good suggestions on how to unit test database interactions.

How would be a good way to get started with unit testing for this application. Keep in mind. I've never done unit testing or TDD before. Should I rewrite it to remove the business logic from the UI classes (a lot of work)? Or is there a better way?

I'd recommend picking up the book Working Effectively with Legacy Code by Michael Feathers. This will show you many techniques for gradually increasing the test coverage in your codebase (and improving the design along the way).

  • Say we have realized a value of TDD too late. Project is already matured, good deal of customers started using it.
  • Say automated testing used are mostly functional/system testing and there is a good deal of automated GUI testing.
  • Say we have new feature requests, and new bug reports (!). So good deal of development still goes on.
  • Note there would already be plenty of business object with no or little unit testing.
  • Too much collaboration/relationships between them, which again is tested only through higher level functional/system testing. No integration testing per se.
  • Big databases in place with plenty of tables, views, etc. Just to instantiate a single business object there already goes good deal of database round trips.

How can we introduce TDD at this stage?

Mocking seems to be the way to go. But the amount of mocking we need to do here seems like too much. Sounds like elaborate infrastructure needs to be developed for the mocking system working for existing stuff (BO, databases, etc.).

Does that mean TDD is a suitable methodology only when starting from scratch? I am interested to hear about the feasible strategies to introduce TDD in an already mature product.

Creating a complex mocking infrastructure will probably just hide the problems in your code. I would recommend that you start with integration tests, with a test database, around the areas of the code base that you plan to change. Once you have enough tests to ensure that you won't break anything if you make a change, you can start to refactor the code to make it more testable.

Se also Michael Feathers excellent book Working effectively with legacy code, its a must read for anyone thinking of introducing TDD into a legacy code base.

As you work in a legacy codebase what will have the greatest impact over time that will improve the quality of the codebase?

  • Remove unused code
  • Remove duplicated code
  • Add unit tests to improve test coverage where coverage is low
  • Create consistent formatting across files
  • Update 3rd party software
  • Reduce warnings generated by static analysis tools (i.e.Findbugs)

The codebase has been written by many developers with varying levels of expertise over many years, with a lot of areas untested and some untestable without spending a significant time on writing tests.

Add unit tests to improve test coverage. Having good test coverage will allow you to refactor and improve functionality without fear.

There is a good book on this written by the author of CPPUnit, Working Effectively with Legacy Code.

Adding tests to legacy code is certianly more challenging than creating them from scratch. The most useful concept I've taken away from the book is the notion of "seams", which Feathers defines as

"a place where you can alter behavior in your program without editing in that place."

Sometimes its worth refactoring to create seams that will make future testing easier (or possible in the first place.) The google testing blog has several interesting posts on the subject, mostly revolving around the process of Dependency Injection.

This is a GREAT book.

If you don't like that answer, then the best advice I can give would be:

  • First, stop making new legacy code[1]

[1]: Legacy code = code without unit tests and therefore an unknown

Changing legacy code without an automated test suite in place is dangerous and irresponsible. Without good unit test coverage, you can't possibly know what affect those changes will have. Feathers recommends a "stranglehold" approach where you isolate areas of code you need to change, write some basic tests to verify basic assumptions, make small changes backed by unit tests, and work out from there.

NOTE: I'm not saying you need to stop everything and spend weeks writing tests for everything. Quite the contrary, just test around the areas you need to test and work out from there.

Jimmy Bogard and Ray Houston did an interesting screen cast on a subject very similar to this: http://www.lostechies.com/blogs/jimmy_bogard/archive/2008/05/06/pablotv-eliminating-static-dependencies-screencast.aspx

Reading this question has helped me solidify some of the problems I've always had with unit-testing, TDD, et al.

Since coming across the TDD approach to development I knew that it was the right path to follow. Reading various tutorials helped me understand how to make a start, but they have always been very simplistic - not really something that one can apply to an active project. The best I've managed is writing tests around small parts of my code - things like libraries, that are used by the main app but aren't integrated in any way. While this has been useful it equates to about 5% of the code-base. There's very little out there on how to go to the next step, to help me get some tests into the main app.

Comments such as "Most code without unit tests is built with hard dependencies (i.e.'s new's all over the place) or static methods." and "...it's not rare to have a high level of coupling between classes, hard-to-configure objects inside your class [...] and so on." have made me realise that the next step is understanding how to de-couple code to make it testable.

What should I be looking at to help me do this? Is there a specific set of design patterns that I need to understand and start to implement which will allow easier testing?

Michael Feather's book Working Effectively With Legacy Code is exactly what you're looking for. He defines legacy code as 'code without tests' and talks about how to get it under test.

As with most things it's one step at a time. When you make a change or a fix try to increase the test coverage. As time goes by you'll have a more complete set of tests. It talks about techniques for reducing coupling and how to fit test pieces between application logic.

As noted in other answers dependency injection is one good way to write testable (and loosely coupled in general) code.

I have heard many developers refer to code as "legacy". Most of the time it is code that has been written by someone who no longer works on the project. What is it that makes code, legacy code?

Update in response to: "Something handed down from an ancestor or a predecessor or from the past" http://www.thefreedictionary.com/legacy. Clearly you wanted to know something else. Could you clarify or expand your question? S.Lott

I am looking for the symptoms of legacy code that make it unusable or a nightmare to work with. When is it better to throw it away? It is my opinion that code should be thrown away more often and that reinventing the wheel is valuable part of development. The academic ideal of not reinventing the wheel is a nice one but it is not very practical.

On the other hand there is obviously legacy code worth keeping.

According to Michael Feathers, the author of the excellent Working Effectively with Legacy Code, legacy code is a code which has no tests. When there is no way to know what breaks when this code changes.

The main thing that distinguishes legacy code from non-legacy code is tests, or rather a lack of tests. We can get a sense of this with a little thought experiment: how easy would it be to modify your code base if it could bite back, if it could tell you when you made a mistake? It would be pretty easy, wouldn't it? Most of the fear involved in making changes to large code bases is fear of introducing subtle bugs; fear of changing things inadvertently. With tests, you can make things better with impunity. To me, the difference is so critical, it overwhelms any other distinction. With tests, you can make things better. Without them, you just don’t know whether things are getting better or worse.

Having recently discovered this method of development, I'm finding it a rather nice methodology. So, for my first project, I have a small DLL's worth of code (in C#.NET, for what it's worth), and I want to make a set of tests for this code, but I am a bit lost as to how and where to start.

I'm using NUnit, and VS 2008, any tips on what sort of classes to start with, what to write tests for, and any tips on generally how to go about moving code across to test based development would be greatly appreciated.

Working Effectively with Legacy Code is my bible when it comes to migrating code without tests into a unit-tested environment, and it also provides a lot of insight into what makes code easy to test and how to test it.

I also found Test Driven Development by Example and Pragmatic Unit Testing: in C# with NUnit to be a decent introduction to unit testing in that environment.

One simple approach to starting TDD is to start writing tests first from this day forward and make sure that whenever you need to touch your existing (un-unit-tested) code, you write passing tests that verify existing behavior of the system before you change it so that you can re-run those tests after to increase your confidence that you haven't broken anything.

I have prepared some automatic tests with the Visual Studio Team Edition testing framework. I want one of the tests to connect to the database following the normal way it is done in the program:

string r_providerName = ConfigurationManager.ConnectionStrings["main_db"].ProviderName;

But I am receiving an exception in this line. I suppose this is happening because the ConfigurationManager is a singleton. How can you work around the singleton problem with unit tests?


Thanks for the replies. All of them have been very instructive.

Example from Book: Working Effectively with Legacy Code

Also given same answer here: http://stackoverflow.com/a/28613595/929902

To run code containing singletons in a test harness, we have to relax the singleton property. Here’s how we do it. The first step is to add a new static method to the singleton class. The method allows us to replace the static instance in the singleton. We’ll call it setTestingInstance.

public class PermitRepository
{
    private static PermitRepository instance = null;
    private PermitRepository() {}
    public static void setTestingInstance(PermitRepository newInstance)
    {
        instance = newInstance;
    }
    public static PermitRepository getInstance()
    {
        if (instance == null) {
            instance = new PermitRepository();
        }
        return instance;
    }
    public Permit findAssociatedPermit(PermitNotice notice) {
    ...
    }
    ...
}

Now that we have that setter, we can create a testing instance of a PermitRepository and set it. We’d like to write code like this in our test setup:

public void setUp() {
    PermitRepository repository = new PermitRepository();
    ...
    // add permits to the repository here
    ...
    PermitRepository.setTestingInstance(repository);
}

While unit-testing seems effective for larger projects where the APIs need to be industrial strength (for example development of the .Net framework APIs, etc.), it seems possibly like overkill on smaller projects.

When is the automated TDD approach the best way, and when might it be better to just use manual testing techniques, log the bugs, triage, fix them, etc.

Another issue--when I was a tester at Microsoft, it was emphasized to us that there was a value in having the developers and testers be different people, and that the tension between these two groups could help create a great product in the end. Can TDD break this idea and create a situation where a developer might not be the right person to rigorously find their own mistakes? It may be automated, but it would seem that there are many ways to write the tests, and that it is questionable whether a given set of tests will "prove" that quality is acceptable.

Every application gets tested.

Some applications get tested in the form of does my code compile and does the code appear to function.

Some applications get tested with Unit tests. Some developers are religious about Unit tests, TDD and code coverage to a fault. Like everything, too much is more often than not bad.

Some applications are luckily enough to get tested via a QA team. Some QA teams automate their testing, others write test cases and manually test.

Michael Feathers, who wrote: Working Effectively with Legacy Code, wrote that code not wrapped in tests is legacy code. Until you have experienced The Big Ball of Mud, I don't think any developer truly understands the benefit of good Application Architecture and a suite of well written Unit Tests.

Having different people test is a great idea. The more people that can look at an application the more likely all the scenarios will get covered, including the ones you didn't intend to happen.

TDD has gotten a bad rap lately. When I think of TDD I think of dogmatic developers meticulously writing tests before they write the implementation. While this is true, what has been overlooked is by writing the tests, (first or shortly after) the developer experiences the method/class in the shoes of the consumer. Design flaws and shortcomings are immediately apparent.

I argue that the size of the project is irrelevant. What is important is the lifespan of the project. The longer a project lives the more the likelihood that a developer other than the one who wrote it will work on it. Unit tests are documentation to the expectations of the application -- A manual of sorts.

Imagine that 90% of your job is merely to triage issues on a very massive, very broken website. Imagine that this website is written in the most tightly coupled, least cohesive PHP code you've ever seen, the type of code that would add the original developers to your "slap on sight" list. Imagine that this web application is made up of 4 very disparate parts (1 commercial, 2 "repurposed", and 1 custom) and a crap-ton of virtual duct tape and shims. Imagine that it contains the type of programming practices in which major components of the website actually rely on things NOT working properly, and fixing these broken things usually breaks other things. Imagine that you know from too many bad experiences that changing one seemingly innocuous part of the website, such as splitting a "name" field into two separate "first" and "last" fields, will bring the site to its knees and require hours of rollbacks, merges and patches. Imagine pleading with the customer for years to just ditch the code and start all over but being met with Enterprise-Grade despair and hand wringing. Then imagine getting ASAP/EMERGENCY tickets to implement new features that in any other web site would take 4 hours but you know better with this site so you quote 40 hours, then blow right by that and bill 80 hours, but it's OK because the client is used to that with their website.

Here are some other things that you should also imagine:

  • there are no tests at all right now
  • there are googleteen different layers of logins. Some customers actually have 3 different accounts for different sections of the website
  • when I say "tightly coupled", I mean the loops of include/require statements would probably map out like a celtic knot
  • when I say "least cohesive" I mean some stuff is organized sort of like MVC, but it's not really MVC. In some cases it may take you several hours just to find out how URI A is mapped to file B
  • the UI was written like "obtrusive" and "inaccessible" were the buzzwords of the day

Imagining all that, is it even worth trying to achieve even a moderate level of test coverage? Or should you, in this imaginary scenario, just keep doing the best you can with what you've been given and hoping, praying, maybe even sacrificing, that the client will agree to a rewrite one of these days and THEN you can start writing tests?

ADDENDUM

Since many of you brought it up: I have approached the possibility of a re-write at every chance I've had to date. The marketing people I work with know that their code is crap, and they know it's the fault of the "lowest bid" firm they went with originally. I've probably overstepped my bounds as a contractor by pointing out that they spend a crap ton of money on me to provide hospice care for this site, and that by redeveloping it from scratch they would see an ROI very quickly. I've also said that I refuse to rewrite the site as-is, because it doesn't really do what they want it to do anyway. The plan is to rewrite it BDD style, but getting all the key players in one place is tough, and I'm still not sure they know what they need. In any case, I fully expect that to be A Very Big Project.

Thanks for all the feedback so far!

The most important thing (After buying Working efficiently with legacy code) is to start small. I work on several projects, each several thousand PHP lines long and often without a single function (and don't even think of objects) and whenever i have to change code i try to refactor the part into a function and write a test for it. This is combined with extensive manual testing of that part so i can be sure it works as before. When i have multiple functions for similar things i move them as static methods into a class and then, step by step, replace them with proper object-oriented code.

Every step from moving it into a function to changing it into a real class is surrounded by unit testing (not very good one as 90% of the code are SQL queries and it's nearly impossible to set up a reliable testing database, but i can still test the behaviour).

Since a lot of code repeats (i found a single SQL query repeated 13 times in a single file and many times more in the other 50 files of that project) i could change all other places, but i don't since those are neither tested nor can i be sure the surrounding code doesn't depend on that code in some wierd way (think global). That code can be changed as soon as i have to touch that code anyways.

It's a long and tedious work and every time i see the code i feel a step (or rather a leap) closer to mental breakdown, but the code quality will improve (slowly but mostly reliably).

Your situation seems to be quite similar, so maybe my ideas might help you with your code.

In short

Start small, change only what you work on and begin to write only limited unit tests and expand them the more you learn about the system.

I do write unit tests while writing APIs and core functionalities. But I want to be the cool fanboy who eats, sleeps and breathes TDD and BDD. What's the best way to get started with TDD/BDD the right way? Any books, resources, frameworks, best practices?

My environment is Java backend with Grails frontend, integrated with several external web services and databases.

A good place to start is reading blogs. Then buy the books of the people who are blogging. Some I would highly recommend:

"Uncle Bob" Martin and the guys at Object Mentor: http://blog.objectmentor.com/

P.S. get Bobs book Clean Code:

http://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882

My friend Tim Ottinger (former Object Mentor dude) http://agileinaflash.blogspot.com/ http://agileotter.blogspot.com/

The Jetbrains guys: http://www.jbrains.ca/permalink/285

I felt the need to expand on this, as everyone else seems to just want to give you their opinion of TDD and not help you on your quest to become a Jedi-Ninja. The Michael Jordan of TDD is Kent Beck. He really did write the book on it:

http://www.amazon.com/Test-Driven-Development-Kent-Beck/dp/0321146530

he also blogs at:

http://www.threeriversinstitute.org/blog/?p=29

other "famous" supporters of TDD include:

All are great people to follow. You should also consider attending some conferences like Agile 2010, or Software Craftsmanship (this year they were held at the same time in Chicago)

I've been doing TDD for a couple of years, but lately I've started looking more in to the BDD way of driving my design and development. Resources that helped me get started on BDD was first and formost Dan North's blog (the 'founder' of BDD). Take a look at Introducing BDD. There's also an 'official' BDD Wiki over at behaviour-driven.org with some good post well worth reading.

The one thing that I found really hard when starting out with BDD (and still find a bit hard) is how to formulate those scenarios to make them suitable to BDD. Scott Bellware is a man well skilled in BDD (or Context-Spesification as he like to coin it) and his article Behavior-Driven Development in Code Magazine helped me a lot on understanding the BDD way of thinking and formulating user stories.

I would also recomend the TekPub screencast Behavior-driven Design with Specflow by Rob Conery. A great intro to BDD and to a tool (SpecFlow) very good suited for doing BDD in C#.

As for TDD resources, there's already a lot of good recommendations here. But I just want to point out a couple of books that I can really recommend;

I have recently heard of Functional Testing over Unit Testing.

I understand that Unit Testing tests each of the possibilities of a given piece of code from its most atomic form. But what about Functional Testing?

This sounds to me like only testing if the code works, but is it as reliable as Unit Testing?

I've been told there was two school of thoughts for the matter. Certains would prefer Unit Testing, others Functional Testing.

Is there any good resources, links, books, any references or one of you all who can explain and elighten my path on the subject?

Thanks!

Jason's answer is correct. Different types of tests have different purposes, and can be layered for best results (good design, meeting specifications, reduced defects).

  • Unit testing = drives design (with Test-Driven Development, or TDD)
  • Integration testing = do all the pieces work together
  • Customer acceptance testing = does it meet the customer's requirements
  • Manual testing = often covers the UI; dedicated testers can find what automation misses
  • Load testing = how well does the system perform with realistic amounts of data

There is some overlap between these categories; unit tests can specify behavior, for instance.

And there are others; for more than most people care to know, see Software Testing.

One point people missed is that unit testing is testing pieces of code in isolation. Good unit tests don't hit the database, for instance. This has two advantages: it makes the tests run fast so you'll run them more often, and it forces you to write loosely coupled classes (better design).

You asked for resources; I recommend Roy Osherove's book The Art of Unit Testing with Examples in .NET. While no book is perfect, this one gives many excellent pointers on writing good tests.

EDIT: And for writing tests against existing software, nothing beats Michael Feathers' book Working Effectively with Legacy Code.

Our code sucks. Actually, let me clarify that. Our old code sucks. It's difficult to debug and is full of abstractions that few people understand or even remember. Just yesterday I spent an hour debugging in an area that I've worked for over a year and found myself thinking, "Wow, this is really painful." It's not anyone's fault - I'm sure it all made perfect sense initially. The worst part is usually It Just Works...provided you don't ask it to do anything outside of its comfort zone.

Our new code is pretty good. I think we're doing a lot of good things there. It's clear, consistent, and (hopefully) maintainable. We've got a Hudson server running for continuous integration and we have the beginnings of a unit test suite in place. The problem is our management is laser-focused on writing New Code. There's no time to give Old Code (or even old New Code) the TLC it so desperately needs. At any given moment our scrum backlog (for six developers) has about 140 items and around a dozen defects. And those numbers aren't changing much. We're adding things as fast as we can burn them down.

So what can I do to avoid the headaches of marathon debugging sessions mired in the depths of Old Code? Every sprint is filled to the brim with new development and showstopper defects. Specifically...

  • What can I do to help maintenance and refactoring tasks get high enough priority to be worked?
  • Are there any C++-specific strategies you employ to help prevent New Code from rotting so quickly?

Working Effectively with Legacy Code

We unit test most of our business logic, but are stuck on how best to test some of our large service tasks and import/export routines. For example, consider the export of payroll data from one system to a 3rd party system. To export the data in the format the company needs, we need to hit ~40 tables, which creates a nightmare situation for creating test data and mocking out dependencies.

For example, consider the following (a subset of ~3500 lines of export code):

public void ExportPaychecks()
{
   var pays = _pays.GetPaysForCurrentDate();
   foreach (PayObject pay in pays)
   {
      WriteHeaderRow(pay);
      if (pay.IsFirstCheck)
      {
         WriteDetailRowType1(pay);
      }
   }
}

private void WriteHeaderRow(PayObject pay)
{
   //do lots more stuff
}

private void WriteDetailRowType1(PayObject pay)
{
   //do lots more stuff
}

We only have the one public method in this particular export class - ExportPaychecks(). That's really the only action that makes any sense to someone calling this class ... everything else is private (~80 private functions). We could make them public for testing, but then we'd need to mock them to test each one separately (i.e. you can't test ExportPaychecks in a vacuum without mocking the WriteHeaderRow function. This is a huge pain too.

Since this is a single export, for a single vendor, moving logic into the Domain doesn't make sense. The logic has no domain significance outside of this particular class. As a test, we built out unit tests which had close to 100% code coverage ... but this required an insane amount of test data typed into stub/mock objects, plus over 7000 lines of code due to stubbing/mocking our many dependencies.

As a maker of HRIS software, we have hundreds of exports and imports. Do other companies REALLY unit test this type of thing? If so, are there any shortcuts to make it less painful? I'm half tempted to say "no unit testing the import/export routines" and just implement integration testing later.

Update - thanks for the answers all. One thing I'd love to see is an example, as I'm still not seeing how someone can turn something like a large file export into an easily testable block of code without turning the code into a mess.

What you should have initially are integration tests. These will test that the functions perform as expected and you could hit the actual database for this.

Once you have that savety net you could start refactoring the code to be more maintainable and introducing unit tests.

As mentioned by serbrech Workign Effectively with Legacy code will help you to no end, I would strongly advise reading it even for greenfield projects.

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

The main question I would ask is how often does the code change? If it is infrequent is it really worth the effort trying to introduce unit tests, if it is changed frequently then I would definatly consider cleaning it up a bit.

I think Tomasz Zielinski has a piece of the answer. But if you say you have 3500 lines of procedural codes, then the the problem is bigger than that. Cutting it into more functions will not help you test it. However, it' a first step to identify responsibilities that could be extracted further to another class (if you have good names for the methods, that can be obvious in some cases).

I guess with such a class you have an incredible list of dependencies to tackle just to be able to instanciate this class into a test. It becomes then really hard to create an instance of that class in a test... The book from Michael Feathers "Working With Legacy Code" answer very well such questions. The first goal to be able to test well that code into should be to identify the roles of the class and to break it into smaller classes. Of course that's easy to say and the irony is that it's risky to do without tests to secure your modifications...

You say you have only 1 public method in that class. That should ease the refactoring as you don't need to worry about the users fro, all the private methods. Encapsulation is nice, but if you have so much stuff private in that class, that probably means it doesn't belong here and you should extract different classes from that monster, that you will eventually be able to test. Pieces by pieces, the design should look cleaner, and you will be able to test more of that big piece of code. You best friend if you start this will be a refactoring tool, then it should help you not to break logic while extracting classes and methods.

Again the book from Michael Feathers seems to be a must read for you :) http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

ADDED EXAMPLE :

This example come from the book from Michael Feathers and illustrate well your problem I think :

RuleParser  
public evaluate(string)  
private brachingExpression  
private causalExpression  
private variableExpression  
private valueExpression  
private nextTerm()  
private hasMoreTerms()   
public addVariables()  

obvioulsy here, it doesn't make sense to make the methods nextTerm and hasMoreTerms public. Nobody should see these methods, the way we are moving to the next item is definitely internal to the class. so how to test this logic??

Well if you see that this is a separate responsibility and extract a class, Tokenizer for example. this method will suddenly be public within this new class! because that's its purpose. It becomes then easy to test that behaviour...

So if you would apply that to your huge piece of code, and extract pieces of it to other classes with less responsibilities, and where it would feel more natural to make these methods public, you also will be able to test them easily. You said you are accessing about 40 different tables to map them. Why not breaking that into classes for each part of the mapping?

It's a bit hard to reason about a code I can't read. You maybe have other issues that prevent you to do this, but that's my best try on it.

Hope this helps Good luck :)

As the title suggest... How can I apply a scrum process to anything that doesn't work on new code and can be estimated to some degree?

How can I apply a scrum process to maintenance and emergency fixes (which can take from 5 minutes to 2 weeks to fix) type of environment when I still would like to plan to do things?

Basically, how do I overcome unplanned tasks and tasks that are very difficult to estimate with the scrum process? or am I simply applying the wrong process for this enviroment?

As the title suggest... How can I apply a scrum process to anything that doesn't work on new code and can be estimated to some degree?

On the contrary, I've heard teams find adopting Scrum easier in the maintenance phase.. because the changes are smaller (no grand design changes) and hence easier to estimate. Any new change request is added to the product backlog, estimated by devs and then prioritized by the product owner.

How can I apply a scrum process to maintenance and emergency fixes (which can take from 5 minutes to 2 weeks to fix) type of environment when I still would like to plan to do things?

If you're hinting at fire-fighting type of activity, keep a portion of the iteration work quota for such activities.. Based on historical trends/activity, you should be able to say e.g. we have a velocity of 10 story points per iteration (4 person-team 5day-iteration). Each of us spend about a day a week responding to emergencies. So we should only pick 8 points worth of backlog items for the next iteration to be realistic. If we don't have emergency issues, we'll pick up the next top item from the prioritized backlog.
CoryFoy mentions a more dynamic/real-time approach with kanban post-its in his response.

Basically, how do I overcome unplanned tasks and tasks that are very difficult to estimate with the scrum process? or am I simply applying the wrong process for this enviroment?

AFAIR Scrum doesn't mandate an estimation technique.. Use one that the team is most comfortable with.. man days / story points / etc. The only way to get better at estimation is practice and experience I believe. The more the same set of people sit together to estimate new tasks, the better their estimates get. In a maintenance kind of environment, I would assume that it's easier to estimate because the system is more or less well known to the group. If not, schedule/use spikes to get more clarity.

I sense that you're attempting to eat an elephant here.. I'd suggest the following bites

I have a class which calls getaddrinfo for DNS look ups. During testing I want to simulate various error conditions involving this system call. What's the recommended method for mocking system calls like this? I'm using Boost.Test for my unit testing.

Look up patterns for "Dependency Injection".

Dependency Injection works like this: instead of calling getaddrinfo directly in your code, the code uses an interface that has a virtual method "getaddrinfo".

In real-life code, the caller passes an implementation of the interface that maps the virtual method "getaddrinfo" of the interface to the real ::getaddrinfo function.

In unit tests, the caller passes an implementation that can simulate failures, test error conditions, ... to be short: mock anything you want to mock.

EDIT: Read "Working effectively with legacy code" of Michael Feathers for more tips.

In this case you don't need to mock getaddrinfo, rather, you need to test without relying on its functionality. Both Patrick and Noah have good points but you have at least two other options:

Option 1: Subclass to Test

Since you already have your object in a class, you can subclass to test. For example, assume the following is your actual class:

class DnsClass {
    int lookup(...);
};

int DnsClass::lookup(...) {
    return getaddrinfo(...);
}

Then, for testing, you would subclass like this:

class FailingDnsClass {
    int lookup(...) { return 42; }
};

You can now use the FailingDnsClass subclass to generate errors but still verify that everything behaves correctly when an error condition occurs. Dependency Injection is often your friend in this case.

NOTE: This is quite similar to Patrick's answer but doesn't (hopefully) involve changing the production code if you aren't already setup for dependency injection.

Option 2: Use a link seam

In C++, you also have link-time seams which Michael Feathers describes in Working Effectively with Legacy Code.

The basic idea is to leverage the linker and your build system. When compiling the unit tests, link in your own version of getaddrinfo which will take precedence over the system version. For example:

test.cpp:

#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <iostream>

int main(void)
{
        int retval = getaddrinfo(NULL, NULL, NULL, NULL);
        std::cout << "RV:" << retval << std::endl;
        return retval;
}

lib.cpp:

#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>

int getaddrinfo(const char *node, const char *service,
        const struct addrinfo *hints, struct addrinfo **res
        )
{
        return 42;
}

And then for testing:

$ g++ test.cpp lib.cpp -o test
$ ./test 
RV:42
catch(Exception ex)
{

}

What's the best way to proceed?

Rip them all out and let it crash? Add logging code? Message boxes? This is in C#.

The first intermediate goal is to get a good picture of what exceptions are being ignored and where; for that purpose, you can simply add logging code to each of those horrid catch-everything blocks, showing exactly what block it is, and what is it catching and hiding. Run the test suite over the code thus instrumented, and you'll have a starting "blueprint" for the fixing job.

If you don't have a test suite, then, first, make one -- unit tests can wait (check out Feathers' great book on working with legacy code -- legacy code is most definitely your problem here;-), but you need a suite of integration tests that can be run automatically and tickle all the bugs you're supposed to fix.

As you go and fix bug after bug (many won't be caused by the excessively broad catch blocks, just hidden/"postponed" by them;-), be sure to work in a mostly "test-driven" manner: add a unit test that tickes the bug, confirm it breaks, fix the bug, rerun the unit test to confirm the bug's gone. Your growing unit test suite (with everything possible mocked out or faked) will run fast and you can keep rerunning cheaply as you work, to catch possible regressions ASAPs, when they're still easy to fix.

The kind of task you've been assigned is actually harder (and often more important) than "high prestige" SW development tasks such as prototypes and new architectures, but often misunderstood and under-appreciated (and therefore under-rewarded!) by management/clients; make sure to keep a very clear and open communication channel with the stakeholder, pointing out all the enormous amount of successful work you're doing, how challenging it is, and (for their sake more than your own), how much they would have saved by doing it right in the first place (maybe next time they will... I know, I'm a wild-eyed optimist by nature;-). Maybe they'll even assign you a partner on the task, and you can then do code reviews and pair programming, boosting productivity enormously.

And, last but not least, good luck!!! -- unfortunately, you'll need it... fortunately, as Jefferson said, "the harder I work, the more luck I seem to have";-)

Ever since I started using .NET, I've just been creating Helper classes or Partial classes to keep code located and contained in their own little containers, etc.

What I'm looking to know is the best practices for making ones code as clean and polished as it possibly could be.

Obviously clean code is subjective, but I'm talking about when to use things (not how to use them) such as polymorphism, inheritance, interfaces, classes and how to design classes more appropriately (to make them more useful, not just say 'DatabaseHelper', as some considered this bad practice in the code smells wiki).

Are there any resources out there that could possibly help with this kind of decision making?

Bare in mind that I haven't even started a CS or software engineering course, and that a teaching resource is fairly limited in real-life.

Working Effectively with Legacy Code is one of the best books I have seen on this subject.

Don't be put off the title of the book - Rather than treating Refactoring as a formal concept (which has its place), this book has lots and lots of simple "why didn't I think of that" tips. Things like "go through a class and remove any methods not directly realted to that class and put them in a different one".

e.g. You have a grid and some code to persist the layout of that grid to file. You can probably safely move the layout persisting code out to a different class.

A real eye-opener to me was Refactoring: Improving the Design of Existing Code:

With proper training a skilled system designer can take a bad design and rework it into well-designed, robust code. In this book, Martin Fowler shows you where opportunities for refactoring typically can be found, and how to go about reworking a bad design into a good one.

Refactoring

It helped me to efficiently and systematically refactor code. Also it helped me a lot in discussions with other developers, when their holy code has to be changed ...

I'd recommend Domain Driven Design. I think both YAGNI and AlwaysRefactor principles are two simplistic. The age old question on the issue is do i refactor "if(someArgument == someValue)" into a function or leave it inline?

There is no yes or no answer. DDD advises to refactor it if the test represents a buisiness rule. The refactoring is not (only) about reuse but about making the intentions clear.

Check out Martin Fowler's comments and book on Refactoring

I'm familiar with refactoring fairly large code bases in C# and Java but Clojure is something of a different beast, especially since it:

  • Has a mix of macros and functions in typical code (i.e. you might want to refactor from a macro to a function or vice-versa?)
  • Uses dynamic typing in most circumstances (so you don't get compile time checks on the correctness of your refactored code)
  • Is functional rather than object-oriented in style
  • Has less support for refactoring in current IDEs
  • Is less tolerant of cyclic dependencies in code bases (making it harder to move blocks of code / definitions around!)

Given the above, what is the best way to approach code refactoring in Clojure?

In "Working effectively with legacy code" Michael Feathers suggests adding unit tests to create artificial "inflection points" in the code that you can re-factor around.

a super brief and wholly incomplete overview on his approach to adding order to unstructured code:

  • devide the code into "Legacy" (without tests) and the rest.
  • create a test
  • recur on both halves.

The recursive approach seemed to fit well with the mental processes I use in thinking about Clojure so I have come to associate them. even new languages can have legacy code right?

This is what I got from my reading of that one book while thinking about clojure. So I hope it is useful as a general guideline. perhaps your codebase already has good tests, in which case you're already beyond this phase.

After years of coding Delphi programs as untestable code in forms and datamodules, including global variables, and the only classes are the forms themselves, containing all the code I need for the form UI itself.

How would I convert the code to a set of classes that do the actual work? would I need to stop using the datasources/datasets and do everything in classes? do I need an ORM?

There's usually zero need for reuse of the code in the forms, so does it make sense to convert the logic to classes?

Another book I can highly, highly recommend - in my personal opinion even better suited than the "generic" refactoring book by Fowler - is "Working Effectively with Legacy Code" by Michael Feathers. It truly showcases the major bumps you will hit while doing that kind of work. Oh, and: Refactoring legacy code can be quite hard on your psyche. I hope you can handle frustration... I like this quote (don't remember where I got it from): "God was able to create the world in 6 days, just because there wasn't any legacy code". Good luck. ;)

To start with I can highly recommend reading the book Refactoring by Martin Fowler.

This will give you a real understanding about how best to sensibly approach introducing changes to the existing (non OO) code to improve maintainability.

I would not look at an ORM until you have a clear understanding about what benefits (if any) one would bring to your application.

I develop and maintain a large (500k+ LOC) WinForms app written in C# 2.0. It's multi-user and is currently deployed on about 15 machines. The development of the system is ongoing (can be thought of as a perpetual beta), and there's very little done to shield users from potential new bugs that might be introduced in a weekly build.

For this reason, among others, i've found myself becoming very reliant on edit-and-continue in the debugger. It helps not only with bug-hunting and bug-fixing, but in some cases with ongoing development as well. I find it extremely valuable to be able to execute newly-written code from within the context of a running application - there's no need to recompile and add a specific entry point to the new code (having to add dummy menu options, buttons, etc to the app and remembering to remove them before the next production build) - everything can be tried and tested in real-time without stopping the process.

I hold edit-and-continue in such high regard that I actively write code to be fully-compatible with it. For example, I avoid:

  • Anonymous methods and inline delegates (unless completely impossible to rewrite)
  • Generic methods (except in stable, unchanging utility code)
  • Targeting projects at 'Any CPU' (i.e. never executing in 64-bit)
  • Initializing fields at the point of declaration (initialisation is moved to the constructor)
  • Writing enumerator blocks that use yield (except in utility code)

Now, i'm fully aware that the new language features in C# 3 and 4 are largely incompatible with edit-and-continue (lambda expressions, LINQ, etc). This is one of the reasons why i've resisted moving the project up to a newer version of the Framework.

My question is whether it is good practice to avoid using these more advanced constructs in favor of code that is very, very easy to debug? Is there legitimacy in this sort of development, or is it wasteful? Also, importantly, do any of these constructs (lambda expressions, anonymous methods, etc) incur performance/memory overheads that well-written, edit-and-continue-compatible code could avoid? ...or do the inner workings of the C# compiler make such advanced constructs run faster than manually-written, 'expanded' code?

Without wanting to sound trite - it is good practice to write unit/integration tests rather than rely on Edit-Continue.

That way, you expend the effort once, and every other time is 'free'...

Now I'm not suggesting you retrospectively write units for all your code; rather, each time you have to fix a bug, start by writing a test (or more commonly multiple tests) that proves the fix.

As @Dave Swersky mentions in the comments, Mchael Feathers' book, Working Effectively with Legacy Code is a good resource (It's legacy 5 minutes after you wrote it, right?)

So Yes, I think it's a mistake to avoid new C# contructs in favor of allowing for edit and continue; BUT I also think it's a mistake to embrace new constructs just for the sake of it, and especially if they lead to harder to understand code.

I'm a senior engineer working in a team of four others on a home-grown content management application that drives a large US pro sports web site. We have embarked upon this project some two years ago and chose Java as our platform, though my question is not Java-specific. Since we started, there has been some churn in our ranks. Each one of us has a significant degree of latitude in deciding on implementation details, although important decisions are made by consensus.

Ours is a relatively young project, however we are already at a point when no single developer knows everything about the app. The primary reasons for that are our quick pace of development, most of which occurs in a crunch leading up to our sport's season opener; and the fact that our test coverage is essentially 0.

We all understand the theoretical benefits of TDD and agree in principle that the methodology would have improved our lives and code quality if we had started out and stuck with it through the years. This never took hold, and now we're in charge of an untested codebase that still requires a lot of expansion and is actively used in production and relied upon by the corporate structure.

Faced with this situation, I see only two possible solutions: (1) retroactively write tests for existing code, or (2) re-write as much of the app as is practical while fanatically adhering to TDD principles. I perceive (1) as by and large not practical because we have a hellish dependency graph within the project. Almost none of our components can be tested in isolation; we don't know all the use cases; and the uses cases will likely change during the testing push due to business requirements or as a reaction to unforeseen issues. For these reasons, we can't really be sure that our tests will turn out to be high quality once we're done. There's a risk of leading the team into a false sense of security whereby subtle bugs will creep in without anyone noticing. Given the bleak prospects with regards to ROI, it would be hard for myself or our team lead to justify this endeavor to management.

Method (2) is more attractive as we'll be following the test-first principle, thus producing code that's almost 100% covered right off the bat. Even if the initial effort results in islands of covered code at first, this will provide us with a significant beachhead on the way to project-wide coverage and help decouple and isolate the various components.

The downside in both cases is that our team's business-wise productivity could either slow down significantly or evaporate entirely during any testing push. We can not afford to do this during the business-driven crunch, although it's followed by a relative lull which we could exploit for our purposes.

In addition to choosing the right approach (either (1), (2), or another as of yet unknown solution), I need help answering the following question: How can my team ensure that our effort isn't wasted in the long run by unmaintained tests and/or failure to write new ones as business requirements roll on? I'm open to a wide range of suggestions here, whether they involve carrots or sticks.

In any event, thanks for reading about this self-inflicted plight.

I think that you need to flip around part of the question and the related arguments. How can you ensure that adhering to TDD will not result in wasted effort?

You can't, but the same is true for any development process.

I've always found it slightly paradoxical that we are always challenged to prove the cash benefit of TDD and related agile disciplines, when at the same time traditional waterfall processes more often than not result in

  • missed deadlines
  • blown budgets
  • death marches
  • developer burn-out
  • buggy software
  • unsatisfied customers

TDD and other agile methodologies attempt to address these issues, but obviously introduce some new concerns.

In any case I'd like to recommend the following books that may answer some of your questions in greater detail:

I've seen a trend to move business logic out of the data access layer (stored procedures, LINQ, etc.) and into a business logic component layer (like C# objects).

Is this considered the "right" way to do things these days? If so, does this mean that some database developer positions may be eliminated in favor of more middle-tier coding positions? (i.e. more c# code rather than more long stored procedures.)

If the applications is small with a short lifetime, then it's not worth putting time into abstracting the concerns in layers. In larger, long lived applications your logic/business rules should not be coupled to the data access. It creates a maintenance nightmare as the application grows.

Moving concerns to a common layer or also known as Separation of concerns, has been around for a while:

Wikipedia

The term separation of concerns was probably coined by Edsger W. Dijkstra in his 1974 paper "On the role of scientific thought"1.

For Application Architecture a great book to start with is Domain Driven Design. Eric Evans breaks down the different layers of the application in detail. He also discusses the database impedance and what he calls a "Bounded Context"

Bounded Context

A blog is a system that displays posts from newest to oldest so that people can comment on. Some would view this as one system, or one "Bounded Context." If you subscribe to DDD, one would say there are two systems or two "Bounded Contexts" in a blog: A commenting system and a publication system. DDD argues that each system is independent (of course there will be interaction between the two) and should be modeled as such. DDD gives concrete guidance on how to separate the concerns into the appropriate layers.

Other resources that might interest you:

Until I had a chance to experience The Big Ball of Mud or Spaghetti Code I had a hard time understanding why Application Architecture was so important...

The right way to do things will always to be dependent on the size, availability requirements and lifespan of your application. To use stored procs or not to use stored procs... Tools such as nHibrnate and Linq to SQL are great for small to mid-size projects. To make myself clear, I've never used nHibranate or Linq To Sql on a large application, but my gut feeling is an application will reach a size where optimizations will need to be done on the database server via views, Stored Procedures.. etc to keep the application performant. To do this work Developers with both Development and Database skills will be needed.

I would like to start doing more unit testing in my applications, but it seems to me that most of the stuff I do is just no suitable to be unit tested. I know how unit tests are supposed to work in textbook examples, but in real world applications they do not seem of much use.

Some applications I write have very simple logic and complex interactions with things that are outside my control. For instance I would like to write a daemon which reacts to signals sent by some applications, and changes some user settings in the OS. I can see three difficulties:

  • first I have to be able to talk with the applications and be notified of their events;
  • then I need to interact with OS whenever I receive a signal, in order to change the appropriate user settings;
  • finally all of this should work as a daemon.

All these things are potentially delicate: I will have to browse possibly complex APIs and I may introduce bugs, say by misinterpreting some parameters. What can unit testing do for me? I can mock both the external application and the OS, and check that given a signal from the application, I will call the appropriate API method on the OS. This is... well, the trivial part of the application.

Actually most of the things I do involve interaction with databases, the filesystem or other applications, and these are the most delicate parts.

For another example look at my build tool PHPmake. I would like to refactor it, as it is not very well-written, but I fear to do this as I have no tests. So I would like to add some. The point is that the things which may be broken by refactoring may not be caught by unit tests:

  • One of things to do is deciding which things are to be built and which one are already up to date, and this depends on the time of last modification of the files. This time is actually changed by external processes, when some build command is fired.
  • I want to be sure that the output of external processes is displayed correctly. Sometimes the buikd commands require some input, and that should be also managed correctly. But I do not know a priori which processes will be ran - it may be anything.
  • Some logic is involved in pattern matching, and this may seem to be testable part. But the functions which do the pattern matching use (ni addition to their own logic) the PHP function glob, which works with the filesystem. If I just mock a tree in place of the actual filesystem, glob will not work.

I could go on with more examples, but the point is the following. Unless I have some delicate algorithms, most of what I do involves interaction with external resources, and this is not suitable for unit testing. More that this, often this interaction is actually the non-trivial part. Still many people see unit testing as a basic tool. What am I missing? How can I learn be a better tester?

As regards your issue about existing code bases that aren't currently covered by tests in which you would like to start refactoring, I would suggest reading:

Working Effectively with Legacy Code By Micheal Feathers.

That book gives you techniques on how to deal with the issues you might be facing with PHPMake. It provides ways to introduce seams for testing, where there previously weren't any.


Additionally, with code that touches say the file systems, you can abstract the file system calls behind a thin wrapper, using the Adapter Pattern. The unit tests would be against a fake implementation of the abstract interface that the wrapping class implements.

At some point you get to a low enough level where a unit of code can't be isolated for unit testing as these depend on library or API calls (such as in the production implementation of the wrapper). Once this happens integration tests are really the only automated developer tests you can write.

Background I work in a team of 7 developers and 2 testers that work on a logistics system. We use Delphi 2007 and modeldriven development with Bold for Delphi as framework. The system has been in production about 7 years now and has about 1,7 millions lines of code. We release to production after 4-5 weeks and after almost every release we have to do some patches for bugs we don’t found. This is of course irritating both for us and the customers.

Current testing The solution is of course more automatic testing. Currently we have manual testing. A Testdbgenerator that starts with an empty database and add data from the modelled methods. We also have Testcomplete that runs some very basic scripts for testing the GUI. Lack of time stop us from add more tests, but scripts is also sensitive for changes in the application. For some years ago I really tried unit testing with DUnit, but I gave up after some days. The units have too strong connections.

Unit testing preconditions I think I know some preconditions for unit testing:

  • Write small methods that do one thing, but do it well.
  • Don’t repeat yourself.
  • First write the test that fails, then write the code so the test pass.
  • The connections between units shold be loose. They should not know much about each other.
  • Use dependency injection.

Framework to use We may upgrade to Delphi XE2, mainly because of the 64-bit compiler. I have looked at Spring a bit but this require an update from D2007 and that will not happen now. Maybe next year.

The question Most code is still not tested automatically. So what is the best path to go for increasing testability of old code ? Or maybe it is best to start writing tests for new methods only ? I’m not sure what is the best way to increase automatic testing and comments about it is welcome. Can we use D2007 + DUnit now and then easily change to Delphi XE2 + Spring later ?

EDIT: About current test methodology for manual testing is just "pound on it and try to break it" as Chris call it.

You want the book by Michael Feathers, Working Effectively with Legacy Code. It shows how to introduce (unit) tests to code that wasn't written with testability in mind.

Some of the chapters are named for excuses a developer might give for why testing old code is hard, and they contain case studies and suggested ways to address each problem:

  • I don't have much time and I have to change it
  • I can't run this method in a test harness
  • This class is too big and I don't want it to get any bigger
  • I need to change a monster method and I can't write tests for it.

It also covers many techniques for breaking dependencies; some might be new to you, and some you might already know but just haven't thought to use yet.

I ask myself where reverse engineering is used. I'm interested at learning it. But I don't know if I can/should put it on my CV.

I don't want my new chief to think I am an evil Hacker or something. :)

  • So is it worth it?
  • Should I learn it or put my effort somewhere else?
  • Is there a good Book or tutorial out there? :)

It is very common (in my experience) to encounter older code which has defects, has become outdated due to changing requirements, or both. It's often the case that there's inadequate documentation, and the original developer(s) are no longer available. Reverse engineering that code to understand how it works (and sometimes to make a repair-or-replace decision) is an important skill.

If you have the source, it's often reasonable to do a small, carefully-planned, strictly-scoped amount of cleanup. (I'm hinting out loud that this can't be allowed to become a sinkhole for valuable developer time!)

It's also very helpful to be able to exercise the code in a testbed, either to verify that it does what was expected or to identify, document, isolate, and repair defects.

Doing so safely requires careful work. I highly recommend Michael Feathers' book Working with Legacy Code for its practical guidance in getting such code under test.

I've got a big project written in PHP and Javascript. The problem is that it's become so big and unmaintainable that changing some little portion of the code will upset and probably break a whole lot of other portions.

I'm really bad at testing my own code (as a matter of fact, others point this out daily), which makes it even more difficult to maintain the project.

The project itself isn't that complicated or complex, it's more the way it's built that makes it complex: we don't have predefined rules or lists to follow when doing our testing. This often results in lots of bugs and unhappy customers.

We started discussing this at the office and came up with the idea of starting to use test driven development instead of the develop like hell and maybe test later (which almost always ends up being fix bugs all the time).

After that background, the things I need help with are the following:

  1. How to implement a test framework into an already existing project? (3 years in the making and counting)

  2. What kind of frameworks are there for testing? I figure I'll need one framework for Javascript and one for PHP.

  3. Whats the best approach for testing the graphical user interface?

I've never used Unit Testing before so this is really uncharted territory for me.

G'day,

Edit: I've just had a quick look through the first chapter of "The Art of Unit Testing" which is also available as a free PDF at the book's website. It'll give you a good overview of what you are trying to do with a unit test.

I'm assuming you're going to use an xUnit type framework. Some initial high-level thoughts are:

  1. Edit: make sure that everyone is is agreement as to what constitutes a good unit test. I'd suggest using the above overview chapter as a good starting point and if needed take it from there. Imagine having people run off enthusiastically to create lots of unit tests while having a different understanding of what a "good" unit test. It'd be terrible for you to out in the future that 25% of your unit tests aren't useful, repeatable, reliable, etc., etc..
  2. add tests to cover small chunks of code at a time. That is, don't create a single, monolithic task to add tests for the existing code base.
  3. modify any existing processes to make sure new tests are added for any new code written. Make it a part of the review process of the code that unit tests must be provided for the new functionality.
  4. extend any existing bugfix processes to make sure that new tests are created to show presence and prove the absence of the bug. N.B. Don't forget to rollback your candidate fix to introduce the bug again to verify that it is only that single patch that has corrected the problem and it is not being fixed by a combination of factors.
  5. Edit: as you start to build up the number of your tests, start running them as nightly regression tests to check nothing has been broken by new functionality.
  6. make a successful run of all existing tests and entry criterion for the review process of a candidate bugfix.
  7. Edit: start keeping a catalogue of test types, i.e. test code fragments, to make the creation of new tests easier. No sense in reinventing the wheel all the time. The unit test(s) written to test opening a file in one part of the code base is/are going to be similar to the unit test(s) written to test code that opens a different file in a different part of the code base. Catalogue these to make them easy to find.
  8. Edit: where you are only modifying a couple of methods for an existing class, create a test suite to hold the complete set of tests for the class. Then only add the individual tests for the methods you are modifying to this test suite. This uses xUnit termonology as I'm now assuming you'll be using an xUnit framework like PHPUnit.
  9. use a standard convention for the naming of your test suites and tests, e.g. testSuite_classA which will then contain individual tests like test__test_function. For example, test_fopen_bad_name and test_fopen_bad_perms, etc. This helps minimise the noise when moving around the code base and looking at other people's tests. It also has then benefit of helping people when they come to name their tests in the first place by freeing up their mind to work on the more interesting stuff like the tests themselves.
  10. Edit: i wouldn't use TDD at this stage. By definition, TDD will need all tests present before the changes are in place so you will have failing tests all over the place as you add new testSuites to cover classes that you are working on. Instead add the new testSuite and then add the individual tests as required so you don't get a lot of noise occurring in your test results for failing tests. And, as Yishai points out, adding the task of learning TDD at this point in time will really slow you down. Put learning TDD as a task to be done when you have some spare time. It's not that difficult.
  11. as a corollary of this you'll need a tool to keep track of the those existing classes where the testSuite exists but where tests have not yet been written to cover the other member functions in the class. This way you can keep track of where your test coverage has holes. I'm talking at a high level here where you can generate a list of classes and specific member functions where no tests currently exist. A standard naming convention for the tests and testSuites will greatly help you here.

I'll add more points as I think of them.

HTH

You should get yourself a copy Working Effectively with Legacy Code. This will give you good guidance in how to introduce tests into code that is not written to be tested.

TDD is great, but you do need to start with just putting existing code under test to make sure that changes you make don't change existing required behavior while introducing changes.

However, introducing TDD now will slow you down a lot before you get back going, because retrofitting tests, even only in the area you are changing, is going to get complicated before it gets simple.

I am a novice programmer and as a part of my project I have to modify a open source tool (written in java) which has hundreds of classes. I have to modify a significant part of it to suit the needs of the project. I have been struggling with it for the last one month trying to read code, trying to find out the functionalities of each class and trying to figure out the pipeline from start to end.

80% of the classes have incomplete/missing documentation. The remaining 20% are those that form the general purpose API for the tool. One month of code reading has just helped me understand the basic architecture. But I have not been able to figure out the exact changes I need to make for my project. One time, I started modifying a part of the code and soon made so many changes that I could no longer remember.

A friend suggested that I try to write down the class hierarchy. Is there a better(standard?) way to do this?

There's a great book called Working Effectively with Legacy Code, by Michael Feathers. There's a shorter article version here.

One of his points is that the best thing you can do is write unit tests for the existing code. This helps you understand where the entry points are and how the code should work. Then it lets you refactor it without worrying that you're going to break it.

From the article linked, the summary of his strategy:

1. Identify change points
2. Find an inflection point
3. Cover the inflection point
   a. Break external dependencies
   b. Break internal dependencies
   c. Write tests
4. Make changes
5. Refactor the covered code.

I have been trying to use Ninject today and have a couple of questions. First of all do I need to use the Inject attribute on all constructors that I want to use injection for. This seems like a really lame design? Do I need to create a Kernel then use that everywhere I pass in an injected class?

The best way to get started with Ninject is to start small. Look for a new.

Somewhere in the middle of your application, you're creating a class inside another class. That means you're creating a dependency. Dependency Injection is about passing in those dependencies, usually through the constructor, instead of embedding them.

Say you have a class like this, used to automatically create a specific type of note in Word. (This is similar to a project I've done at work recently.)

class NoteCreator
{
    public NoteHost Create()
    {
        var docCreator = new WordDocumentCreator();
        docCreator.CreateNewDocument();
        [etc.]

WordDocumentCreator is a class that handles the specifics of creating a new document in Microsoft Word (create an instance of Word, etc.). My class, NoteCreator, depends on WordDocumentCreator to perform its work.

The trouble is, if someday we decide to move to a superior word processor, I have to go find all the places where WordDocumentCreator is instantiated and change them to instantiate WordPerfectDocumentCreator instead.

Now imagine that I change my class to look like this:

class NoteCreator
{
    WordDocumentCreator docCreator;

    public NoteCreator(WordDocumentCreator docCreator)  // constructor injection
    {
        this.docCreator = docCreator;
    }

    public NoteHost Create()
    {
        docCreator.CreateNewDocument();
        [etc.]

My code hasn't changed that much; all I've done within the Create method is remove the line with the new. But now I'm injecting my dependency. Let's make one more small change:

class NoteCreator
{
    IDocumentCreator docCreator;

    public NoteCreator(IDocumentCreator docCreator)  // change to interface
    {
        this.docCreator = docCreator;
    }

    public NoteHost Create()
    {
        docCreator.CreateNewDocument();
        [etc.]

Instead of passing in a concrete WordDocumentCreator, I've extracted an IDocumentCreator interface with a CreateNewDocument method. Now I can pass in any class that implements that interface, and all NoteCreator has to do is call the method it knows about.

Now the tricky part. I should now have a compile error in my app, because somewhere I was creating NoteCreator with a parameterless constructor that no longer exists. Now I need to pull out that dependency as well. In other words, I go through the same process as above, but now I'm applying it to the class that creates a new NoteCreator. When you start extracting dependencies, you'll find that they tend to "bubble up" to the root of your application, which is the only place where you should have a reference to your DI container (e.g. Ninject).

The other thing I need to do is configure Ninject. The essential piece is a class that looks like this:

class MyAppModule : NinjectModule
{
    public override void Load()
    {
        Bind<IDocumentCreator>()
            .To<WordDocumentCreator>();

This tells Ninject that when I attempt to create a class that, somewhere down the line, requires an IDocumentCreator, it should create a WordDocumentCreator and use that. The process Ninject goes through looks something like this:

  • Create the application's MainWindow. Its constructor requires a NoteCreator.
  • OK, so create a NoteCreator. But its constructor requires an IDocumentCreator.
  • My configuration says that for an IDocumentCreator, I should use WordDocumentCreator. So create a WordDocumentCreator.
  • Now I can pass the WordDocumentCreator to the NoteCreator.
  • And now I can pass that NoteCreator to the MainWindow.

The beauty of this system is threefold.

First, if you fail to configure something, you'll know right away, because your objects are created as soon as your application is run. Ninject will give you a helpful error message saying that your IDocumentCreator (for instance) can't be resolved.

Second, if management later mandates the user of a superior word processor, all you have to do is

  • Write a WordPerfectDocumentCreator that implements IDocumentCreator.
  • Change MyAppModule above, binding IDocumentCreator to WordPerfectDocumentCreator instead.

Third, if I want to test my NoteCreator, I don't have to pass in a real WordDocumentCreator (or whatever I'm using). I can pass in a fake one. That way I can write a test that assumes my IDocumentCreator works correctly, and only tests the moving parts in NoteCreator itself. My fake IDocumentCreator will do nothing but return the correct response, and my test will make sure that NoteCreator does the right thing.

For more information about how to structure your applications this way, have a look at Mark Seemann's recent book, Dependency Injection in .NET. Unfortunately, it doesn't cover Ninject, but it does cover a number of other DI frameworks, and it talks about how to structure your application in the way I've described above.

Also have a look at Working Effectively With Legacy Code, by Michael Feathers. He talks about the testing side of the above: how to break out interfaces and pass in fakes for the purpose of isolating behavior and getting it under test.

Can anyone recommend some best-practices on how to tackle the problem of starting to UnitTest a large existing CodeBase? The problems that I'm currently facing include:

  • HUGE code base
  • ZERO existing UnitTests
  • High coupling between classes
  • Complex OM (not much I can do here - it's a complex business domain)
  • Lack of experience in writing UnitTests/TDD
  • Database dependency
  • External sources dependencies (Web services, WCF services, NetBIOS, etc)

Obviously, I understand that I should start with refactoring the code, to make it less coupled, and more testable. However, doing such refactoring is risky without UnitTests (Chicken and Egg, anyone?).

On a side note, would you recommend starting the refactoring and writing test on the Domain classes, or on tier classes (logging, utilities, etc)?

Lack of experience in writing UnitTests/TDD

I think this is the most significant.

Standard advice is to start writing unit tests for all your new code first so that you learn how to unit test before you attempt to unit test your existing code. I think this is good advice in theory but hard to follow in practice, since most new work is modifications of existing systems.

I think you'd be well served by having access to a "player coach", someone who could work on the project with your team and teach the skills in the process of applying them.

And I think I'm legally obligated to tell you to get Michael Feather's Working Effectively with Legacy Code.

How do you perfom Unit-Test like tests in C? Which framework or do you do other Tests as Unit-Tests on code level?

Personally I like the Google Test framework.

The real difficulty in testing C code is breaking the dependencies on external modules so you can isolate code in units. This can be especially problematic when you are trying to get tests around legacy code. In this case I often find myself using the linker to use stubs functions in tests.

This is what people are referring to when they talk about "seams". In C your only option really is to use the pre-processor or the linker to mock out your dependencies.

A typical test suite in one of my C projects might look like this:

#include "myimplementationfile.c"
#include <gtest/gtest.h>

// Mock out external dependency on mylogger.o
void Logger_log(...){}

TEST(FactorialTest, Zero) {
    EXPECT_EQ(1, Factorial(0));
}

Note that you are actually including the C file and not the header file. This gives the advantage of access to all the static data members. Here I mock out my logger (which might be in logger.o and give an empty implementation. This means that the test file compiles and links independently from the rest of the code base and executes in isolation.

As for cross-compiling the code, for this to work you need good facilities on the target. I have done this with googletest cross compiled to Linux on a PowerPC architecture. This makes sense because there you have a full shell and os to gather your results. For less rich environments (which I classify as anything without a full OS) you should just build and run on the host. You should do this anyway so you can run the tests automatically as part of the build.

I find testing C++ code is generally much easier due to the fact that OO code is in general much less coupled than procedural (of course this depends a lot on coding style). Also in C++ you can use tricks like dependency injection and method overriding to get seams into code that is otherwise encapsulated.

Michael Feathers has an excellent book about testing legacy code. In one chapter he covers techniques for dealing with non-OO code which I highly recommend.

Edit: I've written a blog post about unit testing procedural code, with source available on GitHub.

I admit that I have almost none experience of unittesting. I did a try with DUnit a while ago but gave up because there was so many dependencies between classes in my application. It is a rather big (about 1.5 million source lines) Delphi application and we are a team that maintain it.

The testing for now is done by one person that use it before release and report bugs. I have also set up some GUI-tests in TestComplete 6, but it often fails because of changes in the application.

Bold for Delphi is used as persistance framework against the database. We all agree that unittesting is the way to go and we plan to write a new application in DotNet with ECO as persistance framework.

I just don't know where to start with unittesting... Any good books, URL, best practice etc ?

Writing unit tests for legacy code usually requires a lot of refactoring. Excellent book that covers this is Michael Feather's "Working Effectively with Legacy Code"

One additional suggestion: use a unit test coverage tool to indicate your progress in this work. I'm not sure about what the good coverage tools for Delphi code are though. I guess this would be a different question/topic.

Working Effectively with Legacy Code

I am now doing unit testing on an application which was written over the year, before I started to do unit-testing diligently. I realized that the classes I wrote are hard to unit test, for the following reasons:

  1. Relies on loading data from database. Which means I have to setup a row in the table just to run the unit test (and I am not testing database capabilities).
  2. Requires a lot of other external classes just to get the class I am testing to its initial state.

On the whole, there doesn't seem to be anything wrong with the design except that it is too tightly coupled (which by itself is a bad thing). I figure that if I have written automated test cases with each of the class, hence ensuring that I don't heap extra dependencies or coupling for the class to work, the class might be better designed.

Does this reason holds water? What are your experiences?

Yes you are right. A class which is not unit testable hard to unit test is (almost always) not well designed (there are exceptions, as always, but these are rare - IMHO one should better not try to explain the problem away this way). Lack of unit tests means that it is harder to maintain - you have no way of knowing whether you have broken existing functionality whenever you modify anything in it.

Moreover, if it is (co)dependent with the rest of the program, any changes in it may break things even in seemingly unrelated, far away parts of the code.

TDD is not simply a way to test your code - it is also a different way of design. Effectively using - and thinking about using - your own classes and interfaces from the very first moment may result in a very different design than the traditional way of "code and pray". One concrete result is that typically most of your critical code is insulated from the boundaries of your system, i.e. there are wrappers/adapters in place to hide e.g. the concrete DB from the rest of the system, and the "interesting" (i.e. testable) code is not within these wrappers - these are as simple as possible - but in the rest of the system.

Now, if you have a bunch of code without unit tests and want to cover it, you have a challenge. Mocking frameworks may help a lot, but still it is a pain in the ass to write unit tests for such a code. A good source of techniques to deal with such issues (commonly known as legacy code) is Working Effectively with Legacy Code, by Michael Feathers.

I tried looking through all the pages about unit tests and could not find this question. If this is a duplicate, please let me know and I will delete it.

I was recently tasked to help implement unit testing at my company. I realized that I could unit test all the Oracle PL/SQL code, Java code, HTML, JavaScript, XML, XSLT, and more.

Is there such a thing as too much unit testing? Should I write unit tests for everything above or is that overkill?

The purpose of Unit tests is generally to make it possibly to refector or change with greater assurance that you did not break anything. If a change is scary because you do not know if you will break anything, you probably need to add a test. If a change is tedious because it will break a lot of tests, you probably have too many test (or too fragile a test).

The most obvious case is the UI. What makes a UI look good is something that is hard to test, and using a master example tends to be fragile. So the layer of the UI involving the look of something tends not to be tested.

The other times it might not be worth it is if the test is very hard to write and the safety it gives is minimal.

For HTML I tended to check that the data I wanted was there (using XPath queries), but did not test the entire HTML. Similarly for XSLT and XML. In JavaScript, when I could I tested libraries but left the main page alone (except that I moved most code into libraries). If the JavaScript is particularly complicated I would test more. For databases I would look into testing stored procedures and possibly views; the rest is more declarative.

However, in your case first start with the stuff that worries you the most or is about to change, especially if it is not too difficult to test. Check the book Working Effectively with Legacy Code for more help.

I've just started writing unit tests for a legacy code module with large physical dependencies using the #include directive. I've been dealing with them a few ways that felt overly tedious (providing empty headers to break long #include dependency lists, and using #define to prevent classes from being compiled) and was looking for some better strategies for handling these problems.

I've been frequently running into the problem of duplicating almost every header file with a blank version in order to separate the class I'm testing in it's entirety, and then writing substantial stub/mock/fake code for objects that will need to be replaced since they're now undefined.

Anyone know some better practices?

The depression in the responses is overwhelming... But don't fear, we've got the holy book to exorcise the demons of legacy C++ code. Seriously just buy the book if you are in line for more than a week of jousting with legacy C++ code.

Turn to page 127: The case of the horrible include dependencies. (Now I am not even within miles of Michael Feathers but here as-short-as-I-could-manage answer..)

Problem: In C++ if a classA needs to know about ClassB, Class B's declaration is straight-lifted / textually included in the ClassA's source file. And since we programmers love to take it to the wrong extreme, a file can recursively include a zillion others transitively. Builds take years.. but hey atleast it builds.. we can wait.

Now to say 'instantiating ClassA under a test harness is difficult' is an understatement. (Quoting MF's example - Scheduler is our poster problem child with deps galore.)

#include "TestHarness.h"
#include "Scheduler.h"
TEST(create, Scheduler)     // your fave C++ test framework macro
{
  Scheduler scheduler("fred");
}

This will bring out the includes dragon with a flurry of build errors.
Blow#1 Patience-n-Persistence: Take on each include one at a time and decide if we really need that dependency. Let's assume SchedulerDisplay is one of them, whose displayEntry method is called in Scheduler's ctor.
Blow#2 Fake-it-till-you-make-it (Thanks RonJ):

#include "TestHarness.h"
#include "Scheduler.h"
void SchedulerDisplay::displayEntry(const string& entryDescription) {}
TEST(create, Scheduler)
{
  Scheduler scheduler("fred");
}

And pop goes the dependency and all its transitive includes. You can also reuse the Fake methods by encapsulating it in a Fakes.h file to be included in your test files.
Blow#3 Practice: It may not be always that simple.. but you get the idea. After the first few duels, the process of breaking deps will get easy-n-mechanical

Caveats (Did I mention there are caveats? :)

  • We need a separate build for test cases in this file ; we can have only 1 definition for the SchedulerDisplay::displayEntry method in a program. So create a separate program for scheduler tests.
  • We aren't breaking any dependencies in the program, so we are not making the code cleaner.
  • You need to maintain those fakes as long as we need the tests.
  • Your sense of aesthetics may be offended for a while.. just bite your lip and 'bear with us for a better tomorrow'

Use this technique for a very huge class with severe dependency issues. Don't use often or lightly.. Use this as a starting point for deeper refactorings. Over time this testing program can be taken behind the barn as you extract more classes (WITH their own tests).

For more.. please do read the book. Invaluable. Fight on bro!

I work on a large scale platform project supporting around 10 products that use our code.

So far, all of the products have been using the full functionality of our platform:
- Retrieval of configuration data from a database
- Remote file system access
- Security authorization
- Base logic (the thing we are paid to offer)

For a new product we've been asked to support a smaller subset of functionality without the infrastructure the platforms bring along. Our architecture is old (start of coding from 2005 or so) but reasonably solid.

We're confident we can do that using DI on our existing classes, but the estimated times to do so range from 5 to 70 weeks depending who you talk to.

There's a lot of articles out there that tell you how to do DI, but I coulnd't find any that tell you how to refactor for DI in the most efficient way? Are there tools that do this rather than having to go through 30.000 lines of code and having to hit CTRL+R for extacting interfaces and adding them to construcors too many times? (we have resharper if that helps) If not, what do you find is the ideal workflow to quickly achieve this?

This book would probably be very helpful:

Working Effectively with Legacy Code - Michael C. Feathers - http://www.amazon.com/gp/product/0131177052

I would suggest starting with small changes. Gradually move dependencies to be injected through the constructor. Always keep the system working. Extract interfaces from the constructor injected dependencies and start wrapping with unit tests. Bring in tools when it makes sense. You don't have to start using dependency injection and mocking frameworks right away. You can make a lot of improvements by manually injecting dependencies through the constructor.

Just wanted to hear some words of advice (and comfort.. ) that will help me to take control over some complicated spaghetti code -- code that was developed by multiple programmers (usually that never meet each other) over long time. the solution's features are just patched on top of each other.

Usually I tend to see 2 kinds of programmers:

  1. "Scared-to-death programmers" - those guys will not touch anything that they don't have to. they will probably will complete the maintenance task using a quick and dirty fixes that will make the next programmer to start looking for their home address ;-)

    pros:
    it works

    cons:
    you hope you will never see this code again..

  2. "Teachers" - those will probably rewrite the whole code while completely refurbishing its logic.

    pros:
    well, someone has to do the dirty work...

    cons:
    takes longer time and probably one of the most critical features will magically disappear from the product

It will be nice to hear your personal experience from this darker side of the programmer's life.

I am specifically curious to hear any theoretical/hands-on advice that will help me to dive into the spaghetti maintenance task without feeling so miserable.

I try to wrap really bad legacy code in tests before I start. I pretty much smother it in tests, to be honest! Then I have some confidence in diving in an refactoring it where I need to.

You might like to read

Book cover

Working Effectively with Legacy Code

How do you unit test a large MFC UI application?

We have a few large MFC applications that have been in development for many years, we use some standard automated QA tools to run basic scripts to check fundamentals, file open etc. These are run by the QA group post the daily build.

But we would like to introduce procedures such that individual developers can build and run tests against dialogs, menus, and other visual elements of the application before submitting code to the daily build.

I have heard of such techniques as hidden test buttons on dialogs that only appear in debug builds, are there any standard toolkits for this.

Environment is C++/C/FORTRAN, MSVC 2005, Intel FORTRAN 9.1, Windows XP/Vista x86 & x64.

Since you mentioned MFC, I assumed you have an application that would be hard to get under an automated test harness. You'll observe best benefits of unit testing frameworks when you build tests as you write the code.. But trying to add a new feature in a test-driven manner to an application which is not designed to be testable.. can be hard work and well frustrating.

Now what I am going to propose is definitely hard work.. but with some discipline and perseverance you'll see the benefit soon enough.

  • First you'll need some management backing for new fixes to take a bit longer. Make sure everyone understands why.
  • Next buy a copy of the WELC book. Read it cover to cover if you have the time OR if you're hard pressed, scan the index to find the symptom your app is exhibiting. This book contains a lot of good advice and is just what you need when trying to get existing code testable. alt text
  • Then for every new fix/change, spend some time and understand the area you're going to work on. Write some tests in a xUnit variant of your choice (freely available) to exercise current behavior.
  • Make sure all tests pass. Write a new test which exercises needed behavior or the bug.
  • Write code to make this last test pass.
  • Refactor mercilessly within the area under tests to improve design.
  • Repeat for every new change that you have to make to the system from here on. No exceptions to this rule.
  • Now the promised land: Soon ever growing islands of well tested code will begin to surface. More and more code would fall under the automated test suite and changes will become progressively easier to make. And that is because slowly and surely the underlying design becomes more testable.

The easy way out was my previous answer. This is the difficult but right way out.

I am about to start planning a major refactoring of our codebase, and I would like to get some opinions and answers to some questions (I have seen quite a few discussions on similar topics, such as http://stackoverflow.com/questions/108141/how-do-i-work-effectively-with-very-messy-legacy-code, Strategy for large scale refactoring, but I have some specific questions (at the bottom):

We develop a complex application. There are some 25 developers working the codebase. Total man years put into the product to date are roughly 150. The current codebase is a single project, built with ant. The high level goal of the project I'm embarking on is to modularize the codebase into its various infrastructures and applicative components. There is currently no good separation between the various logical components, so it's clear that any modularization effort will need to include some API definitions and serious untangling to enable the separation. Quality standards are low - there are almost no tests, and definitely no tests running as part of the build process.

Another very important point is that this project needs to take place in parallel to active product development and versions being shipped to customers.

Goals of project:

  • allow reuse of components across different projects
  • separate application from infrastructure, and allow them to evolve independently
  • improve testability (by creating APIs)
  • simplify developers' dev env (less code checked out and compiled)

My thoughts and questions:

  1. What are your thoughts regarding the project's goals? Anything you would change?
  2. do you have experience with such projects? What would some recommendations?
  3. I'm very concerned with the lack of tests - hence lack of control for me to know that the refactoring process is not breaking anything as i go. This is a catch 22, because one of the goals of this project is to make our code more testable...
  4. I was very influenced by Michael Feathers' Working Effectively With Legacy Code . According to it, a bottom up approach is the way to solve my problem - don't jump head first into the codebase and try to fix it, but rather start small by adding unit tests around new code for several months, and see how the code (and team) become much better, to an extent where abstractions will emerge, APIs will surface, etc, and essentially - the modularization will start happening by itself. Does anyone have experience with such a direction? As seen in many other questions on this topic - the main problem here is managerial disbelief. "how is testing class by class (and spending a lot of time doing so) gonna bring us to a stable system? It's a nice theory which doesn't work in real life". Any tips on selling this?

I inherited a fairly large, homemade, php4+MySQL, ecommerce project from developers that literally taught themselves programming and html as they wrote it. (I would shudder except that it's really impressive that they were able to do so much by starting from scratch.) My job is to maintain it and take it forward with new functionality.

Functionality of the code depends on $_SESSION data and other global state structures, which then affect the flow of the code and which parts of the site are displayed through require statements. When I took it over last year, my first task was abstracting all of the repetition into separate files which get included via require statements and also removing most of the 'logic' code from the 'display' or output code, but I couldn't remove it all. I have moved code into functions where I can, but that's still quite limited. Classes and methods are definitely out of the question right now.

All testing is being done manually/visually. I would like to start automating some testing, but I simply don't know where to start. Unit testing of functions is pretty straightforward, but very little of the code is in functions and most of that is pretty simple. I've looked at phpUnit and DbUnit, but all of the examples and discussion about them focus on classes and methods.

So, what options do I have to begin implementing unit testing on anything more than the most trivial parts of my project?

First off PHPUnit can be used to test procedural code just fine. Don't let the fact that PHPUnit examples only shows classes deter you. It's just how PHPUnit tests are organized.

You can just write test classes and test your function from them without any problems and that should be your smallest problem :)

If the code doesn't run on PHP 5.2+ then you can't use a current PHPUnit Version which is definitely more of a concern and my first general recommendation is to find any issues an PHP 5 upgrade might bring.


To start off one book recommendation to save you some troubles:

Working Effectively with Legacy Code

The book will help you avoid a lot of small mistakes you'd have to make yourself once instead and will get you in the right mindset. It's Java based but that is not really an issue as most of the stuff is easily adaptable.


Testing is hard because you don't even know what the application is supposed to in the first place

Getting unit tests up an running takes quite some time and doesn't give you a status "is it still working" so my first point would be to get some integration and front-end tests set up.

Tools like Selenium and the web testing parts of Behat can help you A LOT with this.

The advantage of using Behat would be that you can write nice documentation for what the product actually is supposed to do. No matter how to project goes along these docs will always have value for you.

The rests read something like: "When I go to this url and enter that data a user should be created and when I click there I get an email containing my data export". Check it out. It might be useful.


The most important thing is to get a quick green/red indicator if the thing is still working!

If you then find out that it was broken despite your "light" being green you can expand the tests from there on out.


When you don't know when it is broken you will never be confident enough to change enough stuff around so that you can incrementally improve what needs fixing or changing.


After you have a general sense of how everything work and you trust your small tests to show you when you break "the whole thing" I'd say it's time to set up a small Continuous integration server like Jenkins for PHP that allows you to track the status of your project over time. You don't need all the QA stuff at the start (maybe to get an overview over the project) but just seeing that all the "does it still work" stuff returns "yes" is very important and saves you lots of timing making sure of that manually.

2% Code Coverage is boring

When you are at a point where unit testing and code coverage come into play you will be faced with a mean read 0%. That can be quite annoying to never see this number rise a lot.

But you want to make sure you test NEW code so I'd suggest using PHP_Change_Coverage as described in this blog posting so make sure everything you touch has tests afterwards.

PHP Black Magic

function stuff() {
    if(SOME_OLD_UGLY_CONST == "SOME SETTING") {
         die("For whatever reasons");
    }
    return "useful stuff";
}

When testing it is really annoying when your scripts die() but what to do?

Reworking all scripts without tests can be more hurtful than not doing anything at all so maybe you want to hack to get tests first.

For this an other solutions to scary things there is the php test helpers extension.

<?php
set_exit_overload(function() { return FALSE; }
exit;
print 'We did not exit.';
unset_exit_overload();
exit;
print 'We exited and this will not be printed.';
?>

I've read about unit testing and heard a lot of hullabaloo by others touting its usefulness, and would like to see it in action. As such, I've selected this basic class from a simple application that I created. I have no idea how testing would help me, and am hoping one of you will be able to help me see the benefit of it by pointing out what parts of this code can be tested, and what those tests might look like. So, how would I write unit tests for the following code?

public class Hole : INotifyPropertyChanged
{
    #region Field Definitions
    private double _AbsX;
    private double _AbsY;
    private double _CanvasX { get; set; }
    private double _CanvasY { get; set; }
    private bool _Visible;
    private double _HoleDia = 20;
    private HoleTypes _HoleType;
    private int _HoleNumber;
    private double _StrokeThickness = 1;
    private Brush _StrokeColor = new SolidColorBrush(Colors.Black);
    private HolePattern _ParentPattern;
    #endregion

    public enum HoleTypes { Drilled, Tapped, CounterBored, CounterSunk };
    public Ellipse HoleEntity = new Ellipse();
    public Ellipse HoleDecorator = new Ellipse();
    public TextBlock HoleLabel = new TextBlock();

    private static DoubleCollection HiddenLinePattern = 
               new DoubleCollection(new double[] { 5, 5 });

    public int HoleNumber
    {
        get
         {
            return _HoleNumber;
         }
        set
        {
            _HoleNumber = value;
            HoleLabel.Text = value.ToString();
            NotifyPropertyChanged("HoleNumber");
        }
    }
    public double HoleLabelX { get; set; }
    public double HoleLabelY { get; set; }
    public string AbsXDisplay { get; set; }
    public string AbsYDisplay { get; set; }

    public event PropertyChangedEventHandler PropertyChanged;
    //public event MouseEventHandler MouseActivity;

    // Constructor
    public Hole()
    {
        //_HoleDia = 20.0;
        _Visible = true;
        //this.ParentPattern = WhoIsTheParent;
        HoleEntity.Tag = this;
        HoleEntity.Width = _HoleDia;
        HoleEntity.Height = _HoleDia;

        HoleDecorator.Tag = this;
        HoleDecorator.Width = 0;
        HoleDecorator.Height = 0;


        //HoleLabel.Text = x.ToString();
        HoleLabel.TextAlignment = TextAlignment.Center;
        HoleLabel.Foreground = new SolidColorBrush(Colors.White);
        HoleLabel.FontSize = 12;

        this.StrokeThickness = _StrokeThickness;
        this.StrokeColor = _StrokeColor;
        //HoleEntity.Stroke = Brushes.Black;
        //HoleDecorator.Stroke = HoleEntity.Stroke;
        //HoleDecorator.StrokeThickness = HoleEntity.StrokeThickness;
        //HiddenLinePattern=DoubleCollection(new double[]{5, 5});
    }

    public void NotifyPropertyChanged(String info)
    {
        if (PropertyChanged != null)
        {
            PropertyChanged(this, 
                       new PropertyChangedEventArgs(info));
        }
    }

    #region Properties
    public HolePattern ParentPattern
    {
        get
        {
            return _ParentPattern;
        }
        set
        {
            _ParentPattern = value;
        }
    }

    public bool Visible
    {
        get { return _Visible; }
        set
        {
            _Visible = value;
            HoleEntity.Visibility = value ? 
             Visibility.Visible : 
             Visibility.Collapsed;
            HoleDecorator.Visibility = HoleEntity.Visibility;
            SetCoordDisplayValues();
            NotifyPropertyChanged("Visible");
        }
    }

    public double AbsX
    {
        get { return _AbsX; }
        set
        {
            _AbsX = value;
            SetCoordDisplayValues();
            NotifyPropertyChanged("AbsX");
        }
    }

    public double AbsY
    {
        get { return _AbsY; }
        set
        {
            _AbsY = value;
            SetCoordDisplayValues();
            NotifyPropertyChanged("AbsY");
        }
    }

    private void SetCoordDisplayValues()
    {
        AbsXDisplay = HoleEntity.Visibility == 
        Visibility.Visible ? String.Format("{0:f4}", _AbsX) : "";
        AbsYDisplay = HoleEntity.Visibility == 
        Visibility.Visible ? String.Format("{0:f4}", _AbsY) : "";
        NotifyPropertyChanged("AbsXDisplay");
        NotifyPropertyChanged("AbsYDisplay");
    }

    public double CanvasX
    {
        get { return _CanvasX; }
        set
        {
            if (value == _CanvasX) { return; }
            _CanvasX = value;
            UpdateEntities();
            NotifyPropertyChanged("CanvasX");
        }
    }

    public double CanvasY
    {
        get { return _CanvasY; }
        set
        {
            if (value == _CanvasY) { return; }
            _CanvasY = value;
            UpdateEntities();
            NotifyPropertyChanged("CanvasY");
        }
    }

    public HoleTypes HoleType
    {
        get { return _HoleType; }
        set
        {
            if (value != _HoleType)
            {
                _HoleType = value;
                UpdateHoleType();
                NotifyPropertyChanged("HoleType");
            }
        }
    }

    public double HoleDia
    {
        get { return _HoleDia; }
        set
        {
            if (value != _HoleDia)
            {
                _HoleDia = value;
                HoleEntity.Width = value;
                HoleEntity.Height = value;
                UpdateHoleType(); 
                NotifyPropertyChanged("HoleDia");
            }
        }
    }

    public double StrokeThickness
    {
        get { return _StrokeThickness; }
        //Setting this StrokeThickness will also set Decorator
        set
        {
            _StrokeThickness = value;
            this.HoleEntity.StrokeThickness = value;
            this.HoleDecorator.StrokeThickness = value;
            NotifyPropertyChanged("StrokeThickness");
        }
    }

    public Brush StrokeColor
    {
        get { return _StrokeColor; }
        //Setting this StrokeThickness will also set Decorator
        set
        {
            _StrokeColor = value;
            this.HoleEntity.Stroke = value;
            this.HoleDecorator.Stroke = value;
            NotifyPropertyChanged("StrokeColor");
        }
    }

    #endregion

    #region Methods

    private void UpdateEntities()
    {
        //-- Update Margins for graph positioning
        HoleEntity.Margin = new Thickness
        (CanvasX - HoleDia / 2, CanvasY - HoleDia / 2, 0, 0);
        HoleDecorator.Margin = new Thickness
        (CanvasX - HoleDecorator.Width / 2, 
         CanvasY - HoleDecorator.Width / 2, 0, 0);
        HoleLabel.Margin = new Thickness
        ((CanvasX * 1.0) - HoleLabel.FontSize * .3, 
         (CanvasY * 1.0) - HoleLabel.FontSize * .6, 0, 0);
    }

    private void UpdateHoleType()
    {
        switch (this.HoleType)
        {
            case HoleTypes.Drilled: //Drilled only
                HoleDecorator.Visibility = Visibility.Collapsed;
                break;
            case HoleTypes.Tapped: // Drilled & Tapped
                HoleDecorator.Visibility = (this.Visible == true) ? 
                Visibility.Visible : Visibility.Collapsed;
                HoleDecorator.Width = HoleEntity.Width * 1.2;
                HoleDecorator.Height = HoleDecorator.Width;
                HoleDecorator.StrokeDashArray = 
                LinePatterns.HiddenLinePattern(1);
                break;
            case HoleTypes.CounterBored: // Drilled & CounterBored
                HoleDecorator.Visibility = (this.Visible == true) ? 
                Visibility.Visible : Visibility.Collapsed;
                HoleDecorator.Width = HoleEntity.Width * 1.5;
                HoleDecorator.Height = HoleDecorator.Width;
                HoleDecorator.StrokeDashArray = null;
                break;
            case HoleTypes.CounterSunk: // Drilled & CounterSunk
                HoleDecorator.Visibility = (this.Visible == true) ? 
                Visibility.Visible : Visibility.Collapsed;
                HoleDecorator.Width = HoleEntity.Width * 1.8;
                HoleDecorator.Height = HoleDecorator.Width;
                HoleDecorator.StrokeDashArray = null;
                break;
        }
        UpdateEntities();
    }

    #endregion

}

We must test it, right?

Tests are validation that the code works as you expect it to work. Writing tests for this class right now will not yield you any real benefit (unless you uncover a bug while writing the tests). The real benefit is when you will have to go back and modify this class. You may be using this class in several different places in your application. Without tests, changes to the class may have unforseen reprecussions. With tests, you can change the class and be confident that you aren't breaking something else if all of your tests pass. Of course, the tests need to be well written and cover all of the class's functionality.

So, how to test it?

At the class level, you will need to write unit tests. There are several unit testing frameworks. I prefer NUnit.

What am I testing for?

You are testing that everything behaves as you expect it to behave. If you give a method X, then you expect Y to be returned. In Gord's answer, he suggested testing that your event actually fires off. This would be a good test.

The book, Agile Principles, Patterns, and Practices in C# by Uncle Bob has really helped me understand what and how to test.

Tests will help, if you need to make changes.

According to Feathers (Feathers, Working Effectively with Legacy Code, p. 3) there are four reasons for changes:

  • Adding a feature
  • Fixing a bug
  • Improving design
  • Optimizing resource usage

When there is the need for change, you want to be confident that you don't break anything. To be more precise: You don't want to break any behavior (Hunt, Thomas, Pragmatic Unit Testing in C# with NUnit, p. 31).

With unit testing in place you can do changes with much more confidence, because they would (provided they are programmed properly) capture changes in behavior. That's the benefit of unit tests.

It would be difficult to make unit tests for the class you gave as an example, because unit tests also requires a certain structure of the code under test. One reason I see is that the class is doing too much. Any unit tests you will apply on that class will be quite brittle. Minor change may make your unit tests blow up and you will end up wasting much time with fixing problems in your test code instead of your production code.

To reap the benefits of unit tests requires to change the production code. Just applying unit tests principles, without considering this will not give you the positive unit testing experience.

How to get the positive unit testing experience? Be openminded for it and learn.

I would recommend you Working Effectively with Legacy Code for an existing code basis (as that piece of code you gave above). For an easy kick start into unit testing try Pragmatic Unit Testing in C# with NUnit. The real eye opener for me was xUnit Test Patterns: Refactoring Test Code.

Good luck in you journey!

What do you do when you're assigned to work on code that's atrocious and antiquated to the point where it's almost incomprehensible?

For example: hardware interface code, mixed with logic, AND user interface code, ALL in the same functions?

We see bad code all the time, but what do you actually do about it?

  • Do you try to refactor it?
  • Try to make it OO if it's not?
  • Or do you try to make some sense of it, make the necessary changes and move on?

The book Working Effectively with Legacy Code discusses the options you can do. In general the rule is not to change code until you have to (to fix a bug or add a feature). The book describes how to make changes when you can't add testing and how to add testing to complex code (which allows more substantial changes).

I am new to mocking so I might have it totally wrong here but I believe that most mocking frameworks are interface dependent. Unfortunately most of our code is not using an interface. Now the other day I saw a Mocking framework in Java that reproduced the byte code of a class\object as to not call its internal methods but you could still test that it WAS calling these methods.

My question is: does .Net have any mocking frameworks that can do a similar thing? I am looking for something free and I don't want something that requires methods to be virtual or abstract.

You can use classes instead of interfaces with both Moq and Rhino.Mocks, but the mocked methods must be virtual. Mark Rushakoff's answer on TypeMock is correct (+1).

The best option is to refactor your existing code for testability (which may take time). I'd recommend reading Working Effectively with Legacy Code by Michael Feathers.

I'm working with legacy java code, without any Unit-Tests. Many classes need to be refactored in order to work with the project.

Many refactorings can be done with eclipse, and am I doing some by hand. After some refactoring I review the diff to cvs-HEAD, but i can't really feel certain that everything is 100% correct.

The Question: How can I validate a refactoring, that is mathematical identical to the previous version? I wish there were a tool, but i also accept "basic human algorithms" as solutions.

I know, "run your JUnit-Tests" is the best answer, but sadly, there aren't any in my project.

Thank you!

I am a bit surprised that so far no one has mentioned the book Working Effectively with Legacy Code, by Michael Feathers. It deals with this exact situation, with lots of practical advice, in various languages. I recommend it to anyone dealing with legacy projects.

I've recently been studying TDD, attended a conference and have dabbled in few tests and already I'm 100% sold, I absolutely love it TDD.

As a result I've raised this with my seniors and they are prepared to give it a chance, so they have tasked me with coming up with a way to implement TDD in the development of our enterprise product.

The problem is our system has evolved since the days of VB6 to .NET and implements a lot of legacy technology and some far from best practice development techniques i.e. a lot of business logic in the ASP.NET code behind and client script. The largest problem however is how our classes are tightly coupled with database access; properties, methods, constructors - usually has some database access in some form or another.

We use an in-house data access code generator tool that creates sqlDataAdapters that gives us all the database access we could ever want, which helps us develop extremely quickly, however, classes in our business layer are very tightly coupled to this data layer - we aren't even close to implementing some form of repository design. This and the issues above have created me all sorts of problems.

I have tried to develop some unit tests for some existing classes I've already written but the tests take A LOT longer to run since db access is required, not to mention since we use the MS Enterprise Caching framework I am forced to fake a httpcontext for my tests to run successfully which isn't practical. Also, I can't see how to use TDD to drive the design of any new classes I write since they have to be so tightly coupled to the database ... help!

Because of the architecture of the system it appears I can't implement TDD without some real hack which in my eyes just defeats the aim of TDD and the huge benefits that come with.

Does anyone have any suggestions how I could implement TDD with the constraints I'm bound to? Or do I need to push the repository design pattern down my seniors throats and tell them we either change our architecture/development methodology or forget about TDD altogether? :)

Thanks

For adding tests to your existing code, you might check out Working Effectively With Legacy Code. "Legacy code" is defined as code that does not have tests in it.

Nitpick: you can't do Test-Driven Design on existing code, but I do realize that you wish to use TDD against new functionality you implement on the existing code base.

The best thing you can do is to gradually introduce seams into the code base, shielding you from the tightly coupled code already in existence.

The book Working Effectively with Legacy Code contains much good advice on how to turn a legacy application into a testable application.

At work we've found our test suite has got to the point that's it too slow to run repeatedly, which I really don't like. It's at least 5 minutes over the entire suite, and over 3 minutes for just the back-end data object tests. So, I'm curious to hear how people do their testing.

At the moment, we have a single database server with a live schema and a _test schema. When a test runs, it first runs an SQL script which says how to populate the test database (and clear any old data that might get in the way). This happens for almost all tests. From what I can see, this is the biggest bottleneck in our tests - I have just profiled one test and it takes about 800ms to setup the database, and then each subsequent test runs in about 10ms.

I've been trying to find out some solutions, and here's what I've found so far:

  • Have the test schema populated once, and rollback changes at the end of each test.

    This seems to be the easiest solution, but it does mean we're going to have to add some special case stuff to test things that are dependent on a rollback (ie, error handling tests).

  • Mock the database where possible

    We would setup the database for the data object being tested, but mock anything it depends on. To me, this doesn't seem brilliant for 2 reasons. Firstly, when we set the database up, we still (usually) end up with much more rows due to foreign-key dependencies. And secondly, most data object models don't really interact with other ones, they just do JOINs.

  • Run the same system, but use dumps and RAMFS

    Instead of running a big SQL query we would instead load a database dump. The test server would run on an RAMFS partition, and hopefully bring some speed benefits.

    I can't test this though because I'm on OSX and from what I can see, there isn't ramfs support.

There are some other options around like using SQLite, but this isn't an option for us as we depend on some PostgreSQL specific extensions.

Halp! :)

In Working Effectively with Legacy Code, Michael Feathers writes (pg. 10)

Unit tests run fast. If they don't run fast, they aren't unit tests.

Other kinds of tests often masquerade as unit tests. A test is not a unit test if:

  1. It talks to a database.
  2. It communicates across a network.
  3. It touches the file system.
  4. You have to do special things to your environment (such as editing configuration files) to run it.

Tests that do these things aren't bad. Often they are worth writing, and you generally will write them in unit test harnesses. However, it is important to be able to separate them from true unit tests so that you can keep a set of tests that you can run fast whenever you make changes.

If you don't keep your unit tests fast, they lose value because developers won't run them all the time. To be specific, Feathers defines a slow unit test as one that takes a tenth of a second or longer to execute.

Keep your integration tests that actually talk to the database, touch the filesystem, and so on in their own test suites separate from your unit tests. These still need to run as frequently as practicable to keep the feedback loop short, but you can get away with running them, say, only a few times a day.

Don't shelve your integration tests and forget about them! Automate their execution and reporting of results. If you're using a continuous integration server, add another project that does nothing but periodically run the tests.

In your unit tests, use mocks or fakes for the database layer. Having to repeat the same work will get tedious, and the desire to avoid it will tend to improve your design by concentrating database access in a few classes and push your logic into a domain model, which is where the meat of what you want to test is.

How would you begin improving on a really bad system?

Let me explain what I mean before you recommend creating unit tests and refactoring. I could use those techniques but that would be pointless in this case.

Actually the system is so broken it doesn't do what it needs to do.

For example the system should count how many messages it sends. It mostly works but in some cases it "forgets" to increase the value of the message counter. The problem is that so many other modules with their own workarounds build upon this counter that if I correct the counter the system as a whole would become worse than it is currently. The solution could be to modify all the modules and remove their own corrections, but with 150+ modules that would require so much coordination that I can not afford it.

Even worse, there are some problems that has workarounds not in the system itself, but in people's head. For example the system can not represent more than four related messages in one message group. Some services would require five messages grouped together. The accounting department knows about this limitation and every time they count the messages for these services, they count the message groups and multiply it by 5/4 to get the correct number of the messages. There is absolutely no documentation about these deviations and nobody knows how many such things are present in the system now.

So how would you begin working on improving this system? What strategy would you follow?

A few additional things: I'm a one-men-army working on this so it is not an acceptable answer to hire enough men and redesign/refactor the system. And in a few weeks or months I really should show some visible progression so it is not an option either to do the refactoring myself in a couple of years.

Some technical details: the system is written in Java and PHP but I don't think that really matters. There are two databases behind it, an Oracle and a PostgreSQL one. Besides the flaws mentioned before the code itself is smells too, it is really badly written and documented.

Additional info:

The counter issue is not a synchronization problem. The counter++ statements are added to some modules, and are not added to some other modules. A quick and dirty fix is to add them where they are missing. The long solution is to make it kind of an aspect for the modules that need it, making impossible to forget it later. I have no problems with fixing things like this, but if I would make this change I would break over 10 other modules.

Update:

I accepted Greg D's answer. Even if I like Adam Bellaire's more, it wouldn't help me to know what would be ideal to know. Thanks all for the answers.

This is a whole book that will basically say unit test and refactor, but with more practical advice on how to do it

http://www.amazon.com/Working-Effectively-Legacy-Robert-Martin/dp/0131177052

I've got to do some significant development in a large, old, spaghetti-ridden ASP system. I've been away from ASP for a long time, focusing my energies on Rails development.

One basic step I've taken is to refactor pages into subs and functions with meaningful names, so that at least it's easy to understand @ the top of the file what's generally going on.

Is there a worthwhile MVC framework for ASP? Or a best practice at how to at least get business logic out of the views? (I remember doing a lot of includes back in the day -- is that still the way to do it?)

I'd love to get some unit testing going for business logic too, but maybe I'm asking too much?

Update:

There are over 200 ASP scripts in the project, some thousands of lines long ;) UGH!

We may opt for the "big rewrite" but until then, when I'm in changing a page, I want to spend a little extra time cleaning up the spaghetti.

I use ASPUnit for unit testing some of our classic ASP and find it to be helpful. It may be old, but so is ASP. It's simple, but it does work and you can customize or extend it if necessary.

I've also found Working Effectively with Legacy Code by Michael Feathers to be a helpful guide for finding ways to get some of that old code under test.

Include files can help as long as you keep it simple. At one point I tried creating an include for each class and that didn't work out too well. I like having a couple main includes with common business logic, and for complicated pages sometimes an include with logic for each of those pages. I suppose you could do MVC with a similar setup.

Assume i have a function in a code library with a bug in it i've found a bug in a code library:

class Physics
{
    public static Float CalculateDistance(float initialDistance, float initialSpeed, float acceleration, float time)
    {
        //d = d0 + v0t + 1/2*at^2
        return initialDistance + (initialSpeed*time)+ (acceleration*Power(time, 2));
    }
}

Note: The example, and the language, are hypothetical

i cannot guarantee that fixing this code will not break someone.

It is conceivable that there are people who depend on the bug in this code, and that fixing it might cause them to experience an error (i cannot think of a practical way that could happen; perhaps it has something to do with them building a lookup table of distances, or maybe they simply throw an exception if the distance is the wrong value not what they expect)

Should i create a 2nd function:

class Physics
{
    public static Float CalculateDistance2(float initialDistance, float initialSpeed, float acceleration, float time) { ... }

    //Deprecated - do not use. Use CalculateDistance2
    public static Float CalculateDistance(float initialDistance, float initialSpeed, float acceleration, float time) { ... }
}

In a language without a way to formally deprecate code, do i just trust everyone to switch over to CalculateDistance2?

It's also sucky, because now the ideally named function (CalculateDistance) is forever lost to a legacy function that probably nobody needs, and don't want to be using.

Should i fix bugs, or abandon them?

See also

I'm curious to if it would be valuable, I'd like to start using QUnit, but I really don't know where to get started. Actually I'm not going to lie, I'm new to testing in general, not just with JS.

I'm hoping to get some tips to how I would start using unit testing with an app that already has a large amount of JavaScript (ok so about 500 lines, not huge, be enough to make me wonder if I have regression that goes unnoticed). How would you recommend getting started and Where would I write my tests?

(for example its rails app, where is a logical place to have my JS tests, it would be cool if they could go in the /test directory but it's outside the public directory and thus not possible... err is it?)

One of the best guides you can find on incorporating testing into old code is Working Effectively with Legacy Code. In your case, you don't have a large amount of code you need to worry yourself about. Just start putting it in where you can and think about how more easily you could structure your code to allow for tests in general.

I've inherited a legacy web application that has no unit tests in it. I'd like to add some, but am at a loss of where to start. Should I add them to old code? Or just new code going forward? What if that code interacts with legacy code? What would you suggest?

I would suggest getting a copy of Working Effectively with Legacy Code.

We went through this book in a study group. it was painful, but useful.

Topics include:

  • Understanding the mechanics of software change: adding features, fixing bugs, improving design, optimizing performance
  • Getting legacy code into a test harness
  • Writing tests that protect you against introducing new problems
  • Techniques that can be used with any language or platform-with examples in Java, C++, C, and C#
  • Accurately identifying where code changes need to be made
  • Coping with legacy systems that aren't object-oriented
  • Handling applications that don't seem to have any structure

You can see a short into to this at http://www.objectmentor.com/resources/articles/WorkingEffectivelyWithLegacyCode.pdf

This is probably a broad question, not quite SO style, but I'd still like to get some hints or guidelines if possible.

I've been looking through some legacy code and found a part of it that has methods with exceptions nested 3 or 4 levels down.
Is this considered to be a normal practice or should one avoid such codestyle where possible? If it should be avoided, what are the negative effects besides the increasing costs of exception handling and decreasing readability? Are there common ways of refactoring the code to avoid this?

About handling legacy code I would recommend you have a look at the book covering the topic: http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052 You dont even have to go through the whole book, just look at the things that concern you at the moment.

Also a good book regarding good practices is: http://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882/ref=sr_1_1?s=books&ie=UTF8&qid=1356340809&sr=1-1&keywords=clean+code

The best approach when handling nested exceptions is refactoring the code and using runtime instead of checked exceptions, and handling those where needed. This way the code is more readable and easier to maintain.

I don't know, I've been told that the previous developers did fine in picking up and heading straight into coding with no major problem. I wonder if I am doing it wrong by requesting my manager for some brief meetings with some senior programmers here. Is it better to be cautious and finish this time sensitive tracker the long way, or rush it to meat the deadline?

Just on the side note, the previous programmers who maintain this app are all gone after less than a year in the company. Don't know if there's any relation in anything.

There was a thread about this on Slashdot a year ago. Amidst the usual Slashdot cruft, there are some good answers; maybe someone can extract them here.

Some good ones are stepping through the program with a debugger, Doxygen (of course) (and related tools like ctags/etags/GNU Global), giving up, and a couple of books about exactly this topic: Working Effectively with Legacy Code by Michael Feathers and Code Reading: The Open Source Perspective by Diomidis Spinellis.

And I personally recommend reading The P.G. Wodehouse Method Of Refactoring; if nothing it's at least a fun read!

I have to audit a large web Java/J2ee application that has evolved over several years. It's been written by some other company, not the one I'm working for. In it's current state it has become hard to evolve and maintain, new functionalities are hard to add and often lead to bugs that sometime show up in production. There seem to be some copy/pasted code which resulted in code duplication. The current app is some kind of online shopping with some cms-like content here and there. It's mostly Struts and some Spring in newer parts of the code, maybe some ejbs thrown in for good measure. There are some unit tests available, but not a lot of them. These are things I've been told, I haven't seen yet the actual code.

My company will make a proposition to rewrite parts of this app in order to reduce complexity, improve quality and modularity, and make it possible to add easier new functionalities without regressions. Before making any commitement, they would like to have some kind of appreciation of the quality of the existing code and to asses how much of it can be reused, in order to have more than a guess in what there will have to be done - full rewrite or partial rewrite.

The catch is that I'll have to do this in a very short period ( a couple of days ) so I'm trying to work out a plan for what can be done in such a short time. What I'm thiking is :

  • check out "basic" things - exceptions treatment, logging
  • check out the level of layering ( views, controllers, dao layer )
  • measure the actual coverage of the unit tests
  • maybe run some Checkstyle, Findbugs and PMD over the projects
  • ...

So the actual question is what other things should I take into account/check/measure/etc ?

I'm not sure what kind of numbers I could get out of this and if it would really mean something, I have the feeling that what the management is asking is kind of the wrong approach, so the second question would be : does anyone has a better idea ?

I'll appreciate any idea, suggestion, comment on this.

Edit: I'll be adding two dead code detectors to the mix : UCD and DCD

I like your list quite a lot. I think you have an excellent plan of attack to start.

I'd look with an eye to standardizing on either Spring or EJB 3.0 but not both.

I haven't read it myself, but I wonder if Michael Feathers' book "Working Effectively With Legacy Code" has any good ideas?

UPDATE:

Maybe you can help things by putting them on a automated build and continuous integration - Cruise Control, Hudson, or Team City. If you have to do any refactoring it'll help.

I'm a C++ developer and when it comes to testing, it's easy to test a class by injecting dependencies, overriding member functions, and so on, so that you can test edge cases easily. However, in C, you can't use those wonderful features. I'm finding it hard to add unit tests to code because of some of the 'standard' ways that C code is written. What are the best ways to tackle the following:

Passing around a large 'context' struct pointer:

void some_func( global_context_t *ctx, .... )
{
  /* lots of code, depending on the state of context */
}

No easy way to test failure on dependent functions:

void some_func( .... )
{
  if (!get_network_state() && !some_other_func()) {
    do_something_func();
    ....
  }
  ...
}

Functions with lots of parameters:

void some_func( global_context_t *, int i, int j, other_struct_t *t, out_param_t **out, ...)
{
  /* hundreds and hundreds of lines of code */
}

Static or hidden functions:

static void foo( ... )
{
  /* some code */
} 

void some_public_func( ... }
{
  /* call static functions */
  foo( ... );
}

In general, I agree with Wes's answer - it is going to be much harder to add tests to code that isn't written with tests in mind. There's nothing inherent in C that makes it impossible to test - but, because C doesn't force you to write in a particular style, it's also very easy to write C code that is difficult to test.

In my opinion, writing code with tests in mind will encourage shorter functions, with few arguments, which helps alleviate some of the pain in your examples.

First, you'll need to pick a unit testing framework. There are a lot of examples in this question (though sadly a lot of the answers are C++ frameworks - I would advise against using C++ to test C).

I personally use TestDept, because it is simple to use, lightweight, and allows stubbing. However, I don't think it is very widely used yet. If you're looking for a more popular framework, many people recommend Check - which is great if you use automake.

Here are some specific answers for your use cases:

Passing around a large 'context' struct pointer

For this case, you can build an instance of the struct with the pre conditions manually set, then check the status of the struct after the function has run. With short functions, each test will be fairly straightforward.

No easy way to test failure on dependent functions

I think this is one of the biggest hurdles with unit testing C. I've had success using TestDept, which allows run time stubbing of dependent functions. This is great for breaking up tightly coupled code. Here's an example from their documentation:

void test_stringify_cannot_malloc_returns_sane_result() {
  replace_function(&malloc, &always_failing_malloc);
  char *h = stringify('h');
  assert_string_equals("cannot_stringify", h);
}

Depending on your target environment, this may or may not work for you. See their documentation for more details.

Functions with lots of parameters

This probably isn't the answer you're looking for, but I would just break these up into smaller functions with fewer parameters. Much much easier to test.

Static or hidden functions

It's not super clean, but I have tested static functions by including the source file directly, enabling calls of static functions. Combined with TestDept for stubbing out anything not under test, this works fairly well.

 #include "implementation.c"

 /* Now I can call foo(), defined static in implementation.c */

A lot of C code is legacy code with few tests - and in those cases, it is generally easier to add integration tests that test large parts of the code first, rather than finely grained unit tests. This allows you to start refactoring the code underneath the integration test to a unit-testable state - though it may or may not be worth the investment, depending on your situation. Of course, you'll want to be able to add unit tests to any new code written during this period, so having a solid framework up and running early is a good idea.

If you are working with legacy code, this book (Working effectively with legacy code by Michael Feathers) is great further reading.

I'm going to refactor certain parts in a huge code base (18000+ Java classes). Goal is to be able to extract lower layers as independent libraries to be reused in other projects that currently use duplicate of this code base. Especially one part is of interest to be refactored into a framework independent of business logic. Ultimately I would like the code to have a clean architectural layering.

I've looked at the code with a tool called Structure 101 for java and found lots (!) of architectural layering issues where lower layers are referencing upper layers.

I don't want to simply start messing with the code but try to come up with a reasonable strategy to go about this problem. What things should I keep in mind?

I'm thinking about at least taking small steps. I'm also thinking about have unit tests in place, but that requires creating them, since there are none.

Any thoughts on this?

You should also take a look at Working with legacy code by Michael Feathers:

http://www.amazon.com/Working-Effectively-Legacy-Robert-Martin/dp/0131177052/ref=sr_1_1?ie=UTF8&s=books&qid=1242430219&sr=8-1

I think one of the most important things you can put in place to facilitate this are tests to ensure that everything still works post refactoring/pulling out into separate modules. Add to this by introducing a continuous integration system that runs your tests when you check something in.

I'm really new to mocks and am trying to replace a private field with a mock object. Currently the instance of the private field is created in the constructor. My code looks like...

public class Cache {
    private ISnapshot _lastest_snapshot;

    public ISnapshot LatestSnapshot {
        get { return this._lastest_snapshot; }
        private set { this._latest_snapshot = value; }
    }

    public Cache() {
        this.LatestSnapshot = new Snapshot();
    }

    public void Freeze(IUpdates Updates) {
        ISnapshot _next = this.LastestSnapshot.CreateNext();
        _next.FreezeFrom(Updates);
        this.LastestSnapshot = _next;
    }

}

What I'm trying to do is create a unit test that asserts ISnapshot.FreezeFrom(IUpdates) is called from within Cache.Freeze(IUpdates). I'm guessing I should replace the private field _latest_snapshot with a mock object (maybe wrong assumption?). How would I go about that while still retaining a parameterless constructor and not resorting to making LatestSnapshot's set public?

If I'm totally going about writing the test the wrong way then please do point out as well.

The actual implementation of ISnapshot.FreezeFrom itself calls a heirarchy of other methods with a deep object graph so I'm not too keen on asserting the object graph.

Thanks in advance.

I'm almost citing techniques from "Working Effectively with Legacy Code":

  1. Sub-class your class in a unit test and supersede your private variable with a mock object in it (by adding a public setter or in the constructor). You probably have to make the variable protected.
  2. Make a protected getter for this private variable, and override it in testing subclass to return a mock object instead of the actual private variable.
  3. Create a protected factory method for creating ISnapshot object, and override it in testing subclass to return an instance of a mock object instead of the real one. This way the constructor will get the right value from the start.
  4. Parametrize constructor to take an instance of ISnapshot.

I really like Tornado and I would like to use it with Python 3, though it is written for Python versions 2.5 and 2.6.

Unfortunately it seems like the project's source doesn't come with a test suite. If I understand correctly the WSGI part of it wouldn't be that easy to port as it's spec is not ready for Python 3 yet (?), but I am rather interested in Tornado's async features so WSGI compatibility is not my main concern even if it would be nice.

Basically I would like to know what to look into/pay attention for when trying to port or whether there are already ports/forks already (I could not find any using google or browsing github, though I might have missed something).

Software without a decent test suite is legacy software -- even if it has been released yesterday!-) -- so the first important step is to start building a test suite; I recommend Feathers' book in the URL, but you can start with this PDF which is an essay, also by Feathers, preceding the book and summarizing one of the book's main core ideas and practices.

Once you do have the start of a test suite, run it with Python 2.6 and a -3 flag to warn you of things 2to3 may stumble on; once those are fixed, it's time to try 2to3 and try the test suite with Python 3. You'll no doubt have to keep beefing up the test suite as you go, and I recommend regularly submitting all the improvements to the upstream Tornado open source project -- those tests will be useful to anybody who needs to maintain or port Tornado, after all, not just to people interested in Python 3, so, with luck, you might gain followers and more and more contributors to the test suite.

I can't believe that people are releasing major open source projects, in 2009!!!, without decent test suites, but I'm trusting you that this is indeed what the Tornadoers have done...

I work in a medium sized team and I run into these painfully large class files on a regular basis. My first tendency is to go at them with a knife, but that usually just makes matters worse and puts me into a bad state of mind.

For example, imagine you were just given a windows service to work on. Now there is a bug in this service and you need to figure out what the service does before you can have any hope of fixing it. You open the service up and see that someone decided to just use one file for everything. Start method is in there, Stop method, Timers, all the handling and functionality. I am talking thousands of lines of code. Methods under a hundred lines of code are rare.

Now assuming you cannot rewrite the entire class and these god classes are just going to keep popping up, what is the best way to deal with them? Where do you start? What do you try to accomplish first? How do you deal with this kind of thing and not just want to get all stabby.

If you have some strategy just to keep your temper in check, that is welcome as well.

Tips Thus Far:

  1. Establish test coverage
  2. Code folding
  3. Reorganize existing methods
  4. Document behavior as discovered
  5. Aim for incremental improvement

Edit:

Charles Conway recommend a podcast which turned out to be very helpful. link

Michael Feathers (guy in the podcast) begins with the premise that were are too afraid to simply take a project out of source control and just play with it directly and then throw away the changes. I can say that I am guilty of this.

He essentially said to take the item you want to learn more about and just start pulling it apart. Discover it's dependencies and then break them. Follow it through everywhere it goes.

Great Tip Take the large class that is used elsewhere and have it implement an emtpy interface. Then take the code using the class and have it instantiate the interface instead. This will give you a complete list of all the dependencies to that large class in your code.

Ouch! Sounds like the place I use to work.

Take a look at Working effectivly with legacy code. It has some gems on how to deal with atrocious code.

DotNetRocks recently did a show on working with legacy code. There is no magic pill that is going to make it work.

The best advice I've heard is start incrementally wrapping the code in tests.

So - management is looking to do a push to move towards doing unit-testing in all the applications moving forward - and eventually get into full TDD/Continuous Integration/Automated build mode (I hope). At this point however we are just concerned about getting everyone developing apps moving forward using unit-testing. I'd like to just start with the basics.

I won't lie - I'm by far no expert by any means in unit-testing, but I do have a good enough understanding to start the initiative with the basics, and allow us to grow togeather as a team. I'd really love to get some comments & critisism from all you experts on my plan of attack on this thing. It's a team of about 10 developers in a small shop, which makes for a great opportunity to move forward with agile development methodologies and best practices.

First off - the team consists of mainly mid level developers with a couple of junior devs and one senior, all with minimal to no exposure to unit testing. The training will be a semi-monthly meeting for about 30-60 minutes each time (probably wind up running an hour long i'd guess, and maybe have them more often). We will continue these meetings until it makes sense to stop them to allow others to catch up with their own 'homework' and experience - but the push will always be on.

Anyway - here is my lesson plan I have come up with. Well, the first two at least. Any advice from your experts out there on the actual content or structure of the lessons, etc, would be great. Comments & Critisism greatly appreciated. Thanks very much.

I apologize if this is 'too much' to post in here or read through. I think this would be a great thread for SO users looking to get into unit testing in the first place as well. Perhaps you could just skip to the 'lesson plans' section - thanks again everyone.

CLIFF NOTES - I realize this post is incredibly long and ugly, so here is the cliff notes - Lesson 1 will be 'hello world unit tests' - Lesson 2 will be opening the solution to my most recent application, and showing how to apply each 'hello world' example in real life... Thanks so much everyone for the feedback you've given me so far.. just wantd to highlight the fact that lesson 2 is going to have real life production unit tests in it, since many suggested I do that when it was my plan from the begining =)

Unit Testing Lesson Plan

Overview

Why unit test? Seems like a bunch of extra work - so why do it?

• Become the master of your own destiny. Most of our users do not do true UATs, and unfortunately tend to do their testing once in production. With unit-tests, we greatly decrease risk associated with this, especially when we create enough test data and take into account as many top level inputs as we possibly can. While not being a ‘silver bullet’ that prevents all bugs – it is your first line of defense – a huge front line, comparable to that of the SB championship Giants.

• Unit-Testing enforces good design and architecture practices. It is ‘the violent psychopath who maintains your code and knows where you live’ so to say. You simply can’t write poor quality code that is well unit-tested

• How many times have you not refactored smelly code because you were too scared of breaking something? Automated testing remove this fear, makes refactoring much easier, in turn making code more readable and easier to maintain.

• Bottom line – maintenance becomes much easier and cheaper. The time spent writing unit tests might be costly now – but the time it saves you down the road has been proven time and time again to be far more valuable. This is the #1 reason to automate testing your code. It gives us confidence that allows us to take on more ambitious changes to systems that we might have otherwise had to reduce requirements on, or maybe even not take on at all.

Terminology Review

• Unit testing - testing the lowest level, single unit of work. E.G. – test all possible code paths that a single function can flow through.

• Integration testing - testing how your units work together. E.g. – run a ‘job’ (or series of function calls) that does a bunch of work, with known inputs - and then query the database at the end and assert the values are what you expect from those known inputs (instead of having to eye-ball a grid on a web page somewhere, e.g. doing a functional test).

• Fakes – a fake is a type of object whose purpose is to use for your testing. It allows you too easily not test code that you do not want to test. Instead of having to call code that you do not want – like a database call – you use a fake object to ‘fake’ that DB call and perhaps read the data from an XML/Excel file or a mocking framework. o Mock – a type of fake to which you make assert statements against. o Stub – a type of fake to which you use as placeholder code, so you can skip the database call, but do not make asserts against

Lessons

Lesson one – Hello Worlds

• Hello World unit test - I will create a ‘hello world’ console application that is unit tested. Will create this application on the fly during the meeting, showing the tools in Visual Studio 2008 (test-project, test tools toolbar, etc.) that we are going to use along the way while explaining what they do. This will have only a single unit-test. (OK, maybe I won’t create it ‘on the fly’ =), have to think about it more). Will also explain the Assert class and its purpose and the general flow of a unit-test.

• Hello World, a bit more complicated. Now our function has different paths/logical branches the code can flow through. We will have ~3 unit tests for this function. This will be a pre-written solution I make before the meeting.

• Hello World, dependency injection. (Not using any DI frameworks). A pre-written solution that builds off the previous one, this time using dependency injection. Will explain what DI is and show a sample of how it works.

• Hello World, Mock Objects. A pre-written solution that builds off the previous one, this time using our newly added DI code to inject a mock object into our class to show how mocking works. Will use NMock2.0 as this is the only one I have exposure to. Very simple example to just display the use of mock objects. (Perhaps put this one in a separate lesson?).

• Hello World, (non-automated) Integration Test. Building off the previous solution, we create an integration test so show how to test 2 functions together, or entire class together (perhaps put this one in a separate lesson?)

Lesson two – now we know the basics – how to apply this in real life?

• General overview of best practices o Rule #1- Single Responsibility Principal. Single Responsibility Principal. Single Responsibility Principal. Facetiously stating that this is the single most important thing to keep in mind while writing your code. A class should have only one purpose; a function should do only one thing. The key word here is ‘unit’ test – and the SRP will keep your classes & functions encapsulated into units. o Dependency Injection is your second best friend. DI allows you to ‘plug’ behavior into classes you have, at run time. Among other things, this is how we use mocking frameworks to make our bigger classes more easily testable.

o Always think ‘how will I test this’ as you are writing code. If it seems ‘too hard to test’, it is likely that your code is too complicated. Re-factor until it into more logical units of classes/functions - take that one class that does 5 things and turn it into 5 classes, one which calls the other 4. Now your code will be much easier to test – and also easier to read and refactor as well.

o Test behavior, not implementation. In nutshell, this means that we can for the most part test only the Public functions on our classes. We don’t care about testing the private ones (implementation), because the public ones (behavior) are what our calling code uses. For example... I’m a millionaire software developer and go to the Aston Martin dealership to buy myself a brand new DB9. The sales guy tells me that it can do 0-60 in 3 seconds. How would you test this? Would you lift the engine out and perform diagnostics tests, etc..? No... You would take it onto the parkway and do 160 MPH =). This is testing behavior vs. implementation.

• Reviewing a real life unit-tested application. Here we will go over each of the above ‘hello world’ examples – but the real life versions, using my most recent project as an example. I'll open a simple unit test, a more complex one, one that uses DI, and one that uses Mocks (probably coupled to the DI one). This project is fairly small & simple so it is really a perfect fit. This will also include testing the DAL and how to setup a test database to run these tests against.

I think it will be harder to start with test-after type of testing. A major difference compared to test-first is that test-after does not guide you in writing code which is easy to test. Unless you have first done TDD, it will be very hard to write code for which it's easy to write tests.

I recommend starting with TDD/BDD and after getting the hang of that, continue learning how to do test-after on legacy code (also PDF). IMO, doing test-after well is much harder than test-first.

In order to learn TDD, I recommend finishing this tutorial. That should get you started on what kind of tests to write.

The situation: millions of lines of code, more than one hundred developers and frequent defects. We want to avoid repeating defects and we want to improve code design (who doesn't?).

Test Driven Development (first unit test, then code) sounds ideal: write a test case for each function.

But, with so much code written, how can TDD be implemented? Where do you start - with low level functions?

Or are we too late to start TDD?

Start with Working Effectively with Legacy Code.

It's not really TDD if you're starting with legacy code - but all your coding can be TDD. As you tackle a new problem, write a test for it. If you can't, because the legacy classes are too difficult to test, then start writing tests for them, slicing off bits, and covering the bits with tests.

Refactor the Low-Hanging Fruit.

To avoid repeating defects: given an example defect, write a test that demonstrates it. It could be a relatively broad test that just simulates user activity; not yet a unit test. Make sure the test fails. Do your research; figure out why the test is failing. Now - this is important - before fixing the bug, write a unit test that demonstrates the bug. Fix the bug, and now you've got two tests, at least one of them fast, that protect you from regressions.

Having just read the first four chapters of Refactoring: Improving the Design of Existing Code, I embarked on my first refactoring and almost immediately came to a roadblock. It stems from the requirement that before you begin refactoring, you should put unit tests around the legacy code. That allows you to be sure your refactoring didn't change what the original code did (only how it did it).

So my first question is this: how do I unit-test a method in legacy code? How can I put a unit test around a 500 line (if I'm lucky) method that doesn't do just one task? It seems to me that I would have to refactor my legacy code just to make it unit-testable.

Does anyone have any experience refactoring using unit tests? And, if so, do you have any practical examples you can share with me?

My second question is somewhat hard to explain. Here's an example: I want to refactor a legacy method that populates an object from a database record. Wouldn't I have to write a unit test that compares an object retrieved using the old method, with an object retrieved using my refactored method? Otherwise, how would I know that my refactored method produces the same results as the old method? If that is true, then how long do I leave the old deprecated method in the source code? Do I just whack it after I test a few different records? Or, do I need to keep it around for a while in case I encounter a bug in my refactored code?

Lastly, since a couple people have asked...the legacy code was originally written in VB6 and then ported to VB.NET with minimal architecture changes.

Good example of theory meeting reality. Unit tests are meant to test a single operation and many pattern purists insist on Single Responsibilty, so we have lovely clean code and tests to go with it. However, in the real (messy) world, code (especially legacy code) does lots of things and has no tests. What this needs is dose of refactoring to clean the mess.

My approach is to build tests, using the Unit Test tools, that test lots of things in a single test. In one test, I may be checking the DB connection is open, changing lots of data, and doing a before/after check on the DB. I inevitably find myself writing helper classes to do the checking, and more often than not those helpers can then be added into the code base, as they have encapsulated emergent behaviour/logic/requirements. I don't mean I have a single huge test, what I do mean is mnay tests are doing work which a purist would call an integration test - does such a thing still exist? Also I've found it useful to create a test template and then create many tests from that, to check boundary conditions, complex processing etc.

BTW which language environment are we talking about? Some languages lend themselves to refactoring better than others.

For instructions on how to refactor legacy code, you might want to read the book Working Effectively with Legacy Code. There's also a short PDF version available here.

I have been used to do some refactorings by introducing compilation errors. For example, if I want to remove a field from my class and make it a parameter to some methods, I usually remove the field first, which causes a compilation error for the class. Then I would introduce the parameter to my methods, which would break callers. And so on. This usually gave me a sense of security. I haven't actually read any books (yet) about refactoring, but I used to think this is a relatively safe way of doing it. But I wonder, is it really safe? Or is it a bad way of doing things?

When you are ready to read books on the subject, I recommend Michael Feather's "Working Effectively with Legacy Code". (Added by non-author: also Fowler's classic book "Refactoring" - and the Refactoring web site may be useful.)

He talks about identifiying the characteristics of code you are working before you make a change and doing what he calls scratch refactoring. That is refectoring to find characteristics of the code and then throwing the results away.

What you are doing is using the compiler as an auto-test. It will test that your code compiles but not if the behaviour has changed due to your refactoring or if there were any side affects.

Consider this

class myClass {
     void megaMethod() 
     {
         int x,y,z;
         //lots of lines of code
         z = mysideEffect(x)+y;
         //lots more lines of code 
         a = b + c;
     }
}

you could refactor out the addtion

class myClass {
     void megaMethod() 
     {
         int a,b,c,x,y,z;
         //lots of lines of code
         z = addition(x,y);
         //lots more lines of code
         a = addition(b,c);  
     }

     int addition(int a, b)
     {
          return mysideaffect(a)+b;
     }
}

and this would work but the second additon would be wrong as it invoked the method. Further tests would be needed other than just compilation.

This sounds similar to an absolutely standard method used in Test-Driven Development: write the test referring to a nonexistent class, so that the first step to make the test pass is to add the class, then the methods, and so on. See Beck's book for exhaustive Java examples.

Your method for refactoring sounds dangerous because you don't have any tests for safety (or at least you don't mention that you have any). You might create compiling code that doesn't actually do what you want, or breaks other parts of your application.

I'd suggest you add a simple rule to your practise: make noncompiling changes only in unit test code. That way you are sure to have at least a local test for each modification, and you are recording the intent of the modification in your test before making it.

By the way, Eclipse makes this "fail, stub, write" method absurdly easy in Java: each nonexistent object is marked for you, and Ctrl-1 plus a menu choice tells Eclipse to write a (compilable) stub for you! I'd be interested to know if other languages and IDEs provide similar support.

I have number of classes I've been asked to add some unit tests to with Rhino Mocks and having some issues.

First off, I know RhinoMocks doesn't allow for the mocking of Static members. I'm looking for what options I have (besides using TypeMock).

An example of the class I have is similar to the below:

class Example1 : ISomeInterface
{
    private static ISomeInterface _instance;

    private Example1()
    {
        // set properties via private static methods
    }

    static Example1()
    {
        _instance = new Example1();
    }

    public static ISomeInterface Instance() 
    {
        get { return _instance; }
    }

    // Instance properties 

    // Other Instance Properties that represent objects that follow a similar pattern.
}

So when I call the above class, it looks something like this...

Example1.Instance.SomeObject.GoDownARabbitHole();

Is there a way for me to mock out the SomeObject.GoDownARabbitHole() in this situation or mock out the Instance?

Example from Book: Working Effectively with Legacy Code

To run code containing singletons in a test harness, we have to relax the singleton property. Here’s how we do it. The first step is to add a new static method to the singleton class. The method allows us to replace the static instance in the singleton. We’ll call it setTestingInstance.

public class PermitRepository
    {
    private static PermitRepository instance = null;
    private PermitRepository() {}
    public static void setTestingInstance(PermitRepository newInstance)
    {
    instance = newInstance;
    }
    public static PermitRepository getInstance()
    {
    if (instance == null) {
    instance = new PermitRepository();
    }
    return instance;
    }
    public Permit findAssociatedPermit(PermitNotice notice) {
    ...
    }
    ...
    }

Now that we have that setter, we can create a testing instance of a PermitRepository and set it. We’d like to write code like this in our test setup:

public void setUp() {
PermitRepository repository = new PermitRepository();
...
// add permits to the repository here
...
PermitRepository.setTestingInstance(repository);
}

In a few places in our code we use #if DEBUG blocks to simplify development. Things like:

#if DEBUG
   serverIP = localhost;
#else
   serverIP = GetSetting()
#endif

or

private bool isLicensed()

#if DEBUG
   return true;
#endif

return CheckSetting()

There are also a few places where we make cosmetic changes like this:

#if DEBUG
   background = humorousImage.jpg
#else
   background = standardColor
#endif

Is it dangerous to depend on #if debug to make development easier? If it is, what is a valid use of #if debug?

It is a really bad idea. If you're trying to catch a production bug your debug clauses will certainly trip you up at some stage or another. You want to be as close as possible to code that runs in production. In your example you'll never be able to find a bug in CheckSetting()

By the looks of things your code is too tightly coupled. What you want to be doing is to make modules/classes less dependent on each-other and practice Test Driven Development. Also have a look at Inversion of Control (aka Dependency Injection).

Working Effectively with Legacy Code has some useful insights on how to introduce TDD. It also has some really good pointers on how to do TDD in various scenarios where it might be hard to test things.

I would like to know how to implement unit testing in an existing (quite large) application using visual studio 2008 (.net 2.0).

I understand that developing unit tests for the existing/legacy code is not realistic but I would like to have tests for code moving forward.

I have found plenty of examples on how to write tests for code but nothing on how to set it up from scratch on an existing project, and how to integrate it into the development cycle.

I highly recommend reading this book: Working Effectively with Legacy Code if you want to make unit test for the existing code. It's also a good book on best-practices for unit tests in general.

It is possible to do unit testing on existing projects, but you will have to make some adjustments here and there to make the code testable. Too much dependencies is often the problem.

EDIT (after your comment) If you really want to embed unit testing into your development cycle then you should go for TDD (Test Driven Development). The aim here is to write your unit tests first, so you have a good understanding of what your classes will do. Of course these tests will fail, but the target is to get them working one by one. Do a google on TDD, there's plenty of information out there.

Say you have a program that currently functions the way it is supposed to. The application has very poor code behind it, eats up a lot of memory, is unscalable and would take major rewriting to implement any changes in functionality.

At what point does refactoring become less logical then a total rebuild?

Joel wrote a nice essay about this very topic:

Things You Should Never Do, Part 1

The key lesson I got from this is that although the old code is horrible, hurts your eyes and your aesthetic sense, there's a pretty good chance that a lot of that code is patching undocumented errors and problems. Ie., it has a lot of domain knowledge embedded in it and it will be difficult or impossible for you to replicate it. You'll constantly be hitting against bugs-of-omission.

A book I found immensely useful is Working Effectively With Legacy Code by Michael C. Feathers. It offers strategies and methods for approaching even truly ugly legacy code.

We are in the initial phase of trying to implement TDD. I demo'd the Visual Studio Team System code coverage / TDD tools and the team is excited at the possibilities. Currently we use Devpartner for code coverage, but we want to eliminate it because its expensive. We have very limited experience in TDD and want to make sure we don't go a wrong direction. Currently we are using SourceSafe for source control but will be migrating to Team System in about a year.

I can tell you our applications are very data centric. We have about 900 tables, 6000 stored procedures, and about 45GB of data. We have lots of calculations that are based upon userdata and different rates in the system. Also a lot of our code is based upon time (calculate interest to the current date). Some of these calculations are very complex and very intensive (only a few people know the details for some of them).

We want to implement TDD to solve QA issues. A lot of developers are forced to fix bugs in areas they are not familiar with and end up breaking something. There are also areas that developers are almost afraid to touch because the code is used by everything in the system. We want to mitigate this problem.

I'm afraid since our code is so data centric that implementing TDD might be a little bit more complex than most systems. I'm trying to come up with a gameplan that I can present to management but I want to hopefully not get caught up in some of the TDD beginner mistakes. Also if tools / facilities in Team System make TDD more complete then that would be nice but we don't want to wait for Team System to get started.

The first question we our asking is should we just start with the tools in visual studio? I have read post were people complain about the intrinsic tools in visual studio (need to create a separate project to create your test) but the one thing about the tools in visual studio is they are free and the integration is good. If we decide to go the other route in using something like XUnit, MBUnit, or NUnit then we are most likely going to have some maybe significant cost:

1) If we want IDE Integration (failed to mention most of our code is VB.Net)
---TestDriven.Net or Resharper or ?????

2) If we want code coverage
---NCover (Seems pretty pricey for its functionality)

Also I've seen some pretty cool functionality demoed in visual studio 2010. Like the ability to do input testing (data entered on a form) or the ability to record what the user has done and then feed that into your unit test to reproduce a problem.

Also, although I don't quite grasp the mocking object concept yet, I know a lot of people feel it's a must. The question is can all the mocking frameworks plug into visual studio's version of TDD (MSTEST)?

I have advised management that we should probably just add regression testing going forward (new development or found bugs) but not try to go through all our code and put in unit tests. It would be WAY too big of project.

Anyways, I would appreciate anyones help.

With regard to getting started, I would also recommend reading Fowler's Refactoring. The first chapter gives a good feel for what it means to introduce tests then safely introduce change (although the emphasis here is on behaviour preserving change). Furthermore, this talk describes some practices which can help improve the testability of your code. Misko Hevery also has this guide to writing testable code, which summarises the talk.

From your description, it sounds as though you want to test the core parts of your system - the parts with a lot of dependencies where changes are scary. Depending on the degree to which data access is decoupled from business logic, you will probably need to refactor towards a state where the code is more testable - where it is easy and fast to instantiate sets of test data in order to verify the logic in isolation. This may be a big job, and may not be worth the effort if changes here are infrequent and the code base is well proven.

My advice would be to be pragmatic, and use the experience of the team to find areas where it is easiest to add tests that add value. I think having many focussed unit tests is the best way to drive quality, but it is probably easier to test the code at a higher level using integration or scenario tests, certainly in the beginning. This way you can detect big failures in your core systems early. Be clear on what your tests cover. Scenario tests will cover a lot of code, but probably won't surface subtle bugs.

Moving from SourceSafe to Team System is a big step, how big depends on how much you want to do in Team System. I think you can get a lot of value from using Visual Studio's built in test framework. For example, as a first step you could implement some basic test suites for the core system/core use cases. Developers can run these themselves in Visual Studio as they work and prior to check in. These suites can be expanded gradually over time. Later when you get TFS, you can look at running these suites on check in and as part of an automated build process. You can follow a similar path regardless of the specific tools.

Be clear from the outset that there is overhead in maintaining test code, and having well designed tests can pay dividends. I have seen situations where tests are copy pasted then edited slightly etc. Test code duplication like this can lead to an explosion in the number of lines of test code you need to maintain when effecting a small product code change. This kind of hassle can erode the perceived benefit of having the tests.

Visual Studio 2008 will only show you block coverage, although the code analysis will also give other metrics like cyclomatic complexity per assembly/class/method. Getting high block coverage with your tests is certainly important, and allows you to easily identify areas of the system that are totally untested.

However, I think it is important to remember that high block coverage is only a simple measurement of the effectiveness of your tests. For example, say you write a class to purge a file archive and keep the 5 newest files. Then you write a test case which checks if you start with 10 files then run the purger you are left with 5. An implementation which passes the test could delete the newest files, but could easily give 100% coverage. This test only verifies 1 of the requirements.

First thing to do is get this book:

Working Effectively with Legacy Code

For such a large project, read it and internalize it. TDD on a data driven application is hard enough. On a legacy one you need some serious planning and effort. Worth it in my view, but it is still a big curve.

100 thumbs up for Working Effectively with Legacy Code recommended by Yishai. I also recommend Pragmatic Unit Testing in C# with NUnit as you're using .NET (although I'm assuming C#). It's been very useful for teaching the basics of unit testing, providing a solid foundation to work from.

When working with legacy code, and trying to create tests, I often break out dependencies from classes or methods so I can write unit tests using mocks for these dependencies. Dependencies most often come in the form of calls to static classes and objects created using the new keyword in the constructor or other locations in that class.

In most cases, static calls are handled either by wrapping the static dependency, or if its a singleton pattern (or similar) in the form of StaticClass.Current.MethodCall() passing that dependency by its interface go the constructor instead.

In most cases, uses of the new keyword in the constructor is simply replaced by passing that interface in the constructor instead.

In most cases, uses of the new keyword in other parts of the class, is handled either by the same method as above, or by if needed create a factory, and pass the factory's interface in the constructor.

I always use Resharpers refactoring tools to help me all of these break-outs, however most things are still manual labour (which could be automated), and for some legacy classes and methods that can be a very very tedious process. Is there any other refactoring plugins and/or tools which would help me in this process? Is there a "break out all depencencies from this class in a single click" refactoring tool? =)

It sounds to me like all these steps are common for many developers and a common problem, and before I attempt writing plugin to Resharper or CodeRush, I have to ask, because someone has probably already attempted this..

ADDED:

In reflection to answers below: even if you might not want to break out everything at once (one click total break out might cause more problems than it helps) still being able to simply break out 1 methods dependencies, or 1-2 dependencies easily, would be of big difference.

Also, refactoring code has a measure of "try and see what happens just to learn how everything fits together", and a one click total break out would help that process tons, even if you dont check that code in..

I don't think there is any tool that can automate this for you. Working with legacy code means -as you know- changing code with little steps at a time. The steps are often deliberately small to prevent errors from being made. Usually the first change you should make is one that makes that code testable. After you've written the test you change that part of the code in such way that you fix the bug or implement the RFC.

Because you should take small steps I believe it is hard to use a refactoring tool to magically make all your dependencies disappear. With legacy systems you would hardly ever want to make big changes at once, because the risk of breaking (and not finding out because of the lack of tests) is too big. This however, doesn’t mean refactoring tools aren’t useful in this scenario. On the contrary; they help a lot.

If you haven't already, I'd advise you to read Michael Feathers' book Working Effectively with Legacy Code. It describes in great details a series of patterns that help you refactor legacy code to a more testable system.

Good luck.

Martin Fowler says that we should do refactoring before adding new features (given that the original program is not well-structured).

So we all want to refactor this dirty codebase, that's for sure. We also know that without unit-testing code it's very easy to introduce subtle bugs.

But it's a large codebase. Adding a complete suite of tests to it seems impracticable.

What would you do in this case?

Let me recommend the book Working effectively with legacy code by Michael Feathers. It contains a lot of realistic examples and shows good techniques for tackling the legacy code beast.

Fowler also suggests that you never refactor without the safety of tests. But, how do you get those tests in place? And, how far do you go?

The previously recommended book (Working Effectively with Legacy Code by Michael Feathers) is the definitive work on the subject.

Need a quicker read? Take a look at Michael's earlier article (PDF) of the same name.

Do you have any strategies for retrofitting unit tests onto a code base that currently has no unit tests ?

If ever you are trying to add unit tests to old perl code I strongly recommend

Perl Testing: A Developer's Notebook by Ian Langworth and chromatic.

It has some very nice trick on testing legacy and "untestable" code.

How can I write unit tests for existing and already implemented code which has taken a procedural implementation as opposed to an OOP implementation. We are using Java/Spring, however there are not a lot of different beans for the different concerns, which are all mixed into one large class per piece of of major functionality. (EG: we have classes/beans for each batch job, for our DAOs and a few util type beans and that's it).

Just to give some more details, these major classes which need to be tested are about 1k-2k lines of code, and the only dependency injection/OOP they use is DAOs and some odd utilities. They have about 1 public method which they implement for an interface they all share.

Some pointers in this link.

If you want to go beyond that, I consider "Working effectively with legacy code" a must read.

I am working on a web application with an existing code base that has probably been around for 10 years, there are ~1000 class files and ~100,000 lines of code. The good news is that the code is organized well, business logic is separate from the controller domain, and there is a high level of reusability. The bad news is there is only the very beginnings of a test suite (JUnit); there's maybe 12 dozen tests at most.

The code is organized fairly typically for an enterprise Java project. There is a stuts-esque controller package, the model consists of almost purely data objects, there is a hibernate like database layer that is largely encapsulated within data access objects, and a handful of service packages that are simple, self contained, and logical. The end goal of building this test suite is to move towards a continuous integration development process.

  • How would you go about building a test suite for such an application?
  • What tools would you use to make the process simpler?

Any suggestions welcome. thanks!

Start by reading Working Effectively with Legacy Code (short version here). Next I would write a couple of end-to-end smoke tests to cover the most common use cases. Here are some ideas on how to approach it: http://simpleprogrammer.com/getting-up-to-bat-series/

Then when I need to change some part of the system, I would cover it with focused unit tests (refer to the aforementioned book) and then do the change. Little by little the system - or at least the parts which change the most often - would be better covered and working with it would become easier.

I have been trying to learn how to add testing to existing code -- currently reading reading Working Effectively With Legacy Code. I have been trying to apply some of the principles in JavaScript, and now I'm trying to extract an interface.

In searching for creating interfaces in JavaScript, I can't find a lot -- and what I find about inheritance seems like their are several different ways. (Some people create their own base classes to provide helpful methods to make it easier to do inheritance, some use functions, some use prototypes).

What's the right way? Got a simple example for extracting an interface in JavaScript?

There's no definitive right way, because so many people are doing so many different things.. There are many useful patterns.

Crockford suggests that you "go with the grain", or write javascript in a way that corresponds to javascript's prototypal nature.

Of course, he goes on to show that the original model that Netscape suggested is actually broken. He labels it "pseudoclassical", and points out a lot of the misdirection and unnecessary complexity that is involved in following that model.

He wrote the "object" function as a remedy (now known as Object.create() ). It allows for some very powerful prototypal patterns.

It's not always easy to do develop a clean interface when you have to work with legacy javascript, especially not when you're dealing with large systems, usually including multiple libraries, and each implementing a unique style and different inheritance pattern. In general, I'd say that the "right way" to do inheritance is the one which allows you to write a clean interface which behaves well in the context of your legacy code, but also allows you to refactor and eliminate old dependencies over time.

Considering the differences between the major library patterns, I've found that the most successful route to take in my own work is to keep my interfaces independent of the library interfaces entirely. I'll use a library or module if it's helpful, but won't be bound to it. This has allowed me to refactor a lot of code, phase out some libraries, and use libraries as scaffolding which can be optimized later.

Along these lines, I've written interfaces that were inspired by Crockford's parasitic inheritance pattern. It's really a win for simplicity.

On the other side of the coin, I'm sure you could argue for picking a library, enforcing it across your team, and conforming to both its inheritance patterns and its interface conventions.

I'm wondering how I should be testing this sort of functionality via NUnit.

Public void HighlyComplexCalculationOnAListOfHairyObjects()
{
    // calls 19 private methods totalling ~1000 lines code + comments + whitespace
}

From reading I see that NUnit isn't designed to test private methods for philosophical reasons about what unit testing should be; but trying to create a set of test data that fully executed all the functionality involved in the computation would be nearly impossible. Meanwhile the calculation is broken down into a number of smaller methods that are reasonably discrete. They are not however things that make logical sense to be done independently of each other so they're all set as private.

Get the book Working Effectively with Legacy Code by Michael Feathers. I'm about a third of the way through it, and it has multiple techniques for dealing with these types of problems.

Right my junit tests look like a long story:

  • I create 4 users
  • I delete 1 user
  • I try to login with the deleted user and make sure it fails
  • I login with one of the 3 remaining user and verify I can login
  • I send a message from one user to the other and verify that it appears in the outbox of the sender and in the inbox of the receiver.
  • I delete the message
  • ...
  • ...

Advantages: The tests are quite effective (are very good at detecting bugs) and are very stable, becuase they only use the API, if I refactor the code then the tests are refactored too. As I don't use "dirty tricks" such as saving and reloading the db in a given state, my tests are oblivious to schema changes and implementation changes.

Disadvantages: The tests are getting difficult to maintain, any change in a test affects other tests. The tests run 8-9 min which is great for continuous integration but is a bit frustrating for developers. Tests cannot be run isolated, the best you can do is to stop after the test you are interested in has run - but you absolutely must run all the tests that come before.

How would you go about improving my tests?

By testing stories like you describe, you have very brittle tests. If only one tiny bit of functionality is changing, your whole test might be messed up. Then you will likely to change all tests, which are affected by that change.

In fact the tests you are describing are more like functional tests or component tests than unit tests. So you are using a unit testing framework (junit) for non-unit tests. In my point of view there is nothing wrong to use a unit testing framework to do non-unit tests, if (and only if) you are aware of it.

So there are following options:

  • Choose another testing framework which supports a "story telling"-style of testing much better, like other user already have suggested. You have to evaluate and find a suitable testing framework.

  • Make your tests more “unit test”-like. Therefore you will need to break up your tests and maybe change your current production code. Why? Because unit testing aims on testing small units of code (unit testing purists suggest only one class at once). By doing this your unit tests become more independent. If you change the behavior of one class, you just need to change a relatively small amount of unit test code. This makes your unit test more robust. During that process you might see that your current code does not support unit testing very well -- mostly because of dependencies between classes. This is the reason that you will also need to modify your production code.

If you are in a project and running out of time, both options might not help you any further. Then you will have to live with those tests, but you can try to ease your pain:

  • Remove code duplication in your tests: Like in production code eliminate code duplication and put the code into helper methods or helper classes. If something changes, you might only need to change the helper method or class. This way you will converge to the next suggestion.

  • Add another layer of indirection to your tests: Produce helper methods and helper classes which operate on a higher level of abstraction. They should act as API for your tests. These helpers are calling you production code. Your story tests should only call those helpers. If something changes, you need to change only one place in your API and don't need to touch all your tests.

Example signatures for your API:

createUserAndDelete(string[] usersForCreation, string[] userForDeletion);
logonWithUser(string user);
sendAndCheckMessageBoxes(string fromUser, string toUser);

For general unit testing I suggest to have a look into XUnit Test Patterns from Gerard Meszaros.

For breaking dependencies in your production tests have a look into Working Effectively with Legacy Code from Michael Feathers

Background:

An art teacher once gave me a design problem to draw a tiger using only 3 lines. The idea being that I study a tiger and learn the 3 lines to draw for people to still be able to tell it is a tiger.

The solution for this problem is to start with a full drawing of a tiger and remove elements until you get to the three parts that are most recognizable as a tiger.

I love this problem as it can be applied in multiple disciplines like software development, especially in removing complexity.

At work I deal with maintaining a large software system that has been hacked to death and is to the point of becoming unmaintainable. It is my job to remove the burdensome complexity that was caused by past developers.

Question:

Is there a set process for removing complexity in software systems - a kind of reduction process template to be applied to the problem?

This is a loaded question :-)

First, how do we measure "complexity"? Without any metric decided apriori, it may be hard to justify any "reduction" project.

Second, is the choice entirely yours? If we may take an example, assume that, in some code base, the hammer of "inheritance" is used to solve every other problem. While using inheritance is perfectly right for some cases, it may not be right for all cases. What do you in such cases?

Third, can it be proved that behavior/functionality of the program did not change due to refactoring? (This gets more complex when the code is part of a shipping product.)

Fourth, you can start with start with simpler things like: (a) avoid global variables, (b) avoid macros, (c) use const pointers and const references as much as possible, (d) use const qualified methods wherever it is the logical thing to do. I know these are not refactoring techniques, but I think they might help you proceed towards your goal.

Finally, in my humble opinion, I think any such refactoring project is more of people issue than technology issue. All programmers want to write good code, but the perception of good vs. bad is very subjective and varies across members in the same team. I would suggest to establish a "design convention" for the project (Something like C++ Coding Standards). If you can achieve that, you are mostly done. The remaining part is modify the parts of code which does not follow the design convention. (I know, this is very easy to say, but much difficult to do. Good wishes to you.)

Checkout the book, Working Effectively with Legacy Code

The topics covered include

  • Understanding the mechanics of software change: adding features, fixing bugs, improving design, optimizing performance
  • Getting legacy code into a test harness
  • Writing tests that protect you against introducing new problems
  • Techniques that can be used with any language or platform—with examples in Java, C++, C, and C#
  • Accurately identifying where code changes need to be made
  • Coping with legacy systems that aren't object-oriented
  • Handling applications that don't seem to have any structure

This book also includes a catalog of twenty-four dependency-breaking techniques that help you work with program elements in isolation and make safer changes.

Check out the book Anti-Patterns for a well-written book on the whole subject of moving from bad (or maladaptive) design to better. It provides ways to recover from a whole host of problems typically found in software systems. I would then add support to Kristopher's recommendation of Refactoring as an important second step.

All,

I'm trying to do some unit testing in some archaic java code (no interfaces, no abstraction, etc.)

This is a servlet that uses a ServletContext (which I'm assuming is set up by Tomcat) and it has database information is set up in the web.xml/context.xml file. Now, I've figured out how to make a Fake ServletContext, but the code has

 InitialContext _ic = new InitialContext();

all over the place (so it isn't feasible to replace it). I need to find a way to make a default InitialContext() able to do the _ic.lookup(val) without throwing an exception.

I'm assuming there is some way that the context.xml is getting loaded, but how that magic works, I'm drawing a blank. Anyone have any ideas?

Take advantage of the fact that InitialContext uses an SPI to handle its creation. You can hook into its lifecycle by creating an implementation of javax.naming.spi.InitialContextFactory and passing that to your tests via the system property javax.naming.factory.initial (Context.INTITIAL_CONTEXT_FACTORY). It's simpler than it sounds.

Given this class:

public class UseInitialContext {

    public UseInitialContext() {
        try {
            InitialContext ic = new InitialContext();
            Object myObject = ic.lookup("myObject");
            System.out.println(myObject);
        } catch (NamingException e) {
            e.printStackTrace();
        }
    }


} 

And this impl of InitialContextFactory:

public class MyInitialContextFactory implements InitialContextFactory {

    public Context getInitialContext(Hashtable<?, ?> arg0)
            throws NamingException {

        Context context = Mockito.mock(Context.class);
        Mockito.when(context.lookup("myObject")).thenReturn("This is my object!!");
        return context;
    }
}

Creating an instance of UseInitialContext in a junit test with

-Djava.naming.initial.factory=initial.context.test.MyInitialContext

on the command line outputs This is my object!! (easy to set up in eclipse). I like Mockito for mocking and stubbing. I'd also recommend Micheal Feather's Working Effectively with Legacy Code if you deal with lots of legacy code. It's all about how to find seams in programs in order to isolate specific pieces for testing.


I have to do enhancements to an existing C++ project with above 100k lines of code.
My question is How and where to start with such projects ?
The problem increases further if the code is not well documented. Are there any automated tools for studying code flow with large projects?

Thanx,

There's a book for you: Working Effectively with Legacy Code

It's not about tools, but about various approaches, processes and techniques you can use to better understand and make changes to the code. It is even written from a mostly C++ perspective.

I'm convinced from this presentation and other commentary here on the site that I need to learn to Unit Test. I also realize that there have been many questions about what unit testing is here. Each time I go to consider how it should be done in the application I am currently working on, I walk away confused. It is a xulrunner application application, and a lot of the logic is event-based - when a user clicks here, this action takes place.

Often the examples I see for testing are testing classes - they instantiate an object, give it mock data, then check the properties of the object afterward. That makes sense to me - but what about the non-object-oriented pieces?

This guy mentioned that GUI-based unit testing is difficult in most any testing framework, maybe that's the problem. The presentation linked above mentions that each test should only touch one class, one method at a time. That seems to rule out what I'm trying to do.

So the question - how does one unit testing procedural or event-based code? Provide a link to good documentation, or explain it yourself.

On a side note, I also have a challenge of not having found a testing framework that is set up to test xulrunner apps - it seems that the tools just aren't developed yet. I imagine this is more peripheral than my understanding the concepts, writing testable code, applying unit testing.

See the oft-linked Working Effectively with Legacy Code. See the sections titled "My Application Is All API Calls" and "My Project is Not Object-Oriented. How Do I Make Safe Changes?".

In C/C++ world (my experience) the best solution in practice is to use the linker "seam" and link against test doubles for all the functions called by the function under test. That way you don't change any of the legacy code, but you can still test it in isolation.

I'm trying to make a very large, very legacy project testable.

We have a number of statically available services that most of our code uses. The problem is that these are hard to mock. They used to be singletons. Now they are pseudo-singletons -- same static interface but the functions delegate to an instance object that can be switched out. Like this:

class ServiceEveryoneNeeds
{
    public static IImplementation _implementation = new RealImplementation();

    public IEnumerable<FooBar> GetAllTheThings() { return _implementation.GetAllTheThings(); }
}

Now in my unit test:

void MyTest()
{
    ServiceEveryoneNeeds._implementation = new MockImplementation();
}

So far, so good. In prod, we only need the one implementation. But tests run in parallel and might need different mocks, so I did this:

class Dependencies
{
     //set this in prod to the real impl
     public static IImplementation _realImplementation;

     //unit tests set these
     [ThreadStatic]
     public static IImplementation _mock;

     public static IImplementation TheImplementation
     { get {return _realImplementation ?? _mock; } }

     public static void Cleanup() { _mock = null; }
}

And then:

class ServiceEveryoneNeeds
{
     static IImplementation GetImpl() { return Dependencies.TheImplementation; }

     public static IEnumerable<FooBar> GetAllTheThings() {return GetImpl().GetAllTheThings(); }

}

//and
void MyTest()
{
    Dependencies._mock = new BestMockEver();
    //test
    Dependencies.Cleanup();
}

We took this route because it's a massive project to constructor inject these services into every class that needs them. At the same time, these are universal services within our codebase that most functions depend on.

I understand that this pattern is bad in the sense that it hides dependencies, as opposed to constructor injection which makes dependencies explicit.

However the benefits are:
- we can start unit testing immediately, vs doing a 3 month refactor and then unit testing.
- we still have globals, but this appears to be strictly better than where we were.

While our dependencies are still implicit, I would argue that this approach is strictly better than what we had. Aside from the hidden dependencies, is this worse in some way than using a proper DI container? What problems will I run into?

I think what you're doing is not bad. You are trying to make your code base testable, and the trick is to do that in little steps. You'll get this same advice when reading Working Effectively With Legacy Code. Downside of what you're doing however, is that once you start using depedency injection, you will have to refactor your code base again. But more importantly, you will have to change a lot of test code.

I agree with Alex. Prefer using constructor injection instead of using an ambient context. You don't have to directly refactor your whole code base for this, but constructor injection will 'bubble' up the call stack, and you will have to make a 'cut' somewere to prevent it bubbling up, because this forces you to make a lot of changes throughout the code base.

I'm working currently on a legacy code base and can't use a DI container (the pain). Still I use constructor injection where I can, which sometimes means I have to revert to using poor mans dependency injection on some types. This is the trick I use to stop the 'constructor injection bubble'. Still, this is much better than using an ambient context. Poor man's DI is sub optimal, but still allows you to write proper unit tests and makes it much easier later on to break out that default constructor later on.

Recently, i took ownership of some c++ code. I am going to maintain this code, and add new features later on. I know many people say that it is usually not worth adding unit-tests to existing code, but i would still like to add some tests which will at least partially cover the code. In particular, i would like to add tests which reproduce bugs which i fixed.

Some of the classes are constructed with some pretty complex state, which can make it more difficult to unit-test.

I am also willing to refactor the code to make it easier to test.

Is there any good article you recommend on guidelines which help to identify classes which are easier to unit-test? Do you have any advice of your own?

While Martin Fowler's book on refactoring is a treasure trove of information, why not take a look at "Working Effectively with Legacy Code."

Also, if you're going to be dealing with classes where there's a ton of global variables or huge amounts of state transitions I'd put in a lot of integration checks. Separate out as much of the code which interacts with the code you're refactoring to make sure that all expected inputs in the order they are recieved continue to produce the same outputs. This is critical as it's very easy to "fix" a subtle bug that might have been addressed somewhere else.

Take notes too. If you do find that there is a bug which another function/class expects and handles properly you'll want to change both at the same time. That's difficult unless you keep thorough records.

Refactoring is the process of improving the existing system design without changing its behavior.

Besides Martin Fowler's seminal book "Refactoring - Improving the design of existing code" and Joshua Kerievsky's book "Refactoring to Patterns", are there any good resources on refactoring?

Working Effectively with Legacy Code focuses on dealing with existing code-bases that need to evolve to be testable. Many techniques are used in the book to accomplish this, and is an excellent resource for refactoring.

I would recommend reading Working Effectively with Legacy Code, then Refactoring - Improving the design of existing code. Martin Fowler's book is more like a receipt book for me, it explains how. Working effectively with legacy code, explains the why in my opinion.

below is some other books relating to refactoring:

antipatterns refactoring software architectures and projects in crisis

refactoring in large software projects performing complex restructurings

refactoring sql applications

Prefactoring

Yes, the dreaded 'M' word.

You've got a workstation, source control and half a million lines of source code that you didn't write. The documentation was out of date the moment that it was approved and published. The original developers are LTAO, at the next project/startup/loony bin and not answering email.

What are you going to do?

{favourite editor} and Grep will get you started on your spelunking through the gnarling guts of the code base but what other tools should be in the maintenance engineers toolbox?

To start the ball-rolling; I don't think I could live without source-insight for C/C++ spelunking. (DISCLAIMER: I don't work for 'em).

Just like eating the elephant - one bite at a time :)

Sometimes the big picture can be a real demotivator, and you need to pick a spot and tackle it piece by piece.

Of course, still need to choose the bit to start on... Typically this is driven most by the users/business with top priority specific changes required (yesterday...) but if you have a little flexibility or familiarization time, metrics are often useful. Tools here vary with the technology and language, but tools like NDepend and JDepend, any built in Code Metrics (like in in Visual Studio Team System, or the various available Eclipse plugins) or a tool like Simian to get a feel for the size of the copy and paste problem.

Hopefully the number of unit tests and coverage is greater than zero, and so a good first step is always to get whatever tests you can running in a Continuous Integration environment, as a foundation for adding more tests as you learn.

And as others have said - assuming options are available for the language - a good IDE with code navigation and automated refactoring is a must (Eclipse, Visual Studio (with or without ReSharper).

A couple of morale-boosting books:

Good luck :)

I have just read Michael C. Feathers great book Working Effectively with Legacy Code, the bible of introducing tests to legacy code. In this book he describes something called Edit-triggered testing:

If it isn't out by the time this book is released, I suspect that someone will soon develop an IDE that allows you to specify a set of tests that will run at every keystroke. It would be an incredible way of closing the feedback loop.

It has to happen. It just seems inevitable. There are already IDEs that check syntax on each keystroke and change the color of code when there are erros. Edit-triggered testing is the next step.

When I read this I hadn't heard about any IDEs or tools that support this. However, I just found a project called Infinitest that supports this for Java.

My questions are:

  1. Are there any other tools/framework that support this (hopefully also for Visual Studio)?
  2. What are your experiences with this kind of testing (efficient, slows down the IDE, etc)?
  3. Is this the next step of TDD?

Updates:

In the type of embedded programming I'm getting into, determinism and transparency of the running code are highly valued. What I mean by transparency is, for instance, being able to look at arbitrary sections of memory and know what variable is stored there. So, as I'm sure embedded programmers expect, new is to be avoided if at all possible, and if it can't be avoided, then limited to initialization.

I understand the need for this, but don't agree with the way my coworkers have gone about doing this, nor do I know a better alternative.

What we have are several global arrays of structures, and some global classes. There is one array of structs for mutexes, one for semaphores, and one for message queues (these are initialized in main). For each thread that runs, the class that owns it is a global variable.

The biggest problem I have with this is in unit testing. How can I insert a mock object when the class I want to test #includes global variables that I don't?

Here's the situation in pseudo-code:

foo.h

#include "Task.h"
class Foo : Task {
public:
  Foo(int n);
  ~Foo();
  doStuff();
private:
  // copy and assignment operators here
}

bar.h

#include <pthread.h>
#include "Task.h"

enum threadIndex { THREAD1 THREAD2 NUM_THREADS };
struct tThreadConfig {
  char      *name,
  Task      *taskptr,
  pthread_t  threadId,
  ...
};
void startTasks();

bar.cpp

#include "Foo.h"

Foo foo1(42);
Foo foo2(1337);
Task task(7331);

tThreadConfig threadConfig[NUM_THREADS] = {
  { "Foo 1", &foo1, 0, ... },
  { "Foo 2", &foo2, 0, ... },
  { "Task",  &task, 0, ... }
};

void FSW_taskStart() {
    for (int i = 0; i < NUMBER_OF_TASKS; i++) {
        threadConfig[i].taskptr->createThread(  );
    }
}

What if I want more or less tasks? A different set of arguments in the constructor of foo1? I think I would have to have a separate bar.h and bar.cpp, which seems like a lot more work than necessary.

If you want to unit test such code first I would recommend reading Working Effectively With Legacy Code Also see this.

Basically using the linker to insert mock/fake objects and functions should be a last resort but is still perfectly valid.

However you can also use inversion of control, without a framework this can push some responsibility to the client code. But it really helps testing. For instance to test FSW_taskStart()

tThreadConfig threadConfig[NUM_THREADS] = {
  { "Foo 1", %foo1, 0, ... },
  { "Foo 2", %foo2, 0, ... },
  { "Task",  %task, 0, ... }
};

void FSW_taskStart(tThreadConfig configs[], size_t len) {
    for (int i = 0; i < len; i++) {
        configs[i].taskptr->createThread(  );
    }
}

void FSW_taskStart() {
    FSW_taskStart(tThreadConfig, NUM_THREADS);
}

void testFSW_taskStart() {
    MockTask foo1, foo2, foo3;
    tThreadConfig mocks[3] = {
          { "Foo 1", &foo1, 0, ... },
          { "Foo 2", &foo2, 0, ... },
          { "Task",  &foo3, 0, ... }
        };
    FSW_taskStart(mocks, 3);
    assert(foo1.started);
    assert(foo2.started);
    assert(foo3.started);
}

Now you can can can pass mock version of you're threads to 'FSW_taskStart' to ensure that the function does in fact start the threads as required. Unfortunatly you have to rely on the fact that original FSW_taskStart passes the correct arguments but you are now testing a lot more of your code.

I've inherited some code that has a class AuthenticationManager with all static methods.

Im introducing DI and wanted to add a constructor that took a dependency UserController

UserController _userController;

public AuthenticationManager(UserController userCont)
{
    _userController = userCont;
}

Now Im getting the compile time error as a non-static variable is referenced from a static method. What would your best practice recommendation be to get this to work with the minmimal changes to this class and the calling code?

We're using the SimpleServiceLocator as the IOC container.

Well it depends on how often the class is used throughout the code. You'll likely want to create an IAuthenticationManager interface that includes methods that match the static methods you want to replace with instance methods. Then you could create an AuthenticationManager class that implements the interface, and accepts the UserController dependency via its constructor.

You would then need to replace all the static method call-sites the instance methods. You would probably want to inject an IAuthenticationManager into the classes via a constructor or a property. If need-be, you could also pass an IAuthenticationManager to the methods (at the call-sites) as a parameter.

Unfortunately replacing static methods takes quite a bit of refactoring. It is worth the effort though. It opens up the door for unit-testing.

Keep in mind that you can always refactor one method at a time by extracting an interface for one of the static methods. Do each method one at a time to take a step-wise approach to your refactoring (in other words, each method gets its own interface).

I would recommend taking a look at this book if you can: Working Effectively With Legacy Code. Great book that covers all sort of situation like this one.

Let's say you got a project that is really poorly written, contains lots of code smells, wtfs, etc. Moreover, its code structure is so complicated that it is extremely hard to add any new functionality to it. On the other hand, the project works as it is supposed to.

You want to refactor the project, perhaps move it to a new framework, how would you approach this problem? Will you try to build a new one from scratch or use some techniques (specify) to convert the working project into a new one?


I would like to clarify the question a little bit as there is a confusion what do I mean when saying "refactoring".

I'll give an example about a car, think of it as a software project. Let's say you have built your own car. Its construction is pretty weird: engine is upside down, thus all the pipes are laid differently, electricity wires are tangled together and nobody has an idea where do they start or end, etc.

However, everything is working fine: you can easily ride it to shop, work, etc. However, its fuel consumption is a bit too high. Also, if you ever wanted to install new headlights to it, it would be a disaster with all the mess in the wires.

You cant afford buying a new one, so you have to refactor the car somehow: change engines position to normal, make the wires tidy, etc. You need to do this, because sooner or later you will need to change engine, headlights, install new stereo etc.. On the other hand, you still need something to drive you to work every morning, so you must make sure that you don't screw everything up.

Now let's go back to the project. How would you refactor the project as complicated as the car above, while not disturbing its primary function and purpose.


I would also like to make this a community wiki. Please edit.

So far the main tendencies are:

Links:

My question is quite relevant to something asked before but I need some practical advice.

I have "Working effectively with legacy code" in my hands and I 'm using advice from the book as I read it in the project I 'm working on. The project is a C++ application that consists of a few libraries but the major portion of the code is compiled to a single executable. I 'm using googletest for adding unit tests to existing code when I have to touch something.

My problem is how can I setup my build process so I can build my unit tests since there are two different executables that need to share code while I am not able to extract the code from my "under test" application to a library. Right now I have made my build process for the application that holds the unit tests link against the object files generated from the build process of the main application but I really dislike it. Are there any suggestions?

Working Effectively With Legacy Code is the best resource for how to start testing old code. There are really no short term solutions that won't result in things getting worse.

Sometimes for testing/developing purposes we make some changes in the code that must be removed in a production build. I wonder if there is an easy way of marking such blocks so that production build would fail as long as they are present or at least it will warn you during the build somehow.

Simple "//TODO:" doesn't really work because it is ofter forgotten and mixed with tons of other todos. Is there anything stronger?

Or maybe even if I can create some external txt file and put there instructions on what to do before production, and that ant would check if that file is present then cancel build.

We are using Eclipse/Ant (and java + Spring).

Update: I don't mean that there are big chunks of code that are different in local and production. In fact all code is the same and should be the same. Just lets say I comment out some line of code to save lot of time during development and forget to uncomment it or something along those lines. I just want to be able to flag the project somehow that something needs an attention and that production build would fail or show a warning.

Avoid the necessity. If you're placing code into a class that shouldn't be there in production, figure out how to do it differently. Provide a hook, say, so that the testing code can do what it needs to, but leave the testing code outside the class. Or subclass for testing, or use Dependency Injection, or any other technique that leaves your code valid and safe for production, while still testable. Many such techniques are well-documented in Michael Feathers' fantastic book, Working Effectively with Legacy Code.

We've trying to separate a big code base into logical modules. I would like some recommendations for tools as well as whatever experiences you might have had with this sort of thing.

The application consists of a server WAR and several rich-clients distributed in JARs. The trouble is that it's all in one big, hairy code base, one source tree of > 2k files war. Each JAR has a dedicated class with a main method, but the tangle of dependencies ensnares quickly. It's not all that bad, good practices were followed consistently and there are components with specific tasks. It just needs some improvement to help our team scale as it grows.

The modules will each be in a maven project, built by a parent POM. The process has already started on moving each JAR/WAR into it's own project, but it's obvious that this will only scratch the surface: a few classes in each app JAR and a mammoth "legacy" project with everything else. Also, there are already some unit and integration tests.

Anyway, I'm interesting in tools, techniques, and general advice to breaking up an overly large and entangled code base into something more manageable. Free/open source is preferred.

To continue Itay's answer, I suggest reading Michael Feathers' "Working Effectively With Legacy Code"(pdf). He also recommends every step to be backed by tests. There is also A book-length version.

When having a new C++ project passed along to you, what is the standard way of stepping through it and becoming acquainted with the entire codebase? Do you just start at the top file and start reading through all x-hundred files? Do you use a tool to generate information for you? If so, which tool?

You could try running it through doxygen to at last give a browsable set of documentation - but basically the only way is a debugger, some trace/std::cerr messages and a lot of coffee.

The suggestion to write test cases is the basis of Working-Effectively-Legacy-code and the point of the cppunit test library. If you can take this approach depends on your team and your setup - if you are the new junior you can't really rewrite the app to support testing.

How would you maintain the legacy applications that:

  1. Has no unit tests have big methods

  2. with a lot of duplicated logic have

  3. have No separation of concern
  4. have a lot of quick hacks and hard coded strings
  5. have Outdated and wrong documentation
  6. Requirements are not properly documented! This has actually resulted in disputes between the testers, developers and the clients in the past. Of course there are some non-functional requirements such as shouldn't be slow, don't clash and other business logics that are known to the application users. But beyond the most common-sense scenario and the most common-sense business workflow, there is little guidance on what should be ( or not) done.

???

You need this book.

alt text

I basically agree with everything Paul C said. I'm not a TDD priest, but anytime you're touching a legacy codebase -- especially one with which you're not intimately familiar -- you need to have a solid way to retest and make sure you've followed Hippocrates: First, do no harm. Testing, good unit and regression tests in particular, are about the only way to make that play.

I highly recommend picking up a copy of Reversing: Secrets of Reverse Engineering Software if it's a codebase with which you're unfamiliar. Although this book goes to great depths that are outside your current needs (and mine, for that matter), it taught me a great deal about how to safely and sanely work with someone else's code.

I am on a team where i am trying to convince my teammates to adopt TDD (as i have seen it work in my previous team and the setup is similar). Also, my personal belief is that, at least in the beginning, it really helps if both TDD and Pair Programming are done in conjunction. That way, two inexperienced (in TDD) developers can help each other, discuss what kind of tests to write and make good headway.

My manager, on the other hand, feels that if we introduce two new development practices in the team at once, there is a good chance that both might fail. So, he wants to be a little more conservative and introduce any one.

How do i convince him that both of these are complementary and not orthogonal. Or am i wrong?

I am working on a java project and I have to extend (add more functionality) it. But I don't know how should I learn the existing one before incorporating them. Is there any specific path I should follow? Can I run it in a way so that I can see, statement by statement, the execution of the program?

I am a kind of stuck in understanding it, thanks.

This is a recurrent question on Stack Overflow. There is already very good answers all around:

Also, this book might help: Working Effectively with Legacy Code

"Patience and fortitude conquer all things." - Ralph Waldo Emerson

I have a following method, which retrieves top visited pages from Google Analytics:

public function getData($limit = 10)
{
    $ids = '12345';
    $dateFrom = '2011-01-01';
    $dateTo = date('Y-m-d');

    // Google Analytics credentials
    $mail = 'my_mail';
    $pass = 'my_pass';

    $clientLogin = Zend_Gdata_ClientLogin::getHttpClient($mail, $pass, "analytics");
    $client = new Zend_Gdata($clientLogin);

    $reportURL = 'https://www.google.com/analytics/feeds/data?';

    $params = array(
        'ids' => 'ga:' . $ids,
        'dimensions' => 'ga:pagePath,ga:pageTitle',
        'metrics' => 'ga:visitors',
        'sort' => '-ga:visitors',
        'start-date' => $dateFrom,
        'end-date' => $dateTo,
        'max-results' => $limit
    );

    $query = http_build_query($params, '');
    $reportURL .= $query;

    $results = $client->getFeed($reportURL);

    $xml = $results->getXML();
    Zend_Feed::lookupNamespace('default');
    $feed = new Zend_Feed_Atom(null, $xml);

    $top = array();
    foreach ($feed as $entry) {
        $page['visitors'] = (int) $entry->metric->getDOM()->getAttribute('value');
        $page['url'] = $entry->dimension[0]->getDOM()->getAttribute('value');
        $page['title'] = $entry->dimension[1]->getDOM()->getAttribute('value');
        $top[] = $page;
    }

    return $top;
}

It needs some refactoring for sure, but the question is:

  • How would you write PHPUnit tests for this method?

My first inclination is to tell you that this one function getData is one of the most nasty and ugliest piece of code. You are asking how to unit test this. Well guess what my recommendation is going to be? Refactor.

In order to refactor this code, you will need a coverage test.

The reasons for refactoring are many:

  1. Dependency on third-party framework.
  2. Dependency on external service.
  3. getData has too many responsibilites.

    a. Login in to external service using third-party framework.

    b. Create query for external service.

    c. Parse query response from external service.

How have you isolated your code from changes to either third-party framework and from external service?

You really should take a look at Michael Feather's Book. Working Effectively with Legacy Code

[EDIT]

My point to you (spoiler coming), is that with this code you can never get a true unit test. It is because of the dependency on external service. The unit test has no control over the service or the data it returns. A unit test should be able to execute such that every time it executes it's outcome is consistent. With an external service this may not be the case. YOU HAVE NO CONTROL OVER WHAT THE EXTERNAL SERVICE RETURNS.

What do you do if the service is down? Unit test FAIL.

What if the results are returned changes? Unit test FAIL.

Unit tests results must remain consistent from execution to execution. Otherwise it is not a unit test.

I have been working on a comparatively large system on my own, and it's my first time working on a large system(dealing with 200+ channels of information simultaneously). I know how to use Junit to test every method, and how to test boundary conditions. But still, for system test, I need to test all the interfacing and probably so some stress test as well (maybe there are other things to do, but I don't know what they are). I am totally new to the world of testing, and please give me some suggestions or point me to some info on how a good code tester would do system testing.

PS: 2 specific questions I have are: how to test private functions? how to testing interfaces and avoid side effects?

Here are two web sites that might help:

The first is a list of open source Java tools. Many of the tools are addons to JUnit that allow either easier testing or testing at a higher integration level.

Depending on your system, sometimes JUnit will work for system tests, but the structure of the test can be different.

As for private methods, check this question (and the question it references).

You cannot test interfaces (as there is no behavior), but you can create an abstract base test classes for testing that implementations of an interface follow its contract.

EDIT: Also, if you don't already have unit tests, check out Working Effectivly with Legacy Code; it is a must for testing code that is not set up well for testing.

I just recently finished Michael Feathers' book Working Effectively with Legacy Code. It was a great book on how to effectively create test seams and exploit them to get existing code under test.

One of the techniques he talk about was using "link seams". Basically the idea was that if you had code that depending on another library you could use the linker to insert a different library for testing than for production. This would allow you to sense test conditions through a mock library, or avoid calling into libraries that have real world effects (databases, emails, etc.), etc.

The example he gave was in C++. I'm curious if this technique (or something similar) is possible in .NET / C#?

Yes it is possible in .Net. In the simplest case, you can just replace an assembly with another one of the same name.

With a strongly named assembly, you should change the version number and then configure assembly bindings to override the compile time "linked" version. This can be done on an enterprise, machine, user or directory level.

There are some caveats, related to security. If the assembly you wish to substitute has been strongly named, then you will need to recreate the same public key in signing the assembly.

In other words, if you as the application developer do not want your libraries "mocked" (or perhaps replaced with malicious code) then you must ensure that the assembly is signed and the private key is not publicly available.

That is why you cannot mock DateTime -- because Microsoft has strongly named the core libraries of .Net.

I'm currently on a co-op term working on a project nearing completion with one other co-op student. Since this project has been passed down from co-op to co-op, poor practices have been taken along the way and testing has been left until the end. I've decided I'd like to write unit-tests to learn something new while testing.

However, I'm working on a 3-tier, tightly coupled app that seems impossible to unit test in its current form. I don't want to throw off the other co-op student with no knowledge of any of these concepts by refactoring the code beyond recognition overnight. So what steps should I take to slowly pull the code towards unit-testability? Should I first implement a factory pattern and let the other student familiarize themselves with that before moving forward?

My apologies if my knowledge is flawed and there should be no issue whatsoever. I'm new to this :)

Working Effectively with Legacy Code by Michael Feathers (also available in Safari if you have a subscription) is an excellent resource for your task. The author defines legacy code as code without unit tests, and he gives practical walkthroughs of lots of conservative techniques—necessary because you're working without a safety net—for bringing code under test. Table of contents:

  • Part: I The Mechanics of Change
    • Chapter 1. Changing Software
      • Four Reasons to Change Software
      • Risky Change
    • Chapter 2. Working with Feedback
      • What Is Unit Testing?
      • Higher-Level Testing
      • Test Coverings
      • The Legacy Code Change Algorithm
    • Chapter 3. Sensing and Separation
      • Faking Collaborators
    • Chapter 4. The Seam Model
      • A Huge Sheet of Text
      • Seams
      • Seam Types
    • Chapter 5. Tools
      • Automated Refactoring Tools
      • Mock Objects
      • Unit-Testing Harnesses
      • General Test Harnesses
  • Part: II Changing Software
    • Chapter 6. I Don't Have Much Time and I Have to Change It
      • Sprout Method
      • Sprout Class
      • Wrap Method
      • Wrap Class
      • Summary
    • Chapter 7. It Takes Forever to Make a Change
      • Understanding
      • Lag Time
      • Breaking Dependencies
      • Summary
    • Chapter 8. How Do I Add a Feature?
      • Test-Driven Development (TDD)
      • Programming by Difference
      • Summary
    • Chapter 9. I Can't Get This Class into a Test Harness
      • The Case of the Irritating Parameter
      • The Case of the Hidden Dependency
      • The Case of the Construction Blob
      • The Case of the Irritating Global Dependency
      • The Case of the Horrible Include Dependencies
      • The Case of the Onion Parameter
      • The Case of the Aliased Parameter
    • Chapter 10. I Can't Run This Method in a Test Harness
      • The Case of the Hidden Method
      • The Case of the "Helpful" Language Feature
      • The Case of the Undetectable Side Effect
    • Chapter 11. I Need to Make a Change. What Methods Should I Test?
      • Reasoning About Effects
      • Reasoning Forward
      • Effect Propagation
      • Tools for Effect Reasoning
      • Learning from Effect Analysis
      • Simplifying Effect Sketches
    • Chapter 12. I Need to Make Many Changes in One Area. Do I Have to Break Dependencies for All the Classes Involved?
      • Interception Points
      • Judging Design with Pinch Points
      • Pinch Point Traps
    • Chapter 13. I Need to Make a Change, but I Don't Know What Tests to Write Characterization Tests
      • Characterizing Classes
      • Targeted Testing
      • A Heuristic for Writing Characterization Tests
    • Chapter 14. Dependencies on Libraries Are Killing Me
    • Chapter 15. My Application Is All API Calls
    • Chapter 16. I Don't Understand the Code Well Enough to Change It
      • Notes/Sketching
      • Listing Markup
      • Scratch Refactoring
      • Delete Unused Code
    • Chapter 17. My Application Has No Structure
      • Telling the Story of the System
      • Naked CRC
      • Conversation Scrutiny
    • Chapter 18. My Test Code Is in the Way
      • Class Naming Conventions
      • Test Location
    • Chapter 19. My Project Is Not Object Oriented. How Do I Make Safe Changes?
      • An Easy Case
      • A Hard Case
      • Adding New Behavior
      • Taking Advantage of Object Orientation
      • It's All Object Oriented
    • Chapter 20. This Class Is Too Big and I Don't Want It to Get Any Bigger
      • Seeing Responsibilities
      • Other Techniques
      • Moving Forward
      • After Extract Class
    • Chapter 21. I'm Changing the Same Code All Over the Place
      • First Steps
    • Chapter 22. I Need to Change a Monster Method and I Can't Write Tests for It
      • Varieties of Monsters
      • Tackling Monsters with Automated Refactoring Support
      • The Manual Refactoring Challenge
      • Strategy
    • Chapter 23. How Do I Know That I'm Not Breaking Anything?
      • Hyperaware Editing
      • Single-Goal Editing
      • Preserve Signatures
      • Lean on the Compiler
    • Chapter 24. We Feel Overwhelmed. It Isn't Going to Get Any Better
  • Part: III Dependency-Breaking Techniques
    • Chapter 25. Dependency-Breaking Techniques
      • Adapt Parameter
      • Break Out Method Object
      • Definition Completion
      • Encapsulate Global References
      • Expose Static Method
      • Extract and Override Call
      • Extract and Override Factory Method
      • Extract and Override Getter
      • Extract Implementer
      • Extract Interface
      • Introduce Instance Delegator
      • Introduce Static Setter
      • Link Substitution
      • Parameterize Constructor
      • Parameterize Method
      • Primitivize Parameter
      • Pull Up Feature
      • Push Down Dependency
      • Replace Function with Function Pointer
      • Replace Global Reference with Getter
      • Subclass and Override Method
      • Supersede Instance Variable
      • Template Redefinition
      • Text Redefinition
  • Appendix: Refactoring
    • Extract Method

I'm still learning the dark arts of TDD and recently I've been trying to learn how to do TDD in VB6 and what I basically narrow down the list to was the free simplyvbunit and the most costly vbunit3.

My application is an richtext editor with plenty of 3rd party dll and I was scouring high and low in Google to find how to do unit test this exe file.

So my question is how do you unit test an exe file? Especially in the context for VB6 and if you have any good example with vbunit3 or simplyvbunit, you're simply a lifesaver as I'm drowing in the material right now and I still can't write one unit test yet :(

Edit

Actually the app consists of many forms, modules and class modules and when I compile it, it of course becomes nice neatly packaged .EXE file. And to make things more complicated there are quite a number of global variables flying around.

But my main intention is to unit test all or most breakable part of the code. And i want to ensure that i can keep the test and the code separate. So i thought that the best way to do is, is to somehow directly test the exe via add reference, etc.

Is there a better way to do this?

About the only useful piece of advice I might be able to give is to pick up Michael Feathers Working Effectively with Legacy Code. I think your biggest challenge is going to be your toolset, as I know the tools for VB6 unit testing just aren't as strong.

You could also try asking on the TDD Yahoo! List as I know at least a couple of people are using vbunit3 on there.

I want to introduce Unit Testing to some colleagues that have no or few experience with Unit Testing. I'll start with a presentation of about an hour to explain the concept and give lots of examples. I'll follow up with pair programming sessions and code reviews.

What are the key points that should be focussed on at the intrduction?

Unit tests test small things

Another thing to remember is that unit tests test small things, "units". So if your test runs against a resource like a live server or a database, most people call that a system or integration test. To unit test just the code that talks to a resource like that, people often use mock objects (often called mocks).

Unit tests should run fast and be run often

When unit tests test small things, the tests run fast. That's a good thing. Frequently running unit tests helps you catch problems soon after the occur. The ultimate in frequently running unit tests is having them automated as part of continuous integration.

Unit tests work best when coverage is high

People have different views as to whether 100% unit test coverage is desirable. I'm of the belief that high coverage is good, but that there's a point of diminishing return. As a very rough rule of thumb, I would be happy with a code base that had 85% coverage with good unit tests.

Unit tests aren't a substitute for other types of tests

As important as unit tests are, other types of testing, like integration tests, acceptance tests, and others can also be considered parts of a well-tested system.

Unit testing existing code poses special challenges

If you're looking to add unit tests to existing code, you may want to look at Working Effectively with Legacy Code by Michael Feathers. Code that wasn't designed with testing in mind may have characteristics that make testing difficult and Feathers writes about ways of carefully refactoring code to make it easier to test. And when you're familiar with certain patterns that make testing code difficult, you and your team can write code that tries to avoid/minimize those patterns.

I would like to refactor a large legacy application originally written in Visual Basic 6.0 and subsequently ported to .NET. In order to do this with confidence, I want to have unit tests around the existing code so I can compare before and after. What's the easiest and most effective way of doing this?

There's a book called "Working Effectively with Legacy Code" that looks like it might help me. However, it looks like it only deals with object-oriented languages and Visual Basic 6.0 is not necessarily OO. Can this book still help me? I'm hoping someone who's read it can vouch for it.

Specifically, this application uses no classes other than the forms themselves. It accesses the database directly from the forms, and not consistently. There were several people working on this project all using their own styles with no standards whatsoever.

As I said, this project has been ported to VB.NET. However, it's only ported in the sense that it compiles under Visual Studio 2008. All of the coding concepts are Visual Basic 6.0.

I would suggest taking a look at Martin Fowler's Refactoring: Improving the design of existing code, which is an excellent must-read.

You might be looking for something along the lines of Professional Refactoring in Visual Basic. I haven't read it, but it looks applicable.

It does not just deal with object-oriented (OO) languages. Large sections are about how to deal with legacy code in C.

So yes, buy it!


There is a whole chapter (Chapter 19) called:

My project is not object oriented. How do I make safe changes?

Also there is vbUnit, an xUnit implementation that may help you use TDD with Visual Basic 6.0.

Actually, I misread the question and thought you were going to port, not that you already have ported. In this case, you have a ton of 'legacy' VB.NET code that this is totally the book for you. You can take advantage of VB.NET's OO capabilities and use the rest of the book.

I really cannot recommend this book more.

Thank you all for your help. A number of you posted (as I should have expected) answers indicating my whole approach was wrong, or that low-level code should never have to know whether or not it is running in a container. I would tend to agree. However, I'm dealing with a complex legacy application and do not have the option of doing a major refactoring for the current problem.

Let me step back and ask the question the motivated my original question.

I have a legacy application running under JBoss, and have made some modifications to lower-level code. I have created a unit test for my modification. In order to run the test, I need to connect to a database.

The legacy code gets the data source this way:

(jndiName is a defined string)

Context ctx = new InitialContext();
DataSource dataSource = (DataSource) ctx.lookup(jndiName);

My problem is that when I run this code under unit test, the Context has no data sources defined. My solution to this was to try to see if I'm running under the application server and, if not, create the test DataSource and return it. If I am running under the app server, then I use the code above.

So, my real question is: What is the correct way to do this? Is there some approved way the unit test can set up the context to return the appropriate data source so that the code under test doesn't need to be aware of where it's running?


For Context: MY ORIGINAL QUESTION:

I have some Java code that needs to know whether or not it is running under JBoss. Is there a canonical way for code to tell whether it is running in a container?

My first approach was developed through experimention and consists of getting the initial context and testing that it can look up certain values.

private boolean isRunningUnderJBoss(Context ctx) {
        boolean runningUnderJBoss = false;
        try {
            // The following invokes a naming exception when not running under
            // JBoss.
            ctx.getNameInNamespace();

            // The URL packages must contain the string "jboss".
            String urlPackages = (String) ctx.lookup("java.naming.factory.url.pkgs");
            if ((urlPackages != null) && (urlPackages.toUpperCase().contains("JBOSS"))) {
                runningUnderJBoss = true;
            }
        } catch (Exception e) {
            // If we get there, we are not under JBoss
            runningUnderJBoss = false;
        }
        return runningUnderJBoss;
    }

Context ctx = new InitialContext();
if (isRunningUnderJboss(ctx)
{
.........

Now, this seems to work, but it feels like a hack. What is the "correct" way to do this? Ideally, I'd like a way that would work with a variety of application servers, not just JBoss.

There are a couple of ways to tackle this problem. One is to pass a Context object to the class when it is under unit test. If you can't change the method signature, refactor the creation of the inital context to a protected method and test a subclass that returns the mocked context object by overriding the method. That can at least put the class under test so you can refactor to better alternatives from there.

The next option is to make database connections a factory that can tell if it is in a container or not, and do the appropriate thing in each case.

One thing to think about is - once you have this database connection out of the container, what are you going to do with it? It is easier, but it isn't quite a unit test if you have to carry the whole data access layer.

For further help in this direction of moving legacy code under unit test, I suggest you look at Michael Feather's Working Effectively with Legacy Code.

The IT department I work in as a programmer revolves around a 30+ year old code base (Fortran and C). The code is in a poor condition partially as a result of 30+ years of ad-hoc poorly thought out changes but I also suspect a lot of it has to do with the capabilities of the programmers who made the changes (and who incidentally are still around).

The business that depends on the software operates 363 days a year and 20 hours a day. Unfortunately there are numerous outages. This is the first place I have worked where there are developers on call to apply operational code fixes to production systems. When I was first, there was actually a copy of the source code and development tools on the production servers so that on the fly changes could be applied; thankfully that practice has now been stopped.

I have hinted a couple of times to management that the costs of the downtime, having developers on call, extra operational staff, unsatisifed customers etc. are costing the business a lot more in the medium, and possibly even short term, than it would to launch a whole hearted effort to re-write/refactor/replace the whole thing (the code base is about 300k lines).

Ideally they'd be some external consultancy that could come in and run the rule over the quality of the code and the costs involved to keep it running vs rewrite/refactor/replace it. The question I have is how should a business go about doing that kind of cost analysis on software AND be able to have confidence in that analysis? The first IT consultants down the street may claim to be able to do the analysis but how could management be made to feel comfortable with it over what they are being told by internal staff?

  1. Read the book Working Effectively with Legacy Code (also see the short PDF version) and surround the code with automated tests, as instructed in that book.

  2. Refactor the system little by little. If you rewrite some parts of the code, do it a small subsystem at a time. Don't try to make a Grand Redesign.

After months of frustration and of time spent in inserting needles in voodoo dolls of previous developers I decided that it is better try to refactor the legacy code.

I already ordered Micheal Feather's book, I am into Fowler's refactoring and I made some sample projects with DUnit.

So even if I don't master the subject I feel it is time to act and put some ideas into practice.

Almost 100% of the code I work on has the business logic trapped in the UI, moreover all is procedural programming (with some few exceptions). The application started as quick & dirty and continued as such.

Now writing tests for all the application is a meaningless task in my case, but I would like to try to unittest something that I need to refactor.

One of the complex tasks one big "TForm business logic class" does is to read DB data, make some computations and populate a scheduler component. I would like to remove the reading DB data and computation part and assign to a new class this task. Of course this is a way to improve the current design, it is not the best way for starting from scratch, but I'd like to do this because the data returned by this new class is useful also in other ways, for example now I've been ask to send e-mail notifications of scheduler data.

So to avoid a massive copy and paste operation I need the new class.

Now the scheduler is populated from a huge dataset (huge in size and in number of fields), probably a first refactoring step could be obtaining the dataset from the new class. But then in the future I'd better use a new class (like TSchedulerData or some other name less bound to scheduler) to manage the data, and instead of having a dataset as result i can have a TSchedulerData object.

Since refactor occurs at at small steps and tests are needed to refactor better I am a little confused on how to proceed.

The following points are not clear to me:

1) how to test a complex dataset? Should I run the working application, save one result set to xml, and write a test where I use a TClientDataSet containing that xml data?

2) How much do I have to care about TSchedulerData? I mean I am not 100% sure I will use TSchedulerData, may be I will stick with the Dataset, anyway thinking of creating complex tests that will be discarded in 2 weeks is not appealing for a DUnitNewbee. Anyway probably this is how it works. I can't imagine the number of bugs that I would face without a test.

Final note: I know someone thinks rewriting from scratch is a better option, but this is not an option. "The application is huge and it is sold today and new features are required today not to get out of business". This is what I have been told, anyway refactoring can save my life and extend the application life.

Your eventual goal is to separate the UI, data storage and business logic into distinct layers.

Its very difficult to test a UI with automatic testing frameworks. You'll want to eventually separate as much of the business logic from the UI as possible. This can be accomplished using one of the various Model/View/* patterns. I prefer MVP passive view, which attempts to make the UI nothing more than an interface. If you're using a Dataset MVP Supervising Controller may be a better fit.

Data storage needs to have its own suite of tests but these are different from unit tests (though you can use the same unit testing framework) and there are usually fewer of them. You can get away with this because most of the heavy lifting is being done by third party data components and a dbms (in your case T*Dataset). These are integration tests. Basically making sure your code plays nice with the vendor's code. Also needed if you have any stored procedures defined in the DB. They are much slower that unit tests and don't need to be run as often.

The business logic is what you want to test the most. Every calculation, loop or branch should have at least one test(more is preferable). In legacy code this logic often touches the UI and db directly and does multiple things in a single function. Here Extract Method is your friend. Good places to extract methods are:

for I:=0 to List.Count - 1 do
begin
  //HERE
end;

if /*HERE if its a complex condition*/ then
begin
  //HERE
end
else
begin
  //HERE
end

Answer := Var1 / Var2 + Var1 * Var3; //HERE

When you come across one of these extraction points

  1. Decide what you want the method signature to look like for your new method: Method name, parameters, return value.
  2. Write a test that calls it and checks the expected outcome.
  3. Extract the method.

If all goes well you will have a newly extracted method with at least one passing unit test.

Delphi's built in Extract Method doesn't give you any way to adjust the signature so if that's your own option you'll have to make do and fix it after extraction. You'll also want to make the new method public so your test can access it. Some people balk at making a private utility method public but at this early stage you have little choice. Once you've made sufficient progress you'll start to see that some utility methods you've extracted belong in their own class (in which case they'd have to be public anyway) while others can be made private/protected and tested indirectly by testing methods that depend on them.

As your test suite grows you'll want to run them after each change to ensure your latest change hasn't broken something elsewhere.

This topic is much too large to cover completely in an answer. You'll find the vast majority of your questions are covered when that book arrives.

Any useful metrics will be fine

Legacy code has been defined in many places as "code without tests". I don't think they are specific in the types of tests, but in general, if you can't make a change to your code without the fear of something unknown happening, well, it quickly devolves.

See "Working Effectively with Legacy Code"

At work, we have a three-tier product. There is a client application which the users use and it queries data from a server which forwards those requests to a SQL database. We don't allow the client to have direct access to the SQL server.

The client product is what I'm wanting to unit test, but it has over 1.2 million lines of C# code and is a very old product. It was not designed with unit testing in mind and the lead developers for this product are generally opposed to unit testing mostly because of risk vs reward concerns, as well as how redesign would be required to reduce the amount of mocking that would need to be done. The redesign of these core, low-level client libraries and objects also has them concerned.

My philosophy is to certainly never neglect unit testing (because we'll always be too busy for it, and it'll always seem risky, and thus will never ever get done) and take an iterative approach to implementing unit tests.

I'm interested in hearing solutions to this situation. I'm sure many of you have encountered the situation of having to add unit testing into existing infrastructure. How could unit tests be added iteratively into the code base without hindering productivity and release cycles?

I found Michael Feathers' Working Effectively with Legacy Code useful when researching this topic.

I am trying to re factor some code by breaking a class into several other classes. to do so i want to move some methods already existing in my old class to new class. But these methods are being referred in a lot of places and manually updating the references seems tiresome. So is there any way to move methods as well as update their references in eclipse?

I would do it this way:

  1. Ensure that your tests work and the code to be re-factored is covered. If you don't have tests write tests. They are your safety rope.
  2. Use the re-factoring pattern extract superclass to create the new class that you want to move some methods to.
  3. Use the re-factoring pattern pull up methods to move the methods along with the variables that they need to the superclass. Now you will see if the methods you want to move and the instance variables have dependencies to the other methods that you don't want to move. If so you must first break this dependencies.
  4. Find all client code that should use the new extracted class instead of the "old" class and rewrite it to the new extracted class.
  5. Remove the "extends" relationship between the two classes. Now the client code should work or you missed something.

Also a good book for learning how to apply re-factoring patterns is Working Effectively with Legacy Code

I am trying to retrospectively unit test an application that is fairly complex but utilises MVC. I know retrospectively applying unit tests isn't ideal but I still believe it's possible by refactoring existing code. A majority of the time it is not possible to unit test one unit, without relying on other units i.e. a view relies on a model.

What is the best way to unit test in this case? Is it better to utilise the real model or create a mock model?

The problem with utilising a real model in my situation is that the model relies on other response classes that get data from XML so there's a chain of reliance. This model has a lot of data so it would be much easier to use this but maybe I'm missing the point.

I have provided a UML of the application for brevity.

enter image description here

**Edit ****

Ok so if I am correct, is it best practice to create mock data inside a mock class? for example I have the mock class "MockPlaylistPanelModel" that creates data required for the View class "PlaylistPanel" to run without errors:

class MockPlaylistPanelModel extends Mock implements IPlaylistPanelModel
{
  /** 
   * Return all playlist items
   * @public 
   */
  public function get mainPlaylistItems():Vector.<PlaylistData> 
  {
    var playData:Vector.<PlaylistData> = new Vector.<PlaylistData>;
    var playlistResp:PlaylistData = new PlaylistData(0, "", "", 0, 0, 0, 0);
    playData.push(playlistResp);
    return playData;
   }

}

To retrospectively fit unit testing into an existing application, you often need to change the application code to support unit testing (as you rightly mention you may need to perform some refactoring). However of course the risk here is changes to the application introduce bugs, which cannot be protected against without having some tests in place.

Therefore, a sensible approach is to get some system level tests in place covering some of your key use cases. This acts as a kind of 'test scaffolding' around your application, meaning that you can more safely begin to introduce lower level tests with a decreased risk of introducing bugs as you modify the application to make it more testable. Once this is in place, you can then introduce a policy that developers must write tests around any code they change before they change it - this allows you to organically grow a set of automated tests around the application.

I would highly recommend getting hold of Working Effectively with Legacy Code - this excellent book covers all sorts of useful techniques for introducing testing into an existing application which has little automated testing.

Regarding your question on whether you should create mock data inside a mock class for testing, this is one approach you can take when injecting test versions of objects, but probably not the best. By using a mocking framework like Mockito you can easily create mock objects with clearly defined behaviour on the fly. In your case, you can use Mockito to create a mock model implementation, and then inject your mock model into whatever object depends on it.

I have a Visual Studio 2005 C++ project, it is a console application.

I want to start getting bits of the code under a test harness but I've run into some issues that I don't know how to best handle.

I don't want most of my testing code to end up in the normal .exe in production so I thought would be best to create a separate project for my tests. First issue, how is this new project going to call into the rest of the code? Should I make my legacy code a .lib or .dll with a single entry point and create a separate project that calls the main of my legacy code?

Should I go for the ugly hack of putting all my tests in files that are entirely #ifdef TESTING so that the code doesn't end up in my production .exe? If so how should I conditionally load my testing framework? Use a separate Properties Configuration for testing?

I'm basically looking for any suggestions on how to go about getting a test harness on a legacy .exe project in Visual C++

First, I'd strongly recommend Michael Feather's book "Working Effectively with Legacy Code". It's all about how to add automated unit tests to a legacy app that has no tests. If you're wondering "how do I even start testing this pile of code" then this book is for you.

Michael is also the author of CppUnit, an Open Source NUnit-like testing framework for C++ code. You can find it here: http://sourceforge.net/projects/cppunit/.

One quick-and-dirty way to add tests is to add a UnitTest configuration to your solution. This configuration would compile your code, but instead of linking it to your main.CPP, you exclude your main.cpp from the build and include UnitTestMain.cpp, where you would place the calls to execute the unit tests. We started out this way a long time ago, when we didn't know any better. You end up spending a lot of time including and excluding all the various testMyCode.cpp modules to the various configurations, though, and it gets tiring after a while. We found that developers didn't like that approach too much.

A much better approach is to add a unit test project to your solution, with a build dependency upon your real project. If the project is named Foo.vcproj, call it Foo_test.vcproj. This project contains just your test code, it #includes your Foo headers, and it links to your compiled fooCode.obj modules. Add a call to execute Foo_test.exe as a post-build step of the Foo_test build, and it automatically runs the unit tests during the build. If any unit tests fail, the build fails. If you have gated-check-ins configured on your build server, nobody can check in changes that break existing tests.

I have been coding for a few years and still feel that my knowledge is not broad enough to become a professional. I have studied some books related to design patterns, but I know there are many others.

So could somebody list the patterns and principles which you think are good to learn to become a better programmer and more professional?

Programming Languages I work on: C#, Ruby, Javascript.

Encyclopedic knowledge of design patterns will get you nowhere. Plenty of experience applying them will. This will teach you when to use them and when not to.

That said, the original Design Patterns book is still one of my favorites. Pick up other patterns as you go along.

Martin Fowler's Patterns of Enterprise Application Architecture to build up a shared vocabulary with other developers (e.g. Repository, Active Record, Domain Model, Unit of Work).

Douglas Crockford's Javascript: The Good Parts to actually understand how Javascript works.

And I'd really recommend getting into TDD (Test Driven Development). There are a bunch of good TDD books but if you are doing brownfield development (which most of us are) then I'd really recommend Michael Feather's Working Effectively with Legacy Code.

And finally the book that shows you just how refactored and clean code can be: Uncle Bob's Clean Code.

I have the following method and am looking to write effective unit tests that also give me good coverage of the code paths:

public TheResponse DoSomething(TheRequest request)
{
    if (request == null)
        throw new ArgumentNullException("request");

    BeginRequest(request);

    try
    {
        var result = Service.DoTheWork(request.Data);

        var response = Mapper.Map<TheResult, TheResponse>(result);

        return response;
    }
    catch (Exception ex)
    {
        Logger.LogError("This method failed.", ex);

        throw;
    }
    finally
    {
        EndRequest();
    }
}

The Service and Logger objects used by the method are injected into the class constructor (not shown). BeginRequest and EndRequest are implemented in a base class (not shown). And Mapper is the AutoMapper class used for object-to-object mapping.

My question is what are good, effective ways to write unit tests for a method such as this that also provides complete (or what makes sense) code coverage?

I am a believer in the one-test-one-assertion principal and use Moq for a mocking framework in VS-Test (although I'm not hung up on that part for this discusion). While some of the tests (like making sure passing null results in an exception) are obvious, I find myself wondering whether others that come to mind make sense or not; especially when they are exercising the same code in different ways.

Judging from your post/comments you seem to know what tests you should write already, and it pretty much matches what I'd test after first glance at your code. Few obvious things to begin with:

  • Null argument exception check
  • Mock Service and Logger to check whether they're called with proper data
  • Stub Mapper (and potentially Service) to check whether correct results are actually returned

Now, the difficult part. Depending whether your base class is something you have access to (eg. can change it without too much trouble), you can try approach called Extract & Override:

  1. Mark BeginRequest/EndRequest as virtual in base class
  2. Do nothing with them in derived class
  3. Introduce new, testable class deriving from class you want to test; override methods from base class (BeginRequest/EndRequest), making them eg. change some internal value which you can later easily verify

Code could look more or less like this:

Base 
{
    protected virtual void BeginRequest(TheRequest request) { ... }
    protected virtual void EndRequest() { ... }
}

Derived : Base // class you want to test
{
    // your regular implementation goes here
    // virtual methods remain the same
}

TestableDerived : Derived // class you'll actually test
{
    // here you could for example expose some properties 
    // determining whether Begin/EndRequest were actually called,
    // calls were made in correct order and so on - whatever makes
    // it easier to verify later

    protected override void BeginRequest(TheRequest request) { ... }
    protected override void EndRequest() { ... }  
}

You can find more about this technique in Art of Unit Testing book, as well as Working Effectively with Legacy Code. Even tho I believe there are probably more elegant solutions, this one should enable you to test flows and verify general correctness of interactions within DoSomething method.

Problems of course appear when you don't have access to base class and you can't modify its code. Unfortunatelly I don't have "out of book" solution to such situation, but perhaps you could create virtual wrapper around Begin/EndRequest in Derived, and still use extract & override TestableDerived.

I joined a new company about a month ago. The company is rather small in size and has pretty strong "start-up" feel to it. I'm working as a Java developer on a team of 3 others. The company primarily sells a service to for businesses/business-type people to use in communicating with each other.

One of the main things I have been, and will be working on, is the main website for the company - from which the service is sold, existing users login to check their service and pay their bills, new users can sign up for a trial, etc. Currently this is a JSP application deployed on Tomcat, with access to a database done thru a persistence layer written by the company itself.

A repeated and growing frustration I am having here (and I'm pretty happy with the job overall, so this isn't an "oh no I don't like my job"-type post) is the lack of any larger design or architecture for this web application. The app is made up of several dozen JSP pages, with almost no logic existing in Servlets or Beans or any other sort of framework. Many of the JSP pages are thousands of lines of code, they jsp:include other JSP pages, business logic is mixed in with the HTML, frequently used snippets of code (such as obtaining a web service connection) is cut and paste rather than reused, etc. In other words, the application is a mess.

There have been some rumblings within the company of trying to re-architect this site so that it fits MVC better; I think that the developers and higher-ups are beginning to realize that this current pattern of spaghetti code isn't sustainable or very easily scalable to add more features for the users. The higher-ups and developers are wary of completely re-writing the thing (with good reason, since this would mean several weeks or months of work re-writing existing functionality), but we've had some discussions of (slowly) re-writing certain areas of the site into a new framework.

What are some of the best strategies to enable moving the application and codebase into this direction? How can I as a developer really help move this along, and quickly, without seeming like the jerk-y new guy who comes into a job and tells everyone that what they've written is crap? Are there any proven strategies or experiences that you've used in your own job experience when you've encountered this sort of thing?

First pick up a copy of Michael Feather's Working Effectively with Legacy Code. Then identify how best to test the existing code. The worst case is that you are stuck with just some high level regression tests (or nothing at all) and If you are lucky there will be unit tests. Then it is a case of slow steady refactoring hopefully while adding new business functionality at the same time.

I'm reading a lot about good and bad practices in OOP design. It's nice to know your design is bad, or good. But how do you get from bad to good design? I've split the interface (xaml) and codebehind from the main businesslogic class. That last class is growing big. I've tried splitting it up into smaller classes, but I'm stuck now. Any ideas on how to split large classes? The main class has 1 list of data of different types. I'm doing calculations on the total, but also on the individual types. I've got methods to perform these calculations which are called from events handled in the codebehind. Any ideas where to go from here?

Additional Info:

We are already about 6 months into this project. I've worked with object oriented laguages for years (first c++, java and now c#), but never on a large project like this one. I believe we've made some wrong turns in the beginning and I think we need to correct these. I can't specify any details on this project at the moment. I'm going to order one or two books about design. If I separate all the classes, how do I stick them back together? Maybe it's even better to continue this way to the first release and rebuilt parts after that, for a second release?

I highly recommend picking up Code Complete. It's a great book that offers tons of good advice on questions like yours.

To give you a quick answer to your question about how to split large classes, here's a good rule of thumb: make your class responsible for one thing, and one thing only. When you start thinking like that, you quickly can identify code that doesn't belong. If something doesn't belong, factor it out into a new class, and use it from your original class.

Edit: Take that thinking down to the "method" level, too - make your methods responsible for one thing, and one thing only. Helps break large (>50 line) methods down very quickly into reusable chunks of code.

Refactoring by Martin Fowler is an excellent book about how to change the design of your software without breaking it.

Design Patterns works similarly to algorithims but tells you how to combine objects to perform various useful tasks.

Finally Martin Fowler has a variety of useful design pattern for applications. For example Passive View

Michael Feathers's "Working Effectively with Legacy Code" is supposed to be very good, but I'll confess that I haven't read it myself.

Same goes for "Refactoring to Patterns."

I've inherited a functional but messy WPF MVVM application. To my annoyance, there were virtually no unit tests which makes the adoption MVVM somewhat pointless. So I decided to add some.

After grabbing the low-hanging fruit, I'm starting to run into trouble. There's a lot of highly interdependent code, especially inside the properties and methods that are used in the Views. It's common, for instance, for one call to set off a chain of "property change" events which in turn set off other calls and so forth.

This makes it very hard to test, because you have to write enormous mocks and set up a large number of properties to test single functions. Of course you only have to do it once per class and then re-use your mock and ViewModel on each test. But it's still a pain, and it strikes me this is the wrong way to go about things. It would be better, surely, to try and break the code down and make it more modular.

I'm not sure how realistic that is in MVVM. And I'm in a vicious circle because without good tests, I'm worried about breaking the build by refactoring to write good tests. The fact it's WPF MVVM is a further concern because nothing tracks the interdependence between View and ViewModel - a careless name change could completely break something.

I'm working in C# VS2013 and grabbed the trial version of ReSharper to see if it would help. It's fun to use, but it hasn't so far. My experience of unit testing is not vast.

So - how can I approach this sensibly, methodically and safely? And how can I use my existing tools (and look at any other tools) to help?

  1. Start at the heart -- business logic

Your application solves some real world problems and this is what business logic represents. Your view models wrap around those business logic components even if they (components) don't exist just yet (as separate classes).

As a result, you should assume that view model is lightweight, data preparation/rearrangement object. All the heavy lifting should be done within business logic objects. View model should be served with raw data, display ready.

  1. Modularize

Having this important assumption in mind (VM = no BL) move business logic components to separate, possibly modular projects. Organizing BL in modular way will often result in projects structure similar to:

  • Company.Project.Feature.Contract - interfaces, entities, DTOs relating to feature
  • Company.Project.Feature - implementation of contract
  • Company.Project.Feature.Tests - tests for feature implementation

Your goal with #1 and #2 is to reach a state where abandoning WPF and replacing it with console-interface should only require writing console interface and wiring it with BL layer. Everything WPF related (Views, ViewModels) could be shift-deleted and such action should not break a thing.

  1. IoC for testability

Get familiar with dependency injection and abstract where needed. Introduce IoC container. Don't make your code rely on implementation, make it rely on contract. Whenever view model takes dependency to BL-implementation, swap it with BL-abstraction. This will be one of the necessary steps to make view models testable.

This is how I would start. By no means this is exhausive list and I'm convinced you'll run into situations where sticking to it won't be enough. But that's what it is -- working with legacy code is not an easy task.

I've joined a team that works on a product. This product has been around for ~5 years or so, and uses ASP.NET WebForms. Its original architecture has faded over time, and things have become relatively disorganized throughout the solution. It's by no means terrible, but definitely can use some work; you all know what I mean.

I've been performing some refactorings since coming on to the project team about 6 months ago. Some of those refactorings are simple, Extract Method, Pull Method Up, etc. Some of the refactorings are more structural. The latter changes make me nervous as there isn't a comprehensive suite of unit tests to accompany every component.

The whole team is on board for the need to make structural changes through refactoring, but our Project Manager has expressed some concerns that we don't have adequate tests to make refactorings with the confidence that we aren't introducing regression bugs into the system. He would like us to write more tests first (against the existing architecture), then perform the refactorings. My argument is that the system's class structure is too tightly coupled to write adequate tests, and that using a more Test Driven approach while we perform our refactorings may be better. What I mean by this is not writing tests against the existing components, but writing tests for specific functional requirements, then refactoring existing code to meet those requirements. This will allow us to write tests that will probably have more longevity in the system, rather than writing a bunch of 'throw away' tests.

Does anyone have any experience as to what the best course of action is? I have my own thoughts, but would like to hear some input from the community.

Your PM's concerns are valid - make sure you get your system under test before making any major refactorings.

I would strongly recommend getting a copy of Michael Feather's book Working Effectively With Legacy Code (by "Legacy Code" Feathers means any system that isn't adequately covered by unit tests). This is chock full of good ideas for how to break down those couplings and dependencies you speak of, in a safe manner that won't risk introducing regression bugs.

Good luck with the refactoring programme; in my experience it's an enjoyable and cathartic process from which you can learn a lot.

Recently I inherited a business critical project at work to "enhance". The code has been worked on and passed through many hands over the past five years. Consultants and full-time employees who are no longer with the company have butchered this very delicate and overly sensitive application. Most of us have to deal with legacy code or this type of project... its part of being a developer... but...

There are zero units and zero system tests. Logic is inter-mingled (and sometimes duplicated for no reason) between stored procedures, views (yes, I said views) and code. Documentation? Yeah, right. I am scared. Yes, very sacred to make even the most minimal of "tweak" or refactor. One little mishap, and there would be major income loss and potential legal issues for my employer.

So, any advice? My first thought would be to begin writing assertions/unit tests against the existing code. However, that can only go so far because there is a lot of logic embedded in stored procedures. (I know its possible to test stored procedures, but historically its much more difficult compared to unit testing source code logic). Another or additional approach would be to compare the database state before and after the application has performed a function, make some code changes, then do database state compare.

I've had this problem before and I've asked around (before the days of stack overflow) and this book has always been recommended to me. http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

What books should you read to improve your code and get used to good programming practices after getting a taste of the language?

C++ Coding-Standards: 101 Rules, Guidelines, and Best Practices

---- Herb Sutter and Andrei Alexandrescu

alt text

Meyers' Effective C++, "More Effective C++" and "Effective STL".

Design Patterns by the 4 guys affectionately nicknamed "the gang of four".

Lakos' Large Scale C++ Software Design.

There are of course many others, including many truly good ones, but if I had to pick 3 on C++ (counting Meyers' three thin, information-packed volumes as one;-) these would be it...

what is the different purpose of those both? I mean, in which condition I should do each of them?

as for the example condition. if you have the backend server and several front-end webs, which one you'll do?do-unit testing the backend server first or do-UI testing in the web UI first? given the condition, the server and the front-end webs already exist, so it's not an iterative design to build along with (TDD)...

Unit testing aims to test small portions of your code (individual classes / methods) in isolation from the rest of the world.

UI testing may be a different name for system / functional / acceptance testing, where you test the whole system together to ensure it does what it is supposed to do under real life circumstances. (Unless by UI testing you mean usability / look & feel etc. testing, which is typically constrained to details on the UI.)

You need both of these in most of projects, but at different times: unit testing during development (ideally from the very beginning, TDD style), and UI testing somewhat later, once you actually have some complete end-to-end functionality to test.

If you already have the system running, but no tests, practically you have legacy code. Strive to get the best test coverage achievable with the least effort first, which means high level functional tests. Adding unit tests is needed too, but it takes much more effort and starts to pay back later.

Recommended reading: Working Effectively with Legacy Code.

When it comes to the use of design patterns, I am guessing there are three types of shops. Those who wouldn't know a pattern if it hit them in the face - these usually prefer the Ctrl-C / Ctrl-V approach to code-reuse. Those who spend hours a day searching their legacy code in hopes of implementing one more great pattern - these usually spend more time refactoring the code of simple programs than would ever be spent in a hundred years of maintenance. And finally, those who walk the middle path using patterns when they make sense, and coding whatever comes first for minimally exposed code.

I'm wondering if anyone has locked down a good approach to a balanced incorporation of pattern use in the software development life cycle. Also, what is the best resource on the Web for patterns, their motivating factors, and their proper use?

Thanks.

There are lots of different 'pattern' families out there, but taking your question its broadest terms...

I'd recommend:

Offline (my favourites):

Offline (popular):

  • GoF Design Patterns
  • Fowler's Refactoring: Improving the Design of Existing Code

I agree with these references, but there are not newbie friendly. To quick start learning design patterns the easy way :

Head First : design pattern

Then your can check the more advanced books once you got the big picture (a lot of practical examples ;-))

The definition and original answer is the origin of the concept, in Design patterns. This book deals with the concept at a very theortical level, rather than the management speak that has infiltrated the area. Their view is that design patterns are just names for common idioms; they enumerate some and justify their positions.

They avoid the "What design pattern should I use" type of question, instead dealing with problem as "Am I naturally stepping into a well known area? If so, can others experience help me?". Design patterns, to me, are not like prefab components you glue together to make a solution. They are just guidance repositories for when you come across a situation similar to one others have countered, and give names to allow people to refer to the general situations in conversation.

Possible Duplicates:
What should I keep in mind in order to refactor huge code base?
When is it good (if ever) to scrap production code and start over?

I am currently working with some legacy source code files. They have quite a few problems because they were written by a database expert who does not know much about Java. For instance,

  1. Fields in classes are public. No getters and setters.
  2. Use raw types, not parameterized types.
  3. Use static unnecessarily.
  4. Super long method names.
  5. Methods need too many parameters.
  6. Repeat Yourself frequently.

I want to modify them so that they are more object-oriented. What are some best practices and effective/efficient approaches?

Read "Working Effectively with Legacy Code" by Michael Feathers. Great book - and obviously it'll be a lot more detailed than answers here. It's got lots of techniques for handling things sensibly.

It looks like you've identified a number of issues, which is a large part of the problem. Many of those sound like they can be fixed relatively easily - it's overall design and architecture which is harder to do, of course.

Are there already unit tests, or will you be adding those too?

Possible Duplicate:
When is it good (if ever) to scrap production code and start over?

Say you have been given a project, and you look at the code. Although it works, and is functional, you realize that to make future changes it would be easier to rewrite a large portion or all of the code. Is it better to do the rewrite? If it costs you a delay now, but saves you delays later (and possible bug fixes later), is it worth it?

Or do you simply fix what you see as you go along and are touching that part of the code?

Or do you fix it only if a bug is reported that would require touching that code?

This is a complicated issue many of us face, me included. Consider reading a good book on the subject, because this is a huge topic. How much time/money do you have, and do you really understand the scope? If (1) you have enough time/money, and (2) you're sure the solution will be active long enough to make back the investment, and (3) you are certain you understand the scope of a total revamp, then the total revamp may well be better. But that is a lot of "ifs".

Because of time/money constraints and uncertainty concerning how long the solution will be relevant, piecemeal refactoring is more often the better choice (meaning, lower risk). How agile are you? If your flexibility and designing-on-the-fly skills are your strong point, and your client/boss supports you in this, then gradual refactoring will almost always be the better choice.

  • Use the same balancing skills you use when evaluating how exactly to write new code.
  • Compare time spent now against maintainability/testability/simplicity later.
  • Refactor the code most likely to need to be debugged/fixed/reconfigured in the future.
  • Don't bite off more than you can chew; usually there isn't a lot of time available for refactoring.
  • Unit tests are your best friend when refactoring.

And again, don't try to do too much! When your client/boss asks why features aren't being delivered, it doesn't sound good to say "well, I started refactoring a bunch of stuff, and it caused a lot of bugs, and I'm still trying to fix those." Better to get just one key part of your code in good shape.

This is what refactoring is all about. Changing the design of existing code without changing its functionality. You say you "realize" that it would be easier to rewrite, but that entails a certain presumption: that you know what will be required of the code in the future, that you know how much work rewriting it would be, and that you know how much work reworking it would be. I haven't known any programmers for whom that is true (myself very much included).

Refactoring - versus rewriting - offers some protection against this presumption. Done well, you can work only on what you need today, design the code you have so it can serve the immediate need you have, then modify it to serve that need. Bit by bit, your code gets cleaner and more malleable. And the bits that change are the ones that need changing - without making impossible predictions about the future. If you haven't read Martin Fowler's Refactoring, have a look at it; also Michael Feathers' Working Effectively with Legacy Code.

90% of our code is linear in nature. We have functions spread around in some places, but our code generally looks like this:

<?php

// gather some data
// do something with that data
// instantiate a bunch of globals

// output a bunch of stuff to the browser

include_once "file.php";

// output some more stuff
include_once "file2.php";

// ad nauseum

then in file.php

<?php

// do something with the globals from above
// gather some more data
// do something with this newfound data
// output more stuff to the browser

As part of moving to a cleaner code base, I want to begin testing this, but I'm unsure as to the proper way to do that. Any suggestions? I'm at a loss as to the proper method.

As others have commented, you've essentially written procedural code. That type of code is typically not very conducive to Unit Testing or Test Driven Development. To begin with, you should probably become familiar with Object Oriented Programming and start grouping related pieces of functionality into appropriate abstractions.

You could try looking for Link Seams and other such tricks, but you're likely in for a world of hurt unless you start changing the paradigm. Unless you can break your procedural PHP into enough tiny files with set input and output points that you can test each in isolation. But that will definitely require creating methods and eliminating as many of the GLOBALS as possible.

The first thing you need to do is probably read Chapter 19 of Michael Feathers' Working Effectively with Legacy Code. In fact, read the whole book. Because the "seemingly simple bit" of adding tests is going to require a paradigm shift.

At work we have a legacy process written in Visual C++ that basically consists of a single 5000 line function. In essence the program is just one big case statement with similar cut-and-pasted code handling a fair amount of the logic for case. Obviously we would like to refactor this code to extract these cases out into separate functions (or objects) and to eliminate any cut-and-pasted code.

My question is - are there any suggestions for going about a refactoring effort of this size? Are there any automated tools that could streamline the process?

Several suggestions:

You can find copy-and-pasted code with a tool like Duplo. A script (in the scripting language of your choice) or an editor with multi-line search-and-replace can help you replace that code with function calls. (This can also be done by hand, but using a script or search-and-replace helps eliminate the possibility of error.)

Refactoring tools can automatically do operations like extracting a portion of a function into a new function. Eclipse CDT, for example, can do this, and it's free, and it can edit code that's maintained in other IDEs. This doesn't always work, but when it does work, it's truly amazing to watch as the IDE splits apart a thousands-line method, extracting just what you want, and correctly identifying every variable that needs to be passed in as a new parameter to your new method... Other refactoring tools are available, such as Refactor! Pro (free version available), but I've not used them.

More generally, Michael Feather's book Working Effectively with Legacy Code is the standard work on doing this kind of thing. Basically, you'll want to set up characterization tests - similar to unit tests, but the goal is to cover (characterize) as much of the current function's behavior as possible, rather than testing for correctness the smallest units possible - then apply refactorings one at a time. (Feathers includes a catalog of refactorings and other techniques that are particularly useful for legacy code.)

Before I ask my "general" question, I wanted to present some quotes to my high-level general understanding of the business delegate pattern:

"You want to hide clients from the complexity of remote communication with business service components."

"You want to access the business-tier components from your presentation-tier components and clients, such as devices, web services, and rich clients."

Lets say you have large J2EE web application. When I mean large, I mean that the application has been developed for several years and by many different people and departments. The system wasn't designed well for growth. Right now, lets say the application is configured with one web application.

It would have been great if the application were more modular. But it isn't. One piece of code or library can bring down the entire system. Testing and other means are used to prevent this, but ideally this is just one monolithic web application.

Here is my question; how do you normally avoid this type of design with J2EE apps, where the application grows and grows and one separate application can bring down everything.

I am not familiar EJBs and don't plan on using them for anything too serious. But, the concept of the Business Delegate and Service Locator Pattern seems like a good fit.

Lets say you have a shopping cart screen (same web app) and a set of screens for managing a user account (same web app). In your web-tier (some kind of MVC action method), it seems you could have a Locator that will get the business specific interface, invoke those interfaces per module and if something goes wrong with that module then it doesn't kill other code. Lets say the shopping cart module fails (for whatever reason), the user account screen still works.


The best way to improve a system like this is to slowly make it better. If your system is anything like the systems I've worked on, changing the code often leads to bugs. Automated tests are a way to reduce the chance of introducing new bugs, but often the code wasn't written with testing in mind, and changing the code to make it easier to write tests can lead to bugs.

The way around this problem is to introduce some automated integration tests, and use those tests as a safety net as you carefully refactor the code and introduce tests at a lower level. Making code more testable often results in introducing interfaces and abstractions that make the code easier to work with. It also often requires separating business logic from presentation logic (since testing both together is painful) and breaking out our code into modules (since classes with many dependencies can be hard to test).

When you write new code for your system, try to write unit tests while you write the code. This is much easier (and therefore less frustrating) than writing tests for legacy code, and gives you a chance to see where you might go when you refactor your legacy code.

I know of two excellent bugs on this subject: Working Effectively with Legacy Code by Robert C. Martin and Refactoring to Patterns by by Joshua Kerievsky.

Finally, consider using a dependency injection framework, like Spring or Guice. Dependency injection makes it easier to make your code testable with unit tests. It also makes it easier to follow good design practices like the Dependency Inversion Principle and the Interface Segregation Principle.

Chapter about TDD from Martin's "Clean Code" caught my imagination.

However.
These days I am mostly expanding or fixing large existing apps.
TDD, on the other hand, seems to only be working only for writing from scratch.

Talking about these large existing apps:
1. they were not writted with TDD (of course).
2. I cannot rewrite them.
3. writing comprehensive TDD-style tests for them is out of question in the timeframe.

I have not seen any mention of TDD "bootstrap" into large monolite existing app.

The problem is that most classes of these apps, in principle, work only inside the app.
They are not separable. They are not generic. Just to fire them up, you need half of the whole app, at least. Everything is connected to everything.
So, where is the bootstrap ?

Or there is alternative technique with results of TDD
that'd work for expanding the existing apps that were not developed with TDD ?

The bootstrap is to isolate the area you're working on and add tests for behavior you want to preserve and behavior you want to add. The hard part of course is making it possible, as untested code tends to be tangled together in ways that make it difficult to isolate an area of code to make it testable.

Buy Working Effectively with Legacy Code, which gives plenty of guidance on how to do exactly what you're aiming for.

You might also want to look at the answers to this related question, Adding unit tests to legacy code.

Currently I have started on very large Legacy Project and am in a situation which I always feared Reading and Understanding Other People's code, I have known that this is an essential skill which is required but haven't developed it as till date it was not required and now its like necessity to develop this skill rather than hobby and so I would like to know from SO Readers about:

  1. How you have overcome the hurdle of reading other people's code ?
  2. What techniques or skill have you developed to polish your art of reading and understanding other people code ?
  3. Are there any books or articles which you have referred to or in general how did you developed the skill of reading and understanding other people's code ?

I would highly appreciate useful answers to this questions as now I can understand how one would feel while trying to understand my code.

Patience: Understand that reading code is more difficult than writing new code. Then you need to respect the code, even if it is not very readable, for it does its job and in many cases pretty efficiently. You need to give the code time and effort to understand it.

Understand the Architecture: It is best if there is any documentation on this. Try talking to people who know more about it if they are available.

Test it: You need to spend some time testing and debugging the code so you know what it does. For those parts you understand, write some unit tests if possible so you can use them later.

Be Unassuming: Many times the names of the patterns are misused. The classes have names which do not indicate their purpose. So don't assume anything about them.

Learn Refactoring: The best book I found on this topic is Refactoring: Improving the Design of Existing Code - By Martin Fowler. Working Effectively with Legacy Code is another awesome one.

Michael Feathers' Working Effectively with Legacy Code is a great resource that contains a large number of techniques for working with older code.

Do you refactor your SQL first? Your architecture? or your code base? Do you change languages? Do you throw everything away and start from scratch? [Not refactoring]

This really depends on the state of the codebase... are there massive classes? one class with mega-methods? Are the classes tightly coupled? is configuration a burden?

Considering this, I suggest reading Working Effectively with Legacy Code, picking out your problems, and applying the recommendations.

I am adding tests to some gnarly legacy code in order to have confidence enough to seriously refactor it. One of the issues is that whoever wrote the code obviously made no attempt to make the code testable (given that they never wrote a single unit test!)

A common issue is that there are currently no interfaces, just an 11-level-deep inheritance chain. I am using Rhino Mocks to isolate the class under test from its dependencies, but as I am mocking a class, not an interface, I can only stub a read-only property if it has the virtual keyword.

My current thinking is that I will just add the virtual keyword to the property. There is no plan to add any further objects into the existing dependency chain and it will allow the tests to be written.

Are the any arguments against adding the virtual keyword, or is this an acceptable compromise in order to get tests in?

Example code...

In the test class:

var someClassStub = MockRepository.GenerateStub<SomeClass>();
someClassStub.Stub(s => s.SomeProperty).Return("Test");

In SomeClass:

public virtual string SomeProperty {
    get {
        return someDependency.SomeMethod();
    }
}

Instead of adding virtual everywhere, there a few safer methods of making your code initially testable. Personally, I'd highly recommend using the "Extract Interface" tool provided with Visual Studio and replace concrete class references with the interface where possible to do safely. Then, mock the interface instead of the concrete class.

If you're using a version of Visual Studio(or some other IDE) that doesn't support Extract Interface, all you have to do is track down all the public members of the class and add them to an interface and make your concrete class implement it.

Your first priority should be getting that initial set of tests. This way you can later make more dangerous changes with reasonable certainty that your code isn't broken.

For anyone working on making old legacy code unit testable, I'd highly recommend reading the book Working Effectively With Legacy Code. It is well worth the money. So much so that my manager ended up buying my office a copy to consult.

I am looking to run a load of automated functionality tests on a user interface application of mine and was wondering what is the best software out there to carry out these tests. Preferably the software will be able to intergrate with Visuall C++ 2005. I have googled various software however there is so much out there I'm not sure what is best for what I need. Any help would be awesome, thanks.

for automated software unit tests I would recommend google test. There is a very good q&a on this platform, which you can find here.

Additionally, there is CPPUnitLite, which is developed by the author of "Working Effectively with Legacy Code", Michael Feathers.

I used AutoIt Scripts for testing a MFC application just a little bit, but it was not that easy to maintain them properly and build an effective logging system for failed tests.

However, the unit tests depend heavily on the architecture of your program and the structure of your class - especially the dependencies to other components / classes. So if you already have an existing MFC application, which was not built with unit tests in mind, you probably have to refactor a lot of things. Therefore, I would recommend the mentioned book. You can also use the classic "Refactoring" by Martin Fowler.

If I have a type with a big-old (lots of params) constructor, is it a valid approach to implement a protected parameterless constructor simply for the purposes of creating a derived "Fake" type to use for stubbing in unit tests?

The alternative is to extract an interface, but this is not always desireable in a codebase you do not have full control over...

It's not ideal, but it is valid.

To quote a couple of people who know more about this than me, in The Art of Unit Testing, Roy Osherove talks about unit tests being like a user of the code, and as such providing access specifically for them is not necessarily a bad thing.

And in Working Effectively with Legacy Code, Michael Feathers discusses several such techniques, pointing out that making things protected for testing can be better than making them public, although it's not necessarily the best way to do things. (In fact I'd recommend you read that book if you are working with legacy code, as it sounds like you are).

I think it depends what sort of code it is - if it's a public API where people are likely to take protected access to mean it's designed to be overridden, it's probably a bad idea, but on a typical business app where it's pretty obviously not meant to be overridden, I don't think it's a problem. I certainly do it sometimes in this situation.

I have an existing asp.net webforms application that I would like to add some unit testing to but am unsure of exactly how to go about it.

The application is database driven with functionality I guess you could compare to an advanced forum. Logic, data access and presentation are seperated for the most part.

What methods should I be testing?

How do I handle the database and test data?

Are there any tools recommended to assist in this?

The first thing you need to decide is: What is your motivation for adding unit tests?

There are many excellent reasons for having unit tests (I rigorously practice TDD myself), but knowing which one is the main driving force in your case should help you decide which tests to first write.

In most cases you should concentrate on writing unit tests for the areas of your application that have been causing you the most pain in the past.

Much experience has shown that when software originally was written without unit tests, it can be hard to subsequently retrofit unit tests. Working Effectively with Legacy Code provides valuable guidance on how to make an untested software projects testable.

I have an object which has a public void method on it that just modifies internal state. Part of me feels as though it should have tests as it is a public method but how do you go about testing this? Do you bother or do you rely on the other tests which make use of the fact that the internal state has changed?

To add an example I am implementing a gap buffer which has an insert method on it and a moveCursorForward method which just modifies the internal state.

Thanks, Aly

I think injection is your answer. Create a Cursor object and inject it into your Buffer. Unit test the cursor object so that you know it works perfectly ;). Since cursor is stateful and you've passed it into your buffer, you can assert on the cursor object when you test moveCursorForward.

Check out Michael Feather's Working Effectively with Legacy Code

This also is where mocking objects might be beneficial.

I have a project for a web-app.

  1. Most classes of this project must be compiled into .jar file and put to server into codebase/WEB-INF/lib dir. These classes are used by server only.

  2. But I have some classes which need to be used on the server and the client. These classes must be put directly to codebase/[package.class].

  3. All of these classes depend on each other and vice-versa.

End of story. Now I try to move my project from IDE build to Gradle. My project contains two modules which depend on each other. InteljID provide warning for me and compile this fine. But with Gradle it gets stuck. So, can I somehow split these classes in two logical groups, and have an easy way to build the structure described above?

I was trying to make multi-project build, several sourceSets but everywhere I got the same circular dependency errors.

You need to refactor your code. One way of doing this is to create a separate project for the server specific code, another project for the client project and a common project for code that needs to be shared. Next, you can let the server project and the client project depend on the common project. However, there should not be any dependencies between the server project and client project themselves.

Typically, you would like to keep the common project as small as possible. Ask yourself if a certain class is necessary in both projects? If not, it should not be part of the common package. Make sure to keep your dependencies between classes, packages and projects unidirectional. This may include quite a lot of refactoring since the code is tangled.

Recommended reading:

I have a solution that is missing a lot of code coverage. I need to refactor this code to decouple to begin to create unit tests. What is the best strategy? I am first thinking that I should push to decouple business logic from data access from businessobjects to first get some organization and then drill down from there. Since many of the classes don't support single responsible principle, it's hard to begin testing them.

Are there are other suggestions or best practices from taking a legacy solution and getting it into shape to be ready for code coverage and unit testing?

I have inherited a project that has no interfaces or abstract classes i.e. concrete classes only and I want to introduce unit testing. The classes contain lots of functions, which contain business logic and data logic; breaking every rule of SOLID (http://en.wikipedia.org/wiki/SOLID_%28object-oriented_design%29).

I had a thought. I was thinking about creating interfaces for each of the poorly designed classes, exposing all functions. Then at least I can Mock the classes.

I am relatively new to Unit Testing (I have experience with a project, which was very well developed using interfaces in the right places). Is it a good idea to do this i.e. create interfaces for all the concrete classes (exposing all the functions and sub routines), just for unit testing?

I have spent some time researching this but I have not found an answer.

This is a tough thing to tackle. I think you are on the right track. You'll end up with some ugly code (such as creating header interfaces for each monolithic class), but that should just be an intermediate step.

I'd suggest investing in a copy of Working Effectively with Legacy Code. First you could start by reading this distillation.

In addition to Karl's options (which let you mock via interception), you could also use Microsoft Fakes & Stubs. But these tools will not encourage you to refactor the code to adhere to SOLID principles.

I prefer to create interfaces and classes as you need to test things and not all upfront.

Besides interfaces, you can use some techniques to test legacy code. The one I often use is "Extract And Override", where you extract some piece off "untestable" code inside other method and make it overridable. Them derive the class that you want to test and override the "untestable" method with some sensing code.

Using a mock framework will be as easy as adding keyword Overridable to the method and sets the returning using the mock framework.

You can find many techniques on the book "Working Effectively with Legacy Code".

One thing about existing code, is that sometimes it is better to write integration tests than unit tests. And after you have the behavior under test, you create unit tests.

Another tip is to start with modules/class that have less dependencies, that way, you become familiar with the code with less pain.

Let me know if you need an example about "extract and override" ;)

I have read so many places is that if your code is not test-able that mean code is not well written. So that makes me start writing a code that is test-able and to start using some unit testing framework.

With this though I start looking for some example with piece of code that is not testable and gradually converted to a testable code. I find tons of examples on unit testing but if someone can provide an example like above it probably can jump start things for me.

TIA

Here are two great books that will help you get started:

  1. The Art of Unit Testing
  2. Working Effectively with Legacy Code

Good luck.

I'm in a project where we don't do TDD, because our bosses and the cliente are very "old styled" people. Because I can't do design through TDD but I feel fear to changes, I would like to write unit tests for my own safety. But, how those unit test would be? Do I have to write a test for each method specification for test that they do what it's supposed they do? Do I have to test for each new functionality like TDD but without design? I have a mess in my mind.

Thanks in advance.

You will find a lot of good advice for your situation in the following book: http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

In may Daily Job i come across this Dilemma :

"Stable System Vs Better Design"

In routine job when i am fixing some module, When i see bad design

-> Badly written code

-> Badly Written Algorithm

-> Optimization possible

I would prefer to fix these also along with issue i am fixing

But many people opposes my changes a few supports, People who opposes will say

"You should be business oriented if system is stable, If you change some thing may cause regression, Hence do not favor business"

some time :

you will see your own written code after 6 months, Always you will see some improvment opportunity in this

While who support will say:

This is continual improvement and system will be more stable

So i would like to know what you people think

On the one hand, changing code that's already more-or-less working can run the risk of breaking things, and it can easily become an all-consuming task.

On the other hand, leaving bad code alone for fear of breaking things can stifle new development, due to the burden of maintaining bad code.

Sometimes code looks bad because it has to deal with complicated corner cases, as Joel Spolsky points out, and rewriting it will create bugs by failing to cover those corner cases. Sometimes code looks bad because it really is bad, and rewriting it can fix bugs that you didn't even know were there. Experience with your code base should help you determine which code is which.

In Boy Scout Check-Ins, Jeff Moser discusses the idea of "always leaving the campground cleaner than you found it." Always leave the codebase cleaner than you found it, even if you can't fix everything; those little improvements add up over time.

As was said in this answer, unit tests are a good thing. Working Effectively with Legacy Code, by Michael Feathers, is a great resource on this topic.

It has come to my attention that you can unit-test an abstract class by instantiating it as a mock object. Thereby you mock the abstract properties and methods, while being able to test the implemented ones.

However, I am accustomed to distinguish between the class to test and the injected dependencies to mock/stub. So, in this early stage of my new enlightenment I wonder if there are any pitfalls to this way of testing?? Any thoughts on this?

Regards, Morten

An abstract class, to be of any value, should have concrete subclasses which can be instantiated. So unit test those, and via them implicitly the base class.

If it has no concrete subclasses, I can't think of any reason why it should exist (as an abstract class, that is).

In general, I prefer to use mocking only to set up the environment of the class to be tested, not to instantiate the class itself. This distinction - to me - keeps the test cases clearer, and ensures that I always test the real functionality of the class.

Of course, I can think of cases (with legacy code) when the main issue is to be able to write unit tests somehow, anyhow, to enable refactoring (as discussed in Working Effectively with Legacy Code). As a temporary solution in such cases, (almost) anything goes. But once the unit tests are working, the class should be refactored properly asap to make it clean and testable (and its unit tests too, to make them maintainable).

I understand the short answer to it is by doing testing but still how do you do this testing? Do you modify your test cases to include the bugs as additional test cases to run or do you just verify all the bugs in your bug tracking system starting from the oldest to the newest releases.

Thanks for the answers. Looks like my question was not clear. I understand we need to write bug report, fix the bug & do the testing for fix verification. However, under which test phase should this test go so that during the next version release, we are sure to re-run the test again to make sure none of the new changes have re-introduced the bug. Should it go under regression testing or should it go under integration testing for that specific project or should we just test all the bugs in the bug tracking system since version 1.0?

What do you mean by saying "You're protected from regressing it, because that test will suddenly fail."?

The point of "unit tests" is that they will be run automatically as part of the compile/check cycle: you'll run the tests after compiling and before checking code in. If a unit test fails, then you know that some code that you just wrote will recreate the old bug.

edit:

How can you be so sure that just because your unit tests passed all your bugs have been fixed.

Generally, if you can reproduce a bug, you can make a test to duplicate it in code. The most important part is once you've written a test to check for the existance of a bug, then if you run the tests, and the test fails, then you reintroduced the bug. One great book on the subject is "Working Effectively with Legacy Code"

Is it not possible for a single bug to span multiple unit tests.

Of course it can span multiple tests.

Let's presume the bug is "can't save files."

And let's presume that there are several different scenarios that can cause this:

  • out of disk space,

  • the file you're writing on top of is opened and locked by another process,

  • there are permissions problems (like you can't access the directory, or don't have write permissions),

  • or there is a network error while you're saving the file.

Ideally, your test harness (a collection of tests) would have a separate test for each one. For some of the tests, you may need to use stuff that are called "mock objects" (they can also be used when you haven't written the code for the component yet).

I've got a contract where I have to continue the development of and old application suite that was programmed in VB5 back in the days.

I've bug to fix and new feature to develop.

So I have a few choices:

  1. Keep programming in VB5 (NOOOOOOOOOOOOOOO!!!!)
  2. Convert VB5 to C# (How??? Is it possible without going insane?)
  3. Rewrite the whole application suite (Very time consuming)

Is there any other choices? What should I do?

EDIT: Ah and also, it relies on an ACCESS database which I'd like to move to SQL EXPRESS. Because it's a crazy database made illogically by a stupid programmer from the '90s lol.

Thanks

One approach would be to tactically rewrite sections of the code in C# - you could start with the areas you are likely to be bugfixing the most, create C# assemblies to mirror the functionality, and then expose these to the VB5 code through COM interop.

A good suite of Unit Tests would be highly recommended for this approach!

I've heard that Working Effectively With Legacy Code by Michael Feathers is the best book around for understanding how best to chop up a problem like this.

I've got a problem and I'm asking you for help

I've started working on web application, that has no tests, is based on spring 2.5 and hibernate 3.2, is not very well modularized, with classes having up to 5k lines, as view technology there is JSP used all over the place with quite a lot things duplicated (like many similar search forms with very few differencies but with not many shared parts).

Aplication works well, however, everything is running just fine, but when there is need to add or to change some functionality, it is realy slow and not very convenient.

Is there any possibility to employ TDD at this point? Or what would you recomend as I dont't think I can develop it forever the way it is now, it is just getting messier all the time.

Thanky you for answers.

I would start by picking up a copy of Michael Feathers' book Working Effectively with Legacy Code - this is pure gold.

Once you learn techniques for refactoring and breaking apart your application at it's logical seams, you can work on integrating TDD in newer modules/sprout classes and methods, etc.

Case in point, we recently switched to a TDD approach for a ten year old application written in almost every version of our framework, and while we're still struggling with some pieces, we've made sure that all of our new work is abstracted out, and all of the new code is under test.

So absolutely doable - just a bit more challenging, and the book above can be a tremendous help in getting started.

I've recently begun coding in Java in the past few months. I have a Matrix class that's becoming much too bloated with a lot of methods. I also have a SquareMatrix class that extends Matrix, and reduces some of the bloat.

What I've found is that alot of the methods in the Matrix class are relevant to matrices in general. It has all the basics, like addMatrix(Matrix), multiplyMatrix(Matrix), multiplyMatrix(float), then more complex methods like getGaussian(Matrix) or getLUDecomposition(Matrix).

What are my options in reducing the number of lines in my class Matrix? Is it normal for a class to become extremely large? I don't think so... Is this something I should have considered early on and refactoring is difficult? Or are there simple solutions?

Thank you!


Edit: After reading a couple of responses, I'm considering doing the following:

Utility class Matrix contains all common/basic methods for a matrix

Delegation classes (Helper/sub):

*Helper Class* getGaussian(..), getLUFactorization(..), ...

Subclasses (extends Matrix) SquareMatrix, LowerTriMatrix, UpperTriMatrix, ...

Delegation seems similar to defining multiple .cpp files with headers in C++. More tips are still welcome.


Edit2: Also, I'd prefer to refactor by properly designing it, not quickfixes. I hope it helps down the road for future projects, not just this one.

Interfaces should be both complete and minimal, meaning that a type interface should contain all methods necessary to achieve the needed and meaningful tasks on that type, and only these.

So analyse the API methods of Matrix and determine which of them belong to the core API (which must have access to the internals of the class to accomplish their task), and which are providing "extended" functionality. Then reduce the class API to the core functionality, migrating the rest of the methods to separate helper / utility / subclasses which can accomplish their goals using the public core API.

For refactoring / unit testing help, consider getting Working Effectively with Legacy Code by Michael Feathers.

Let me set the scene.

The Scene

We're a web development company, and so we work within a highly distributed environment. DB Servers, Services, Web Services, Front-end, Back-end, etc etc. I'm not sure if that's relevant, but there it is.

Essentially though, at it's most basic, we develop custom web solutions, and maintain one large CMS application, which we use to leverage 80% of our projects.

The Problem

The CMS application works, kinda, but it has to be customized and changed almost all the time for each project. Every time I go in it, or fix the same problem over and over again, I want to cry.

It just could be so much better, and in it's current form, it's very inflexible, and rigid. For example:

  • There's repeated code all over the place
  • Global business logic is spread out all over different layers, multiple times, making it a dog to change, anytime anything has to be modified
  • there's loads of god classes, and the structure doesn't make any sense
  • etc etc.

Not only is this ugly, and annoying, but it makes it prone to bugs, and hard to extend. For example, it would be great if the CMS worked at an API level, but because there's business code all over the UI, it cannot.

The Question

I would love to discuss with my peers solutions for these problems, and fix/rewrite a lot of the issues in the application in a collaborative effort, because I believe it would have a lot of long-term benefits.

For example, it would make each individual project take less time, because there would less bugs to fix. It would make customizing easier, and it would make extending and building a better application simpler.

However, I realize this would be a relatively massive undertaking, and thus would cost a lot of money.

The alternative, which my seniors dictate, is to keep hacking the application each time with quick, non reusable "fixes", which I hate, but it does make it "work", doesn't take THAT long, and so perhaps makes more economic sense?

Who's right? Am I just an idealist fool?

The question is, should I approach my seniors and make a case for it, or not?

UPDATE:

Hey guys, I know my question gets a bit specific towards the end, but I'm really enjoying all your general feedback, and the experiences you've had, and how you dealt with it.

Thanks again, keep 'em coming!

First learn to write code which does not rot when the project is maintained, but instead gets better as time goes on (TDD/BDD, Clean Code, Boy Scout Rule, DRY, YAGNI, SOLID etc.). Learning is easiest done in small greenfield projects.

Then when you have the skills, apply them to the legacy projects and improve them gradually (Working Effectively with Legacy Code, short version also in PDF). Gradual improvement is easier to sell to the management than a big bang rewrite (not that a rewrite would even help in the long run, unless you know how to write code which does not rot, and in that case you would anyways be able to improve the legacy code without a rewrite).

My question is same as the title. I have not done Test driven development yet,I have performed unit testing and I am aware of testing in general but I hear it is much advantageous to follow TDD?. So should I opt for it everytime or are there some preconditions... i just need to know some basic or important requirements when i should opt for TDD. Sorry if this question is too trivial.

IMHO if you are not comfortable with TDD, trying to apply it in projects where you will need to interact/use legacy code it's much more complex than applying it in a project from scratch. In fact, there is a whole book about this.

HTH

People at my company see unit testing as a lot of extra work, that offers fewer benefits than existing functional tests. Are unit and integration tests worth it? Note a large existing codebase that wasn't designed with testing in mind.

Retroactively writing unit tests for legacy code can very often NOT be worth it. Stick with functional tests, and automate them.

Then what we've done is have the guideline that any bug fixes (or new features) must be accompanied by unit tests at least testing the fix. That way you get the project at least going in the right direction.

And I have to agree with Jon Skeet (how could I not?) in recommending "Working Effectively With Legacy Code", it really was a helpful skim/read.

(I'm assuming that you're using "functional test" to mean a test involving the whole system or application being up and running.)

I would unit test new functionality as I wrote it, for three reasons:

  • It helps me get to working code quicker. The turnaround time for "unit test failed, fix code, unit test passed" is generally a lot shorter than "functional test failed, fix code, functional test passed".
  • It helps me to design my code in a cleaner way
  • It helps me understand my code and what it's meant to be doing when I come to maintain it. If I make a change, it will give me more confidence that I haven't broken anything.

(This includes bug fixes, as suggested by Epaga.)

I would strongly recommend Michael Feathers' "Working Effectively with Legacy Code" to give you tips on how to start unit testing a codebase which wasn't designed for it.

scenario: you have tightly coupled legacy code that you can not unit test. and you have to bug fix, what do you do to not get very depressed? I read working effectively... and refactoring.. by fowler, however in both book unit testing is indispensable. Refactoring is out of the question and a code rewrite is on the table however it will take a long time. suggestions?

I'm working with some code these days that sounds like it's in a similar situation. We operate under the assumption that the existing code is correct, however confusing or just plain bad it might look. There must be something you can do to perform some kind of regression test. We have some carefully prepared inputs that we run through both the original system and our modified system to ensure that there are no unexplained changes in behaviour. I also recommend the book Working Effectively with Legacy Code.

An alternative to unit-tests when addressing legacy code is introduced in the book Working Effectively with Legacy Code :

Characterization tests are tests that pin down (characterize) the current behavior of the code and sense when changes invalidate it. Quoting Wikipedia:

The goal of characterization tests is to help developers verify that the modifications made to a reference version of a software system did not modify its behaviour in unwanted or undesirable ways. They enable, and provide a safety net for, extending and refactoring code that does not have adequate unit tests.

We have a project that is starting to get large, and we need to start applying Unit Tests as we start to refactor. What is the best way to apply unit tests to a project that already exists? I'm (somewhat) used to doing it from the ground up, where I write the tests in conjunction with the first lines of code onward. I'm not sure how to start when the functionality is already in place. Should I just start writing test for each method from the repository up? Or should I start from the Controllers down?

update: to clarify on the size of the project.. I'm not really sure how to describe this other than to say there's 8 Controllers and about 167 files that have a .cs extension, all done over about 7 developer months..

As you seem to be aware, retrofitting testing into an existing project is not easy. Your method of writing tests as you go is the better way. Your problem is one of both process and technology- tests must be required by everyone or no one will use them.

The recommendation I've heard and agree with is that you should not attempt to wrap tests around an existing codebase all at once. You'll never finish. Start by working testing into your bugfix process- every fixed bug gets a test. This will start to work testing into your existing code over time. New code must always have tests, of course. Eventually, you'll get the coverage up to a reasonable percentage, but it will take time.

One good book I've had recommended to me is Working Effectively With Legacy Code by Michael C. Feathers. The title doesn't really demonstrate it, but working testing into an existing codebase is a major subject of the book.

Forgive me if this is a dupe but I couldn't find anything that hit this exact question.

I'm working with a legacy application that is very tightly coupled. We are pulling out some of the major functionality because we will be getting that functionality from an external service.

What is the best way to begin removing the now unused code? Should I just start at the extreme base, remove, and refactor my way up the stack? During lunch I'm going to go take a look at Working Effectively with Legacy Code.

If you can, and it makes sense in your problem domain, I would try to, during development, try and keep the legacy code functioning in parallel with the new API. And use the results from the legacy API to cross check that the new API is working as expected.

Hi in my project we have hundreds of test cases.These test cases are part of build process which gets triggered on every checkin and sends mail to our developer group.This project is fairly big and is been for more than five years.
Now we have so many test cases that build takes more than an hour .Some of the test cases are not structured properly and after refactoring them i was able to reduce the running time substantially,but we have hundreds of test cases and refactoring them one by one seems bit too much.
Now i run some of the test cases(which takes really long to execute) only as part of nightly build and not as part of every checkin.
I am curious as how other guys manage this .

I believe it was in "Working Effectively with Legacy Code" that he said if your test suite takes longer than a couple minutes it will slow developers down too much and the tests will start getting neglected. Sounds like you are falling into that trap.

Are your test cases running against a database? Then that's most likely your biggest source of performance problems. As a general rule, test cases shouldn't ever be doing I/O, if possible. Dependency Injection can allow you to replace a database object with mock objects that simulate the database portion of your code. That allows you test the code without worrying whether the database is setup correctly.

I highly recommend Working Effectively with Legacy Code by Michael Feathers. He discusses how to handle a lot of the headaches that you seem to be running into without having to refactor the code all at once.

UPDATE:

A another possible help would be something like NDbUnit. I haven't used it extensively yet, but it looks promising: http://code.google.com/p/ndbunit/

I am completely new to Unit test case writing. I am using MVVMLigh with WPF. Is it necessary to use some third party test framework or .Net Unit test framework it enough? Also how to handle static class in unit test case? In this case AppMessages class.

Can some one please guide me how to write unit cases for following piece of code:

public MyViewModel(Participant participant)
{    
    if (participant != null)
    {
        this.ParentModel = parentModel;
        OkCommand = new RelayCommand(() => OkCommandExecute());
        CalculateAmountCommand = new RelayCommand(() => CalculateAmount());        
    }
    else
    {
        ExceptionLogger.Instance.LogException(Constants.ErrorMessages.FinancialLineCanNotBeNull, "FinancialLineViewModel");
        AppMessages.DisplayDialogMessage.Send(Constants.ErrorMessages.FinancialLineCanNotBeNull, MessageBoxButton.OK, Constants.DefaultCaption, null);
    }
}

public static class AppMessages
{
    enum AppMessageTypes
    {
        FinancialLineViewDisplay,
        FinancialLineViewClose,
        DisplayDialogMessage
    }

    public static class DisplayDialogMessage
    {
        public static void Send(string message, MessageBoxButton button, string caption, System.Action<MessageBoxResult> action)
        {
            DialogMessage dialogMessage = new DialogMessage(message, action)
            {
                Button = button,
                Caption = caption
            };

            Messenger.Default.Send(dialogMessage, AppMessageTypes.DisplayDialogMessage);
        }

        public static void Register(object recipient, System.Action<DialogMessage> action)
        {
            Messenger.Default.Register<DialogMessage>(recipient, AppMessageTypes.DisplayDialogMessage, action);
        }
    }
}

public class ExceptionLogger
{
    private static ExceptionLogger _logger;
    private static object _syncRoot = new object();

    public static ExceptionLogger Instance
    {
        get
        {
            if (_logger == null)
            {
                lock (_syncRoot)
                {
                    if (_logger == null)
                    {
                        _logger = new ExceptionLogger();
                    }
                }
            }

            return _logger;
        }
    }

    public void LogException(Exception exception, string additionalDetails)
    {
        LogException(exception.Message, additionalDetails);
    }

    public void LogException(string exceptionMessage, string additionalDetails)
    {
        MessageBox.Show(exceptionMessage);
    }
}

About testability

Due to using of singletons and static classes MyViewModel isn't testable. Unit testing is about isolation. If you want to unit test some class (for example, MyViewModel) you need to be able to substitute its dependencies by test double (usually stub or mock). This ability comes only when you providing seams in your code. One of the best technique to provide seams is Dependency Injection. The best resource for learning DI is book from Mark Seemann (Dependency Injection in .NET).

You can't easily substitute calls of static members. But if you use much static members then your design isn't perfect.

Of course, you can use unconstrained isolation framework such as Typemock Isolator, JustMock or Microsoft Fakes to fake static method calls but it costs money and it don't push you to better design. This frameworks are great for creating test harness for legacy code.

About design

  1. Constructor of MyViewModel is doing too much. Constructors should be simple.
  2. If dependecy is null then constructor must throw ArgumentNullException but not silently log about error. Throwing exception is a clear indication that your object isn't usable.

About testing framework

You can use any unit testing framework you like. Even MSTest, but personally I don't recommend it. NUnit and xUnit.net is MUCH better.

Further reading

  1. Mark Seeman - Dependency Injection in .NET
  2. Roy Osherove - The Art of Unit Testing (2nd Edition)
  3. Michael Feathers - Working Effectively with Legacy Code
  4. Gerard Meszaros - xUnit Test Patterns

Sample (using MvvmLight, NUnit and NSubstitute)

public class ViewModel : ViewModelBase
{
    public ViewModel(IMessenger messenger)
    {
        if (messenger == null)
            throw new ArgumentNullException("messenger");

        MessengerInstance = messenger;
    }

    public void SendMessage()
    {
        MessengerInstance.Send(Messages.SomeMessage);
    }
}

public static class Messages
{
    public static readonly string SomeMessage = "SomeMessage";
}

public class ViewModelTests
{
    private static ViewModel CreateViewModel(IMessenger messenger = null)
    {
        return new ViewModel(messenger ?? Substitute.For<IMessenger>());
    }

    [Test]
    public void Constructor_WithNullMessenger_ExpectedThrowsArgumentNullException()
    {
        var exception = Assert.Throws<ArgumentNullException>(() => new ViewModel(null));
        Assert.AreEqual("messenger", exception.ParamName);
    }

    [Test]
    public void SendMessage_ExpectedSendSomeMessageThroughMessenger()
    {
        // Arrange
        var messengerMock = Substitute.For<IMessenger>();
        var viewModel = CreateViewModel(messengerMock);

        // Act
        viewModel.SendMessage();

        // Assert
        messengerMock.Received().Send(Messages.SomeMessage);
    }
}

I am currently writing a Windows Service in C# that will be responsible for hosting a WCF Service. When Service Operations are invoked, my service will execute a command line application, do some data processing on the StandardOutput and return the results. This command line application changes state of other third party services on the server.

This is a rare instance where I'm truely able to start from scratch, so I'd like to set it up correctly and in a way that can be easily unit tested. There is no legacy code, so I have a clean slate. What I'm struggling with is how to make my service unit-testable because it is almost entierly dependent on an external application.

My service operations have a roughly 1:1 ratio with operations the command line application does. If you have not guessed, I'm building a tool to allow remote administration of a service which only provides CLI administration.

Is this just a case where unit-tests are more trouble than they are worth? To provide some context, here's a quick sample from my proof of concept app.

    private string RunAdmin(String arguments)
    {
        try
        {
            var exe = String.Empty;
            if (System.IO.File.Exists(@"C:\Program Files (x86)\App\admin.exe"))
            {
                exe = @"C:\Program Files (x86)\App\admin.exe";
            }
            else
            {
                exe= @"C:\Program Files\App\admin.exe";
            }

            var psi = new System.Diagnostics.ProcessStartInfo
            {
                FileName = exe,
                UseShellExecute = false,
                RedirectStandardInput = true,
                RedirectStandardOutput = true,
            };

            psi.Arguments = String.Format("{0} -u {1} -p {2} -y", arguments, USR, PWD);

            var adm= System.Diagnostics.Process.Start(psi);

            var output = fmsadmin.StandardOutput.ReadToEnd();

            return output;
        }
        catch (Exception ex)
        {
            var pth = 
                System.IO.Path.Combine(
                    Environment.GetFolderPath(Environment.SpecialFolder.Desktop),
                    "FMSRunError.txt");

            System.IO.File.WriteAllText(pth, String.Format("Message:{0}\r\n\r\nStackTrace:{1}", ex.Message, ex.StackTrace));

            return String.Empty;
        }
    }

    public String Restart()
    {
        var output = this.RunAdmin("/restart");
        return output; // has cli tool return code
    }

Without adding a LOT of "test" code to my RunAdmin method, is there any way to unit-test this? Obviously I could add a lot of code to the RunAdmin method to "fake" output, but I'd like to avoid that if possible. If that is the recommend way, I can probably create a script for myself to create all possible output. Is there another way?

Just to offer you a second opinion. Based on your description and code sample I'm having hard time finding code that does not depend on system calls. The only logic in this code is 'call .NET Process class'. It is literally one-to-one, web wrapper over System.Diagnostics.Process. The process output gets returned to the web client as is, there is no translation.

Two main reasons to write unit tests:

  • Improve design of your classes and interactions between classes

There is hardly anything that can be improved because you pretty much need one or two classes. This code simply doesn't have enough responsibility. As you describe it, everything fits into WCF service instance. There is no point in creating wrapper over .NET Process because it will simply be an artificial proxy.

  • Make you confident in your code, avoid bugs etc

The only bugs that can happen to this code is if you use .NET Process API incorrectly. Since you don't control this API then your test code would just reiterate your assumptions about API. Let me give you an example. Suppose you or your colleague would accidentally set RedirectStandardOutput = false. This will be a bug in your code because you will not be able to get process output. Would your test catch it? I don't think so, unless you would literally write something like:

public class ProcessStartInfoCreator {
    public static ProcessStartInfo Create() {
        return new ProcessStartInfo {
            FileName = "some.exe",
            UseShellExecute = false,
            RedirectStandardInput = true,
            RedirectStandardOutput = false  // <--- BUG
        };
    }
}

[Test]
public void TestThatFindsBug() {
    var psi = ProcessStartInfoCreator.Create();
    Assert.That(psi.RedirectStandardOutput, Is.True);
}

Your test just repeats your code, your assumption about API you don't own. Even after you wasted time introducing artificial, not needed class like ProcessStartInfoCreator Microsoft will introduce another flag that will potentially break your code. Would this test catch it? No. Would your test catch the bug when EXE you are running will get renamed or changes command line parameters? No. I highly recommend you to read this article. The idea is that the code that is just a thin wrapper over API you don't control does not need unit tests, it needs integration tests. But that is a different story.

Again I might have misunderstood requirements and there is more to it. Maybe there is some translation for process output. Maybe there is something else that deserves unit testing. It also might be worth writing unit tests for this code just for educational purposes.

P.S. You also mention that you have chance to start from scratch with this project. If you decide that this project is not a good candidate you can always start introducing Unit tests to the legacy code base. This is a very good book on this subject, despite the title it is about unit tests.

I am interested in exploring the idea that the relationship between methods and the member varialbes they use can give a hint to how the class could be broken up into smaller pieces. The idea is that a group of variables will be closely related to one responsibility and should be contained in one class according to SRP.

For a trivial example, a class such as:

public class Rectangle {
  private int width; 
  private int height;
  private int red,green,blue;
  public int area() { return width * height; }
//just an example, didn't check the api.
  public Color color () { return new Color (red, green, blue); } 
}

Should be refactored into:

public class Rectangle {
  private Dimension size;
  private Color color;
  ...
}

Because the break down would be:

Area: width, height Color: red, green, blue

Since these variables are used in the same method they are clearly related and could be made into a class of its own. I know this example might be too trivial but bear with me and try and think bigger here. If other methods also use these variables they are most likely also related and could also be moved into the new class.

Just for fun I created a little plugin for Eclipse that tries to do this. It will have a break down on methods->variables, variables->methods and also tries to group methods together after which variables they use either directly or indirectly.

Is this something that you do when coding, and is it actually helpful?

This is certainly a good refactoring approach. In Working Effectively With Legacy Code, Michael Feathers describes a technique for this in a section called Seeing Responsibilities:

Heuristic #4 Look for Internal Relationships

Look for relationships between instance variables and methods. Are certain instance variables used by some methods and not others?

He then goes on to describe a method where you draw a diagram called a feature sketch, which is a graph showing the methods of a class and internal method calls. From the diagram, you can easily visually identify clusters of interconnected method calls which are candidates for refactoring into a new, more cohesive class.

It would be very cool if your Eclipse plugin could quickly generate a graph as described in the book - I could see that as being a useful refactoring tool.

I am really new to unit testing, and I can't figure out the proper way to do it for my case, though I spent crazy amount of time on research.

My code base is huge (~3 years of work), very coupled unfortunately, hard to test and no unit testing has ever been done on it.

So for instance, when trying to test a collection class ProductCollection, more specifically, the bool MoveElementAtIndex(Product productToMove, int newIndex) of it, I am encountering the following problems:

  • first I have to initialize this new ProductCollection()
  • the constructor initializes another hand made class: new KeyedList<ID, Product>. I suppose this should not be called within this constructor, since I'm not testing KeyedList.
  • next, I am trying to add 3 products to this ProductCollection.
  • I am then first creating these 3 new Product().
  • But the constructor of Product class does several things
  • It computes a unique ID for the newly created product: this.ID = IDUtils.ComputeNewIDBasedOnTheMoonPhase(). I suppose I should not test this neither, since it's not my scope. How should I avoid such calls at this level of deepness?
  • The same Product constructor assigns some default properties for this product: this.Properties = new ProductProperties(folderPathToDefaultProperties). This should not be called from my simple FieldCollection.MoveElementAtIndex test, right?
  • Let's say I finally have my product objects now, and I'm trying to add them to my collection.
  • But the ProductCollection.Add(MyProduct) checks if the underlying KeyedList already contains the product. This is also business logic that I should avoid, not being related to my test. The question is how?
  • Also, in this Add method, some events are raised, notifying the system about several things (e.g., that a new product was added to the collection). These should also not be fired at all I suppose.
  • And then finally, when I have my products added, I'm calling the desired SUT: the move elements method.
  • But this method also has logic that can be out of scope for my test: it verifies that the underlying KeyedList actually contains those fields, it calls KeyedList.Remove(), KeyedList.Insert() for its moving logic, and it fires events like CollectionModified.

If you could just explain me how to do this unit test properly, how to avoid underlying objects from being called, I would really appreciate it.

I am thinking of Microsoft's Moles framework (VS2010), since I have the impression that it does not require me to refactor everything, as this is absolutely not an option. But having tried it already, still can't find a proper way to use it.

Also, I have the impression that this concrete example will help many in my situation, because this is how code in real world usually is.

Any thoughts?

I would recommend to use ApprovalTest. This is great tool to start testing legacy systems with not the best design.

Don't bother now between difference of unit and integration tests and don't bother with isolating your class completely. Your code is probably not best suited for testability and when you start to isolate everything from one another - you will end up of huge arrange sections and with very fragile tests.

On the other hand, there are external resources (web services, databases, file systems etc), that you have to isolate. Also all non-deterministic behavior should be isolated (Random, current time, user input and so on)

I just recommend to create a safety net of verification tests, that will help you to change software in the direction of testability and will tell you if you made a breaking change in your code.

Read Michael Feathers Working Effectively with Legacy Code

I have a "recipe" method which i am trying to write using TDD. It basically calls out to different methods and occasionally makes decisions based on results of these methods:

  public void HandleNewData(Data data)
  {
     var existingDataStore = dataProvider.Find(data.ID);
     if (data == null)
        return;

     UpdateDataStore(existingDataStore, data, CurrentDateTime);

     NotifyReceivedData(data);

     if (!dataValidator.Validate(data))
        return;

     //... more operations similar to above
  }

My knee jerk reaction would be to start writing test cases where I verify that HandleNewData calls the methods seen above passing the expected arguments and that it returns in those cases where the method call fails. But this feels to me kind of like this is a huge investment in time to code such a test up with little to no actual benefit.

So what is the real benefit of writing such a test? Or is it really not worth the bother?

It seems like it is just an over-specification of code it self, and will lead to maintenance problems whenever that code has to call another method or decide to not call one of the current methods anymore.

TDD does not mean writing unit tests for code that already exists (although sometimes it may be necessary when improving legacy code).

You've probably heard the term "Red, Green, Refactor". This is the approach we take when doing TDD. Here are the three laws of Test-Driven Development, which take that a little further...

  1. You may not write production code until you have written a failing unit test.
  2. You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
  3. You may not write more production code than is sufficient to pass the currently failing test.

The benefits of taking this approach is that you end up with very close to 100% unit-test coverage and you know that your code works exactly as specified.

It will reduce maintenance problems because as soon as somebody makes a change to your code and runs the tests, they will know if they have broken anything.

In this case, I would incrementally add unit tests for the methods being called from HandleNewData() before adding any for HandleNewData().

Adding unit tests to legacy code is tough, but doable and very much worth the effort. If you haven't yet, I really recommend reading Working Effectively with Legacy Code by Michael Feathers. I've found it invaluable when adding unit tests to a 25-year-old code-base.

I have just been through the following paper and I found it extremely useful: http://www.objectmentor.com/resources/articles/Clean_Code_Args.pdf

I am looking for similar papers/books/tutorials/etc. that provide step-by-step practice on refactoring and/or correct class design. I have read Fowler's “Refactoring”, but I was looking for more substantial examples.

You probably won't find much resource about refactoring large examples step by step. Because you can never cover all types of example.

The reason Martin Fowler use small and easy example in "Refactoring", is because almost every large chunk of bad code are a combination of different bad smell. By learning how to recognize particular bad smell, you can fix the code gradually.

I will recommend you check Working Effectively with Legacy Code. It's a book that focus on the strategies to improve large legacy code. For class design, you probably want to read some books about design pattern.

Most important, try to apply things you learn in the book to your code.

I have seen many articles on why Test Driven Development is good and that it reduces development time and so on. But after searching through a lot of forums, I am still yet to get a concrete advantage of TDD. I am not saying testing is a bad thing, but my point is what is the harm if I write my unit test after I write my source code rather than vice-versa as TDD proposes. And both the test cases do act like regression tests once it is complete. I also experienced a lot of problems while trying to follow TDD in a legacy code.I guess nowadays most of the code is legacy code where we have to modify code without pre-existing tests. Also is TDD limited to unit tests only or even system level and integration tests. I am just not able to imagine how we can do integration tests without writing source code.

I won't say that TDD shorten the development time. It could even be longer. But TDD leads to "clean code that works". The software grows at the same time as the unit tests, not one after the other, and thus is tested as soon as it it written. This gives confidence to the developer as well as a good idea of "where he stands" because he knows that what he has done so far is "done done".

Also writing the unit tests after the fact, can be hard. The author "Working effectively with legacy code" (a very good resource BTW) even says that code written without unit tests indeed is legacy code.

Also is TDD limited to unit tests only or even system level and integration tests. I am just not able to imagine how we can do integration tests without writing source code.

TDD is a development technique, it's not intended to replace other kind of testing.

One can however write integration tests before the code to be tested exists. This allows asking oneself how the code that will be produced can be tested.

I have taken on a project that has been underway for months. I've yet to ever do unit tests and figured it would be a decent time to start. However, normally unit tests are written as you go and you plan for them when beginning the project. Is it reasonable for me to start now? Are there any decent resources for setting up unit tests without starting a brand new solution (the project is already underway). Using vb.net with VS2005

Thanks in advance :)

The basic issue you face is something Michael Feathers refers to in Working Effectively with Legacy Code as the Legacy Code Dilemma, a.k.a. the Refactoring Dilemma:

  • When we refactor, we should have tests.
  • To put tests in place, we often have to refactor.

As a way out of the dilemma, the book describes a set of dependency-breaking techniques. These are relatively safe ways to pry apart dependencies in order to isolate code and get it under unit test. The book's examples are in Java, C++, C, and C#. I don't know enough about vb.net to say how many of the techniques would apply.

There is also some general discussion/encouragement here on the c2 wiki.

I have the C++ code of a exe which contains a UI and some process. My goal is to separate the UI from the process and to convert the exe into a dll.

In order to do that, I am thinking of generating unit test before touching any code and then to do my modification and make sure the tests are not failing.

The problem is that I am not sure if this is the best approach and if it is, is there a way to automatically generate unit test.

BTW, I am using VS 2012. Do you have any guidance for me?

As far as I know, there are no tools for automatically bringing existing code under unit tests - if it were that easy, there should be no new bugs at all, right? As arne says in his answer, if code was not designed to be tested in the first place, it usually has to be changed to be testable.

The best you can do in my opinion is to learn some techniques of how to introduce unit tests with relatively few changes (so that you can introduce the unit tests before you start the "real" modifications); one book on this subject I've read recently is Michael Feathers' "Working Effectively with Legacy Code" (Amazon Link: http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052). Although it has some shortcomings, it has pretty detailed descriptions of techniques how you can easily introduce unit tests.

I have three legacy applications that share a lot of source code and data. Multiple instances of each of these applications be executed by a user at any time, e.g. a dozen mixed application executions can be active at a time. These applications currently communicate through shared memory and messaging techniques so that they can maintain common cursor positioning, etc. The applications are written primarily in C++, use Qt and run in total to about 5 million lines of code. Only some of the existing code is threadsafe.

I want to consolidate these three executables into a single executable and use multi-threading functionality to allow multiple instance of each of the three functionality branches to execute at the same time. It has been suggested that I look into some of the features provided by Boost, e.g. shared pointers, and use OpenMP to orchestrate overall execution of the multiple threads.

Any comments on how to proceed will be appreciated, particularly references on the best way to tackle this kind of a refactoring problem.

My suggestion to you would be design the desired solution first (firstly assuming that the requirements are the same) and then build a phased migration path from the existing code base basing it on the requirements that third party functionality may introduce.

Refactoring should be one small step at a time - but knowing where you are going.

The Working Effectively with Legacy Code (Robert C. Martin Series) will be a good read I'd suggest.

Trust me (I've got the t shirts) don't attempt to refactor unless you know how to prove functionality - automated verification tests will be your saviour.

I'm currently consulting on an existing system, and I suspect the right next step is to add unit tests because of the types of exceptions that are occurring (null pointers, null lists, invalid return data). However, an employee who has a "personal investment" in the application insists on integration tests, even though the problems being reported are not related to specific use cases failing. In this case is it better to start with unit or integration tests?

Typically, it is very difficult to retrofit an untested codebase to have unit tests. There will be a high degree of coupling and getting unit tests to run will be a bigger time sink than the returns you'll get. I recommend the following:

  • Get at least one copy of Working Effectively With Legacy Code by Michael Feathers and go through it together with people on the team. It deals with this exact issue.
  • Enforce a rigorous unit testing (preferably TDD) policy on all new code that gets written. This will ensure new code doesn't become legacy code and getting new code to be tested will drive refactoring of the old code for testability.
  • If you have the time (which you probably won't), write a few key focused integration tests over critical paths of your system. This is a good sanity check that the refactoring you're doing in step #2 isn't breaking core functionality.

I'm working on an existing Java EE project with various maven modules that are developed in Eclipse, bundled together and deployed on JBoss using Java 1.6. I have the opportunity to prepare any framework and document how unit testing should be brought to the project.

Can you offer any advice on...

  • JUnit is where I expect to start, is this still the defacto choice for the Java dev?
  • Any mocking frameworks worth setting as standard? JMock?
  • Any rules that should be set - code coverage, or making sure it's unit rather than integration tests.
  • Any tools to generate fancy looking outputs for Project Managers to fawn over?

Anything else? Thanks in advance.

If you haven't done so already, read Working Effectively with Legacy Code by Michael Feathers.

In some cases unit testing can be really difficult. Normally people say to only test your public API. But in some cases this is just not possible. If your public API depends on files or databases you can't unit test properly. So what do you do?

Because it's my first time TDD-ing, I'm trying to find "my style" for unit testing, since it seems there is just not the one way to do so. I found two approaches on this problem, that aren't flawless at all. On the one hand, you could try to friend your assemblies and test the features that are internal. On the other hand, you could implement interfaces (only for the purpose of unit testing) and create fake objects within your unit tests. This approach looks quite nice first but becomes more ugly as you try to transport data using these fakes.

Is there any "good" solution to this problem? Which of those is less flawed? Or is there even a third approach?

If your public API depends on files or databases you can't unit test properly. So what do you do?

There is an abstraction level that can be used.

  • IFileSystem/ IFileStorage (for files)
  • IRepository/ IDataStorage (for databases)

Since this level is very thin its integration tests will be easy to write and maintain. All other code will be unit-test friendly because it is easy to mock interaction with filesystem and database.

On the one hand, you could try to friend your assemblies and test the features that are internal. 

People face this problem when their classes violates single responsibility principle (SRP) and dependency injection (DI) is not used.

There is a good rule that classes should be tested via their public methods/properties only. If internal methods are used by others then it is acceptable to test them. Private or protected methods should not be made internal because of testing.

On the other hand, you could implement interfaces (only for the purpose of unit testing) and create fake objects within your unit tests.  

Yes, interfaces are easy to mock because of limitations of mocking frameworks. If you can create an instance (fake/stub) of a type then your dependency should not implement an interface.

Sometimes people use interfaces for their domain entities but I do not support them.

To simplify working with fakes there are two patterns used:

  1. Object Mother
  2. Test Data Builder

When I started writing unit tests I started with 'Object Mother'. Now I am using 'Test Data Builder's.

There are a lot of good ideas that can help you in the book Working Effectively with Legacy Code by Michael Feathers.

Is it generally accepted that you cannot test code unless the code is setup to be tested?

A hypothetical bit of code:

public void QueueOrder(SalesOrder order)
{
   if (order.Date < DateTime.Now-20)
      throw new Exception("Order is too old to be processed");
   ...  
}

Some would consider refactoring it into:

protected DateTime MinOrderAge;
{
   return DateTime.Now-20;
}

public void QueueOrder(SalesOrder order)
{
   if (order.Date < MinOrderAge)
      throw new Exception("Order is too old to be processed");
   ...
}

Note: You can come up with even more complicated solutions; involving an IClock interface and factory. It doesn't affect my question.

The issue with changing the above code is that the code has changed. The code has changed without the customer asking for it to be changed. And any change requires meetings and conference calls. And so i'm at the point where it's easier not to test anything.

If i'm not willing/able to make changes: does it make me not able to perform testing?

Note: The above pseudo-code might look like C#, but that's only so it's readable. The question is language agnostic.

Note: The hypothetical code snippet, problem, need for refactoring, and refactoring are hypothetical. You can insert your own hypothetical code sample if you take umbrage with mine.

Note: The above hypothetical code is hypothetical. Any relation to any code, either living or dead, is purely coincidental.

Note: The code is hypothetical, but any answers are not. The question is not subjective: as i believe there is an answer.


Update: The problem here, of course, is that i cannot guarantee that change in the above example didn't break anything. Sure i refactored one piece of code out to a separate method, and the code is logically identical.

But i cannot guarantee that adding a new protected method didn't offset the Virtual Method Table of the object, and if this class is in a DLL then i've just introduced an access violation.

You can unit test your original example using a Mock object framework. In this case I would mock the SalesOrder object several times, configuring a different Date value each time, and test. This avoids changing any code that ships and allows you to validate the algorithm in question that the order date is not too far in the past.

For a better overall view of what's possible given the dependencies you're dealing with, and the language features you have at your disposal, I recommend Working Effective with Legacy Code.

I've been asked to work on changing a number of classes that are core to the system we work on. The classes in question each require 5 - 10 different related objects, which themselves need a similiar amount of objects.

Data is also pulled in from several data sources, and the project uses EJB2 so when testing, I'm running without a container to pull in the dependencies I need!

I'm beginning to get overwhelmed with this task. I have tried unit testing with JUnit and Easymock, but as soon as I mock or stub one thing, I find it needs lots more. Everything seems to be quite tightly coupled such that I'm reaching about 3 or 4 levels out with my stubs in order to prevent NullPointerExceptions.

Usually with this type of task, I would simply make changes and test as I went along. But the shortest build cycle is about 10 minutes, and I like to code with very short iterations between executions (probably because I'm not very confident with my ability to write flawless code).

Anyone know a good strategy / workflow to get out of this quagmire?

As you suggest, it sounds like your main problem is that the API you are working with is too tightly coupled. If you have the ability to modify the API, it can be very helpful to hide immediate dependencies behind interfaces so that you can cut off your dependency graph at the immediate dependency.

If this is not possible, an Auto-Mocking Container may be of help. This is basically a container that automatically figures out how to return a mock with good default behavior for nested abstractions. As I work on the .NET framework, I can't recommend any for Java.

If you would like to read up on unit testing patterns and best practices, I can only recommend xUnit Test Patterns.

For strategies for decoupling tightly coupled code I recommend Working Effectively with Legacy Code.

I've been working in windows forms applications and ASP.Net applications for the past 10 months. I've always wondered how to perform proper unit testing on the complete application in a robust manner covering all the scenarios. I've the following questions regarding them -

  • What are the standard mechanisms in performing unit testing and writing test cases?
  • Does the methodologies change based on the application nature such as Windows Forms, Web applications etc?
  • What is the best approach to make sure we cover all the scenarios? Any popular books on this?
  • Popular tools for performing unit testing?

I recommend a very good book on this subject: Working Effectively with Legacy Code by Michael Feathers. I found it immensely useful for similar legacy projects.

The main problem with legacy code is that there is no "standard" way to unit test it :-( The "standard" Test Driven Development is invented for new projects, where you start writing code - and unit tests - from scratch, so you can grow your unit test suite together with your code from day 1, and keep all (or most) of your code covered all the time.

However, reality is that most of the real life projects involve legacy code, without a single unit test (in fact, Feathers' definition of legacy code is "code without unit tests"). This book is full of useful advice on what to do when you need to touch code you barely understand, or modify a monster method of 1000 lines and make sure that at least your modification gets unit tested properly. In such cases, typically it is very difficult to write unit tests, because the code was not designed to be testable. So often you have to refactor it to make it testable, but of course without unit tests in place, this is risky... still, there are ways out of such traps, and this book shows them.

In general, you shouldn't start with the aim of covering the whole codebase (unless you have a project manager willing to accept that you are not going to produce any new feature in the next couple of months - or years ;-). You have a limited amount of time to achieve the most possible benefit from your unit tests. Thus you have to focus on the most critical parts of the code first. These are typically the ones most often modified and/or the ones where the most bugs are found (there is a correlation of course). You may also know in advance that an upcoming feature requires extending a specific part of the code, so you may prepare the way by creating unit tests for it. This way, over time you start to grow little "islands of safety" within the code, which are ever better covered with unit tests. Maintenance and refactoring is easier in these spots, and as you add more unit tests, the islands slowly grow...

Note that in the beginning these "safe islands" don't tend to show a very "systematic" pattern of occurrence. The most critical, most often modified parts in the code usually are distributed fairly randomly. Only at a much later stage, when the unit tested islands start to grow and merge, is it worth covering a specific module more systematically. E.g. if you see that in this particular module the code coverage of unit tests grew over 60%, you may decide to go through it and add tests for the remaining code parts too.

I've got a legacy system that processes extremely complex data that's changing every second. The modularity of the system is quite poor so I can't split the business logic into smaller modules to ease functional testing.

The actual test system is: "close your eyes click and pray", which is not acceptable at all. I want to be confident on the changes we commit on the code.

What are the test good practices, the bibles to read, the changes to operate, to increase confidence in such a system.

The question is not about unit testing, the system wasn't designed for that and it takes too much time to decouple, mock and stub all the dependencies and most of all, we sadly don't have the time and budget for that. I don't want to a philosophic debate about functional testing: I want facts that work in real life.

Ive started working on a product that uses license files. These files need to be read (and verified) in order for the application to work. This causes some problems in the unit tests, without the proper license it throws exceptions.

We are using NUnit and what I need to do is either:

  • Copy the license file into the shadow copied directory before the test is run.
  • Set the working directory to the original build output folder so that file names are still valid in the temporary test folder.

I know that file access should generally be avoided in unit tests but before the refactoring can begin, we need the unit tests in place so I need this to work.

I suggest you start with reading this book:

Working Effectively with Legacy Code

It will give you lots of insight into how to break open this type of problem. There will be some level of change to make such code testable that will have to happen without tests, but keep it as little as possible and do it very very carefully.

In your case, since injecting classes that fake reading licenses is too much of a jump, what you can do is change the class that validates the license file so that the actual validation logic is in a single method is launched from a single method that tells the rest of the class that the license is fine and make that method virtual, and then test with a subclass that overrides the method to pretend that it validated the file.

Then, once you have some tests around this class you can dump the method and subclass in favor of a properly injected class.

(Edited to respond to the fact that the validation is complex).

Our customer has a massive legacy system that is in the process of upgrade from .NET 1.1 to .NET 4.0. Along with that will be the porting of underlying frameworks to WCSF 2010 for web applications, and Enterprise Library 5.0 ou1 as the base enterprise framework. Along with those frameworks come DI facilities.

WCSF effectively forces the MVP and module designs to rely on ObjectBuilder and its attributes, so the web application upgrades are following that pattern.

The web applications consume additional application services on an app tier (via WCF). Since the underlying framework has become Enterprise Library, Unity app block has "slided in for free". But the existing coding style remains legacy, thus does not have a fundamental switch to DI unlike the web layer.

The typical layers at the app tier are

  • Business facade (BF) - exposes public coarse-grain business functions, controls transactions.
  • Business components (BC) - granular logic components coordinated by facade methods to do the actual work in the transaction.
  • Data access (DA) - read/write data for business components

Why i mentioned above the style remains legacy is because the service communication between the tiers happens by passing DataSets (and they are huge) across the network. This DataSets are passed into each object's constructor. E.g. psuedo code

BF.SomeExposedMethod(StrongTypeDS ds)
{
    this.BeginTransaction();
    BizComp bc = new BizComp(ds);
    bc.ShareTransaction();
    bc.DoWork();
    this.CommitTransaction();
}

BC.DoWork()
{
    DataAccess da = new DataAccess(this.transaction, this.ds);
    // perform work
    da.SaveWork(this.ds);
}

What's more these classes' constructors are based on the style of their parent abstract classes; they expect receiving DataSets upon instantiation.

I have been reading Mark Seeman's DI in .NET book and various other public Internet sources about DI issues. What I seem to understand from all these discussions is DI looks good for preparing a graph of clean objects ready to do work. By clean I mean empty of context data. These legacy code patterns have their constructors expecting the working data and context right from the start.

If I understand the principle of the Composition Root correctly, the constructors and methods have to be redesigned so that the context DataSets are passed in via properties. I am unclear how a DI container (Unity in this case) can figure out on its own what DataSet to inject in. For sure it cannot be an empty brand-new DataSet. At this stage of development (which I am not directly involved in the project but supporting from the side lines) I will be unable to recommend a fundamental change to the WCF implementation to make it the Composition Root before instantiating a facade object.

Additionally, I have not gathered significant advice on how DI applies to instantiating objects based on runtime conditions? Based on the execution state of the data, additional objects may be instantiated, and the type may differ based on the data category. It would appear to be the easiest way to introduce some level of DIP and DI practice into this app would be to employ the Unity container as a Service Locator; grab an instance only at that point of execution as needed.

UPDATE TO ANSWER

Based on Mark Seeman's advice, I have fabricated the following POC

BF with a Unity container. While I have successfully experimented with Unity.Wcf package, I have also made the next closest thing with each BF a Composition root where each facade method will call the container to resolve the graph of objects needed to carry out the work of the method. (This is more realistically a pattern this environment might pick up given the situation.)

static ExampleBF()
{
    container = new UnityContainer().LoadConfiguration();
}

BF gets a resolved IBCFactory, which will instantiate a concrete BC with live context data passed in.

DataSet IExampleService.GetProducts(DataSet ds)
{
    IExampleBC bc = container.Resolve<IBCFactory>().CreateBC(ds);
    // GetProducts() takes no parameter because DataSet already passed in on construction.
    return bc.GetProducts();
}

The BCFactory takes in an IDACFactory via constructor injection. Which it hands to each instantiated BC.

public BCFactory(IDACFactory DACFactory)
{
    this.DACFactory = DACFactory;
}

IExampleBC IBCFactory.CreateBC(DataSet ds)
{
    return new ExampleBC(this.DACFactory, ds);
}

The BC would rely on the IDACFactory to supply it with a DAC.

DataSet IExampleBC.GetProducts()
{
    IExampleDAC dac = this.dacFactory.CreateDAC(this.ds);
    return dac.GetProducts();
}

The DACFactory similarly instantiates a DAC based on the context data.

IExampleDAC IDACFactory.CreateDAC(DataSet ds)
{
    return new OrderDAC(ds);
}

Of course, the reality of the code base's complexity situation is way bigger in magnitude, but this simple example should hopefully suffice in demonstrating the DI concept to them.

As far as I understand the description of the classes in question, they sound more like data carriers (Entities, if you will) than Services. If this is true, it's not the responsibility of the Composition Root to compose data objects.

However, if you must resolve services based on run-time conditions, an Abstract Factory is the universal solution.

Since you are working with a legacy code base, I'd like to recommend the book Working Effectively with Legacy Code, which provides much valuable guidance in how to decouple tightly coupled legacy code bases.

Our team is planning to re-factor some of the module in legacy code base.Its a web application written in java. It has no unit tests at all.

I asked developer to write a junit for existing functionality before re-factoring, but I am sure that will not be very extensive.

what are the other measures (blackbox / whitebox / processes) i can take to make sure the re-factoring doesn't disturb any existing functionality.

the current system is pretty stable and has been running more than 8 years.

Thanks Gray

Read Michael Feathers' Working Effectively with Legacy Code before you begin.

It is very likely that the code in its current state can't be effectively unit-tested (because it's probably not in units). What I've seen work well is integration-level tests that simply run with some reasonable inputs and record the outputs; web applications make this particularly appropriate. Write those, then sprout little methods and classes - unit testing everything new - while keeping those high-level tests working. It's more work than doing TDD proper from the beginning, but it's definitely doable.

The development team I'm a part of wrote and still maintains a codebase of pure Java spaghetti code. Most of this was implemented before I joined this team.

My background is in Python/Django development and the Django community really emphasizes "pluggability" -- a feature of Django "apps" (modules) which implies mostly "one thing and one thing well", re-useability, loose coupling, and a clean, conscious API. Once the Django team I once worked in started to "get it", we pretty much had zero problems with messy, tightly-coupled masses of monolithic code.

Our practice involved developing Django apps outside of the Django project in which we intended to use the app (a common practice among Django developers, from what I've gathered) Each even lived in a Git repository separate from that of the overall Django project.

Now, we're presented with an opportunity to refactor much of our spaghetti code, and it seems to me that some of what I learned from the Django community should be applied here. In short, I'd like to see our new codebase developed as a series of "pluggable" modules each written under the assumption that it won't have access to other modules (except those on which it should rationally depend). I believe this should do a good job of driving home principles of proper software design to the whole team.

So, what I'm advocating for is to have one separate repository per "feature" we want in our new (Spring) project. Each would have its own, independent build process and the result would be a .jar . We'd also have a repository for project-level things (JSP's, static files, etc) and its build process would produce a .war . The .jars wouldn't be placed inside the .war, but rather be treated as Gradle dependencies (the same way third-party dependencies would be.)

Now I have to sell it to the boss. He's asked for some example of precedent for this plan. Obvious places to look would be open-source projects, but if a project is split across multiple repositories, it's likely to be multiple projects. So, perhaps I'm looking for some sort of suite. Spring itself looks promising as an example, but I haven't been able to find many others.

My questions are (and sorry for the long back-story):

  • Is there any such precedent?
  • What examples are there?
  • Is there any documentation (even just a blog post would be of help) out there advocating anything like this?
  • Any suggestions for implementing this?
  • Is this even a good idea?

Thanks in advance!

Edit: Whether or not to refactor is not in question. We have already decided to make some drastic changes to most of our code -- not primarily for the purpose of "making it cleaner" in fact. My question is about whether our planned project structure is sound and how to justify it to the decision-makers.

I think it's possible to find examples of good and bad implementation in both Java and Python.

Spring is indeed a good place to start. It emphasizes good layering, designing to interfaces, etc. I'd recommend it highly.

Spring's layering is "horizontal". If "module" means "business functionality" to you, then multiple modules will be present in a given layer.

Plugable suggests a more vertical approach.

Maybe another way to attack it would be to decompose the system into a set of independent REST web services. Decouple the functionality from the interface completely. That way the services can be shared by web, mobile, or any other client that comes along.

This will require strict isolation of services and ownership of data sources. A single web service will own/manage a swath of data.

I'd recommend looking at Michael Feathers' book "Working Effectively With Legacy Code". It's ten years old, but still highly regarded on Amazon.

One thing I like about the REST web services approach is you can isolate features and do them as time and budget permit. Create and test a service, let clients exercise it, move on to the rest. If things aren't too coupled you can march through the app that way.

this is something I know I should embrass in my coding projects but, due to lack of knowledge and my legendary lazyness, I do not do.

In fact, when I do them, I feel like overloaded and I finally give up.

What I am looking for is a good book/tutorial on how to really write good tests -i.e useful and covering a large spectrum of the code -.

Regards

Excellent book that covers unit tests for legacy code in particular is Michael Feather's "Working Effectively with Legacy Code"

Working Effectively with Legacy Code

If you are looking for a good book going over how to Unit Test I would recommend Kent Beck's : Test Driven Development: By Example. He is the person who really started the idea of Unit Testing, so regardless of language this would be a great book to read to get a good foundation.

Also, Don't let the title discourage you. It does talk about TDD, but it's really just a good easy overview, of how to write effective unit tests, and how they should affect your design which is a key component of writing unit tests.

What are the best approaches to unit test methods that include I/O operations in iOS?

For example, I'm considering using Objective-C categories to redefine the behavior of I/O methods such as imageNamed. However, this prevents black-box unit testing because it's necessary to know which I/O methods are used in the implementation.

Either pass in the results of the I/O operation so that test data can be supplied in place of the actual I/O data or use OCMock. I have used OCMock for exactly this purpose. If you are considering OCMock, do read at least the header file for all the methods available.

If working with Legacy code consider reading/studying the book Working Effectively with Legacy Code by Michael Feathers

When I walk to work (in London) I see loads of redevelopment on commercial buildings. One of the first this I notice is they knock down the old building and build from scratch. They end up building some of the most advanced and stunning to look at buildings.

Bearing this in mind, when I look at old legacy code, in existing applications, I think shouldn't we build from scratch?

I have seen and worked on applications that have been around for, and all you end up doing is propping them up.

I have never been involved in a project where the old application has been scrapped and we start from the ground up. Maybe the business are worried about the impact, wonder why people in the building and construction world don't have the same sentiment.

So do you rebuild or enhance enterprise applications ?

Cheeky favour - can someone reopen this as I think there is more on the subject.

There is always a tradeoff here to be made by the business. From a technical point of view, we see new tools, frameworks etc. and think to ourselves: "hey, if we use FrameBling 2.0, we would be much faster, so let's just rewrite this crap, and do it right this time". Over and over again, it has been proven that complete rewrites in most situations are just a techie dream come true. You're spending the first 60, 70 % of design and development rebuilding already existing features, in your new shiny OneTrueBusinessApp. Not generating a penny of business value during this time. It can only work if the current situation is generating no more revenues or other income streams whatsoever. Nothing wrong with starting from scratch there, since there is nothing to loose.

But this is not the common scenario: you bet that there is an enterprise app, developers are complaining all the time about the crap code that former devs left behind. But then? The best advice I can give is to head over to "Working effectively with legacy code" by Michael Feathers, and get yourselves out of this mess.

Prepare for a tough ride though, as refactoring existing apps can be a real tough ride. Good luck!

Currently, I'm working on an application with long launch time which is about 1,5min to start two of its main modules. How should I approach testing new functionalities in such applications given that I need those modules to initialize properly (caches, connection pool etc)? This seems like a waste of time to test every single change and waiting such long time.

Should I try to make my functionalities less dependent of the whole system design? I'm sure it's not always possible. Lot's of TDD examples on the Internet concentrate on small 3 classes examples

What is your experience? How to deal with it?

Yes, you should try to break dependencies so that functionality can be tested in very small units. This is the essense of TDD and it is hard to do it successfully if you don't.

Here's an interesting little commentary on TDD:

http://www.industriallogic.com/blog/history-microtests/

If you have legacy code with lots of dependencies, Michael Feathers writes about how to deal with that:

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

How do you restrict users from calling methods that are only intended for unit tests? I was thinking of throwing an exception if the call wasn't made in the context of a unit test but I don't know if tests have symbols like the "DEBUG" for projects built under debug.

One instance where I could use this is with DTOs and their identities. To properly simulate a persisted DTO I would need to assign an Id to the entity I will be using. I could do this by adding a method or adding a constructor for that specific purpose but I want it to be clear that assigning an Id to a entity should only be available for unit tests and should never be done on the actual code itself.

In general, it is not recommended to have methods only meant to be called from unit tests. Ideally, the code under test should not behave differently in unit tests than in the real world. That said, sometimes such tricks are inevitable, e.g. when trying to write tests for legacy code which has never been designed to be unit testable.

Even in these case, a "test only" method should be a temporary shortcut rather than a long term solution. So I wouldn't worry so much about detecting whether the method is called from outside unit tests. Rather, once I have the code well covered with tests, strive to refactor it to get rid of "test only" methods and other temporary scaffolding, working towards a cleaner and safer design.

If you tell more about what your method is actually meant for, we may be able to give more concrete advice.

The recommmended reading on this topic is Working Effectively with Legacy Code.

Update

Thanks for your explanation. We also have such checks in a couple of places within our legacy project, so I know sometimes it's inevitable. Luckily, we didn't need to write unit tests for those places as yet (well, I would be most happy if those were the only places in our code without unit tests :-).

I thought up an alternative solution: extract the entity ID check into a separate (virtual) method (possibly in a separate interface) and override/mock that in unit tests. This way you can eliminate the need for test-only methods with only a few lines of extra code. Hope this helps :-)

I am developing on a system that is a bit difficult to work with. It is 99% undocumented, does not follow best practices and is fairly difficult to understand (globals galore, methods spanning 50 lines, eval abuse, etc.). Unfortunately, I am fairly new to the code base and I need to add functionality.

I am sure there is code in there that I could reuse, but I have to meet a deadline and am afraid that the time spent salvaging will end up with me rushing at the end. What is better in the long run? Part of me wants wants to reuse as much as possible, but another part says I should focus on writing the new functionality from scratch, at the risk of duplication (with a plan to refactor when I have more time to spend with the existing code)? I'm leaning towards the latter but wanted to hear some opinions.

Thanks!

Everything you are experiencing can be overcome by reading Working Effectively With Legacy Code. I know you said you were on a tight deadline but rushing and not fully understanding the core code base could (and probably) have some negative side effects.

Also, you mention planning to refactor once time permits. I've said that many times and let me tell you that almost always that time never comes. Do it right the first time and do yourself a favor for the next developer or when you add new features later on.

I am new to unit testing but am beginning to think I need it. I have an ASP.NET web forms application that is being extended in unforeseen directions meaning that certain bits of code are being adapted for multiple uses. I need to try and ensure that when changing these units I don't break their original intended use. So how do I best go about applying unit tests retrospectively to existing code? Practical advice appreciated. Thanks

Very slowly!

I'm currently doing the same thing on my project and it is a lot of effort. In order to unit test existing classes they often need to be completely redesigned...and because of high coupling changing a single class can result in changes that need to be made in several dozen other classes.

The end result is good clean code that works, can be extended, and can be verified....but it does take a lot of time and effort.

I've bought myself several books on unit testing that are helping me through the process.

You might want to consider getting yourself xUnit Test Patterns and Working Effectively with Legacy Code.

I am trying to auto-generate Unit Tests for my C code using API sanity autotest.

But, the problem is that it is somewhat complex to use, and some tutorials / howto / other resources on how to use it would be really helpful.

Have you had any luck with API sanity autotest? Do you think there's a better tool that can be used to auto-generate unit tests for C code?

It's a recipe for disaster in the first place. If you auto-generate unit tests, you're going to get a bunch of tests that don't mean a lot. If you have a library that is not covered in automated tests then, by definition, that library is legacy code. Consider following the conventional wisdom for legacy code...

For each change:

  1. Pin behavior with tests
  2. Refactor to the open-closed principle (harder to do with C but not impossible)
  3. Drive changes for new code with tests

Also consider picking up a copy of Working Effectively with Legacy Code.

EDIT:

As a result of our discussion, it has become clear that you only want to enforce some basic standards, such has how null pointer values are handled, with your generated tests. I would argue that you don't need generated tests. Instead you need a tool that inspects a library and exercises its functions dynamically, ensuring that it meets some coding standards you have defined. I'd recommend that you write this tool, yourself, so that it can take advantage of your knowledge of the rules you want enforced and the libraries that are being tested.

I am trying to implement an extreme programming environment for my team's code. Whilst there are many aspects of extreme programming that I think are great (pair programming, collective code ownership, continuous integration are a few favourites), the one I'm most interested in incorporating is a test-driven environment.

While I am actively encouraging new developments to be written with unit/integration tests, there is a lot of legacy code that does not have tests written for it. At the moment, we're sort of working on the basis of if you find a bug, write a test case for the right behaviour and fix the bug. However, I'd like to be a bit more systematic about this.

There are two approaches that I'd be interested in:

  1. A means to programmatically identify all untested (public) methods throughout a project. I'm entirely not sure what this would look like in practice - perhaps as a maven plugin?
  2. A means to require all implementations of interfaces or extensions of abstract classes to have test classes that will test specific methods (from the interface or abstract class).

Also, I'd be interested to know of any other approaches people have found useful in moving to a test-driven environment.

(I appreciate that some people may say that doing this kind of migration is a waste of time, and the conversion should be undertaken over time, we would place a high value on the confidence that comes from having well-tested code. Also, if I can implement (1), adding tests can be a little extra job that can be worked on around other things.)

You need a code coverage tool; last I remember Clover was a good one in Java. http://www.atlassian.com/software/clover/

You should also read: http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

I am working on a large codebase with basically no unit test coverage. We are about to start moving toward a more test-driven approach, so I thought I would try to write a unit test for a very simple function I added, basically

class ClassUnderTest  {
    public void SetNoMatchingImage() {
        currentState.State = FSMState.NoMatchingImage;
        ... // Some more logic
        NotifyViews();
    }

    public ViewStatus GetStatus() {
        ...
        if (currentState.State == FSMState.NoMatchingImage)
           return ViewStatus.EmptyScreen;
        ...
    }

    ...
}

Ok, so test this, I would just like to do:

[Test]
public void TestSetNoMatchingImage() {
    ClassUnderTest view = new ClassUnderTest(...);
    view.SetNoMatchingImage();  
    Assert.AreEqual(ViewStatus.EmptyScreen, view.Status); 
}

But my problem here is that the ClassUnderTest constructor takes 3 arguments to non-interfaces that cannot be null, so I cannot easily create a ClassUnderTest. I can try to either create instances of these classes or stub them, but the problem is the same for them: each of the contructors take arguments that has to be created. And the problem is the same for... and so on. The result is of course a very large overhead and a lot of code needed even for very simple tests.

Is there a good way of dealing with cases like this to make the test cases easier to write?

I'd second the recommendation for Typemock, and the solutions suggested in the other answers. In addition to what's already being said Michael Feathers has written a book dealing with the patterns that you're bumping up against called 'Working Effectively With Legacy Code' -

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052/ref=sr_1_1?ie=UTF8&qid=1313835584&sr=8-1

There's a PDF extract here - http://www.objectmentor.com/resources/articles/WorkingEffectivelyWithLegacyCode.pdf

I believe I understand the basic concepts of DI / IoC containers having written a couple of applications using them and reading a lot of stack overflow answers as well as Mark Seeman's book. There are still some cases that I have trouble with, especially when it comes to integrating DI container to a large existing architecture where DI principle hasn't been really used (think big ball of mud).

I know the ideal scenario is to have a single composition root / object graph per operation but in a legacy system this might not be possible without major refactoring (only the new and some select refactored old parts of the code could have dependencies injected through constructor and the rest of the system using the container as a service locator to interact with the new parts). This effectively means that a stack trace deep within an operation might include several object graphs with calls being made back and forth between new subsystems (single object graph until exiting into an old segment) and traditional subsystems (service locator call at some point to code under DI container).

With the (potentially faulty, I might be overthinking this or be completely wrong in assuming this kind of hybrid architecture is a good idea) assumptions out of the way, here's the actual problem:

Let's say we have a thread pool executing scheduled jobs of various types defined in database (or any external place). Each separate type of scheduled job is implemented as a class inheriting a common base class. When the job is started, it gets fed the information about which targets it should write its log messages to and the configuration it should use. The configuration could probably be handled by just passing the values as method parameters to whatever class needs them but if the job implementation gets larger than say 10-20 classes, it doesn't seem very handy.

Logging is the larger problem. Subsystems the job calls probably also need to write things to the log and usually in examples this is done by just requesting instance of ILog in the constructor. But how does that work in this case when we don't know the details / implementation until runtime? Since:

  • Due to (non DI container controlled) legacy system segments in the call chain (-> there potentially being multiple separate object graphs), child container cannot be used to inject the custom logger for specific sub-scope
  • Manual property injection would basically require the complete call chain (including all legacy subsystems) to be updated

A simplified example to help better perceive the problem:

Class JobXImplementation : JobBase {
    // through constructor injection
    ILoggerFactory _loggerFactory;
    JobXExtraLogic _jobXExtras;

    public void Run(JobConfig configurationFromDatabase)
    {
        ILog log = _loggerFactory.Create(configurationFromDatabase.targets);
        // if there were no legacy parts in the call chain, I would register log as instance to a child container and Resolve next part of the call chain and everyone requesting ILog would get the correct logging targets
        // do stuff
        _jobXExtras.DoStuff(configurationFromDatabase, log);
    }
}

Class JobXExtraLogic {
    public void DoStuff(JobConfig configurationFromDatabase, ILog log) {
        // call to legacy sub-system
        var old = new OldClass(log, configurationFromDatabase.SomeRandomSetting);
        old.DoOldStuff();
    }
}

Class OldClass {
    public void DoOldStuff() {
        // moar stuff 
        var old = new AnotherOldClass();
        old.DoMoreOldStuff();
    }
}

Class AnotherOldClass {
    public void DoMoreOldStuff() {
        // call to a new subsystem 
        var newSystemEntryPoint = DIContainerAsServiceLocator.Resolve<INewSubsystemEntryPoint>();
        newSystemEntryPoint.DoNewStuff();
    }
}

Class NewSubsystemEntryPoint : INewSubsystemEntryPoint {
    public void DoNewStuff() {
        // want to log something...
    }
}

I'm sure you get the picture by this point.

Instantiating old classes through DI is a non-starter since many of them use (often multiple) constructors to inject values instead of dependencies and would have to be refactored one by one. The caller basically implicitly controls the lifetime of the object and this is assumed in the implementations (the way they handle internal object state).

What are my options? What other kinds of problems could you possibly see in a situation like this? Is trying to only use constructor injection in this kind of environment even feasible?

Great question. In general, I would say that an IoC container loses a lot of its effectiveness when only a portion of the code is DI-friendly.

Books like Working Effectively with Legacy Code and Dependency Injection in .NET both talk about ways to tease apart objects and classes to make DI viable in code bases like the one you described.

Getting the system under test would be my first priority. I'd pick a functional area to start with, one with few dependencies on other functional areas.

I don't see a problem with moving beyond constructor injection to setter injection where it makes sense, and it might offer you a stepping stone to constructor injection. Adding a property is usually less invasive than changing an object's constructor.

I have a class which has a well-defined responsibility - to "Enrich" an object with the information it needs. This information is gathered from a variety of sources (Services). Eg:

public class Enricher
{
      private CounterpartyService counterPartyService;
      private BookingEntityService bookingEntityService;
      private ExchangeRateService exchangeRateService;
      private BrokerService brokerService;
      ... 6 more services

    public EnrichedTradeRequest enrichTrade(TradeRequest request)
    {
        EnrichedTradeRequest enrichedRequest = new EnrichedRequest(request);

        // Enrich with counterparty info
        enrichedRequest = enrichCounterParty(enrichedRequest)

        // Enrich with booking entity info
        enrichedRequest = enrichBookingEntity(enrichedRequest)

        // Enrich with exchange rate info
        ...

        // Enrich with broker info
        ...

        // ....etc
        return enrichedRequest;
    }

    private EnrichedTradeRequest enrichCounterparty(EnrichedRequest enrichedRequest)
    {
        // Get info from CounterpartyService
        // ...

        return enrichedRequest;
    }

The logic for "how" to enrich a request is contained here. The class could be extended for different types of trade for example.

We enrich the trade in one step because we don't want any partially-enriched objects floating around, which wouldn't make a lot of sense.

The class is really difficult to unit test, because it has so many collaborators (up to 12 other services it calls upon). I would need to mock up 12 services, each with 3 or 4 different methods.

How do I reduce the number of collaborators in this case, and make this code testable?

I think the best way to write testable code is by practicing TDD. This helps you write testable code, because you first need to write your test before you can write any production code. I recommend you to read Uncle Bob's three laws of TDD. But below I will give you a summarize of the first part:

Over the years I have come to describe Test Driven Development in terms of three simple rules. They are: You are not allowed to write any production code unless it is to make a failing unit test pass. You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures. You are not allowed to write any more production code than is sufficient to pass the one failing unit test.

You must begin by writing a unit test for the functionality that you intend to write. But by rule 2, you can't write very much of that unit test. As soon as the unit test code fails to compile, or fails an assertion, you must stop and write production code. But by rule 3 you can only write the production code that makes the test compile or pass, and no more.

If you think about this you will realize that you simply cannot write very much code at all without compiling and executing something. Indeed, this is really the point. In everything we do, whether writing tests, writing production code, or refactoring, we keep the system executing at all times. The time between running tests is on the order of seconds, or minutes. Even 10 minutes is too long.

This Guide: Writing Testable Code is also very interesting read and gives you a lot of tips to write testable code.

UPDATE

Refactoring

When you have test in place you just need to do refactoring, but keep in mind you are not allowed to write any production code, before you have a failing test. I think your (might)class has to much responsibilities when it has up to 12 collaborators.

Extract a class where you are altering existing behavior. As you work on existing functionality, (i.e. adding another conditional) extract a class pulling along that responsibility. This will start to take chunks out of the legacy class, and you will be able to test each chunk in isolation (using Dependency Injection).

Working Effectively with Legacy Code

I would like to point out Working Effectively with Legacy Code

The Legacy Code Change Algorithm

When you have to make a change in a legacy code base, here is an algorithm you can use.

1. Identify change points.
2. Find test points.
3. Break dependencies.
4. Write tests.
5. Make changes and refactor.

Mocking frameworks

Also I would like to point that you can eliminate much of the pain mocking objects by using mocking frameworks like for example Mockito. I played with this one and I like this one the most.

I have a method that is called on an object to perform some business logic and add it to the database.

The object is a Transaction, and part of the business logic requires searching the databses for related accounts and history items on the account.

There are then a series of comparisons and operations that need to bring back information from the account and apply it to the transaction before the transaction is then passed on to other people and written to the database.

The only way I can think of for testing this currently is within the test to create an account and the relevant history information, then to construct a transaction for each different scenario and capture the information written to the DB for the transaction and information being passed on, however this feels like its testing way too much in one test. Each scenario would be performed in a separate unit test, with the test construction refactored out into separate methods, but the actual piece of code targetted by the test is over 500 lines long.

I guess this question is more about refactoring than unit testing, but in this case they go hand in hand.

If anyone has any advice (good or bad) then I'd be glad to hear it.

EDIT:

Pseudo code:

Find account for transaction 
Do validation on transaction codes and values 
Update transaction with info from account 
Get related history from account Handle different transaction codes and values (6 different combinations, each with different logic) 
Update the transaction again with new account info (resulting from business logic) 
Send transaction to clients

Separating out units from existing legacy code can be extremely tricky and time consuming. Check out Working Effectively With Legacy Code for a variety of tried and tested techniques to make things more manageable.

I have a .net c# library that I have created that I am currently creating some unit tests for. I am at present writing unit tests for a cache provider class that I have created. Being new to writing unit tests I have 2 questions These being:

  1. My cache provider class is the abstraction layer to my distributed cache - AppFabric. So to test aspects of my cache provider class such as adding to appfabric cache, removing from cache etc involves communicating with appfabric. Therefore the tests to test for such, are they still categorised as unit tests or integration tests?

  2. The above methods I am testing due to interacting with appfabric, I would like to time such methods. If they take longer than a specified benchmark, the tests have failed. Again I ask the question, can this performance benchmark test be classifed as a unit test?

The way I have my tests set up I want to include all unit tests together, integration tests together etc, therefore I ask these questions that I would appreciate input on.

  1. These would likely be considered integration tests since you aren't testing code in isolation, but rather the integration of at least two different classes of production code. If you had the class(es) of AppFabric implement interface(s) and then test your cache provider class using stubs and/or mocks that impliment the interface(s), then this would be considered a unit test.

  2. Michael Feathers defines 1/10th a second or longer as too slow for a unit test in his book Working Effectively With Legacy Code, page 13. While it may technically be implemented as a unit test, you'll likely want to run these tests with integration tests, which also tend to take longer to execute.

    The reason is that if you have thousands of tests (say 10,000) and they take 1/10th a second each, you're looking at about 17 minutes to run them. If you have your tests run every time you commit your latest changes (so as to get quick feedback if you break something), that would likely be too long. This doesn't mean you shouldn't write slower unit tests, you may need them, but it just means you won't want to run them as often as the faster unit tests as the project grows.

    I also wouldn't make them fail if they take to long as their timings will likely vary from run to run to some degree. If they are generally slower, group them with the integration tests.

    Further, Feathers summarizes on page 14:

    Unit tests run fast. If they don't run fast, they aren't unit tests.

    Other kinds of tests often masquerade as unit tests. A test is not a unit test if:

    1. It talks to a database.
    2. It communicates across a network.
    3. It touches the file system.
    4. You have to do special things to your environment (such as editing configuration files) to run it.

I have a new assignment coming up shortly where I have re-architect some legacy COM applications in .Net WPF. If possible I need to re-use functionality or existiing code however I suspect the scope for this is limited.

I need to replicate existing functionality but need to achieve it using a modern and extensible architecture.

Does anyone have any general advice for approaching a project of this type? Are there any good resources on this subject?

Are there any tried and tested techniques or common pifalls?

Michael Feathers: Working effectively with legacy code presents a number of techniques to work with, and replace, legacy code. I found it quite readable; some of the methods (and many of the hacks) were new to me.

I have code completed a project almost 50%, but haven't written any test code. I want to write rspec, capybara tests. Now this is reverse of what actually is done in testing. What should be my strategy here from where should I start (from model, controller, feature) and what should be my approach. Also are there any tutorials for this

Usually it's better to start from feature tests. They are easier to write and they provide most of coverage because cover a lot of functionality at once. Also you will need not so many feature tests comparing to unit tests, for example. Because they are on top of the Testing Pyramid

When you will achieve decent coverage you can start throwing in unit test and refactor your codebase. Having feature tests you can eliminate the fear of refactoring. As soon as you haven't write test before the code your methods would be probably hard to test without additional refactoring. That's is an additional advantage of having feature tests before unit tests.

You can find bunch of articles describing how do people usually test their projects.

As an example here is an article from Thoughtbot https://robots.thoughtbot.com/how-we-test-rails-applications

I would also recommend the Working Effectively with Legacy Code book. It describes how to start covering projects with tests and then refactor the parts.

I have data model classes that contain private fields which are meant to be read-only (via a getter function). These fields are set by my JPA persistence provider (eclipselink) during normal operation, using the contents of the database. For unit tests, I want to set them to fake values from a mockup of the persistence layer. How can I do that? How does eclipselink set these values, anyway?

Simplified example:

@Entity
class MyEntity
{
    @Id
    private Integer _ix;

    public Integer ixGet()
    {
        return this._ix;
    }
}

Another option, if you really hate to make things public, is to create a subclass for testing, and provide public access there.

You have a few options:

  • Create stubs to replace your entity (extract an interface first)
  • Use Reflection
  • Add a public setter for testing
  • Keep your tests within the package and use a default scope

For a bunch of useful techniques, have a look at Michael Feather's book, Working Effectively With Legacy Code

I have the following logic in a more than decade old code for which I have to write Unit Tests. It is a concrete class and the following logic lies in the ctor. Is there a good way to write Unit Tests/Mocks for such legacy code. I am using MSTest / RhinoMocks framework and VS 2010 IDE with .Net framework 4.0

public class SomeClass
    {
        /// ctor
        public SomeClass(XmlNode node)
        {
            //Step 1: Initialise some private variable based on attributes values from the node

            //Step 2: Lot of If , else -if statements ---> something like - 

            if (/*attributeValue is something*/)
            {
                // Connect to Db, fetch  some value based on the attribute value. 
                // Again the logic of connecting and fetching is in another concrete class
            }
            else if (/*attributeValue is somthing else*/)
            {
                // fetch a value by loading a config file (this loading and reading of config file 
                // is again a singleton class where config file path is hardcoded)
            }
            else
            {
                // set some private member variable 
            }
        }
    }

Unit testing legacy code is tricky. In general you will have to refactor first to be able to write unit tests. Your best bet are very small refactoring steps, that one by one improve testability while leaving the class under test in a "working" condition. I would recommend:

1.) Introduce "sensing" variables that allow you to verify the internal state of the class under test at key positions (i.e. before and after the DB call). This will allow you to write tests that verify the current behavior of the class (based on the public sensing variables) without having to refactor very much. Verify the behavior of the class based on these tests and start to refactor. The sensing variables are temporary and should be removed once you have finished your refactorings. They are only there to be able to write tests in the mean time so you somewhat safely refactor.

2.) One by one replace references to concrete classes to interface references that you pass via the constructor. For the singleton you have two options, one is to have the Singleton return a special instance for unit testing - this requires modifying the Singleton implementation but leaves your class under test unchanged. You can do this until you can refactor to use an interface dependency to replace the Singleton.

Also I would recommend picking up a copy of Working Effectively with Legacy Code which describes this step by step refactoring and especially dependency-breaking techniques in detail.

I have to maintain a batch script of about 3500 lines littered with GOTO. Seems that the original "developer" hasn't heard of this famous paper and modular programming.

What the script does?

The script deals with the (silent) installation/uninstallation/reinstallation of several programs using different options. It could be split in several files that deal with each program in part. The problem is that if you're trying to take a part in another file that part will still GOTO another section that needs to be in the original script.

Refactoring?

Normally you wouldn't do a refactoring without having automated tests (so you can be sure you didn't break anything), but I don't know how to do it. There's no testing framework for that.

Partial Solution

I have come up with a partial "solution" that is some kind of adaptation of characterization tests (from Working Effectively with Legacy Code by Michael Feathers) and approval tests:
- create another script: test.py that replaces all commands (like copy or msiexec) with echo,
- redirect the output to a text file (good.txt),
- change the original batch script,
- run the test.py script again and save the output to another text file (current.txt),
- diff good.txt and current.txt -> if there are no differences then I didn't break anything, but if they are different I need to check if I broke something.

Problem with partial solution

How can I capture and replace all the commands? I could make a list of commands to replace, but there are also a lot of string concatenations to get the name and path of the program to be installed.

CMD level capture/hook?

Is there any way I can hook into the command line interpreter (CMD.exe) so I can replace on the fly all the calls to installers with echo?

Other suggestions?

Do I approach the problem in the wrong way? Can I do it better somehow? Do you have some advice I could use?

I just joined a Heroic shop. The code appears to be clean and quality, but there is no documentation to speak of. (the code was written and maintained by a small group). Thus, new engineers need to reengineer from source to determine design. As the project is transitioning from a small (3) team to a larger team, I would like to document my efforts to learn the applications so the next hire can learn them more quickly.

A quick search of "document existing" + application | code doesn't yield much. I am sure this is a common question, so reference to another discussion might be the best.

Any suggestions?

Consider doing some of your documentation in the form of automated tests. Then you'll (a) verify that the code really does work the way you understand it to, and (b) build up a suite of regression tests to let you know if you break something later (e.g. due to not realizing that changing X will cause a side effect in Y -- very possible even in code you are familiar with).

If you don't know how to get started with adding tests to existing code, pick up Michael Feathers' excellent book "Working Effectively with Legacy Code".

As for (easily) human-readable and -skimmable documentation, Fit and FitNesse can produce human-readable, but executable, tests, especially when the requirements can be represented in a grid (these inputs -> these outputs).

I'm facing potentially a refactoring project at work and am having a little bit of trouble grasping the single responsibility principle example that shows up on most websites. It is the one regarding separating the connection and send/receive methods of a modem into two different objects. The project is in Python, by the way, but I don't think it is a language-specific issue.

Currently I'm working to break up a 1300 line web service driver class that somebody created (arbitrarily split into two classes but they are essentially one). On the level of responsibility I understand I need to break the connectivity, configuration, and XML manipulation responsibilities into separate classes. Right now all is handled by the class using string manipulations and the httplib.HTTPConnection object to handle the request.

So according to this example I would have a class to handle only the http connection, and a class to transfer that data across that connection, but how would these communicate? If I require a connection to be passed in when constructing the data transfer class, does that re-couple the classes? I'm just having trouble grasping how this transfer class actually accesses the connection that has been made.

With a class that huge (> 1000 lines of code) you have more to worry about than only the SRP or the DIP. I have (or "I fight") classes of similar size and from my experience you have to make unit tests where possible. Carefully refactor (very carefully!) Automatic testing is your friend - be it unit testing as mentioned or regression testing, integration testing, acceptance testing, or whatever you are able to automatically execute. Then refactor. And then run the tests. Refactor again. Test. Refactor. Test.

There is a very good book that describes this process: Michael Feather's "Working Effectively With Legacy Code". Read it.

For example, draw a picture that shows the dependencies of all methods and members of this class. That might help you to identify different "areas" of repsonsibility.

I'm toying with the idea of phasing in an ORM into an application I support. The app is not very structured with no unit tests. So any change will be risky. I'm obviously concerned that I've got a good enough reason to change. The idea is that there will be less boiler plate code for data access and there for greater productivity.

Do this ring true with your experiences?
Is it possible or even a good idea to phase it in?
What are the downsides of an ORM?

The "Robert C Martin" book, which was actually written by Michael Feathers ("Uncle Bob" is, it seems, a brand name these days!) is a must.

It's near-impossible - not to mention insanely time-consuming - to put unit tests into an application not developed with them. The code just won't be amenable.

But that's not a problem. Refactoring is about changing design without changing function (I hope I haven't corrupted the meaning too badly there) so you can work in a much broader fashion.

Start out with big chunks. Set up a repeatable execution, and capture what happens as the expected result for subsequent executions. Now you have your app, or part of it at least, under test. Not a very good or comprehensive test, sure, but it's a start and things can only get better from there.

Now you can start to refactor. You want to start extracting your data access code so that it can be replaced with ORM functionality without disturbing too much. Test often: with legacy apps you'll be surprised what breaks; cohesion and coupling are seldom what they might be.

I'd also consider looking at Martin Fowler's Refactoring, which is, obviously enough, the definitive work on the process.

I would strongly recommend getting a copy of Michael Feather's book Working Effectively With Legacy Code (by "Legacy Code" Feathers means any system that isn't adequately covered by unit tests). It is full of good ideas which should help you with your refactoring and phasing in of best practices.

Sure, you could phase in the introduction of an ORM, initially using it for accessing some subset of your domain model. And yes, I have found that use of an ORM speeds up development time - this is one of the key benefits and I certainly don't miss the days when I used to laboriously hand-craft data access layers.

Downsides of ORM - from experience, there is inevitably a bit of a learning curve in getting to grips with the concepts, configuration and idiosyncracies of the chosen ORM solution.

Edit: corrected author's name

I'm working on a asp.net Webforms project.

The project isn't covered by any tests at all, besides from the few unit tests I wrote myself to cover a small static method of mine. I would like to write unit tests for the entire project so that I can refactor, and add code to the project more efficiently (I have the support from the CTO about this), but I've run into a problem.

The web application mostly consists of linq queries to the database, with no abstraction between the database and the code, in other words we don't use something like a repository instead we just type the linq queries where ever we need them.

From my understanding, unit tests should not call a database directly as that would be to slow, so I would like to decouple them from the database. I first tried that with a mocking framework called MOQ, it seemed to work but then I read that LINQ-queries to a database has a different behaviour to LINQ-queries to objects so the tests passing doesn't mean that my methods work

This made me consider the repository pattern, hiding the linq queries in repository classes and then mock these classes in my tests. There are (as far as I can tell) two problems with this.

  • Changing the project to using the repository pattern is a massive job, and I'm not sure that the repository pattern is the right tool work the job. So I would like to know beforehand if it is suitable or not.
  • The database schema isn't very good, It was made for us by another company, and they simply took a generic schema and added a few rows in a few tables to "customize it" for us. Many of the tables and columns are poorly named, it can't contain all the data we need it to contain, and the logical structure doesn't suit us very well, which require complicated queries. I'm pretty sure that the correct solution is to rewrite the schema to suit us, but the CTO wants it to remain the same so the project can be more easily modified by the company who made the database and project if we want them to, so that's not an option.

Assuming that a repository pattern is what I should use, my question is, how do I deal with the poor database schema. I assume that I need to hide it behind an abstraction that handles the complicated quering for me, maybe draw an er-diagram to figure out how the schema should look like, but I could be wrong, and I'm not sure about the details.

I would love if you could give me links to examples, and tutorials, and tell me if my assumtions are wrong.

Thank you in advance for your patience and help.

If this was me I would not focus on unit tests for this. I would first try and get a suite of End-To-End tests which characterize the behaviour of the system as it stands. Then as you refactor parts of the system you have some confidence that things are no more broken that they were before.

As you point out, different linq providers have different behaviour so having the end to end tests will ensure that you are actually testing the the system works.

I can recommend SpecFlow as a great tool for building your behaviour based tests, and I can recommend watching this video on pluralsight for a great overview of SpecFlow and a good explanation of why you might be better with end to end tests than having unit tests.

You'll also get a lot out of reading 'Working effectively with legacy code' and reading some of the links and comments here might be useful as well.

You'll notice that some of the comments linked above point out that you need to write unit tests, but often you need to refactor before you can write the tests *as the code isn't currently testable), but that this isn't safe without unit tests. Catch-22. Writing End-To-End tests can often get you out of this catch-22, at the expense of having a slow running test suite.

I need to patch three methods (_send_reply, _reset_watchdog and _handle_set_watchdog) with mock methods before testing a call to a fourth method (_handle_command) in a unit test of mine.

From looking at the documentation for the mock package, there's a few ways I could go about it:

With patch.multiple as decorator

@patch.multiple(MBG120Simulator,
                _send_reply=DEFAULT,
                _reset_watchdog=DEFAULT,
                _handle_set_watchdog=DEFAULT,
                autospec=True)
def test_handle_command_too_short_v1(self,
                                     _send_reply,
                                     _reset_watchdog,
                                     _handle_set_watchdog):
    simulator = MBG120Simulator()
    simulator._handle_command('XA99')
    _send_reply.assert_called_once_with(simulator, 'X?')
    self.assertFalse(_reset_watchdog.called)
    self.assertFalse(_handle_set_watchdog.called)
    simulator.stop()

With patch.multiple as context manager

def test_handle_command_too_short_v2(self):
    simulator = MBG120Simulator()

    with patch.multiple(simulator,
                        _send_reply=DEFAULT,
                        _reset_watchdog=DEFAULT,
                        _handle_set_watchdog=DEFAULT,
                        autospec=True) as mocks:
        simulator._handle_command('XA99')
        mocks['_send_reply'].assert_called_once_with('X?')
        self.assertFalse(mocks['_reset_watchdog'].called)
        self.assertFalse(mocks['_handle_set_watchdog'].called)
        simulator.stop()

With multiple patch.object decoratorations

@patch.object(MBG120Simulator, '_send_reply', autospec=True)
@patch.object(MBG120Simulator, '_reset_watchdog', autospec=True)
@patch.object(MBG120Simulator, '_handle_set_watchdog', autospec=True)
def test_handle_command_too_short_v3(self,
                                     _handle_set_watchdog_mock,
                                     _reset_watchdog_mock,
                                     _send_reply_mock):
    simulator = MBG120Simulator()
    simulator._handle_command('XA99')
    _send_reply_mock.assert_called_once_with(simulator, 'X?')
    self.assertFalse(_reset_watchdog_mock.called)
    self.assertFalse(_handle_set_watchdog_mock.called)
    simulator.stop()

Manually replacing methods using create_autospec

def test_handle_command_too_short_v4(self):
    simulator = MBG120Simulator()

    # Mock some methods.
    simulator._send_reply = create_autospec(simulator._send_reply)
    simulator._reset_watchdog = create_autospec(simulator._reset_watchdog)
    simulator._handle_set_watchdog = create_autospec(simulator._handle_set_watchdog)

    # Exercise.
    simulator._handle_command('XA99')

    # Check.
    simulator._send_reply.assert_called_once_with('X?')
    self.assertFalse(simulator._reset_watchdog.called)
    self.assertFalse(simulator._handle_set_watchdog.called)

Personally I think the last one is clearest to read, and will not result in horribly long lines if the number of mocked methods grow. It also avoids having to pass in simulator as the first (self) argument to assert_called_once_with.

But I don't find any of them particularly nice. Especially the multiple patch.object approach, which requires careful matching of the parameter order to the nested decorations.

Is there some approach I've missed, or a way to make this more readable? What do you do when you need to patch multiple methods on the instance/class under test?

No you didn't have missed anything really different from what you proposed.

About readability my taste is for decorator way because it remove the mocking stuff from test body... but it is just taste.

You are right: if you patch the static instance of the method by autospec=True you must use self in assert_called_* family check methods. But your case is just a small class because you know exactly what object you need to patch and you don't really need other context for your patch than test method.

You need just patch your object use it for all your test: often in tests you cannot have the instance to patch before doing your call and in these cases create_autospec cannot be used: you can just patch the static instance of the methods instead.

If you are bothered by passing the instance to assert_called_* methods consider to use ANY to break the dependency. Finally I wrote hundreds of test like that and I never had a problem about the arguments order.

My standard approach at your test is

@patch('mbgmodule.MBG120Simulator._send_reply', autospec=True)
@patch('mbgmodule.MBG120Simulator._reset_watchdog', autospec=True)
@patch('mbgmodule.MBG120Simulator._handle_set_watchdog', autospec=True)
def test_handle_command_too_short(self,mock_handle_set_watchdog,
                                          mock_reset_watchdog,
                                          mock_send_reply):
    simulator = MBG120Simulator()
    simulator._handle_command('XA99')
    # You can use ANY instead simulator if you don't know it
    mock_send_reply.assert_called_once_with(simulator, 'X?')
    self.assertFalse(mock_reset_watchdog.called)
    self.assertFalse(mock_handle_set_watchdog_mock.called)
    simulator.stop()
  • Patching is out of the test method code
  • Every mock starts by mock_ prefix
  • I prefer to use simple patch call and absolute path: it is clear and neat what you are doing

Finally: maybe create simulator and stop it are setUp() and tearDown() responsibility and tests should take in account just to patch some methods and do the checks.

I hope that answer is useful but the question don't have a unique valid answer because readability is not an absolute concept and depends from the reader. Moreover even the title speaking about general case, question examples are about the specific class of problem where you should patch methods of the object to test.

[EDIT]

I though a while about this question and I found what bother me: you are trying to test and sense on private methods. When this happen the first thing that you should ask is why? There are a lot chances that the answer is because these methods should be public methods of private collaborators (that not my words).

In that new scenario you should sense on private collaborators and you cannot change just your object. What you need to do is to patch the static instance of some other classes.

I need to write tests(using google testing framework) for small study program that was written not by me. (it's just small console game which can get modes from command line or just get it in runtime) There is a problem: I can't change the souce code but there is in almost all methods used cout and cin. and my question is "how to answer on requests (cin) of programm while testing (something like get data for cin from string )?".

I know you said you can't modify the code, but I'll answer this as if you can. The real world typically allows (small) modifications to accommodate testing.

One way is to wrap your calls that require external inputs (DB, user input, sockets, etc...) in function calls that are virtual so you can mock them out. (Example below). But first, a book recommendation on testing. Working Effectively with Legacy Code is a great book for testing techniques that aren't just limited to legacy code.

class Foo {
public:
   bool DoesSomething() 
   {
      string usersInput;
      cin >> usersInput;
      if (usersInput == "foo") { return true; }
      else { return false; }
   }
};

Would turn into:

class Foo
{
public:
   bool DoesSomething() {
      string usersInput = getUserInput();
      if (usersInput == "foo") { return true; }
      else { return false; }
   }

protected:
   virtual std::string getUserInput() {
      string usersInput;
      cin >> usersInput;
      return usersInput;
   }

};

class MockFoo : public Foo {
public:
   void setUserInput(std::string input) { m_input = input }
   std::string getUserInput() {
      return m_input;
   }
};

TEST(TestUsersInput)
{
   MockFoo foo;
   foo.setUserInput("SomeInput");
   CHECK_EQUAL(false, foo.DoesSomething());

   foo.setUserInput("foo");
   CHECK_EQUAL(true, foo.DoesSomething());
}

I saw this question asked about C# I would like an answer for PHP. I have some old code that has 4 pages of foreach loops and conditions which just makes it hard to read and follow. How would I make this more OO? I was thinking of using SPL Functions but don't fully understand whats involved yet.

Some good advice from Johnathan and gnud. Don't worry so much about the SPL, and learn more about refactoring, and look into what refactoring tools are available for PHP.

Also I can't recommend enough reading Working Effectively with Legacy Code:

alt text

And of course the canonical Refactoring Book:

alt text

PHP is a difficult language to keep clean of code smells, but with a little perseverance it can be done! And you will thank yourself every day you have to look at your codebase.

I've been always thinking "Testing? I don't need no freakin' testing! I'm doing without it just fine!"

And then my code became long.

Now I realize why it is important to write tests. I'm always afraid a little change will result in breaking something. I really want to start writing tests. But the code base has become so large that I'm really overwhelmed. I don't know where to start. And much of it I don't even remember why I coded it that way so if I begin to go back and write tests, it will take forever.

Can someone provide advice on how to start writing tests if I already have a large code base?

First, I would recommend reading the WELC book, which should really come in handy in your situation.

Start the next time you touch the code, the very next time you need to change something in the code, go write a test for it first and then continue to write tests surrounding everything that you have to change, update, fix, or add. That way, over time you will have added tests to all the areas that are changing and will get to a point when writing tests for the rest of the code doesn't seem so overwhelming.

However, I would reiterate that the book I linked to how legacy code would be very helpful as it goes into lots of detail about ways to approach this issue.

I've done most of my coding on smaller projects for myself and classes, and this summer ended up working on a fairly large project.

For reference, I'm working on a web app using Zend Framework, and I'm working with one other person.

I started working on it in May - at the time, there was a lot of code already written (pretty sloppily) that I didn't have much control over. When I started, not knowing any better, I was just either editing files directly on the server with vim or using eclipse and ftp-ing the files up. Over the course of working on this, I've started using eclipse for editing, and SVN for source control and to deploy the files to the server.

My question is what are other good ways to both help my productivity with this, and also with managing the deployment better, as it gets closer to real people using it?

Edit: Realized two related things I left out.

One is that I've been eradicating bad/dangerous code as I end up touching on something that uses it. Is it generally better to devote time specifically to that, or is the way I'm doing it generally efficient enough?

Also, given that I have relatively limited time, is unit testing worth the effort or even appropriate?

Edit #2: I was reminded I left out the project management part. I've been having problems deciding what to hit first. I feel like some days I spend as much time keeping tabs on what still needs to be done than actually doing it. Is this common, how do others deal with it?

Changes I'm going to make: The big one I decided was to actually keep track of everything that needs to be done. I have a well-written spec of what's going to be there in the end, but a lot of times I'd lose little things and say over and over they need to be done. Now I'll keep track of them. Also, going to look more into a testing and deploying system.

Unit testing is crucial: it's the only way to get some assurance that your "eraticating bad code as you meet it" isn't going to break things badly. When you smell or spot bad code, build tests around it, THEN proceed to refactor -- really the only sane way to deal with legacy code!

You're already doing other key aspects, such as putting the code under version control; if you were managing/leading a large or distributed team I'd strongly recommend DVCS (hg is my choice, but git and bazaar are very popular too), but as you appear to be working on your own, svn's just fine.

Continuous build is the next recommended practice -- but, again, if you're basically on your own and just make sure to run your test suite obsessively, that comes to much the same thing. Next: good releng practices -- if and when a bug's reported on your code, you MUST be able to reconstruct EXACTLY the set of sources, configs, etc, that caused that bug.

What IDE or editor you like -- eclipse, vim, emacs, zend stuff, w/ever -- is really secondary. You seem to be developing the right instincts -- good for you!!!-)

I've got lots of problems with project i am currently working on. The project is more than 10 years old and it was based on one of those commercial C++ frameworks which were very populary in the 90's. The problem is with statecharts. The framework provides quite common implementation of state pattern. Each state is a separate class, with action on entry, action in state etc. There is a switch which sets current state according to received events.

Devil is hidden in details. That project is enormous. It's something about 2000 KLOC. There is definitely too much statecharts (i've seen "for" loops implemented using statecharts). What's more ... framework allows to embed statechart in another statechart so there are many statecherts with seven or even more levels of nesting. Because statecharts run in different threads, and it's possible to send events between statecharts we have lots of synchronization problems (and big mess in interfaces).

I must admit that scale of this problem is overwhelming and I don't know how to touch it. My first idea was to remove as much code as I can from statecharts and put it into separate classes. Then delegate these classes from statechart to do a job. But in result we will have many separate functions, which logically don't have any specific functionality and any change in statechart architecture will need also a change of that classes and functions.

So I asking for help: Do you know any books/articles/magic artefacts which can help me to fix this ? I would like to at least separate as much code as I can from statechart without introducing any hidden dependencies and keep separated code maintainable, testable and reusable.

If you have any suggestion how to handle this, please let me know.

Sounds to me like your best bet is (gulp!) likely to start from scratch if it's as horrifically broken as you make out. Is there any documentation? Could you begin to build some saner software based on the docs?

If a complete re-write isn't an option (and they never are in my experience) I'd try some of the following:

  1. If you don't already have it, draw an architectural picture of the whole system. Sketch out how all the bits are supposed to work together and that will help you break the system down into potentially manageable / testable parts.
  2. Do you have any kind of requirements or testing plan in place? If not, can you write one and start to put unit tests in place for the various chunks of code / functionality which exist already? If you can do that, you can start to refactor things without breaking as much of whatever does currently work.
  3. Once you've broken things down a bit, start building your unit tests into integration tests which pull together more of the functionality.

I've not read them myself, but I've heard good things about these books which may have some advice you can use:

  1. Refactoring: Improving the Design of Existing Code (Object Technology Series).
  2. Working Effectively with Legacy Code (Robert C. Martin)

Good luck! :-)

I work on a project that was just told we had to incorporate Parasoft C++ unit testing tool into any code changes going forward. The problem I face is that we have methods with very small changes and now it seems we are forced to unit test the whole method. Many of these methods are hundreds or thousands of lines of code. I know for certain that if I have to test the methods entirely then we will run into fixing old issues such as null pointer checks and our budget and manpower can't handle these fixes.

Does anyone know if parasoft allows you to test small portions of a method? or if another unit testing framework would work better.

No unit testing framework allows you to just test portions of a method.

One ugly suggestion is to use #include to include small chunks of code directly into methods, with the same #include used to include that code into a testing method that sets up variables used by that code.

I recommend Michael Feather's book Working Effectively with Legacy Code for advice on how to add testing to a large code base. It's also available online at Safari.

Let's say you're the lucky programmer who inherited a code that is close to software rot. Software rot as defined in Pragmatic Programmer is code that is too ugly (in this case, unrefactored code) it is compared to a broken window that no one wants to fix it and in turn can damage a house and can cause criminals to thrive in that city.

But it is the same code that Joel Spolsky in JoelOnSoftware values in such a way that it contains valuable patches which have been debugged throughout its lifetime (which can look unstructured and ugly).

How would you maintain this?

Have a look at Working Effectively with Legacy Code by Michael Feathers. Lots of good advice there.

Welc is a great book. You should certainly check it out. If you don't want to wait for the book arrive, I can summarise the bits I think are important

  1. You need to understand your system. Do some throwaway coding to understand the part you need to work on. E.g. be prepared to try and do some work to get the system under test based upon the knowledge that you will probably break it. (understand what went wrong)
  2. Look for areas where you can break dependencies. Michael Feathers calls these seams. They are points where you can take abit of the legacy system and refactor it so it will be testable.
  3. As you work on the system add tests as you go.

My new project is to renew Software that a customer had used for years. Like we all know...things grew over the years.

So i looked at the old app. get all Infos about it and wrote a lot of user stories down. After cleaning it up i recognized that i made (in the end) a mistake that leads to the same problem that the customer has now.

It is a different Mistake i made but really annoying.

How do you guys prevent such mistakes. Do you refuse looking at the old app?

This is a really hard problem. I'll describe what I would do (and have done) if the old code is substantially large.

In general, the old code is full of decisions, bug fixes and undocumented behaviors. If you throw that away, you're bound to make many of the same mistakes they did and then some more.

For what it's worth, you should evolve the system around the old code. Try to abstract away from the old code, e.g., by creating interfaces, and then implement them by calling the old code at first. Write lots of unit tests for the interfaces, and gain knowledge about how the old code works. New features should gain new implementations, so old code and new code will live side-by-side, for as long as needed, and maybe even forever.

Now, slowly and carefully make incursions into the old code, refactoring it, replacing it, and making sure that your tests still pass. Write new tests as well. If you have regression tests for the system (apart from the unit tests), they still need to pass.

It's OK not to touch the old code, typically if it's working OK, there're no bugs reported for it, it passes your tests, and it doesn't need to be extended.

Some good resources:

http://martinfowler.com/bliki/StranglerApplication.html

Working Effectively with Legacy Code

I've also found it in StackOverflow already:

Strategy for large scale refactoring

My software company never did BDD or even TDD before. Testing before meant to simply try out the new software some days before deployment.

Our recent project is about 70% done. We also use it as a playground for new technologies, tools and ways of developement. My boss wanted that I switch to it to "test testing".

I tried out Selenium 2 and RSpec. Both are promising, but how to catch up months of developement? Further problems are:

  • new language
  • never wrote a line of code by myself
  • huge parts are written by freelancer
  • lots of fancy hacking
  • no documentation at all besides some source comments and flow charts

All I was able to was to do was to cover a whole process with Selenium. This appeared to be quite painfully (but still possible), since the software was obivously never meant to be testet this way. We have lots of dynamically generated id ´s, fancy jQuery and much more. Dont even know how to get started with RSpec.

So, is it still possible to apply BDD to this project? Or should I run far away and never come back?

Reading the book Working Effectively with Legacy Code might be helpful. There is also a short version in PDF form.

Before you start - have you asked your boss what he values from the testing? I'd clarify this with your boss first. The main benefits of system-level BDD are in the conversations with business stakeholders before the code is written. You won't be able to get this if all you're doing is wrapping existing code. At a unit level, the main benefits are in questioning the responsibilities of classes and their behavior, which, again, you won't be able to get. It might be useful for helping you understand the value of the application and each level of code, though.

If your boss just wants to try out BDD or TDD it may be simpler to start a new project, or even grab an existing project from someone else to wrap some tests around. If he genuinely wants to experiment with BDD over legacy code, then it may be worth persisting with what you have - @Esko's book suggestion rocks. Putting higher-level system tests around the existing functionality can help you to avoid breaking things when you refactor the lower-level code (and you will need to do so, if you want to put tests in place). I additionally recommend Martin Fowler's "Refactoring".

RSpec is best suited to applying BDD at a unit level, as a variant of TDD. If you're looking to wrap automated tests around your whole environment, take a look at Cucumber. It makes reusing steps a lot simpler. You can then call Selenium directly from your steps.

I put together a page of links on BDD here, which I hope will help newcomers to understand it better. Best of luck.

Sometimes when you deal with legacy code you find it hard to break the dependencies. One of the solutions that some unit-testing gurus (e.g. Roy Osherove in The Art of Unit Testing or Michael Feathers in Working Effectively with Legacy Code) propose is to use the "Subclass and override" technique. The idea of this technique is to encapsulate the usage of dependency in protected method and then test the extended class with that method overridden with method that uses fake dependency.

Also there are two more techniques that, imo, are doing almost the same. Those are:

  • mock the protected method that contains dependency
  • use reflection (if it is supported in your language) to make the dependency-containing protected method to return what is needed

However the latter two techniques are considered to be anti-patterns (you don't mock what you are testing). But, to me, all three techniques are using the same idea. Is the "Subclass and override" technique also considered an anti-pattern and if not, why?

Update

The phrase "you don't mock what you are testing" seems to be confusing so I should probably explain myself. Though only one method of the class is mocked, AFAIK, the fact of mocking the part of the class under test is considered to be bad practice because it breaks encapsulation and exposes implementation details to the test.

I don't understand your comment about those two more techniques.

You mock method because it is dependent on something you don't wan't in your test. You are not testing it. You are testing the rest of the code and this method (before it is mocked) prevents you from doing it.

This is my first encounter with unit testing and I am trying to understand how can this concept be used on a simple date validation.

The user can select a ToDate that represents the date until a payment can be made. If our date is not valid the payment cant be made.

    private void CheckToDate(DateTime ToDate)
    {
        if (Manager.MaxToDate < ToDate.Year)
            //show a custom message
    }

How can unit tests be used in this case?

Regards,

Alex

Thanks for your answers:

As suggested by many of you I will split the function and separate the validation from the message display and use unit tests just for this.

public bool IsDateValid(DateTime toDate)
{
    return (Manager.MaxToDate < toDate.Year);
}

Sure thing :-) Detecting that the custom message is shown can require a little trick (I assume you mean a messagebox displayed on a GUI, but the idea is the same even if the message is displayed differently).

You can't detect mssage boxes from unit tests, neither you want to launch the whole GUI environment from your unit tests. The easiest way to work around this is to hide the actual code displaying a message box in a separate method, ideally in a distinct interface. Then you can inject a mock implementation of this interface for your unit tests. This mock does not display anything, just records the message passed to it, so you can check it in your unit test.

The other issue is that your method is private. Check first where it is called from, and whether it can be called via a public method without too much complication. If not, you may need to make it (temporarily) public to enable unit testing. Note that the need to unit test private methods is usually a design smell: your class may be trying to do too much, taking on too many distinct responsibilities. You may be able to extract some of its functionality into a distinct class, where it becomes public, thus directly unit testable. But first you need to have those unit tests, to ensure you are not breaking anything when refactoring.

You then need to set up the Manager.MaxToDate prior to the test with a suitable date, and call CheckToDate with various parameters, check that the result is as expected.

The recommended reading for similar tricks and more is Working Effectively with Legacy Code.

This might be a duplicate, but as I haven't seen anything hopefully it isn't. Basically, I've never seen a good rule-of-thumb for when to split a chunk of code into separate methods and objects. Does anyone have any good rules they found work well?

If a method doesn't fit on a page, it's too big. :-)

That's not an excuse to use a really small font or use a really big monitor

Go and read;

Although they talk about how to deal with what you already have, if you read them and learn from them you can aim not to create lagacy code to start with!

We have recently adopted the specification patterns for validating domain objects and now want to introduce unit testing of our domain objects to improve code quality.

One problem I have found is how best to unit test the validate functionality shown in the example below. The specification hits the database so I want to be able to mock it but as it is instantiated in-line I can't do this. I could work off interfaces but this increases the complexity of the code and as we may have a lot of specifications we will ultimately have a lot of interfaces (remember we are introducing unit testing and don't want to give anyone an excuse to shoot it down).

Given this scenario how would we best solve the problem of unit testing the specification pattern in our domain objects?

...
public void Validate()
{
    if(DuplicateUsername())
    { throw new ValidationException(); }
}

public bool DuplicateUsername()
{
    var spec = new DuplicateUsernameSpecification();
    return spec.IsSatisfiedBy(this);
}

A more gentle introduction of Seams into the application could be achieved by making core methods virtual. This means that you would be able to use the Extract and Override technique for unit testing.

In greenfield development I find this technique suboptimal because there are better alternatives available, but it's a good way to retrofit testability to already existing code.

As an example, you write that your Specification hits the database. Within that implementaiton, you could extract that part of the specification to a Factory Method that you can then override in your unit tests.

In general, the book Working Effectively with Legacy Code provides much valuable guidance on how to make code testable.

We're getting some errors, if we try to test something with the asp.net membership framework. It seems that it can't instantiate the asp.net membership environment and so, it can't access all the profiles, users and so on.

Has anybody seen a similar problem? Or is it working for someone?

Cheers, Steve

If you are depending on external resources such as your database and the configuration files (used when using the ASP.NET membership) you aren't writing very effective unit tests. You will have to keep everything in sync including the data in the database. This because a maintenance nightmare.

If you want to test this way, I recommend setting up your configuration to have membership (you can grab this from your application). You then will also want to set up a test database to connect to. It needs to be populated with fake data, so you'll need scripts to make sure the data is consistent.

I would however recommend that you take the approach of doing more proper unit tests. When we refer to a "unit test" we mean testing a very small piece of code at a time. This code should not depend on anything else, so what you need to do is use interfaces and use fakes, stubs, or mocks so that your tests scope is enclosed to a single unit of code.

If you want to go this route I highly recommend reading Working Effectively with Legacy Code. There are also plenty of other books and resources which talk about how to find seams and decouple your code so you're able to test it. Once you get the hang of unit testing you'll be glad you looked into this.

While starting a new project, we kick start it based on what is "latest" and what is "known".

This includes selection of programming languages, frameworks in those languages etc. Quite a lot of time is spent on architectural design and detailed level design in terms of using specific frameworks and design patterns etc.

Things go on fine till we complete the development and push things to production.

Then comes maintenance (Defect fixes and Enhancements). People change, architects and designers moved out.

New set of folks who may not have any historical details of project are maintaining it now. They start comprising things on architecture, design principles etc. to provide quick fixes and adding enhancements.

This trend I'm seeing in many projects I've worked.

How to maintain the "conceptual integrity" of the system while doing maintenance?

Maintaining conceptual integrity is difficult. It's an issue that needs to be addressed constantly during architecture, design, and construction, and it only gets worse when a project changes hands.

One thing that can help is for people from the original development team to be involved in the maintenance. Someone who already has an idea of the project's conceptual framework will be better able to keep to that framework than someone who is learning it from scratch.

Other than that, though, this comes down to the gigantic topic of best practices. Almost all "good" programming practices are aimed at ease of maintenance. Good design and construction practices lead to projects that are more easily grasped by later developers. Steve McConnell talks about managing complexity as the central imperative of software work. If the complexity is managed well up front, it will be easier for those who come later to keep the conceptual integrity of the project intact.

At the other end, good maintenance practices involve working against entropy. Keep the system under test. Don't decrease cohesion or increase coupling for the sake of a quick fix or a new feature. In fact, aim to make the project more coherent with each change that is made.

If the system was designed with extensibility in mind, then it shouldn't be difficult for maintenance programmers to keep conceptual integrity intact while doing their jobs. And even if it wasn't, it should still be possible for them to improve the project during maintenance rather than bring it down further.

If maintenance developers simply hack things together and do things the "easy" way, it will always degrade the conceptual integrity and increase the complexity of a project. Developers have to be aware of that, and they have to consciously choose the practices that will best allow them to avoid it.

The main idea is that maintenance should be a process of constantly improving a project, not constantly degrading it. An excellent book that deals with this topic is Michael Feathers' Working Effectively with Legacy Code. You might want to check it out.

Will it be easy for a C++ developer to read Refactoring: Improving the Design of Existing Code

Is there any other book that I should read about refactoring? Feel free to add any articles on refactoring.

Read a book called Refactoring by Martin Fowler.

I'm the author of Refactoring to Patterns. I have recently completed work on a multimedia album about Refactoring (in C++, Java, and soon, C#).

You can look at samples of this album here:

In addition, if you want to get good at recognizing what kind of code needs refactoring, you can consider studying my album on Code Smells as well. See

If you work with legacy code then it may be worth getting Working Effectively with Legacy Code by Michael Feathers.

Refactoring to Patterns by Joshua Kerievsky

Language used in the book shouldn't matter, the concept is what is important. This book is a practical approach to refactoring.

Here is my problem:

I have an n-tiers application for which I have to write unit tests. Unit tests are for the business layer.

I have a method to test called Insert() and this one use two protected methods from inheritance and call directly a method from Data access layer.

So I have made a mock object for the DAL. But here is the point, in a (edit :) protected method from inheritance, It will use another object from DAL! It seems it is not possible to mock this one!

Here is the method for test code:

public int Insert(MYOBJECT aMyObject)
    {
            //first inherited method use the FIRSTDALOBJECT so the mock object --> No problem
            aMyObject.SomeField= FirstInherited();

            //Second inherited method (see after) --> my problem
            aMyObject.SomeOtherField = SecondInherited();

            // Direct access to DALMethod, use FIRSTDALOBJECT so the mock -->No Problem
            return this.FIRSTDALOBJECT.Insert(aMyObject);             
     }

Here is the SecondInherited method:

 protected string SecondInherited ()
    { 
        // Here is my problem, the mock here seems not be possible for seconddalobject                                          
        return ( new SECONDDALOBJECT Sdo().Stuff());
    }

And here is the unit test method code :

    [TestMethod()]
    public void InsertTest()
    {
        BLLCLASS_Accessor target = new BLLCLASS_Accessor();
        MYOBJECT aMyObject = new MYOBJECT { SomeField = null, SomeOtherField = 1 };
        int expected = 1;
        int actual;

        //mock
        var Mock = new Mock<DAL.INTERFACES.IFIRSTDALOBJECT>();
        //Rec for calls
        List<SOMECLASS> retour = new List<SOMECLASS>();
        retour.Add(new SOMECLASS());

        //Here is the second call (last from method to test)
        Mock
            .Setup(p => p.Insert(aMyObject))
            .Returns(1);

        // Here is the first call (from the FirstInherited())
        Mock
            .Setup(p => p.GetLast())
            .Returns(50);
        // Replace the real by the mock
        target.demande_provider = Mock.Object;

        actual = target.Insert(aMyObject);
        Assert.AreEqual(/*Some assertion stuff*/);
    }

Thank you for reading all the question :-) Hope it is clear enough.

Your text seems to say that SecondInherited is private, while in the code example it is protected. Anyway, if it is not protected, I would suggest changing its access qualifier as the first step.

You can create a subclass solely for testing purposes, and override SecondInherited there to avoid creating SECONDDALOBJECT and just return some value suitable for your tests.

This way you can write your first unit test(s) with minimal changes to the class being tested, thus minimizing the chances of breaking something. Once you have the unit tests in place, these allow you to do more refactoring safely, eventually achieving a better (more testable / mockable) design, such as using Dependency Injection, or a Factory. (I would probably prefer an Abstract Factory over Factory Method here, as the latter would practically force you to keep subclassing the tested class).

The fundamental, highly recommended book for learning this technique (and many more) is Working Effectively With Legacy Code.

hope you can help!

I've started a project in MVC 3 and set the business domain model in another assembly, the interfaces define the contract between this assembly and all projects that will use it. I'm using Ninject to inject the dependencies into the project. I've hit a brick wall at the moment with a specific LINQ query.

    public IEnumerable<ITheInterface> DoMyQuery()
    {
        using (ISession session = _sessionFactory.OpenSession()) {
            var query = (
                from c in session.Query<IMyInterface>()
                where something == anotherthing
                group c by new { c.TheGrouper } into grp
                select new IMyInterface() {
                    Property = grp.Key
                }
            );

            return query.ToList();
        }
    }

Now obviously I can't instantiate an interface, but this is my problem! The only way around it is to instantiate the concrete class, but that breaks my rules of being loosely coupled. Has anyone else ran into this before?

I suppose my question is, how do I use "select new Object" in a LINQ query by using the interface and NOT the concrete class?

Note: just for the record, even if I do use my concrete class to just get it to work, I get an NHibernate error of "Could not resolve property: Key of: "... but that's another issue.

Any help appreciated!!

Just using interfaces and DI container does not mean that you writing loosely coupled code. Interfaces should be used at application Seams, not for Entities:

A seam is a place where you can alter behaviour in your program without editing in that place

From Mark Needham:

... we want to alter the way that code works in a specific context but we don’t want to change it in that place since it needs to remain the way it is when used in other contexts.

Entities (domain objects) are the core of your app. When you change them you change them in place. Building Seam around your data access code however is a very good idea. It is implemented using Repository pattern. Linq, ICreteria, HQL is just an implementation detail that is hidden from consumers behind domain driven repository interface. Once you expose one of these data access technologies your project will be coupled to them, and will be harder to test. Please take a look at these two articles and this and this answers:

I am about to break a monolithic WPF/C# application into manageable modules. Please let me know what are the points i need to remember before breaking the software. Any tools that would be handy,etc.

Thanks in advance.

Regards,

JOhn.

Your question is very broad as there are many techniques (e.g. heavily unit testing your code) that are of help. It is probably best to read a book on that topic. I can highly recommend you Michael C. Feathers

Working Effectively with Legacy Code

enter image description here

Although this book is mostly Java-centric the described techniques are generally applicable. It will definitely change the way you write and think about code and will be of help when working with existing applications.

Feathers' book is also one of the books that are most recommended in this SO post.

I am a TDD noob and I don't know how to solve the following problem. I have pretty large class which generates text file in a specific format, for import into the external system. I am going to refactor this class and I want to write unit tests before.

How should these tests look like? Actually the main goal - do not break the structure of the file. But this does not mean that I should compare the contents of the file before and after?

If you haven't yet, pick up a copy of Michael Feathers' book "Working Effectively with Legacy Code". It's all about how to add tests to existing code, which is exactly what you're looking for.

But until you finish reading the book, I'd suggest starting with a regression test: create the class, have it write the file to disk, and then compare that file to a "known good" file that you've stashed in your source repository somewhere. If they don't match, fail the test.

Then start looking at the interesting decisions that your class makes. See how you can get them under test. Maybe you extract some complicated if-conditions into public functions that return bool, and you write a battery of tests to prove that, given the right inputs, that function returns the right value. Maybe generation of a particular string has some interesting logic; start testing it.

Along the way, you may find objects that want to get out. For example, you may find that the code (or the tests!) would be simpler if there was a separate class that generates a single line of output. Go with it. You've got your regression test to catch you if you screw anything up.

Work relentlessly to remove dependencies (but make sure you've got a higher-level test, like a regression test, to catch you if you make mistakes). If your class creates its own FileStream and writes to the filesystem, change it to take a TextWriter in its constructor instead, so you can write tests that pass in a StringWriter and never touch the file system. Once that's done, you can get rid of the old test that writes a file to disk (but only if you didn't break it while trying to write the new test!) If your class needs a database connection, refactor until you can write a test that passes in fake data. Etc.

I am developing a Genetic Algorithm framework and initially decided on the following IIndividual definition:

public interface IIndividual : ICloneable
{
    int NumberOfGenes { get; }
    double Fitness { get; }
    IGene GetGeneAt(int index);
    void AddGene(IGene gene);
    void SetGeneAt(int index, IGene gene);
    void Mutate();
    IIndividual CrossOverWith(IIndividual individual);
    void CalculateFitness();
    string ToString();
}

It looked alright, but as soon as I developed other classes that used IIndividual, I came to the conclusion that making Unit-Tests for those classes would be kind of painful. To understand why, I'll show you the dependency graph of IIndividual:

alt text

So, when using IIndividual, I end up to also having to create/manage instances of IGene and IInterval. I can easily solve the issue, redefining my IIndividual interface to the following:

public interface IIndividual : ICloneable
{
    int NumberOfGenes { get; }
    void AddGene(double value, double minValue, double maxValue);
    void SetGeneAt(int index, double value);
    double GetGeneMinimumValue(int index);
    double GetGeneMaximumValue(int index);
    double Fitness { get; }
    double GetGeneAt(int index);
    void Mutate();
    IIndividual CrossOverWith(IIndividual individual);
    void CalculateFitness();
    string ToString();
}

with the following dependency graph:

alt text

This will be pretty easy to test, at the expense of some performance degradation (I'm at this moment not that worried about that) and having IIndividual heavier (more methods). There is also a big problem for the clients of IIndividual, as if they want to add a Gene, they'll have to add all the little parameters of Gene "manually" in AddGene(value, minimumValue, maximumValue), instead of AddGene(gene)!

My question is:

What design do you prefer and why? Also, what is the criteria to know where to stop?

I could do just the same thing I did to IIndividual to IGene, so anyone that uses IGene doesn't have to know about Interval. I have a class Population that will serve as a collection of IIndividuals. What stops me from doing the same I did to IIndividual to Population? There must be some kind of boundary, some kind of criteria to understand in which cases is best to just let it be (have some dependencies) and in which cases it is best to hide them (like in the second IIndividual implementation).

Also, when implementing a framework that's supposed to be used by other people, I feel like the second design is less pretty (and is maybe harder for others to understand).

Thanks!

(I apologize for not directly answering your question, but I can't help but point you this way...) I can't suggest the book Working Effectively with Legacy Code (author Michael Feathers) highly enough. It's an outstanding treatment of the challenges of getting code under (unit, functional) test.

I am porting a legacy C++ system from VC6 to VC9.

The application (<APP A>) statically links to an internal application <APP B> ( developed in house but by a separate team). A local copy of header files from <APP B> are included in CPP files and compiled in <APP A>.

Currently we are not planning to migrate <APP B> to VC9. Though both <APP A> and <APP B> will use the separate CRTs but no conflict expected.

The issue we are facing is that include files from ( local copy in ) are not getting compiled with VC9.

fatal error C1083: Cannot open include file: 'iostream.h': No such file or directory

Possible solutions: If I make the changes in local copy of <APP A> and compile with VC9 then I am not sure if it can cause some problem during runtime.

Is there any other way in which I can ask VC9 to compile the <APP A> files with <iostream.h> instead of <iostream> ?

There is a great book by Michael Feathers http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052, about this type of project.

The short answer is if you have tests, make the required changes and refactorings and rerun your tests. For your example, I would use a preprocessor directive to pick the right include based on compiler version, and then fix any broken tests.

Without tests you're in a bit more trouble, you'll have either write them or pray you don't break anything

I am trying to write unit test for a huge project where testability has never been though of when coding. I've started mocking objects and writing tests, but I realize I have to refactor a lot of our code in order to be able to mock it.

This is one of the methods I want to create a test for:

public List<DctmViewDefinition> GetDctmViewDefinitions()
{
    List<DctmViewDefinition> dctmViewDefinitions = new List<DctmViewDefinition>();
    DataPackage dataPackage = MyDfsUtil.GetObjectsWithContent();
    foreach (DataObject dataObject in dataPackage.DataObjects)
    {
        DctmViewDefinition view = GetDctmViewDefinitionFromXmlFile(dataObject);
        dctmViewDefinitions.Add(view);
    }
    return dctmViewDefinitions;
}

The MyDfsUtil-class handles webservice-calls and I want to mock it. MyDfsUtil is divided into 14 partial classes each consisting of 300-500 lines of code. So there is a lot of code!

This is an extract of the class to give you the idea:

public partial class MyDfsUtil
{
    public string Locale { get; set; }
    public string DfsServiceUrl { get; set; }
    public string UserName { get; set; }

    public DataPackage GetObjectsWithContent()
    {
        //Some code here
    }

}

I am using Moq and therefore I can't mock this class directly (as far as I know). I have to either create an interface, an abstract class or make the methods virtual. So, what I was have been trying to find out is: What is the best approach in order to be able to mock MyDfsUtil?

First, I was thinking to create an interface, but what about the variables (Locale, UserName etc.) used all over the code?

Secondly, I tried to create an abstract base class MyDfsUtilBase with all the variables and made the methods in the base class returning NotImplementedException. Like this:

public abstract class MyDfsUtilBase
{
    public string Locale { get; set; }
    public string DfsServiceUrl { get; set; }
    public string UserName { get; set; }

    public void GetObjectsWithContent()
    {
        throw new NotImplementedException();
    }
}

Then Resharper tells me to add the 'new' keyword to the my GetObjectsWithContent()-implementation in the MyDfsUtil-class. Or I can declare my methods in the base class as virtual and then user the 'override'-keyword on the implementation. But if I have to declare my methods virtual anyway, I can just do that in MyDfsUtil and then I do not need to create an abstract base class. I have been reading about virtual methods, and it seems like people don't agree on whether to use them or not. Using virtual methods in MyDfsUtil will make my refactoring-assignment easier and it makes me able to mock them. Is there any best practice for cases like mine?

I'm trying to do this the best, simplest way. I have no experience unit-testing or mocking and I really want to do this without introducing too much complexity.

I was exactly where you are three or more years ago.

My advice to you would be to leave MyDfsUtil alone, don't touch it.
(I assume it's a static class with static methods?)

Instead create an interface and matching class (say ISaneMyDfsUtil & SaneMyDfsUtil)

Starting off with that one method you give as an example GetDctmViewDefinitions add to the new class and interface the MyDfsUtil method that it uses GetObjectsWithContent. This "new" method on the new class simply delegates directly to the existing - and untestable MyDfsUtil class. You inject an instance of this class into the class under test.

There are multiple reasons to do it this way.

Making MyDfsUtil mockable probably isn't ideal.

  1. The class is probably used at various levels of code through out the project. Testing a single method will soon require you to mock - in detail - several of it's methods.
  2. The class is way to big and needs to be re-factored into different classes with single responsibilities. You can do that by rolling different interfaces and classes that sit over MyDfsUtil. In time - when you have time - functionality can come out of MyDfsUtil and into the new classes where it actually belongs.
  3. The methods in MyDfsUtil probably return too much for your use cases. e.g. say the metod you're testing needs a list of Customer Ids from MyDfsUtil. You call MyDfsUtil.QueryCustomers(myOrderId); which returns a list of customers. You've code that does stuff and only ever uses the Id properties of the customers. When mocking that call you have to create customer objects, set the ids, and pass back the list of customers. In the SaneMyDfsUtil you can have a QueryCustomerIds method that only returns the customer ids. It make the Code Under Test more explicit, and makes mocking for the tests simpler.

I had some legacy software here that used a static Dal object with hundreds (if not thousands) of methods. I wrote some code that automatically generated Sane_Object classes and interfaces for it. As efforts go to introducing seams for testing it wasn't awful but I learned in time it was far from ideal, and following the pattern I've laid out here would have saved time and effort and would have helped me push Unit Testing to the team in an easier manner.

I could now answer my own question and say, no it's not a good idea.

A final word read The Art of Unit Testing before you do too much else (honestly buy it and read it from cover to cover) Then keep Working Effectively with Legacy Code on your desk, dip in and out of it and keep it as a reference for when things get tough.

Any questions just shout

I am working on DNA proteins alignment project "readseq" . Its "flybase " package contains java code having " charToByteConverter" class which does not compile and gives the " type deprecated " message. (http://iubio.bio.indiana.edu/soft/molbio/readseq/java/). Here readseq source can be foundI need to add some more functionality into this application, don't know how to fix it to proceed towards my goal. I am a kind of new bie in java. Plz help if possible. Readseq is with its gui is easily available on net. It just converts an array of given characters to bytes. Here is some info about it: (docjar.com/docs/api/sun/io/CharToByteConverter.html) . I don't know what to do about this being deprecated. It is an abstract class used as under:

protected byte[] getBytes(CharToByteConverter ctb) {
        ctb.reset();
        int estLength = ctb.getMaxBytesPerChar() * count;
        byte[] result = new byte[estLength];
        int length;

        try {
            length = ctb.convert(value, offset, offset + count,
                     result, 0, estLength);
            length += ctb.flush(result, ctb.nextByteIndex(), estLength);
        } catch (CharConversionException e) {
            length = ctb.nextByteIndex();
        }

        if (length < estLength) {
            // A short format was used:  Trim the byte array.
            byte[] trimResult = new byte[length];
            System.arraycopy(result, 0, trimResult, 0, length);
            return trimResult;
        }
        else {
            return result;
        }
}

This is a perfect case for Adapt Parameter, from Michael Feathers book Working Effectively With Legacy Code.

Shameless self-plug: Here's a short prezi I did on it. It has a step-by-step breakdown of what you need to do.

Essentially, you're going to have to modify the code you have and apply the Adapter Pattern to the parameter. You'll want to define your own interface (let's call it ByteSource), make getBytes() take your interface instead (getBytes(ByteSource ctb)), then make the Adapter that internally has a CharToByteConverter for testing. To fix the broken library, you should make one that has a java.nio.charset instead.

If I use a mockito mock of an object being injected into the SUT as an argument, what happens if during refactoring the code is re-organized to call another non-mocked method of that same mock? My tests would fail and I'd have to go back and change my tests and set them up for this new call (the opposite of what I'd want to be doing when refactoring code)

If this is a common occurrence during refactoring, how can using mocks be of any use except for when mocking external, resource-intensive entities (network, db, etc.)?

I'm using mocks to mock out objects that would take hours to set up given my team seems to love monstrously deep aggregate objects.

Thanks!

I would suggest that you only mock the bare minimum of what's needed. Mockito has many facilities for doing this (spies, ability to return specific data/mocks when a method is invoked a certain way, etc.) but it ultimately comes down to having testable "seams" in your code. If you haven't already, I would recommend reading Michael Feather's book Working Effectively with Legacy Code for many suggestions on how to do this.

I have a class ClassToTest which has a dependency on ClassToMock.

public class ClassToMock {

   private static final String MEMBER_1 = FileReader.readMemeber1();

   protected void someMethod() {
       ...
   }

}

The unit test case for ClassToTest.


public class ClassToTestTest {
   private ClassToMock _mock;

   @Before
   public void setUp() throws Exception {
      _mock = mock(ClassToMock.class)
   }

}

When mock is called in the setUp() method, FileReader.readMemeber1(); is executed. Is there a way to avoid this? I think one way is to initialize the MEMBER_1 inside a method. Any other alternatives?

Thanks!

Your ClassToMock tightly coupled with FileReader, that's why you are not able to test/mock it. Instead of using tool to hack the byte code so you can mock it. I would suggest you do some simple refactorings to break the dependency.

Step 1. Encapsulate Global References

This technique is also introduced in Michael Feathers's wonderful book : Working Effectively with Legacy Code.

The title pretty much self explained. Instead of directly reference a global variable, you encapsulate it inside a method.

In your case, ClassToMock can be refactored into this :

public class ClassToMock {
  private static final String MEMBER_1 = FileReader.readMemeber1();

  public String getMemberOne() {
    return MEMBER_1;      
  }
}

then you can easily using Mockito to mock getMemberOne().

UPDATED Old Step 1 cannot guarantee Mockito mock safely, if FileReader.readMemeber1() throw exception, then the test will failled miserably. So I suggest add another step to work around it.

Step 1.5. add Setter and Lazy Getter

Since the problem is FileReader.readMember1() will be invoked as soon as ClassToMock is loaded. We have to delay it. So we make the getter call FileReader.readMember1() lazily, and open a setter.

public class ClassToMock {
  private static String MEMBER_1 = null;

  protected String getMemberOne() {
    if (MEMBER_1 == null) {
      MEMBER_1 = FileReader.readMemeber1();
    }
    return MEMBER_1;      
  }

  public void setMemberOne(String memberOne) {
    MEMBER_1 = memberOne;
  }
}

Now, you should able to make a fake ClassToMock even without Mockito. However, this should not be the final state of your code, once you have your test ready, you should continue to Step 2.

Step 2. Dependence Injection

Once you have your test ready, you should refactor it further more. Now Instead of reading the MEMBER_1 by itself. This class should receive the MEMBER_1 from outside world instead. You can either use a setter or constructor to receive it. Below is the code that use setter.

public class ClassToMock {
  private String memberOne;
  public void setMemberOne(String memberOne) {
    this.memberOne = memberOne;
  }

  public String getMemberOne() {
    return memberOne;
  }
}

These two step refactorings are really easy to do, and you can do it even without test at hand. If the code is not that complex, you can just do step 2. Then you can easily test ClassToTest


UPDATE 12/8 : answer the comment

See my another answer in this questions.

UPDATE 12/8 : answer the comment

Question : What if FileReader is something very basic like Logging that needs to be there in every class. Would you suggest I follow the same approach there?

It depends.

There are something you might want to think about before you do a massive refactor like that.

  1. If I move FileReader outside, do I have a suitable class which can read from file and provide the result to every single class that needs them ?

  2. Beside making classes easier to test, do I gain any other benefit ?

  3. Do I have time ?

If any of the answers is "NO", then you should better not to.

However, we can still break the dependency between all the classes and FileReader with minimal changes.

From your question and comment, I assume your system using FileReader as a global reference for reading stuff from a properties file, then provide it to rest of the system.

This technique is also introduced in Michael Feathers's wonderful book : Working Effectively with Legacy Code, again.

Step 1. Delegate FileReader static methods to instance.

Change

public class FileReader {
  public static FileReader getMemberOne() {
    // codes that read file.
  }
}

To

public class FileReader {
  private static FileReader singleton = new FileReader();
  public static String getMemberOne() {
    return singleton.getMemberOne();
  }

  public String getMemberOne() {
    // codes that read file.
  }
}

By doing this, static methods in FileReader now have no knowledge about how to getMemberOne()

Step 2. Extract Interface from FileReader

public interface AppProperties {
  String getMemberOne();
}

public class FileReader implements AppProperties {
  private static AppProperties singleton = new FileReader();
  public static String getMemberOne() {
    return singleton.getMemberOne();
  }

  @Override
  public String getMemberOne() {
    // codes that read file.
  }
}

We extract all the method to AppProperties, and static instance in FileReader now using AppProperties.

Step 3. Static setter

public class FileReader implements AppProperties {
  private static AppProperties singleton = new FileReader();

  public static void setAppProperties(AppProperties prop) {
    singleton = prop;
  }

  ...
  ...
}

We opened a seam in FileReader. By doing this, we can set change underlying instance in FileReader and it would never notice.

Step 4. Clean up

Now FileReader have two responsibilities. One is read files and provide result, another one is provide a global reference for system.

We can separate them and give them a good naming. Here is the result :

// This is the original FileReader, 
// now is a AppProperties subclass which read properties from file.
public FileAppProperties implements AppProperties {
  // implementation.
}

// This is the class that provide static methods.
public class GlobalAppProperties {

  private static AppProperties singleton = new FileAppProperties();

  public static void setAppProperties(AppProperties prop) {
    singleton = prop;
  }

  public static String getMemberOne() {
    return singleton.getMemberOne();
  }
  ...
  ...
}

END.

After this refactoring, whenever you want to test. You can set a mock AppProperties to GlobalAppProperties

I think this refactoring would be better if all you want to do is break the same global dependency in many classes.

I have a file - in a large legacy codebase - containing methods that access databases. No classes are used, just a header file with the method declarations, and the source file with the implementation.

I want to override these methods to eliminate the DB access during unit testing.

I thought of the following options:

  1. Make file into class and override methods.
    The main minus here is that it results in alot of changes throughout the codebase.
    Not ideal, though it does improve the code...
  2. Wrap the whole source file with an #ifdef PRODUCTION_CODE and create a new source file containing the stub and wrap it with the opposite, i.e. make the whole thing compilation dependent. Problem here is that in a build system that performs regression tests I would have to compile twice, once in order to create the apps and do regression tests, and an additional time to create the unit test executables.

Any recommended ways of doing this?

You may also want to look at Michael Feathers' book Working Effectively with Legacy Code. Not only does he discuss exactly these types of problems, but the book includes numerous examples in C++ (in addition to Java, C, and C#). Feathers is also the original creator of CppUnit.

I have a large Shape class, instances of which can (should) be able to do lots of things. I have many "domain" shape classes which inherit from this class, but do not provide any different functionality other than drawing themselves.

I have tried subclassing the Shape class, but then all of the "domain" objects will still inherit this subclass.

How do I break up the class? (it is 300 text lines, C#)

A couple of ideas (more like heuristics):

1) Examine the fields of the class. If a group of fields is only used in a few methods, that might be a sign that that group of fields and the methods that use it might belong in another class.

2) Assuming a well-named class, compare the name of the class to what the class actually does. If you find methods that do things above and beyond what you'd expect from the class' name, that might be a sign that those methods belong in a different class. For example, if your class represents a Customer but also opens, closes, and writes to a log file, break out the log file code into a Logger class. See also: Single Responsibility Principle (PDF) for some interesting ideas .

3) If some of the methods primarily call methods on one other class, that could be a sign that those methods should be moved to the class they're frequently using (e.g. Feature Envy).

CAUTION: Like they say, breaking up is hard to do. If there is risk in breaking up the class, you may want to put some tests in place so that you know you're not breaking anything as you refactor. Consider reading "Working Effectively with Legacy Code" and the "Refactoring" book.

I have an ASP.NET web app which is growing.

It is done in traditional web forms. While I would like to convert it WCSF to improve testability, this is time-prohibitive. I have taken the action to only make method calls in ASPX code behind which call the appropriate methods in the classes in App_Code so the spaghetti code is gone, which is good.

What else could I do to improve testability without any fundamental rewrite?

Thanks

Is this a Web Site project? I find Web Applications are more structured and easier to maintain. I'm not sure if they are more testable. Then do use namespaces where a web site does not.

Have you considered using a UI pattern such as MVP? You also might get partial coverage with creating interfaces for your code-behinds and testing against the interface. Watch out for hidden side-effects (changing the state of a dropdown within a method, it hidden behavior).

A book I found helpful was 'Working Effectively with Legacy Code' by Michael Feathers.

I have been given the task of suggesting a design for a QA system of a large existing software project. I've been reading about Unit Testing and Test Driven Development these past few days, and the concepts they profess seem very well and good to me. They also seem like they are "industry standard good coding practices", which is what I'm going for.

Anyway, I've dipped my toe in the water with all of this stuff, but I don't quite have an idea of how to begin the design of these tests. My manager's main objective is to test that new development code produces the same result as existing production code. This level of granularity seems perfectly fine to me, but it seems like the articles I've been reading about Unit Testing and TDD all encourage a much finer level of granularity (Like multiple tests per method).

There are many developers working on this project at once, and implementing that high a level of unit testing granularity would be a nightmare. Also, the practical benefits of unit testing would diminish at that point.

So I guess what I'm wondering is how can I determine what tests need to be done. I think the input/output tests of each big component of the project is a good base, but I don't have the foggiest idea of how to develop more in-depth, but not too in-depth unit tests. If there are some general rules for this, I would be interested in hearing them.

Also, any general feedback about implementing a TDD philosophy in a huge existing software project would be appreciated too. Thanks!

Unit testing and TDD are both valuable practices but they reall need to be adopted by your developers. As you say they are coding, not necessarily testing, practices. TDD is a design process and you can not have tests drive the design of your code after it has already been written.

Unit tests should indeed have multiple tests per method. As many tests as are required to express the requirements of the feature they are testing.

If you must approach this as only a QA role then look into testing the behavior of your application using functional or integration tests and automating those. Unfortunately this won't be Behavior Driven Design since you'll be describing these behaviors much too late for them to drive the design of your application but many of the tools used in BDD may still be useful for you.

If it is an option I think you really want to get your developers involved in writing test first code. Let them take responsibility for producing and verifying that their code is functionally correct and allow your QA department to focus on verifying that the resulting functionality actually results in a good user experience. In addition writing tests requires writing testable code. Without tests it is far too easy to write code which is incredibly difficult to test in isolation.

Whatever you end up doing take a look at Feathers' "working with legacy code" for strategies for finding seams in your application where you can introduce tests and practices to make sure you don't make unexpected changes to the application's behavior. Unfortunaltely I think you're about to find just how much more expensive it is to test code after it is written rather than before or while it is developed.


Edit: That should be "working effectively with legacy code"

I have often seen tests where canned inputs are fed into a program, one checks the outputs generated against canned (expected) outputs usually via diff. If the diff is accepted, the code is deemed to pass the test.

Questions:

1) Is this an acceptable unit test?

2) Usually the unit test inputs are read in from the file system and are big xml files (maybe they represent a very large system). Are unit tests supposed to touch the file system? Or would a unit test create a small input on the fly and feed that to the code to be tested?

3) How can one refactor existing code to be unit testable?

Output differences

If your requirement is to produce output with certain degree of accuracy, then such tests are absolutely fine. It's you who makes the final decision - "Is this output good enough, or not?".

Talking to file system

You don't want your tests to talk to file system in terms of relying on some files to exists somewhere in order for your tests to work (for example, reading values from configuration files). It's a bit different with tests input resources - you can usually embed them in your tests (or at least test project), treat them as part of codebase, and on top of that they usually should be loaded before test executes. For example, when testing rather large XMLs it's reasonable to have them stored as separete files, rather than strings in code files (which sometimes can be done instead).

Point is - you want to keep your tests isolated and repeatable. If you can achieve that with file being loaded at runtime - it's probably fine. However it's still better to have them as part of codebase/resources than standard system file lying somewhere.

Refactoring

This question is fairly broad, but to put you in the right direction - you want to introduce more solid design, decouple objects and separate responsibilities. Better design will make testing easier and, what's most important - possible. Like I said, it's broad and complex topic, with entire books dedicated to it.

I have the 'luck' of develop and enhance a legacy python web application for almost 2 years. The major contribution I consider I made is the introduction of the use of unit test, nosestest, pychecker and CI server. Yes, that's right, there are still project out there that has no single unit test (To be fair, it has a few doctest, but are broken).

Nonetheless, progress is slow, because literally the coverage is limited by how many unit tests you can afford to write.

From time to time embarrassing mistakes still occur, and it does not look good on management reports. (e.g. even pychecker cannot catch certain "missing attribute" situation, and the program just blows up in run time)

I just want to know if anyone has any suggestion about what additional thing I can do to improve the QA. The application uses WebWare 0.8.1, but I have expermentially ported it to cherrypy, so I can potentially take advantage of WSGI to conduct integration tests.

Mixed language development and/or hiring an additional tester are also options I am thinking.

Nothing is too wild, as long as it works.

Feather's great book is the first resource I always recommend to anybody in your situation (wish I had it in hand before I faced it my first four times or so!-) -- not Python specific but a lot of VERY useful general-purpose pieces of advice.

Another technique I've been happy with is fuzz testing -- low-effort, great returns in terms of catching sundry bugs and vulnerabilitues; check it out!

Last but not least, if you do have the headcount & budget to hire one more engineer, please do, but make sure he or she is a "software engineer in testing", NOT a warm body banging at the keyboard or mouse for manual "testing" -- somebody who's rarin' to write and integrate all sorts of automated testing approaches as opposed to spending their days endlessly repeating (if they're lucky) the same manual testing sequences!!!

I'm not sure what you think mixed language dev't will buy you in terms of QA. WSGI OTOH will give you nice bottlenecks/hooks to exploit in your forthcoming integration-test infrastructure -- it's good for that (AND for sundry other things too;-).

I just finished working on a project for the last couple of months. It's online and ready to go. The client is now back with what is more or less a complete rewrite of most parts of the application. A new contract has been drafted and payment made for the additional work involved.

I'm wondering what would be the best way to start reworking this whole thing. What are the first few things you would do? How would you rework the design in a way that you stay confident that the stuff you're changing does not break other stuff?

In short, how would you tackle drastic application design changes efficiently (both DB and code)?

This is NOT-A-NEW thing in software and people have done this and written a lot about this.

Try reading

The techniques explained here are invaluable to sustain any kind of long running IT projects.

I might inherit a somewhat complex multithreaded application, which currently has several files with 2+k loc, lots of global variables accessed from everywhere and other practices that I would consider quite smelly.

Before I start adding new features with the current patterns, I'd like to try and see if I can make the basic architecture of the application better. Here's a short description :

  • App has in memory lists of data, listA, listB
  • App has local copy of the data (for offline functionality) dataFileA, dataFileB
  • App has threads tA1, tB1 which update dirty data from client to server
  • Threads tA2, tB2 update dirty data from server to client
  • Threads tA3, tB3 update dirty data from in memory lists to local files

I'm kinda trouble on what different patterns, strategies, programming practices etc I should look into in order to have the knowledge to make the best decisions on this.

Here's some goals I've invented for myself:

  1. Keep the app as stable as possible
  2. Make it easy for Generic Intern to add new features (big no-no to 50 lines of boilerplate code in each new EditRecordX.cs )
  3. Reduce complexity

Thanks for any keywords or other tips which could help me on this project.

For making these kind of changes in general you will want to look at Martin Fowler's Refactoring: Improving the Design of Existing Code (much of which is on the refactoring website) and also Refactoring to Patterns. You also might find Working Effectively with Legacy Code useful in supporting safe changes. None of this is as much help specifically with multithreading, except in that simpler code is easier to handle in a multithreading environment.

I'm preparing to create my first Unit Test, or at least that's how I was thinking of it. After reading up on unit testing this weekend I suspect I'm actually wanting to do Integration Testing. I have a black box component from a 3rd party vendor (e.g. a digital scale API) and I want to create tests to test it's usage in my application. My goal is to determine if a newly released version of said component is working correctly when integrated into my application.

The use of this component is buried deep in my application's code and the methods that utilize it would be very difficult to unit test without extensive refactoring which I can't do at this time. I plan to, eventually. Considering this fact I was planning to write custom Unit Tests (i.e. no derived from one of my classes methods or properties) to put this 3rd party component through the same operations that my application will require from it. I do suspect that I'm circumventing a significant benefit of Unit Testing by doing it this way, but as I said earlier I can't stop and refactor this particular part of my application at this time.

I'm left wondering if I can still write Unit Tests (using Visual Studio) to test this component or is that going against best practices? From my reading it seems that the Unit Testing tools in Visual Studio are very much designed to do just that - unit test methods and properties of a component.

I'm going in circles in my head, I can't determine if what I want is a Unit Test (of the 3rd party component) or an Integration Test? I'm drawn to Unit Tests because it's a managed system to execute tests, but I don't know if they are appropriate for what I'm trying to do.

  1. Your plan of putting tests around the 3rd party component, to prove that it does what you think it does (what the rest of your system needs it to do) is a good idea. This way when you upgrade the component you can tell quickly if it has changed in ways that mean your system will need to change. This would be an Integration Contract Test between that component and the rest of your system.

  2. Going forward it would behoove you to put that 3rd party component behind an interface upon which the other components of your system depend. Then those other parts can be tested in isolation from the 3rd party component.

I'd refer to Micheal Feathers' Working Effectively with Legacy Code for information on ways to go about adding unit tests to code which is not factored well for unit tests.

I'm relatively new to the world of WhiteBox Testing and need help designing a test plan for 1 of the projects that i'm currently working on. At the moment i'm just scouting around looking for testable pieces of code and then writing some unit tests for that. I somehow feel that is by far not the way it should be done. Please could you give me advice as to how best prepare myself for testing this project? Any tools or test plan templates that I could use? THe language being used is C++ if it'll make difference.

Try "Working Effectively with Legacy Code": http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

It's relevant since by 'legacy' he means code that has no tests. It's also a rather good book.

Relevant tools are: http://code.google.com/p/googletest/ and http://code.google.com/p/gmock/ There may be other unit test and mock frameworks, but I have familiarity with these and I recommend them highly.

I'm still trying to add in some unit tests for an ASP.NET site (not MVC). One of the methods I need to test makes use of the HttpRequest Request object, Request.Path to be exact. I'm trying to write the tests using Visual Studio 2008's built in testing framework. Whenever the test executes the method in question, I receive a System.Web.HttpExecption: Request is not Available in this context. I understand why it is not available (there is no running web server and not path supplied), but how can I proceed with testing the method?

Since everyone likes to see code, here's the code in question:

protected string PageName
{
    get 
    {
        return Path.GetFileName(Request.Path).Substring(0, Path.GetFileName(Request.Path).Length - 5); 
    }
}

protected Change SetupApproval(string changeDescription)
{
    Change change = Change.GetInstance();
    change.Description = changeDescription;
    change.DateOfChange = DateTime.Now;
    change.Page = PageName;
    return change;
}

Here's the test:

[TestMethod]
public void SetupApproval_SubmitChange_ValidateDescription()
{
    var page = new DerivedFromInternalAppsBasePage_ForTestingOnly();
    var messageToTest = "This is a test description";
    var change = page.SetupApproval(messageToTest);
    Assert.IsTrue(messageToTest == change.Description);
}

In addition, I've read through Microsoft's documentation here: http://msdn.microsoft.com/en-us/library/ms182526(v=vs.90).aspx and tried using the [HostType("ASP.NET")], [UrlToTest(http://localhost:port/pathToAspxPage.aspx")], and [AspNetDevelopmentServer("C:\PathToDllAssembly", "NotSureParameter")] Attributes they suggest, but no luck. (As you can see, I'm not sure about what I should be using for a few of the parameters.).

Lastly, I tried Phil Haack's TestWebServer http://haacked.com/archive/2006/12/12/Using_WebServer.WebDev_For_Unit_Tests.aspx and read through Scott Hanselman's post http://www.hanselman.com/blog/NUnitUnitTestingOfASPNETPagesBaseClassesControlsAndOtherWidgetryUsingCassiniASPNETWebMatrixVisualStudioWebDeveloper.aspx For Phil's server, I'm not sure what I would use for parameters in the ExtractResource method.

There is a very similar problem to the one you have encountered which is described in the book "Working Effectively with Legacy Code" by Michael Feathers. In particular, the refactoring is called "Adapt Parameter".

The "problem" with your code is that it is directly coupled to the HttpRequest, specifically Request.Path. So the overall approach is you want to decouple your code from HttpRequest.

Similarly to what is suggested above, here is another way to do the decoupling following the idea in Michael Feather's book. I haven't tried compiling this, so please excuse any typos or syntax errors.

public interface ParameterSource
{
    public string Path {get; }
}

public class FakeParameterSource : ParameterSource
{
   public string Value;
   public string Path { get { return Value } }
}

public class RealParameterSource : ParameterSource
{
   private HttpRequest request;
   public RealParameterSource(HttpRequest aRequest)
   {
     request = aRequest;
   }
   public string Path { get { return request.Path } }
}

Now here's the important part of what you need to change (and here's one way of doing it):

// small rename
protected string GetPageName(ParameterSource source) 
{ 
   return Path.GetFileName(source.Path).Substring(0, Path.GetFileName(source.Path).Length - 5);
}

Above the injection happens at the method level. You could do it via a constructor or a property too.

Your test code could now be something like:

protected Change SetupApproval(string changeDescription) 
{
    ParameterSource p = new FakeParameterSource { Value = "mypath" }; 
    Change change = Change.GetInstance(); 
    change.Description = changeDescription; 
    change.DateOfChange = DateTime.Now; 
    change.Page = GetPageName(p); //  now uses the value in the parameter source
    return change; 
} 

I hope you get the idea and find this useful

I'm trying to write a unit test to test a protected method in an abstract class. I've tried writing a test class that inherits from the abstract class, but when I instantiate the test class the base abstract class attempts to connect to an Oracle database and fails which doesn't allow me to test the protected method I'm interested in. The abstract class cannot be modified.

How can I directly unit test a protected method in this abstract class?

Here is snippet of what I tried with reflection.

Type type = typeof(AbstractClass);
BindingFlags eFlags = BindingFlags.Instance | BindingFlags.NonPublic;
MethodInfo myMethod = type.GetMethod("ProtectedMethod", eFlags);
object[] arguments = new object[] { _myDs };
myMethod.Invoke(type, arguments);
_myDs = (DataSet)arguments[0];

Compromise.

Say you have:

public abstract class CanNeverChange
{
    public CanNeverChange()
    {
        //ACK!  connect to a DB!!  Oh No!!
    }

    protected abstract void ThisVaries();

    //other stuff
}

public class WantToTestThis : CanNeverChange
{
    protected override void ThisVaries()
    {
         //do something you want to test
    }
}

Change it to this:

public class WantToTestThis : CanNeverChange
{
    protected override void ThisVaries()
    {
         new TestableClass().DoSomethingYouWantToTest();
    }
}

public class TestableClass
{
    public void DoSomethingTestable()
    {
        //do something you want to test here instead, where you can test it
    }
}

Now you can test the behavior you want to test in TestableClass. For the WantToTestThis class, compromise to the pressure of terrible legacy code, and don't test it. Plugging in a testable thing with a minimal amount of untested code is a time honored strategy; I first heard of it from Michael Feather's book Working Effectively with Legacy Code.

Is it a good practice to introduce a TestSettings class in order to provide flexible testing possibilities of a method that has many processes inside?

Maybe not a good example but can be simple: Suppose that I have this method and I want to test its sub-processes:

public void TheBigMethod(myMethodParameters parameter)
{

  if(parameter.Condition1)
   {
     MethodForCondition1("BigMac"); 
   }

  if(parameter.Condition2)
   {
     MethodForCondition2("MilkShake"); 
   }

  if(parameter.Condition3)
   {
     MethodForCondition3("Coke"); 
   }

  SomeCommonMethod1('A');
  SomeCommonMethod2('B');
  SomeCommonMethod3('C');
}

And imagine that I have unit tests for all

  • void MethodForCondition1 (string s)
  • void MethodForCondition2 (string s)
  • void MethodForCondition3 (string s)
  • void SomeCommonMethod1 (char c)
  • void SomeCommonMethod2 (char c)
  • void SomeCommonMethod3 (char c)

And now i want to test the TheBigMethod itself by introducing such test methods with required Asserts in them:

  • TheBigMethod_MethodForCondition1_TestCaseX_DoesGood
  • TheBigMethod_MethodForCondition2_TestCaseY_DoesGood
  • TheBigMethod_MethodForCondition3_TestCaseZ_DoesGood
  • TheBigMethod_SomeCommonMethod1_TestCaseU_DoesGood
  • TheBigMethod_SomeCommonMethod2_TestCaseP_DoesGood
  • TheBigMethod_SomeCommonMethod3_TestCaseQ_DoesGood

So, I want TheBigMethod to be exit-able at some points if it is called by one of my integration tests above.

public void TheBigMethod(myMethodParameters parameter, TestSettings setting)
{

  if(parameter.Condition1)
   {
     MethodForCondition1("BigMac"); 

     if(setting.ExitAfter_MethodForCondition1)
        return;

   }

  if(parameter.Condition2)
   {
     MethodForCondition2("MilkShake"); 

     if(setting.ExitAfter_MethodForCondition2)
        return;

   }

  if(parameter.Condition3)
   {
     MethodForCondition3("Coke"); 

     if(setting.ExitAfter_MethodForCondition3)
        return;

   }

  SomeCommonMethod1('A');
  if(setting.ExitAfter_SomeCommonMethod1)
       return;

  SomeCommonMethod2('B');
  if(setting.ExitAfter_SomeCommonMethod2)
       return;

  SomeCommonMethod3('C');
  if(setting.ExitAfter_SomeCommonMethod3)
       return;
}

Even though it looks it does what I need to introduce a TestSetting parameter can makee the code less readable and does not look nice to have testing logic and the main functionality combined to me.

Can you advise a better design for such cases so that it can replace a TestSetting parameter idea?

thanks

A bit late to the game here, but I would concur that mixing test and production code is a big code smell to be avoided. Big methods in legacy code provide all sorts of issues. I would highly recommend reading Michael Feather's Working Effectively with Legacy Code. It's all about dealing with the myriad of problems encountered in legacy code and how to deal with them.

It seems to me that some code is easier to unit test than others. I love writing unit tests for highly functional code (by this, I'm referring to functions that primarily operate on their arguments and return computed results).

But, when the code is much more about it's side effects, testing it becomes much harder. For example, a socket class I use at work has a method declared like this:

void Socket::Create( void );

It takes no arguments, and returns no results. On error it throws, but the direct result of the underlying call (socket()) is hidden by the class itself.

Can anyone recommend techniques or perhaps a book, or a website to learn more advanced techniques for unit testing code that is mostly about its side effects?

I am not sure if you want to test the method Socket::Create( void ) or code that calls this method. In the first case you want to write tests for existing functionality. So you might want to read...

http://www.objectmentor.com/resources/articles/WorkingEffectivelyWithLegacyCode.pdf

or better the book...

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

In the second case you need a Test Double (we were once allowed to call them mocks...) for the underlying call. See...

http://xunitpatterns.com/

both books deal exactly with the kind of problem you mention. In short, the solution is to divide the problem in such a way that most functionality is easily testable and only as little as possible is not.

Our project contains 2600 class files and we have decided to start using automated tests.

We know we have should have started this 2599 class files ago, but how and where should large projects start to write tests?

Pick a random class and just go?

What's important to know? Are there any good tools to use?

You may find Michael Feathers' book Working Effectively with Legacy Code useful.

Recently I inherited a major piece of software. The original model of the software was lost over generations of programmers working over it (though even the "original model" looks a lot broken). There is no unit test in the code. Fortunately, I do not have a requirement to be backward-compatible (oh that would be worse than hell!). I am taking an approach of selective-forceful, selective-restricted approach to refactoring the code. In the past, I've burnt my hands often with trying to change a lot at the same time, and still consider myself tending to do that.

At one occasion, I also solved a bug after a long time of remaining stuck. From that experience I understood that it is important to spend time to find the missing "right hypothesis", rather than keeping trying new things.

I know my next goal should be to create a testsuite asap but currently am inconfident of how to do that. I'd love to use the principles of refactoring dealt within the book "Working Effectively with Legacy Code", but lack the patience and time (or discipline?) to follow the book.

If you would like to provide some debugging tips based on your experiences, please do.

In my experience, to get your end-user support is fundamental.

I'd a similar experience several years ago, when I received a Clipper Summer '87 application to maintain and to make it Y2K compatible. Started as a nightmare: I started to fix some bugs in one place and another piece stop working. After spending some time, I scheduled a meeting with some key users and proposed to rewrite that application.

I rewrote most important features first and had a very close talk with that users every day, which frequently suggest important missing spots. Every week I'd migrate some sample data, so they could have a feeling how about that application was running. I got first version running in about three weeks, and just a small feature set was migrated.

There was a specific report which took 45 minutes to be processed and it ended costing just 5 seconds on newer version. So, that users saw it was a good choice to spend more time and money throwing old code away and starting a fresh version.

I'm aware not every time you can rewrite an application. But to understand what's important to your customers and to make they understand some major changes are in place was decisive to that project success.

Updated version of question

Hi.

My company has a few legacy code bases which I hope to get under test as soon as they migrate to .NET 3.5. I have selected Moq as my Mocking framework (I fell in love with the crisp syntax immediately).

One common scenario, which I expect to see a lot of in the future, is where I see an object which interacts with some other objects.

I know the works of Michael Feathers and I am getting good at identifying inflection points and isolating decent sized components. Extract and Override is king.

However, there is one feature which would make my life a whole lot easier.

Imagine Component1 interacting with Component2. Component2 is some weird serial line interface to a fire central or somesuch with a lot of byte inspection, casting and pointer manipulation. I do not wish to understand component2 and its legacy interface consumed by Component1 carries with it a lot of baggage.

What I would like to do, is to extract the interface of Component2 consumed by Component1 and then do something like this:

 component1.FireCentral = new Mock<IComponent2> (component2);

I am creating a normal mock, but I am pasing in an instance of the real Component2 in as a constructor argument into the Mock object. It may seem like I'm making my test depending on Component2, but I am not planning on keeping this code. This is part of the "place object under test" ritual.

Now, I would fire up the real system (with a physical fire central connected) and then interact with my object.

What I then would wish for, is to inspect the mock to see a log of how component1 interacted with component2 (using the debugger to inspect some collection of strings on the mock). And, even better, the mock could provide a list of expectations (in C#) that would create this behavior in a mock that did not depend on Component2, which I would then use in my test code.

In short. Using the mocking framework to record the interaction so that I can play it back in my test code.

Old version of question

Hi.

Working with legacy code and with a lot of utility classes, I sometimes find myself wondering how a particular class is acted upon by its surroundings in a number of scenarios. One case that I was working on this morning involved subclassing a regular MemoryStream so that it would dump its contents to file when reaching a certain size.

// TODO: Remove
private class MyLimitedMemoryStream : MemoryStream
{
    public override void Write(byte[] buffer, int offset, int count)
    {
        if (GetBuffer().Length > 10000000)
        {
            System.IO.FileStream file = new FileStream("c:\\foobar.html",FileMode.Create);
            var internalbuffer = GetBuffer();
            file.Write(internalbuffer,0,internalbuffer.Length);
        }
        base.Write(buffer, offset, count);
    }        
}

(and I used a breakpoint in here to exit the program after the file was written). This worked, and I found which webform (web part->web part->web part) control rendered incorrectly. However, memorystream has a bunch of write's and writeline's.

Can I use a mocking framework to quickly get an overview on how a particular instance is acted upon? Any clever tricks there? We use Rhino Mocks.

I see this as a great assett in working with legacy code. Especially if recorded actions in a scenario easily can be set up as new expectations/acceptance criteria for that same scenario replicated in a unit test.

Every input is appreciated. Thank you for reading.

I don't think you can use a mocking framework to easily get an overview of how an particular instance is acted upon.

You can however use the mocking framework to define how it should be acted upon in order to verify that it is acted upon in this way. In legacy code this often requires making the code testable, e.g. introducing interfaces etc.

One technique that can be used with legacy code without needing to much restructuring of the code is using logging seams. You can read more about this in this InfoQ article: Using Logging Seams for Legacy Code Unit Testing.

If you want more tips on how to test legacy code, I recommend the book Working Effectively with Legacy Code by Michael Feathers.

Hope this helps!

The answer seem obvious but I'm asking because I have been doing it subconsciously under the cover of "compile early, compile often" advice.

I guess my question really is how to best apply this advice, as I'm wondering if compiling early and often actually has taught me the bad habit of relying on the compiler for finding errors.

Note, I came about this thought while preparing for technical interview questions where one is asked to write algorithms on a white board or a piece of paper.

If you're working with a statically typed language "leaning on the compiler" - as Michael Feathers calls it in Working Effectively With Legacy Code - works well as a first line of defence to quickly find and fix syntax errors. This does of course not prove that your code is correct, but it sure can boost your efficiency. Especially if you work with a powerful IDE like Eclipse or VisualStudio (with ReSharper), which often offer quick fixes for syntax errors.

So I don't consider it a bad habit at all.

We have a huge web application that has been written in .net 1.1 and 2.0. Althought it is currently running into .net3.5 all architeture is "old" and messy. We may find business logic into ajax and java script until the data layer.

Now the managers had decided that they want to start to unit test the app.

Does anyone have an idea or knows where I find a material to start or even to decide if there is a worth to start?

This is a repeat answer from a similar question.

Typically, it is very difficult to retrofit an untested codebase to have unit tests. There will be a high degree of coupling and getting unit tests to run will be a bigger time sink than the returns you'll get. I recommend the following:

  • Get at least one copy of Working Effectively With Legacy Code by Michael Feathers and go through it together with people on the team. It deals with this exact issue.
  • Enforce a rigorous unit testing (preferably TDD) policy on all new code that gets written. This will ensure new code doesn't become legacy code and getting new code to be tested will drive refactoring of the old code for testability.
  • If you have the time (which you probably won't), write a few key focused integration tests over critical paths of your system. This is a good sanity check that the refactoring you're doing in step #2 isn't breaking core functionality.

We use n-layer architecture in our application. Suppose we use 3 layer application and use MVC pattern for Presentation Layer,Which Layer should test? How can find testability point in my application?

  1. Presentation Layer?
  2. Business Layer
  3. Data Layer
  4. All of them?

Only test the layers you want to be sure work. All 3 of the above seem to be things it would be important to have working. You wouldn't want to remove any of them.

Trying to find testability points in existing software where it hasn't been designed for can be a challenge. There's a good book, Working Effectively with Legacy Code, where legacy is defined as code without tests, that talks about this issue. Basically, if you don't design for testability, it's can be hard to shoe-horn it in, you'll probably need to refactor.

The trick is going to be adding test infrastructure to the code: mocks, stubs, and other test components to allow you to isolate just the bits under test. This is helpful when you test a DB, you really don't want to run a real query, it'll just take too long and you want the tests to be FAST. Dependency Injection can be helpful for the more static languages, like C++/C and Java.

We write a lot of integration tests that work quite well for us, we did attempt to introduce unit testing but found it more difficult to introduce as it took longer and the benefits were not that apparent.

We still have testers manually going through test scripts. We still have a long way to go to automate our testing process and we wanted to know how others approach it and what tools you use?

Thanks,

B

It's very difficult to retrofit unit-tests into legacy code (that is code without unit-tests). There is a book about this: Working effectively on legacy code. Unit-testing should be applied when developing new code.

It's not easy to recommend tools without information on the type of the product that is being testing. You might have to add internal interfaces so that test tools can poke the product: to test a GUI-based application, you can add an undocumented option to supply a command file that will simulate user actions.

Look at the more tedious and error-prone tasks of the manual testing, what testers complain the most of.

Automate your tests steps by steps, for instance, tests execution first and results verification (PASS/FAIL criteria) in a second phase.

Keep hard tasks manual (e.g. installation, configuration) in the beginning.

In short, apply a low hanging fruit strategy.

Also strive to reproduce every problem fixed in the codebase with an automatic test, this will help verification on the fix and constitute automatic regression testing.

Scripting (bash, Perl, Powershell, whatever) is certainly helpful when dealing with automation.

I a have a solution containing 25 projects comprised of both C# and managed C++.

I need to test one of the C# project calls but this project is of type "Window application" (Not DLL).

Even though it is a windows application, my requirement is to call only few internal functional calls (Not related to windows form).

I need to create a separate C# test project to call this functionality. Is it possible to do it like this?

Can anyone suggest a way or examples? And one more thing, I cannot modify the existing code.

Is it possible to do it like this?

Yes. Referencing the project you wish to test in a test project is typically how you unit test your code.

Can any one suggest a way or examples?

Create a unit test project, reference the project that contains the code you wish to test, write tests to test the code you wish to test. If you need to refactor the code to make it testable, do so, or see point below.

And one more thing, i don't have any freedom to modify the existing source code.

In this case, you are going to have to wrap the code in some cleaner interfaces to allow you to test the code.

The book, Working Effectively with Legacy Code by Michael Feathers has some excellent advice on how to get legacy code under test.

I am a newbie to design phase and newbie to the existing working application.

Scenario: Functionally working code in production, MVC, in house framework since many years.

Requirement:

Functionally: Looks and feel of the few pages has to be changed by using new rich java script library. There is some slight functionality change here and there too but major change is about look and feel. For example instead of 1 simple page, its multiple, collapsible, rearrangable small sections on 1 page(thinking of using tiles).

Design: It should be reusable code and could be extended for next phase change-which definitely is more changes on to other pages and maybe or maybe not changes to framework (can be struts though not sure)

Questions:

Ques1: Java classes: Helper, POJO, Retrieval class : Is it better to modify the existing class or have a new class extend the old class? My pref: Better to have new class- Large methods in existing class and at some stage the methods can be rewritten ,easy to debug

Ques2: If suggested way is new classes should they reside in a different package hierarchy so as to easily differentiate if one has to move to new framework

Ques3: Are there any recommended books/articles which might help me in understanding designing when working with new and old code better

Working Effectively with Legacy Code

However, this book is more about the problems when you have some code that is poorly structured -> if the code you are working on is MVC based and well designed, then the minimum for you is to understand MVC.

Your question is little hard to understand, so try to rephrase to be more clear.

I am handling a project which needs restructuring set of projects belonging to a website. Also it has tightly coupled dependency between web application and the dependent project referenced. Kindly help me with some ideas and tools on how the calls could be re factored to more maintainable code.

The main aspect is that there are calls to apply promotion (promotion class has more than 4 different methods available) consumed from various functions, which could not be stream lined easily.

Kindly help me here with best practices.

Sorry guys- i could not share much code due ot restriction, but hope the below helps

My project uses N-Hibernate for data access Project A- web project - aspx and ascx with code behind Project B- Contains class definition consumed by project C (data operation class) Project C - Business logic with saving to database methods (customer, order, promotion etc.)

The problem is with project C - which i am not sure if it does too many things or needs to be broken down.But there are already many other sub projects.

Project C supports like saving details to DB based on parameters some of the class methods in this calls the promotion based on some condition, I would like to make things more robust - sample code below

Project -C Class - OrderLogic

public void UpdateOrderItem(....)
{
....
....
...
}
Order order = itm.Order;
promoOrderSyncher.AddOrderItemToRawOrderAndRunPromosOnOrder(itm, ref order);
orderItemRepository.SaveOrUpdate(itm); 

So just like the above class the promotion is called from may other places, i would like to streamline this calls to promotion class file. So i am looking for some concepts.

I strongly suggest not to start restructuring your application without a strong knowledge of SOLID principles and dependency injection. I did this mistake and now I have an application full of service locator (anti)pattern implementations that are not making my life simpler than before.

I suggest you to read at least the following books befor starting:

http://www.amazon.com/Agile-Principles-Patterns-Practices-C/dp/0131857258 (for SOLID principles)

http://www.manning.com/seemann/ (for .NET dependency injection)

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

A possible strategy is not refactoring just for the sake of it, but consider refactoring only the parts that are touched more than others. If something is working and nobody is going to change it there's no need to refactor it, it can be a loss of time.

Good luck!

I am using Junit 4.12 with PowerMock 1.6 with Mockito. I have also used PowerMockRule library as described here. I am trying to execute initialization code for all of my test cases exactly once as described in this SO Thread. Its executing the initialization code exactly one time however, if I do ServiceInitializer.INSTANCE inside test method it returns me new object. I am not able to understand this behavior. Does anyone have any idea why this is happening? If I execute my code without PowerMockRule Library and run my test with PowerMockRunner then it works fine but in that case my ClassRule is not getting executed.

    public class ServiceInitializer extends ExternalResource {
      public static final TestRule INSTANCE = new ServiceInitializer();
      private final AtomicBoolean started = new AtomicBoolean();

      @Override protected void before() throws Throwable {
        if (!started.compareAndSet(false, true)) {
          return;
        }
        // Initialization code goes here
        System.out.println("ServiceInitializationHelper:"+this); //Print Address @3702c2f1
      }

      @Override protected void after() {
      }
    }




    class BaseTest{
            @Rule
            public PowerMockRule powerMockRule = new PowerMockRule();

              @ClassRule
              public static final TestRule serviceInitializer = ServiceInitializer.INSTANCE;

              @Before
              public final void preTest() {
                // some code
              }

              @After
              public final void postTest() {
                //some code
              }
    }


@PrepareForTest({MyClass.class})
public class MyTest extends BaseTest {
      @Test
      public void testMethodA_1(){
            System.out.println(ServiceInitializer.INSTANCE);//Print Address @54d41c2b
      }
}

Update

I printed the classloader for the classes and it turns out for first print statement the classloder was sun.misc.Launcher$AppClassLoader and for the second print statement the classloder was org.powermock.core.classloader.MockClassLoader. How can I solve this?

Edwin is correct; this is an issue with PowerMock creating a new ClassLoader for every test. I strongly recommend refactoring your code so it can be tested without PoeerMock and switch to Mockito.

These books may be helpful

In the mean time, you can reference ServiceInitializer from your base class:

    public class ServiceInitializer extends ExternalResource {
      public static final ServiceInitializer INSTANCE = new ServiceInitializer();
      private final AtomicBoolean started = new AtomicBoolean();

      @Override protected void before() throws Throwable {
        if (!started.compareAndSet(false, true)) {
          return;
        }
        // Initialization code goes here
        System.out.println("ServiceInitializationHelper:"+this);
      }

      @Override protected void after() {
      }
    }




    class BaseTest{
            @Rule
            public PowerMockRule powerMockRule = new PowerMockRule();

              @ClassRule
              public static final ServiceInitializer serviceInitializer = ServiceInitializer.INSTANCE;

              @Before
              public final void preTest() {
                // some code
              }

              @After
              public final void postTest() {
                //some code
              }
    }


@PrepareForTest({MyClass.class})
public class MyTest extends BaseTest {
      @Test
      public void testMethodA_1(){
            System.out.println(serviceInitializer);
      }
}

I got a Project from my Company without Documentation. Now I have to add some things to this Project like new Add features etc. But I have no clue how this Project is organized and where it starts... Does anyone know how to work with a Project like this? Can I analyse it with a shema?

How can i find the startpoint of this application?

and is there a way to debug from A to Z?

Look in the manifest file for the main activity.

While others may chime in with Android-specific answers, you should think of handling this project as "legacy code" -- created by someone else, with poor or no documentation, and little idea (initially) of how things were implemented.

Check out Working Effectively with Legacy Code by Feathers if you'd like a decent book on your problem. Also relevant, but not Android-specific, is advice in the ebook "Rails Rescue" and the "Legacy Code" chapter in Rails Test Prescriptions. Though Rails-related, the advice the authors give is directly applicable to most projects.

In a nutshell:

  • get the project under version control if it isn't already (and if it is, you're in luck, as you can review the commit history to get a feel for things)
  • get the test suite running cleanly. No test suite? Get a simple test stub started
  • commit yourself to creating tests for every new feature you implement

Lastly, read the code. You're going to have to bite the bullet and familiarize yourself with the implementation -- there's no way around that.

I got a task related to ANCIENT C++ project which hasn't any documentation, comments at all and all code/variables is written in foreign language. Do I have a chance to analyze this code in a 1 working day and make a design/UML to create new features? I have been sitting around for 3 hours already and I feel so frustrated... Maybe somebody also had same problem? Any advice?

BR,

Short answer, no - you probably don't have a chance to understand the code in one day. Reading/maintaining code is one of the hardest things to do, especially when it's lacking documentation. The fact that the code is in a foreign language (!) makes it even harder.

Sounds like you are on a very restricted (unrealistic) time-budget, but Working With Legacy Software is a good book if you're working with legacy systems. If you are planning to keep adding new features to the legacy system it's your responsibility to make your management aware of the scope of the operation. Or at least try.

I have a class like this:

public class Customer : CustomerBase
{
    // internals are visible to test
    internal string GenString()
    {
        // this actually composes a number of different properties 
        // from the parent, child and system properties
        return this.InfoProperty.Name + DateTime.Now.ToString() + "something else"; 
    }
}

// this class is in a 3rd party library, but from metadata it looks like this
public class CustomerBase
{
    public Info InfoProperty { get; }
}

My test looks something like this:

public class Tests
{
    public void MyTest()
    {
        using (ShimsContext.Create())
        {
            // Arrange
            /* I shim datetime etc. static calls */

            Fakes.StubCustomer c = new Fakes.StubCustomer()
            {
                InfoProperty = new Info("Name") // <- Error here because it's readonly
            };

            // Act
            string result = c.GenString();

            // Assert
            Assert.AreEqual(result, "whatnot");
        }
    }
}

So my question is, how can I stub/shim the readonly property so that I can test this function?

What about wrapping this getter in an ad hoc virtual method that could be overrided by a mock?

Eg:

public class Customer : CustomerBase
{
  // internals are visible to test
  internal string GenString()
  {
    // this actually composes a number of different properties 
    // from the parent, child and system properties
    return InfoPropertyNameGetter() + DateTime.Now.ToString() + "something else"; 
  }

  public virtual string InfoPropertyNameGetter(){
    retrn this.InfoProperty.Name;
  }
}

Mock<Customer> mock = new Mock<Customer>();
mock.Setup(m => m.InfoPropertyNameGetter()).Returns("My custom value");

It would look a bit like the Introduce Instance Delegator pattern described in Working effectively with legacy code

I have a class which contains several methods I'd like to test, as well as a private nested class that is used in a few of these methods. Inside this private nested class it creates an object that attempts to form a connection to either a website or a database. I would like to seperate the tests for connecting to this outside resource and for the processing that happens to the retrieved information. This way we can choose to not have the test environment connected to a functional 'outside' source, greatly simplifying the setup we need for these tests.

To that end I am writing a test which mocks the constructor for the object that attempts to form these connections. I don't want it to do anything when the nested private class attempts to form the connection, and when it tries to retrieve information I want it to just return a predefined string of data. At the moment I have something that looks similar to this:

public class MyClass {

    public int mainMethod() {
        //Some instructions...

        NestedClass nestedClass = new NestedClass();
        int finalResult = nestedClass.collectAndRefineData();
    }

    private class NestedClass {

        public NestedClass() {
            Connector connect = new Connector();
        }

        public int collectAndRefineData() {
            //Connects to the outside resource, like a website or database

            //Processes and refines data into a state I want

            //Returns data
        }
}

The test class looks something like this:

@RunWith(PowerMockRunner.class)
@PrepareForTest({Connector.class})
public class MyClassTests {

    @Test
    public void testOne() {
        mockConnector = mock(Connection.class);
        PowerMockito.whenNew(Connector.class).withNoArguments().thenReturn(mockConnector);

        MyClass testClass = new MyClass();
        int result = testClass.mainMethod();

        Assert.equals(result, 1);
    }
}

Now, I do know that inside the PrepareForTest annotation that I need to include the class that instantiates the object that I'm mocking the constructor for. The problem is that I can't put MyClass, because that's not the object that creates it, and I can't put NestedClass, because it can't be seen by the test. I have tried putting MyClass.class.getDeclaredClasses[1] to retrieve the correct class, but unfortunately PowerMocktio requires a constant to be in the annotation and this simply will not work.

Can anyone think of a way to get this mock constructor to work?


Note: I am unable to make any alterations to the code I am testing. This is because the code does work at the moment, it has been manually tested, I am writing this code so that future projects will have this automated testing framework to use.

I suspect you will have to modify the code under test. You have two options:

  1. Write a large integration test that covers this code so you have a safety net just in case your changes might break something. This test could possibly create an in-memory database/backend and have the connector connect to that.
  2. Make a small, safe change that creates a seam for testing

It's preferable to do both. See the book Working Effectively with Legacy Code for more details.

I'll show an example of option 2.

First, you can create a "seam" that allows the test to be able to change the way the Connector is created:

public class MyClass {

    public int mainMethod() {
        // Some instructions...

        NestedClass nestedClass = new NestedClass();
        return nestedClass.CollectAndRefineData();
    }

    // visible for testing
    Connector createConnector() {
        return new Connector();
    }

    private class NestedClass {
        private final Connector connector;

        public NestedClass() {
            connector = createConnector();
        }

        ...
    }
}

You can then use a partial mock of MyClass to test your code.

@RunWith(JUnit4.class)
public class MyClassTests {

    @Test
    public void testOne() {
        MyClass testClass = spy(MyClass.class);
        Connector mockConnector = mock(Connector.class);
        when(testClass.createConnection())
            .thenReturn(mockConnector);

        int result = testClass.mainMethod();

        Assert.assertEquals(1, result);
    }
}

Note that assertEquals expects the first parameter to be the expected value.

Also note that this test uses Mockito, not PowerMock. This is good because tests that use PowerMock may be brittle and subject to breakage with small changes to the code under test. Be warned that using partial mocks can be brittle too. We will fix that soon.

After you get the tests passing, you can refactor the code so the caller passes in a Connector factory:

public class MyClass {
    private final ConnectorFactory connectorFactory;

    @Inject
    MyClass(ConnectorFactory factory) {
        this.connectorFactory = factory;
    }

    // visible for testing
    Connector createConnector() {
        return connectorFactory.create();
    }

    private class NestedClass {
        private final Connector connector;

        public NestedClass() {
            connector = createConnector();
        }

        ...
    }
}

The code using MyClass would preferably use a dependency injection framework like Guice or Spring. If that isn't an option, you can make a second no-arg constructor that passes in a real ConnectorFactory.

Assuming the tests still pass, you can make the test less brittle by changing your test to mock ConnectorFactory instead of doing a partial mock of MyClass. When those tests pass, you can inline createConnector().

In the future, try to write tests as you write your code (or at least before spending a lot of time on manual testing).

I have a method call that I am unit testing. Inside that method call it makes a call to a method that is going to throw an error because it is trying to talk to a device that will not be available when the unit test is running. Is there a way to avoid that internal method call from being called?

Environment: C# Visual Studio 2010 unit testing within IDE

Check out Working Effectively with Legacy Code book by Michael Feathers - it have a lot of suggestions on dealing with code that does not have unit test yet.

Possible approaches covered in the book:

  • extract dependency in interface - ideal approach (see jamespconnor's answer)
  • use flag to bypass call (see Colm Prunty's answer)
  • extract that call into virtual method and override in derived class used in unit test
  • pass delegate (may be less impact than full interface/derivation)

Sample for deriving from the class:

public class WithComplexDependency
{
   public void DoSomething()
   {
     // Extract original code into a virtual protected method
     // dependency.MethodThatWillBreak();
     CallMethodThatWillBreak();
   }

   virtual protected void CallMethodThatWillBreak()
   {
      dependency.MethodThatWillBreak();
   }
}

in test code derive from the class and provide own implementation:

public class WithMockComplexDependency : WithComplexDependency
{
   // may also need to add constructor to call original one.

   override protected void CallMethodThatWillBreak()
   {
      // do whatever is needed for your test
   }
}

...
WithComplexDependency testObject = new WithMockComplexDependency();
testObject.DoSomething(); // now does not call dependency.MethodThatWillBreak()
...

I have an existing Rails app that I built using Rails 3, Mongoid/Mongodb and Devise. The app is running fine. I'd now like to add some tests to it (sure, shoulda done this in the beginning but the learning curve for just Rails was enough...).

I've used several pages to get it going, especially the Rails guide and this blog post about Mongo and Cucumber/Rspec. My concern here is that between all of the "add this to this and such file" that I've done to try and get this working (and it's not) I've made such a mess of things that it might be better to start over from scratch. With the testing portion of the app.

I thought I would just delete the spec and test directories and re-gen the tests but I can't find a command to do that (the regen).

I've built a very simple test (assert true) but I'm getting:

D:/Dev/TheApp/test/test_helper.rb:10:in `<class:TestCase>': 
undefined method `fixtures' for ActiveSupport::TestCase:Class (NoMethodError)

I think the real issue here is that I'm using MongoDb and the test architecture in Rails seems to really really want to do ActiveRecord. Not sure if those two are compatible.

Is there a quick way to build a barebones test directory? My short term solution is to just roll back those directories. Hoping for a better solution.

The blank tests are really worthless. If you didn't have tests/specs of value, then just start from scratch. And if you want to start over, you should just delete them and start new.

You could treat your code as "legacy code" as defined by Michael Feathers in Working Effectively with Legacy Code -- that is, code without tests.

I have inherited an ASP.NET application that is a disaster architecturally, however it works and it is in live production use. In order to enhance the application in the future I need to totally re-architect it, but I need to preserve the existing front-end functionality - the changes that are being asked of me are more backend integration than front-end functional or user interaction.

The basic issue is this - the bulk of the business logic is in stored procs (and the rest is in UI codebehind files) - there is no business domain model of which to speak. I need to refactor logic out of the database (and down from the UI) into code, but the behaviour of the application is fundamentally bound up in these stored procs. So the question is this - how do you go about a complete re-architecting of this kind?

I understand refactoring in small steps, I've read fowler etc. I am really looking for practical advice from people who have been through a similar process.

Thanks!

Edit I should have said, ideally I want to re-engineer iteratively, during a multiple release cycle.

IMHO the most important thing to do is to develop a lot of good tests that would cover entire functionality you want to preserve.

I know that writing tests is extremely hard in messy codebase, but it is still possible through some dependency breaking techniques. I would recommend "Working effectively with legacy code" since it is all about practical side of refactoring.

I have several pages of code. It's pretty ugly because it's doing a lot of "calculation" etc. But it contains of several phases, like many algorthims, like that:

  1. calculate orders I want to leave
  2. kill orders I want to leave but I can't leave because of volume restrictions
  3. calculate orders I want to add
  4. kill other orders I want to leave but I can't because of new orders
  5. adjust new orders ammount to fit desired volume

Totally I have 5 pages of ugly code which I want to separate at least by stage. But I don't want to introduce separate method for each stage, because these stages make sense only together, stage itself is useless so I think it would be wrong to create separate method for each stage.

I think I should use c# #region for separation, what do you think, will you suggest something better?

Avoid #region directives for this purpose, they only sweep dirt under the carpet.

I second @RasmusFranke's advice, divide et impera: while separating functionalities into methods you may notice that a bunch of methods happen to represent a concept which is class-worthy, then you can move the methods in a new class. Reusability is not the only reason to create methods.

Refactor, refactor, refactor. Keep in mind principles like SOLID while using techniques from Refactoring and Working Effectively with Legacy Code.

Take it slow and use if you can tools like Resharper or Refactor! Pro which help to minimize mistakes that could occur while refactoring.

Use your tests to check if you broke anything, especially if you do not have access to the previously mentioned tools or if you are doing some major refactoring. If you don't have tests try to write some, even if it may be daunting to write tests for legacy code.

Last but not least, do not touch it if you don't need to. If it works but it is "ugly" and it is not a part of your code needing changes, let it be.

I have two classes, A & B. B is inheriting from A, I want to inverse the dependency.

Class A { }

Class B : A { }

Class B is inheriting from A. It means B has some dependency from A.

What will be the correct way to inverse the dependency?

Inheritance is a concept implying tight coupling between classes.

In order to use Dependency Injection you need to create some "Seams", as Michael Feathers calls them in Working Effectively with Legacy Code. Here you can find a definition of Seam:

A seam is a place where you can alter behavior in your program without editing in that place.

When you have a seam, you have a place where behavior can change.

By this definition, there is no Seam in your example, which is not necessarily a bad thing. The question is now, why do you feel the need to do Dependency Injection in this place?

If it's for the sake of example, don't do Dependency Injection here. There are places where it does not really make sense to apply it: if you have no volatility, why would you do it?

If you do really feel the need to do something similar in your project though, you probably want to decouple the volatile concepts out of your inheritance hierarchy and create a Seam for these parts: you could have an interface to abstract these concepts, which at this point can be effectively injected into your client class.

I'm an intern working on a project that has the potential to introduce a lot of bugs at a company with an extremely large code base. Currently the company has no automated testing implemented for any of their projects, so I want to begin writing tests for the code as I go so that I can tell when I break something, but I have a hard time developing an intuition for what is worth testing and how to test it. Some things are more obvious than others: testing string manipulation functions isn't too tough, but what to write for a multithreaded custom memory manager is trickier.

How do you go about designing tests for an existing codebase and what do you test for? How do you figure out what underlying assumptions the code is making?

I'm currently reading the "Working Effectively with Legacy Code" by Michael Feathers

and I think I understand about the LSP violations, but the thing is it says something about the rules of thumb that help avoiding LSP violations which are,

  1. Whenever possible, avoid overriding concrete methods.
  2. If you do, see if you can call the method you are overriding in the overriding method.

I don't quite understand the number 2, could you help me clarify this please ?

I'm looking for a tool that can solve following problem:

Our complete unit tests suite takes hours to complete. So when programmer commits code, he gets tests result after few hours. What we would like to achieve is to shorten time for finding simple bugs. This could be done by smart selection of few unit tests which would be run just before/right after the commit. Of course we don't want to randomly pick this unit test - we want unit tests that will more likely find a bug.

Idea to solve this problem without additional software: Make code coverage for each single unit test. Knowing which files are "touched" by which unit test, we can pick this unit test if user changed any of this files. This solution has obvious disadvantage - we have to manually store and update list of covered files for each unit test.

I wonder, if there is any tool that helps selecting tests to run? Project uses C++ and works under Linux.

In Working Effectively with Legacy Code, Michael Feathers writes that a unit test that takes 10ms is a slow unit test. You must root out the slow tests. Do not hack up subset runners based on coverage guesses that will eventually be wrong and bite you.

Keep in mind the distinction between unit tests and integration tests. Unit tests do not touch the filesystem, talk over the network, or communicate with a database: those are integration tests. Yes, integration tests are often easier to write, but that is a strong indication that your software could be factored better—and as a happy coincidence, easier to test.

My suspicion is your integration tests are the ones taking so long. Move those to a separate suite that runs less frequently than on every checkin, say nightly.

I'm testing a function that takes several paramters and on the basis of their values calls different private methods.

I want to check that the function always call the right private method.

Since I know what the private methods will do I can check the final result but it would be more convenient to be able to check directly if the right function was called, because I have already tested the private methods.

Is there a way to replace a privae method with a stub?

Yes, there are mocking libraries that let you do this. One is PowerMock. From their private method tutorial, you need something like this:

@RunWith(PowerMockRunner.class)
@PrepareForTest(MyUnit.class)
public class TestMyUnit {

    @Test
    public void testSomething() {
        MyUnit unit = PowerMock.createPartialMock(MyUnit.class, "methodNameToStub");
        PowerMock.expectPrivate(unit, "methodNameToStub", param1).andReturn(retVal);

        EasyMock.replay(unit);

        unit.publicMethod(param1);

        EasyMock.verify(unit);

    }

}

However, I really disagree with this practice myself. Your unit test should test inputs, outputs, and side effects, and that's it. By ensuring that a private method is called correctly, all you're doing is preventing your code from being easily refactored.

In other words, what if down the road you want to change how your unit does its job? The safe way to do this is to make sure the code is under (passing) tests, then refactor the code (potentially including changing which internal methods are called), and then run the tests again to make sure you didn't break anything. With your approach, this is impossible because your tests test the exact implementation, not the behaviour of the unit itself. Refactoring will almost always break the test, so how much benefit is the test really giving you?

Most often you would want to do this because you're actually considering those privates a unit unto themselves (this sound like you, since you say you are testing those private methods directly already!). If that's the case, it's best to extract that logic into its own class, test it, and then in the remaining code interact with a mock/stub version of that new unit. If you do that, your code has a better structure and you don't need to fall back on the voodoo magic that is PowerMock. A fantastic reference to do these kinds of refactorings is Michael Feathers' Working Effectively with Legacy Code.

I'm working in a PHP project where testing software was neglected a long time. The business logic is full of hard coded dependencies and immediate database access throught some hand-crafted (Oracle) SQL.

I've given up trying to build automated integration tests, because of complex database setup, tight compling to the (complex) database fixture and missing in-memory solutions.

For me it looks like the best place to start, is to test the business logic. Therefore I need to refactor the code to get the data access code seperated from the business logic, I guess. Still I'm struggeling with some basic design questions:

  1. What is the preferred way to encapsulate/get rid of this complex SQL? Is there any design pattern which has some good hints on how to get data from the datasource in an configurable way? Injecting Propel Active Query objects seems the help in some cases, but in complex cases they will be very hard to mock I guess.
  2. Is there a good book about Software Architecture + Unit Testing for Applications that need are heavily making use of their database?

To answer your 2nd question: Working Effectively with Legacy Code is what you need: it explains several pattern to break dependencies to make a code testable.

Regarding your first question: it depends on your current case. Here are a few example described in depth in the book:

Example 1 - Extract and override Call

If you have a classe like (example isn't in php, but you'll get the idea)

class MyClass {
    int getNbEligibleItems(){
      List<Item> rawItems = readInDb();
      //Now count elegible ones
    }

    List<Item> readInDb(){
      //Directly call DB and return a raw list
    }
}

Then you could make readInDb virtual, and use a mock in tests:

class TestableMyClass : MyClass {
    override List<Item> readInDb(){
       //Return a list of hard code test items
    }
}

Example 2 - Parametrized constructor

If you have a class like this

class MyClass {
    private IDbReader _reader;

    MyClass(){
       _reader = new DbReader();
    }

    int work(){
       List<item> items = _reader.read();
       //Work with items
    }
}

Then it would be possible to changes constructors to

    MyClass() : this(new DbReader()){ }

    MyClass(IDbReader reader){
       _reader = reader;
    }

So it would be possible to mock db in tests

So, to put it in a nutshell: there are a lot of technique that could help in your case. We' need some code to be more specific. And I recommend reading this book as it provide lot of answers in those cases.

In an object-oriented language with inheritance and virtual functions, removing dependencies (e.g. database, API calls, etc) from your unit testing code can be as simple as encapsulating those dependencies in their own methods and then overriding those methods in a test class inheriting from the class to be tested.

However, I've run into a problem when trying to do something similar for procedural code (C specifically). Without inheritance I can't override those calls, so how does one provide similar removals of dependencies when unit testing procedural code?

One option would be to provide alternatives to the calls to these dependencies and surround them with #ifdefs, but the ideal approach would be to have the unit test apply to the same code as is going into the final build. Is this possible?

Get Working Effectively with Legacy Code and read the chapter titled "My Application Is All API Calls".

Basically, Feathers describes two options:

The "linker seam": You can compile in a different set of implementations for the API calls you're trying to stub without having to change the code - basically change the makefile/.sln to compile in different implementations of the functions.

If that doesn't work, he talks about "skin and wrap", where you basically move all the API functions into an abstract base class, create two derived classes - a production and a unit-testing implementation with forwarding calls to the appropriate methods - then use dependency injection to pass the appropriate set of functions in.

I need to finish others developer work but problem is that he started in different way... So now I found in situation to use existing code where he chooses to inherit a non-abstract class (very big class, without any virtual functions) that already implements bunch of interfaces or to dismiss that code (which shouldn't be to much work) and to write another class that implements interfaces I need.

What are the pros and cons that would help me to choose the better approach.

p.s. please note that I don't have to much experience

Many Thanks

What are the pros and cons that would help me to choose the better approach.

It's legal to derive from a class with no virtual functions, but that doesn't make it a good idea. When you derive from a class with virtual functions, you often use that class through pointers (eg., a class Derived that inherits from Base is often manipulated through Base*s). That doesn't work when you don't use virtual functions. Also, if you have a pointer to the base class, delete-ing it can lead to a memory leak.

However, it sounds more like these classes aren't being used through pointers-to-the-base. Instead the base class is simply used to get a lot of built in functionality, although the classes aren't related in the normal sense. Inversion of control (and has-a relationships) is a more common way to do that nowadays (split the functionality of the base class into a number of interfaces -- pure virtual base classes -- and then have the objects that currently derive from the base class instead have member variables of those interfaces).

At the very least, you'll want to split the big base class into well-defined smaller classes and use those (like mixins), which sounds like your second option.

However, that doesn't mean rewrite all the other code that uses the blob base class all in one go. That's a big undertaking and you're likely to make small typos and similar mistakes. Instead, buy yourself copies of Working Effectively With Legacy Code and Large-Scale C++ Software Design, and do the work piecemeal.

I've got a fairly large Spring MVC application (80K loc) that I manage. Our team is going to be developing a new feature/sub-application.

The question is, should we build it/deploy it as its own application (a whole new WAR) or build it/deploy it as part of the current application (part of the existing WAR)? Are there pros and cons to each?

This is a restatement of the old question: Should I modify this application in-place, or scrap and rewrite. The answer is almost always, to modify/upgrade in-place. Please read Joel's position, on this and take it to heart. I know the temptation is always to start from scratch, but it is almost always a mistake.

There are lots of ways to use the legacy code but not strongly couple it to the new application. I've found Working Effectively With Legacy Code to be a great resource on this.

Less of a question, more advice and suggestion to a "where TF do I begin?"

Start environment: VS 2010 Pro, C#, VSS 2005, single developer, multple VSS repositories, limited code re-use

Currently I have (mostly inherited and not had time to change until now) multiple VSS repositories, often with C&P'd utility/tool classes/projects. Documentation is on a network share (I include BRD, UAT test scripts, sql scripts for db builds etc. etc. all as "documentation" for this purpose). There is little or no Unit testing and no CI/ build automation.

I am about to get two new devs and therefore I want to change the environment

Possible End environment: VS 2010 Pro, C#, Java (not sure of IDE yet), 3 developers, documentation under source control, re-used code, (automated) unit testing, CI builds, incemental revision/build numbers

My initial thought is to move to SVN or similar, single repository, single "master" solution, multiple "working" solutions. MSTest (as it's built into VS2010?), CI through CC.Net or TeamCity.

I've read a lot on SO about doing each of these tasks independently e.g. arguments for/against each source control system, arguments for/against adding documentation into source control, but nothing on doing everything at once!

Note: we will still be a small team, so some manual work is allowable, but preferably minimal, and we have no budget, so needs to be free or "free to a small team"!

Does anyone have advice on where to start? What tools to use? How to go anout migrating from VSS? Should I be using a "master" solution or is there a CI tool that can "give the impresison" of this for me? e.g. one place to see if a "utility code" change has broken anything else?

(footnote: I know we still need to write the unit tests!)

A good option for small teams (free up to 5 team members) is Team Foundation Service: https://tfs.visualstudio.com/

Team Foundation Service is a cloud hosted version of Team Foundation Server. It will give you all the stuff you need to run your project such as Source Control, Work Items and Automated builds that you can use for Continuous Integration.

Microsoft released a tool that will migrate your repositories to TFS: Visual Source Safe Upgrade Tool for Team Foundation Server.

I would also advice moving to Visual Studio 2012. It's a really big improvement on VS2010. Since you are a small company you probably can apply for Microsoft BizSpark. That will give you three years of free software, Azure and support. It can really help you getting started.

Regarding your Documentation question, have a look at the SSW Rules for better .NET Projects. Rule 45 mentions how you could store your documentation with your project under source control.

All the other things you mention (Sql scripts, test scripts) should also be under source control. You can add them to a specific folder beneath your solution.

Adding Unit Test to an existing project will be a hassle. Working Effectively with Legacy Code is a book that offers some great advice on how to start. If you have no experience with writing unit tests you should first study the subject or hire an external consultant. In the projects I consult on a I see that people start enthusiastically but stop using unit tests because they run into problems.

My code takes input from file. If i have to write NUnit test case for it, do i have to change my code and take input from test case instead of file or is it possible that i don't have to change my code.?

Help Please.

Given below is one of the functions which i have to test.

    public void CLEAN_Discipline(string disp)
    {
        //////////////////////////////////////////////////////SQL CONNECTION////////////////////////////////////////////////

        string sqlconnectionString = "server=localhost;" + "database=fastdata;" + "uid=dwh;" + "password=dwh;";
        MySqlConnection cnMySQL = new MySqlConnection(sqlconnectionString);
        MySqlCommand cmdMySQL = cnMySQL.CreateCommand();
        cmdMySQL.CommandText = "Select * from single";
        MySqlDataReader reader;
        String query;
        MySqlCommand cmd2;
        cnMySQL.Open();
        reader = cmdMySQL.ExecuteReader(); 


        while (reader.Read())
        {
            string disp = reader["Discipline"].ToString();

            disp = disp.ToUpper();
            disp = disp.Replace("MS", "");
            disp = disp.Replace("-", "");
            disp = disp.Replace(" ", "");

            string roll = reader["RollNumber"].ToString();


            MySqlConnection cnMySQL2 = new MySqlConnection(sqlconnectionString);
            cnMySQL2.Open();
            query = "UPDATE single SET Discipline = ('" + disp + "')  where RollNumber = '" + roll + "' ";
            cmd2 = new MySqlCommand(query, cnMySQL2);
            cmd2.ExecuteNonQuery();
            cnMySQL2.Close();
        }
        cnMySQL.Close();

        cnMySQL.Open();
        reader = cmdMySQL.ExecuteReader();
        while (reader.Read())
        {
            string roll = reader["RollNumber"].ToString();
            string disp = reader["Discipline"].ToString();
            string degree = reader["Degree"].ToString();
            if (degree == "MS")
            {
                if (roll[0] == 'I')
                {
                    if (disp != "CS" && disp != "SPM" && disp != "")
                    {
                        disp = "UNKNOWN";
                    }
                }
                else if (roll[0] == 'K')
                {
                    if (disp != "CS" && disp != "SPM" && disp != "TC" && disp != "NW")
                    {
                        disp = "UNKNOWN";
                    }
                }
                else if (roll[0] == 'P')
                {
                    if (disp != "CS" && disp != "TC")
                    {
                        disp = "UNKNOWN";
                    }
                }
                else if (roll[0] == 'L')
                {
                    if (disp != "CS" && disp != "SPM" && disp != "TC" && disp != "NW")
                    {
                        disp = "UNKNOWN";
                    }
                }
            }
            if (disp == "UNKNOWN")
            {
                MySqlConnection cnMySQL2 = new MySqlConnection(sqlconnectionString);
                cnMySQL2.Open();
                query = "UPDATE single SET Discipline = ('" + disp + "')  where RollNumber = '" + roll + "' ";
                cmd2 = new MySqlCommand(query, cnMySQL2);
                cmd2.ExecuteNonQuery();
                cnMySQL2.Close();
            }

        }
    }

I'd advise you to break up your function into smaller functions so you can test just the parts containing logic.

In your case perhaps you should extract the logic that replaces aspects of the string you want to check into a function that takes a string. You can then test that function without all the other dependencies (reading and writing for the database).

Working Effectively with Legacy Code by Michael Feathers is a great resource for explaining how to create seams in your code that allow for testing.

Right now I’m working on a very big banking solution developed in VB6. The application is massively form-based and lacks a layered architecture (all the code for data access, business logic and form manipulation is in the single form class). My job is now to refactor this code. I'm writing a proper business logic layer and data access layer in C# and the form will remain in VB.

Here are code snippets:

public class DistrictDAO
{
    public string Id{get;set;}
    public string DistrictName { get; set; }
    public string CountryId { get; set; }
    public DateTime SetDate { get; set; }
    public string UserName { get; set; }
    public char StatusFlag { get; set; }
}

District Entity class, why its extension is DAO, Im not clear.

 public class DistrictGateway
{
    #region private variable
    private DatabaseManager _databaseManager;
    #endregion

    #region Constructor
    public DistrictGateway(DatabaseManager databaseManager) {
        _databaseManager = databaseManager;
    }
    #endregion

    #region private methods
    private void SetDistrictToList(List<DistrictDAO> dataTable, int index, DistrictDAO district){
        // here is some code for inserting 
    }    
    #endregion

    #region public methods
        try
        {
        /*
         query and rest of the codes
         */    

        }
        catch (SqlException sqlException)
        {
            Console.WriteLine(sqlException.Message);
            throw;
        }
        catch (FormatException formateException)
        {
            Console.WriteLine(formateException.Message);
            throw;
        }
        finally {
            _databaseManager.ConnectToDatabase();
        }


    public void InsertDistrict() { 
        // all query to insert object
    }

    public void UpdateDistrict() { 

    }
    #endregion
}

DistrictGateway class responsible for database query handling Now the business layer.

  public class District
{
    public string Id { get; set; }
    public string DistrictName { get; set; }
    public string CountryId { get; set; }
}


public class DistrictManager
{
    #region private variable
    private DatabaseManager _databaseManager;
    private DistrictGateway _districtGateway;
    #endregion

    #region Constructor
    public DistrictManager() { 
        // Instantiate the private variable using utitlity classes
    }
    #endregion

    #region private method
    private District TransformDistrictBLLToDL(DistrictDAO districtDAO) { 

        // return converted district with lots of coding here
    }

    private DistrictDAO TransformDistrictDLToBLL(District district)
    {

        // return converted DistrictDAO with lots of coding here
    }

    private List<District> TransformDistrictBLLToDL(List<DistrictDAO> districtDAOList)
    {

        // return converted district with lots of coding here
    }

    private List<DistrictDAO> TransformDistrictDLToBLL(List<District> district)
    {

        // return converted DistrictDAO with lots of coding here
    }


    #endregion

    #region public methods
    public List<District> GetDistrict() {
        try
        {
            _databaseManager.ConnectToDatabase();
          return TransformDistrictBLLToDL(  _districtGateway.GetDistrict());

        }
        catch (SqlException sqlException)
        {
            Console.WriteLine(sqlException.Message);
            throw;
        }
        catch (FormatException formateException)
        {
            Console.WriteLine(formateException.Message);
            throw;
        }
        finally {
            _databaseManager.ConnectToDatabase();
        }
    }

    #endregion

This is the code for the business layer.

My questions are:

  1. Is it a perfect design?
  2. If not, what are flaws here?
  3. I think, this code with duplicated try catch block
  4. What can be good design for this implementation

If your job is to refactor the code then first of all ask your boss if you should really, really should just refactor it or add functionality to it. In both cases you need an automated test harness around that code. If you are lucky and you should add functionality to it, then you at least have a starting point and a goal. Otherwise you will have to pick the starting point by yourself and do not have a goal. You can refactor code endlessly. That can be quite frustrating without a goal.

Refactoring code without tests a recipe for disaster. Refactoring code means improving its structure without changing its behavior. If you do not make any tests, you cannot be sure that you did not break something. Since you need to test regularly and a lot, then these tests must be automated. Otherwise you spend too much time with manual testing.

Legacy code is hard to press into some test harness. You will need to modify it in order to get it testable. Your effort to wrap tests around the code will implicitly lead to some layered structure of code.

Now there is the hen and egg problem: You need to refactor the code in order to test it, but you have no tests right now. The answer is to start with “defensive” refactor techniques and do manual testing. You can find more details about these techniques in Micheal Feather's book Working Effectively with Legacy Code. If you need to refactor a lot of legacy code, you should really read it. It is a real eye opener.

To your questions:

  1. There is no perfect design. There are only potentially better ones.
  2. If the application does not have any unit tests, then this is the biggest flaw. Introduce tests first. On the other hand: Those code snippets are not that bad at all. It seems that DistrictDAO something like the technical version of District. Maybe there was some attempt to introduce some domain model. And: At least DistrictGateway gets the DatabaseManager injected as constructor parameter. I have seen worse.
  3. Yes, the try-catch blocks can be seen as code duplicates, but that is nothing unusual. You can try to reduce the catch clauses with a sensible choice of Exception classes. Or you can use delegates or use some AOP techniques, but that will make the code less readable. For more see this other question.
  4. Fit the legacy code into some test harness. A better design will implicitly emerge.

Any way: First of all clarify what your boss means with refactoring code. Just refactoring code without some goal is not productive and will not make the boss happy.

Where do we start using Unit testing?
I have some doubts about where to start using Unit testing.
I am doing unit testing with Junit in RAD. And I am doing it after all code is ready to deploy or may be after deployment. I am confused why we are doing unit testing after code is almost ready to deploy.
My question is when we should start unit testing?

I have some more questions.....
What I am doing in my unit testing is I took one method from a class and create a test class for that method.
In that class I give some input values to the method and expect respected output values from the database.
Now here the single test class does taking input values->passing it to method->calling the method from original class->database connection->fetching value from DB->return it to test class.

If test successfully runs then Junit console shows Green Bar else Red bar.
Red bar with the cause of error.But it doesn't generate any unit test report.

Now here is my question...
Am I doing correct unit testing? Since a single unit test method comprises all code flow and gives result...

The best time to start unit testing, if you haven't already, is now. The most effective use of unit tests is Test-Driven Development (TDD), in which you write the tests for each method as you implement it (write a test that fails and then implement the method to make it pass). However, it's not too late to add tests later on. JUnit tests solidify certain assumptions about the code that could change later on. If the assumption changes, the test breaks and you might save yourself from some bugs that would have beeen really hard to detect.

I don't know of a report facility, but you can add a JUnit ANT task which will output the test results to your build console (or log, if ant output is captured).

Your database tests sound like small integration tests rather than unit tests. That's good, but if the tests are too slow you might want to consider using mock objects (via a framework like JMock or EasyMock) to replace real DB connections with fake ones. This isolates the logic of the method from the state and behavior of the database and lets you run tests without worrying about whether the DB is running and stocked with test data.

Useful links :

http://en.wikipedia.org/wiki/Test-driven_development

http://www.jmock.org/

http://easymock.org/

http://ideoplex.com/id/25/ant-and-junit

http://ant.apache.org/manual/Tasks/junit.html

http://misko.hevery.com/code-reviewers-guide/ (Guide to writing unit-testable code)

[Edit - in response to comment]: About whether what you've done is correct unit testing, technically the answer is 'no'. As I said, it's more of an integration test, which is good, but it does too much and has too many dependencies to be considered a real unit test. Ideally, a unit test will test the logic in a method and not the dependencies. The guide to writing testable code mentioned above (as well as the associated blog of Misko Hevery) gives some good advice about how to write code with seams to insert mocks of dependencies. Michael Feathers goes into depth on the subject in his excellent book Working Effectively with Legacy Code.

The key factor is dependency injection: Testable methods don't look for their dependencies - they receive them in the constructor or as arguments. If this requires too much refactoring, there are some tricks you can try, such as moving the code that looks for dependencies into protected or package-visible methods you can override. So then you run the tests on a subclass which returns mock objects from those methods.

You should really write your tests as early as possible, ideally before you write your implementation.

This is a book that I've found useful on the subject and may be worth a read... http://www.amazon.com/Test-Driven-Development-Addison-Wesley-Signature/dp/0321146530

To try and address your second point, what you are describing is an "integration test" i.e it is not just testing your unit of code, but also your databases connectivity, configuration, data and the like.

Unit tests should only test the specific "part" of code that you are concerned with. i.e. Calling your method should just test that method. It should not be concerned with Database conectivity and data access. You can achieve this using "Mock" objects to act as a temporary replacement for your dependencies.

See: http://msdn.microsoft.com/en-us/library/aa730844.aspx

Whilst this document is from Microsoft and you're using java, the same principles apply.

Hope this helps.

The system I am working has (fortunately) JUnit unit tests already, which are covering some percentage of the code functionality (Better than nothing, I suppose).

The methods in the system are highly dependent to each other and the combination of the parameters that can be sent to a method as an input is huge. Let's see an example:

public void showMessage (final Language language, final Applications apps, final UserID userId)

The above method can have more than 300,000 different messageboxes poping up, based on different apps and userIds. Apart from the fact that whether putting such huge pressure on a single method a couple of them at least, sounds resoanable or not (which IMO cannot be justified design-wise), we are encountering the fear of a bug which can cause huge problems.

That being said, what we found is simply is as follows:

  1. Refactoring JUnit Unit Tests
  2. Extract a lot of objects in the method as parameters
  3. Create an Object Factory
  4. Feed the unit tests with different inputs created by the factory (brute-force test, maybe?)
  5. Check the outcome of the newly refactored tests

So basically, the question is as follows:

Creating Integration Tests on top of Unit Tests: Possibilities and Challenges? Is this a customized approach or this is a rational standard approach? Where to put such tests in both the project files and project build lifecycle, should they be built all the time to complete the build process?

Any type of feedback/comment, both from JUnit implementation limitations and Design/Idea limitations.

Amir,

What do you mean by building "Integration Tests on top of Unit Tests"? I'd see integration tests being rather coarse grained black box tests, testing your application deployed on server talking to a real db, JMS queues, 3rd party webservices etc. Normally you might want to mock pieces of infrastructure you don't own (like 3rd party webservices etc).

Integration test will be much slower than unit tests therefore you'd tend not to execute them with every build. If you use CI server like jenkins you might create a separate integration build that would execute all tests you have (unit + integration).

You mentioned dynamic data in your unit tests. I think it's good to have unit tests with static input data to make sure they cover the same execution path every time they are executed, verifying if any changes of behavior have happened when new code has been introduced or to test some corner cases...

BUT..

it's also a good idea to have and additional suite of tests that would stress your application with random input data. AFAIK Lucene/Solr guys took that approach and they even opensourced they own framerwork for randomized testing. Have a look at those links:

http://labs.carrotsearch.com/randomizedtesting.html

http://vimeo.com/32087114

As for general approach on how to work with legacy software I recommend M. Feather's "Working Effectively with Legacy Code". Excellent book, it might give you some ideas how to cover your existing code with unit tests and improve code quality.

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

We have a legacy third-party telephony system built on something called "CT ADE" that periodically hangs for a few seconds (5 to 30) then resumes. During these hangs, users experience frustrating pauses in the phone menu. This has been going on for several weeks at least.

This code was not written by me, so my knowledge of it is very limited. Internally there are multiple "tasks" (threads?), one per phone line, that handle calls. When the application hangs, all "tasks" are hung.

This issue does not seem to be load related. It occurs even during times of low usage. It does not appear to be network related (occurs on systems where the DB is located on the same physical box as this app). Does not appear to be network or disk related, although creating sample tasks that do lots of DB I/O and File I/O can cause shorter pauses within this application.

The process does not show any memory or cpu spikes when the problem occurs.

At this point I'm just grasping for anything to try...

Working with legacy code is painful - in my experience you just need to dive in and try and understand what the code is doing through whatever means works for you - be it by reading the code and trying to figure out what it does, or debugging various scenarios and stepping through each line of code executed.

It will take a while, and there will be parts of the code you will never understand, but given enough time staring at the code and experimenting with what it does you should eventually be able to understand enough to figure out what the problem is.

There is a book Working Effectively with Legacy Code which I have never read but is meant to be very good.

We have a project developed by 2 years with poorly designed architecture. Now a days there are no any unit tests at all.

Current version of system works satisfyingly but we vitally need refactoring of core modules.

The budget is also limited so we can not hire suffisient number of developers to write unit tests.

Is it the possible strategy to generate code automatically for unit tests which covers, for example, interaction with data, in assumpion that now system works fine and current system's output can be converted in XML-fixtures for unit testing?

This approach gives us a possibility to quickly start refactoring of existing code and receieve immediate feedback if some core functionality is corrupted because of changes.

I would be wary of any tools that claim to be able to automatically determine and encode an arbitrary application's requirements into nice unit tests.

Instead, I would spend a little time setting up at least some high-level functional tests. These might be in code, using the full stack to load a predefined set of inputs and checking against known results, for instance. Or perhaps even higher-level with an automation tool like Selenium or FitNesse (depending on what type of app you're building). Focus on testing the most important pieces of your system first, since time is always limited.

Moving forward, I'd recommend getting a copy of Michael Feathers' Working Effectively with Legacy Code, which deals with exactly the problem you face: needing to make updates to a large, untested codebase while making sure you don't break existing functionality in the process.

We have several places in our code-base where we do something similar to the following:

DataTable dt = new DataTable();
using (DatabaseContext context = DatabaseContext.GetContext(false)) {
    IDbCommand cmd = context.CreateCommand("SELECT * FROM X");
    SqlDataAdapter dataAdapter = new SqlDataAdapter((SqlCommand)cmd);
    dataAdapter.Fill(dt);
}
return dt;

How can we use a Mock Testing framework like Moq to remove our testing dependency on the database? I'd like to mock up the DataTable that gets returned.

Clarification: we have plans to change this code but currently can't. Is it possible at all to mock as is?

By can't change code I think You mean You can't do a big refactor. Here's what I suggest.

  1. Extract code You provided to a method.
  2. Make it virtual
  3. If it is not make it protected or public
  4. Inherit from class containing this method and name it OriginalClassNameTesting for example
  5. Override method and return whatever DataTable You wan't
  6. In tests use Your OriginalClassNameTesting class instead of original one.

This pattern is called 'Extract and Override' and it's one of many presented in great book - http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052.

Some may not like that You are adding virtual method just for testing. So what, this is just a first step. You said You have plans for refactoring. Then it will be important to have tests in place so that You are sure You didn't brake anything. And in Java every method is virtual by default (or am I wrong?)

We've just inherited a new codebase which has a bunch of automated tests. We've been 'going with the flow' and running them before each merge and deployment. Some of our staff have even been adding to the collection of tests when they write new functionality.

I've been trying my best to convince the team that we need to drop these things like a hot potatoe but there is a lot of internal bickering about it.

As I see it, the problem is as follows - these tests take a very long time to run (up to 10 minutes on one developer's older machine!). They keep breaking - I don't know about you but I hate it when my software breaks. If we just got rid of all this buggy code, we would reduce the number of bugs by a factor of 10 at the least. The developers who have been writing new ones are taking at least twice as long to do anything as the others. This is ridiculous, we are already working with tight deadlines and now progress is slowing down even further, for this project, at least. I was looking at one of the commits made by one of these devs and the feature was just over 100 LOC, the tests were close to 250 LOC!

My question is how can we strip out all of this buggy, unmaintanable code which breaks every five minutes (I don't just want to do a search and replace in case actual features start breaking as a result of it) and convince our developers to stop wasting time?

Why do the tests keep breaking? Why are they buggy and unmaintainable? That is the root of your problems.

The tests should express the requirements of your software. If they keep breaking, that may be because they are too tightly coupled to the internals of the implementation, rather than only invoking the code through public interfaces. If you can't drive the code through public interfaces, then that is usually a sign that the code could be better designed to be more testable - which tends to also make it clearer, more flexible, and robust.

If you remove the tests, then you will have no idea whether your codebase still works. Stripping out tests because they are "buggy" leaves you with the uncomfortable question: if the tests are buggy, why would the actual code be any less buggy?

On the other hand, if the (legacy) tests are cripplingly slow and awkward, then some of them at least may be more trouble that they are worth - but you'd have to consider them on a case-by-case basis, and work out how you will replace them. You could time the tests and see if there are any particularly slow ones to target - if they are JUnit tests then you get the timings automatically. You could also measure code coverage and see whether tests are overlapping, or covering very small amounts of the codebase given their size and run-time. There are some books specifically on the topic of dealing with legacy code.

When done properly, automated tests speed up development, not slow it down (assuming you want the software to actually work) because developers can make changes and refactor the code without fear that they will break something else every time they touch anything, and because they can clearly tell when a feature is working, as opposed to "it seems to work, I reckon".

Speeding up the tests would be desirable. Not all tests can be fast if they are exercising the system end-to end, but I normally keep such "feature tests" separate from the fine-grained unit tests, which should run very fast so they can be run all the time. See this posting by Michael Feathers on his definition of a unit test, and how to deal with slow tests.

If you care whether your code works or not, then the test code is just as important as the actual code - it needs to be correct, clear and factored to avoid duplication. Having at least as much test code as real code is perfectly normal, though the ratio will vary a lot with the type of code. Whether your specific example was excessive is impossible to say without seeing it.

Update: In one of your other questions you say "I've introduced the latest spiffy framework, unit & functional testing, CI, a bug tracker and agile workflow to the environment". Have you changed your mind about testing, or is this question a straw-man, perhaps? Or is your question really "should I drop all the old tests and start again"? (It doesn't read that way, given your comments about developers writing new tests...)

I truly hope this question doesn't get deleted since I really do need help from the programming pros out there...

I've been programming for quite a while (my main platform being .Net), and am about 85% self-taught - Usually from the school of hard-knocks. For example, I had to learn the hard way why it was better to put your configuration parameters in an .ini file (that was easier for my users to change) rather than hard-code them into my main app and now to use the app.config rather than even a config file sometimes. Or what a unit-test was and why it proved to be so important. Or why smaller routines / bits of code are better 90% of the time than one, long worker process...

Basically, because I was thrown into the programming arena without even being shown .Net, I taught myself the things I needed to know, but believe I missed MANY of the fundamentals that any beginner would probably be taught in an intro class.

I still have A LOT more to learn and was hoping to be pointed in the right direction towards a book / web-resource / course / set of videos to really teach the best programming practices / how to set things up in the best way possible (in my case, specifically, for .Net). I know this may seem subjective and every programmer will do things differently, but if there are any good books / resources out there that have helped you learn the important fundamentals / tricks (for example, self-documenting code / design patterns / ) that most good programmers should know, please share!!!

My main platform is .Net.

Thanks!!!!

Realated tags

.netabstract-classactionscript-3agileagile-processesalgorithmandroidanonymous-methodsantanti-patternsapiarchitectureasp.netasp.net-membershipasp.net-mvcasynchronousauditautomated-refactoringautomated-testsautomationbatch-filebddbold-delphiboostbrownfieldbusiness-logic-layercc#c#-4.0c++c++-clicincircular-dependencyclassclassloaderclojurecode-analysiscode-cleanupcode-coveragecode-generationcode-injectioncode-reusecode-search-enginecode-smellcodebasecoding-stylecohesioncomcompiler-errorsconstructorcontainerscontinuous-integrationcouplingdata-access-layerdatabasedatabase-designdatamodeldebuggingdecompositiondelphidelphi-2007dependenciesdependency-injectiondeprecateddesigndesign-patternsdesign-principlesdetectdocumentationdomain-driven-designduniteclipseeclipse-pluginedit-and-continueembeddedetlexceptionexception-handlingextreme-programmingflexflexunit4frameworksfunctional-testingglobal-variablesgod-objectgoogle-apigradlegrailshanghttprequestideinheritanceinitial-contextintegration-testingintegrityinterfaceinversion-of-controlioioc-containeriosiphonejavajava-eejavascriptjbossjmockjpajqueryjspjunitjunit4lambdalanguage-agnosticlayerlegacylegacy-codelinqlinuxlspmaintainabilitymaintaining-codemaintenancematrixmethodologymethodsmfcmicrosoft-fakesmockingmockitomodel-view-controllermolesmongodbmoqmstestmultithreadingmvvmmvvm-lightn-layern-tiernhibernateninjectnunitobjectobjective-coopopen-sourceopenmpoptimizationormpair-programmingparasoftperlphpphpunitportingpowermockpowermockitoprivateprocessproductivityprogramming-languagesprojectproject-managementprojectsprojects-and-solutionspythonpython-3.xqaqt4re-engineeringrebuildrefactoringreflectionregressionrelease-managementresharperresourcesreverse-engineeringrhino-mocksrspecruby-on-railsruby-on-rails-3.1scalascalabilityscrumseleniumselenium-webdriverservletssimplyvbunitsingletonsoftware-designsoftware-engineeringsoftware-qualitysolid-principlessplspringsqlsql-serversrpstabilitystatestatic-methodsstored-proceduresstubsynchronizationsystemtddtest-firsttest-plantestabilitytestcasetestdrivendesigntestingtesting-strategiestightly-coupled-codetodostornadoui-testingunit-testingunity-containerunixuser-interfacevalidationvb.netvb5vb6vb6-migrationvc6version-controlvisual-c++visual-c++-2008visual-sourcesafevisual-studiovisual-studio-2008visual-studio-2010vs-unit-testing-frameworkweb-applicationswebformswhite-boxwpfwsgixamlzend-framework