March | 2010 | Journeyman

Shields Up! (continued)

March 12, 2010 § 5 Comments

(Continued from the previous article: “Shields Up!”)

Here is where I think the argument to ‘check for null in all cases’ breaks down under the defensive programming paradigm. The best opportunity to check the state of m_pObjAPtr and s_pInstance is the moment SomeClass is created or requested. We want to identify the problem early to provide opportunities to correct or avoid issues when they happen. So you could write the following code:

1:

   2: SomeClass.h

   3: -----------

4:

   5: class SomeClass {

   6:     public:

   7:         static boost::shared_ptr<SomeClass> getInstance();

   8:         bool ReadyToBeUsed();

9:

  10:     private:

  11:         static boost::shared_ptr<SomeClass> s_pInstance;

  12:         ObjectAPtr m_pObjAPtr;

  13: };

14:

  15: SomeClass.cpp

  16: -------------

17:

  18: SomeClassPtr SomeClass::getInstance()

  19: {

  20:     if (!s_pInstance)

  21:     {

  22:         s_pInstance.reset( new SomeClass() );

  23:     }

24:

  25:     assert(s_pInstance->ReadyToBeUsed());

26:

  27:     return s_pInstance;

  28: }

29:

  30: SomeClass::SomeClass() {

  31:     m_pObjAPtr = GetFromFactory();

  32:     assert(m_pObjAPtr);

  33: }

34:

  35: bool SomeClass::ReadyToBeUsed() {

  36:     return (pObjAPtr) ? true : false;

  37: }

The new code checks the internal state of the SomeClass instance and asserts if there is a problem. Now SomeClass has a publically accessible method to validate its internal state. Given that you added the null check in the caller because the response from getInstance() cannot be trusted, by extension the same rule is applied to checking ReadyToBeUsed(). You could argue that the null check is enough, but lets be honest, just because it is not null, does not mean that it is usable. Especially considering that the object is leveraged by other systems since it is a singleton in this example. Checking for null is a very limited validation. This is an opportunity to further ensure that the object at the end of the pointer is usable. If you are truly concerned about dodging erroneous behaviours, you have to run ReadyToBeUsed() next to the null check in the callers code.

So now you have the following:

   1: MyCode.cpp

   2: ----------

   3: SomeClassPtr pLocal = SomeClass::getInstance();

4:

   5: if (pLocal && pLocal->ReadyToBeUsed()) {

   6:     pLocal->SomeMethod();

   7: }

An easy win and the code is safe right? Yes, but at what cost? The logic for what constitutes a valid instance of SomeClass is duplicated across two entities in the software. This would not be be much of a problem if SomeClass is only used once. In our code SomeClass is used in several places. Which means the validations for a valid SomeClass instance is duplicated across several systems. Should the requirements change for ReadyToBeUsed (perhaps passing flag for deep or shallow check of its member vars), or if a new entity needs checking before SomeClass (a user security check), you may need to edit each system and update each check – everywhere there is a response from getInstance().

Let’s not forget that SomeClass is not the only object usable in the system. Several classes use multiple instances of other classes. If you truly want your code to test for a usable instance of the objects it uses you could see this in your code:

   1: SomeClassPtr pLocal1 = SomeClass1::getInstance();

   2: SomeClassPtr pLocal2 = SomeClass2::getInstance();

   3: SomeClassPtr pLocal3 = SomeClass3::getInstance();

4:

   5: if (pLocal1 && pLocal1->ReadyToBeUsed()) {

   6:     pLocal1->SomeMethod();

   7: }

8:

   9: if (pLocal2 && pLocal2->ReadyToBeUsed()) {

  10:     pLocal2->SomeMethod();

  11: }

12:

  13: if (pLocal3 && pLocal3->ReadyToBeUsed()) {

  14:     pLocal3->SomeMethod();

  15: }

An additional problem with this mindset is that several entities in the software ecosystem now have knowledge of what a valid SomeClass is. This not only breaks the DRY principle, but also erodes the encapsulation of SomeClass. After all, getInstance() was supposed to guarantee a valid instance of SomeClass.

I don’t like that direction. Forget the calls to ReadyToBeUsed(). Even if all you check for is null in this code, it adds zero value if a null pointer is not possible, duplicates responsibilities, and clutters the code.

Here is what I would prefer to see:

   1: SomeClass.cpp

   2: -------------

3:

   4: SomeClassPtr SomeClass::getInstance()

   5: {

   6:     if (!s_pInstance)

   7:     {

   8:         s_pInstance.reset( new SomeClass() );

   9:     }

10:

  11:     // invariants

  12:     if (    !s_pInstance                         // can never be null

  13:             || !s_pInstance->ReadyToBeUsed()) {  // must be in a usable state

14:

  15:         // log or assert that there was a problem

16:

  17:         s_pInstance.reset ( new NullObject() );

  18:     }

19:

  20:     return s_pInstance;

  21: }

22:

  23: MyCode.cpp

  24: ----------

25:

  26: SomeClassPtr pLocal = SomeClass::getInstance();

27:

  28: pLocal->SomeMethod();

The above code uses invariants or validations to enforce the contract with the callers. This code solves every issue the repeated ‘Null checks by the caller’ solves with less overhead and greater versatility. The summaries below states the reasons why.

Null-Checks by the Caller

Pros

Easy to implement
Guarantees that the code does not attempt to use a null pointer

Cons

Duplicate the null check (if that’s all you check for) every time you acquire the pointer
Null check does not guarantee a useable object
Clutters the code (reduced readability and maintainability)
Increased effort to unit test (multiple systems to test if the system fails to deal with a null pointer vs. a single system)
It does not enforce the contract (unless the null checks are wrapped in asserts and executed in debug mode), otherwise the code silently moves on
A null check could be missed by the author and reviewers

Invariants to Enforce the Contract

Pros

Easy to implement
It guarantees a usable pointer every time (read the Null Object Pattern)
No duplication of code / encapsulates the validations
Less code clutter (increased readability and maintainability)
Less effort to unit test (less execution paths, only one system to test for a null pointer)
Frees engineers from performing monkey work (i.e. adding checks for nulls everywhere)
Makes the contract explicit for future engineers so that they must make the conscious decision to maintain or reject the contract
Invariants/Validations are always executed

Cons

Just one…

Cons (common to both approaches)

Someone could always change the code and break the contract (remove the invariants or the null checks). This cannot be enforced in the code.

I think I’ll invest in the solution that has more pro’s than cons, providing me with the greatest return. That’s where I’ll hang my shield.

———-8<—————————————————–

Update: I wrote the article above a couple days ago when the ‘you are missing a null check’ defect was originally logged. The team met today to discuss it. Everyone outlined their points of view, concerns, and goals. In the end, we agreed that the null check was not needed, and decided to use the solution I proposed. It was a great conversation, not because we chose my strategy, but because we got a chance to talk software design. It is something I wish we would do more often.

Shields Up!

March 12, 2010 § Leave a comment

Where do you hang your shield?

I spent the last week designing, writing, and testing a new error reporting system for our product. I spent a lot of time reviewing the design ensuring that I did not introduce any unnecessary dependencies or wrote code that was not required. While coding I intentionally made one omission to the design while knowingly ignoring one of our team rules. I then submitted it for review knowing that someone would reject the change, request that the rule be upheld, and log a defect so that it would be fixed. There will be debate. Eventually other members of the team will be involved because it involves one of our team rules. Sides will be picked and arguments for and against will be made. It will bring to light individual perspectives and debate about what constitutes good design in software.

So what was the single change I made that violated a team rule and opened Pandora’s Box of emotional debate? I did not check for a null pointer.

Is this a bad thing? If you are a programmer, I am sure that you have reached your conclusion already. Of course it is a terrible thing not to check your pointers before you use them. I absolutely agree. So would you be surprised that despite the fact I did not check for a null pointer, I did not violate the rule?

Here is some context.

The rule to check for a null was put in to place to ensure that code does not attempt to utilize a null pointer. If a null pointer was used, the repercussions could not be predicted. If you did not encapsulate it inside a try/catch, the code would likely crash. The ‘check for null’ rule shields you from this undesired behaviour in your code. Check it before you use it. That makes sense.

The question I pose is where to put the check? Where should you hang the shield that protects you for the highest return on investment?

Here is an example of the c++ code I was working with.

1:

   2: SomeClass.h

   3: -----------

4:

   5: class SomeClass {

   6:     public:

   7:         static boost::shared_ptr<SomeClass> getInstance();

   8:         void SomeMethod();

9:

  10:     private:

  11:         static boost::shared_ptr<SomeClass> s_pInstance;

  12: };

13:

  14: typedef boost::shared_ptr<SomeClass> SomeClassPtr;

15:

16:

  17: SomeClass.cpp

  18: -------------

19:

  20: SomeClassPtr SomeClass::s_pInstance;

21:

  22: SomeClassPtr SomeClass::getInstance()

  23: {

  24:     if (!s_pInstance)

  25:     {

  26:         s_pInstance.reset( new SomeClass() );

  27:     }

28:

  29:     return s_pInstance;

  30: }

31:

32:

  33: MyCode.cpp

  34: ----------

35:

  36: SomeClassPtr pLocal = SomeClass::getInstance();

37:

  38: pLocal->SomeMethod();

SomeClass is a singleton. This means that only one copy of it can can exist within the system. When an entity wants to use it, it calls “SomeClass::getInstance()”. getInstance() checks if it has a pointer to an instance of SomeObject. If not, it creates one. Then it returns the pointer to the caller. It is important to note that getInstance() will never return a null pointer. The check “if (!s_pInstance)” enforces that s_pInstance can never be null when returned to the caller. So I do not need to check if pLocal is null before calling SomeMethod().

It is important to note that all of this code is under the control of the team. SomeClass is not some external dependency on a third party SDK. We wrote it and maintain it.

Yet the argument was made that the ‘check for null’ was still needed even though getInstance() would never return a null pointer. This is what was being asked for:

   1: SomeClassPtr pLocal = SomeClass::getInstance();

2:

   3: if (pLocal) {

   4:     pLocal->SomeMethod();

   5: }

The argument for adding the null check is two fold:

It is easy to add so why not just add it.
It makes your code more defensive because you cannot trust what getInstance() returns.

Something that is easy does not mean it is the right thing to do. I agree it is better than not checking the pointer, but there is the perception that adding it is a low cost, easy win. This is a perception I believe to be false.

Let’s explore the cost around the rule to have the caller always check for null with a couple scenarios.

Scenario 1: “getInstance() never changes to return a null”

Let’s assume that getInstance() does not change for the life of the product. This means:

We spent time adding null checks each time getInstance() was called, even though they were not necessary
We slowed code reviews when someone forgot to add a null check because we had to enter a defect for it.
- The author then spends time making the change, retesting it, and resubmitting it for review.
- This delays software being sent to QA to validate the engineers modifications.
Most importantly, the null-check adds no value because getInstance() already guarantees never to return a null pointer.
The code is not more readable because of it. Extra null checks adds to the code you need to review when troubleshooting and making changes.
It has become more complex because the null check adds a new execution path through the code.
Unit tests are more complex because to ensure coverage you need to include the test if getInstance() returns a null. Even though we know that getInstance() does not.

Scenario 2: “getInstance() is changed and potentially returns a null”

Now let’s assume that someone does modify getInstance() several years from now to potentially return a null pointer. Will the code crash if the caller does not check the response from getInstance() for a null. It is absolutely possible. You know what? That someone could have done many things to SomeClass. Anything is possible. SomeClass could have been refactored in to a bridge pattern for a remote service for all we know. Perhaps someone did modify getInstance() and break the contract it had with those entities that depended on it. But someone could also leave getInstance() alone and remove the null checks too. The callers to getInstance() are not immune from code refactorings.

The problem with someday/maybe arguments is that they are speculative & presumptuous. You can’t make the argument that someone will do something to only one part of the code and ignore everything else. Anyone with an opposing viewpoint could make the same argument and have equal weight. Someday, someone might change ‘this’ code, and not that code. It may not be caught in code reviews, QA regression tests, or beta tests. It could make it in to production where a user might receive a null pointer and crash the product. Sure it is possible and a null check would catch it potentially averting a crash. Sure the code that uses the pointer does not execute and crash, but what about the objects and systems that depend on SomeClass performing its responsibilities?

Someone, someday willingly alters the code and breaks the contract in a way that might result in a null pointer, and the only way to defend against that person is check for a null pointer. That argument doesn’t hold water for me.

Someone altering the code is not something you can defend against. Why? Maybe that someone is the same person modifying the caller. Maybe someone goes through the code and strips code they feel is unnecessary and that they do not want or need to maintain. Don’t laugh. I’ve met several individuals and teams that have done just that. The argument “someone/someday” is far to vague and hypothetical. Picking one practice over another because ‘someone/someday’ might do something bad is not a good foundation to base a practice on. Remember, in this case we are not writing an interface or an SDK where the inputs are completely foreign and unknown. This is a cohesive, stand-alone product that we control. We know how it works.

Ok, lets run with the argument that ‘someone/someday’ will change SomeClass so that getInstance could return a null pointer. Lets also assume that SomeClass now has a newly introduced internal state that needs to be checked before it can be used. For example, SomeClass now calls a factory to retrieve a pointer to a helper class. Without the helper class, SomeClass cannot fulfill its responsibilities. Lets’ assume that the caller also checks for a null pointer:

1:

   2: SomeClass.h

   3: -----------

4:

   5: class SomeClass {

   6:     public:

   7:         static boost::shared_ptr<SomeClass> getInstance();

   8:         void SomeMethod();

9:

  10:     private:

  11:         static boost::shared_ptr<SomeClass> s_pInstance;

  12:         ObjectAPtr m_pObjAPtr;

  13: };

14:

  15: typedef boost::shared_ptr<SomeClass> SomeClassPtr;

16:

17:

  18: SomeClass.cpp

  19: -------------

20:

  21: SomeClassPtr SomeClass::getInstance()

  22: {

  23:     if (!s_pInstance)

  24:     {

  25:         s_pInstance.reset( new SomeClass() );

  26:     }

27:

  28:     return s_pInstance;

  29: }

30:

  31: SomeClass::SomeClass() {

  32:     m_pObjAPtr = GetFromFactory();

  33:     assert(m_pObjAPtr);

  34: }

35:

36:

  37: MyCode.cpp

  38: ----------

39:

  40: SomeClassPtr pLocal = SomeClass::getInstance();

41:

  42: if (pLocal) {

  43:     pLocal->SomeMethod();

  44: }

In the SomeClass constructor, it acquires a pointer, stores it in m_pObjAPtr. Following the ‘check for null’ rule we assert if it is null. If it is not a release build of the product, the assert will trigger and the user will be notified of the problem. In a release build, asserts are typically turned off. In either case, the code could continue to execute.

Here’s the problem. There are now two conditions to check for a valid instance of SomeClass, the null checks for s_pInstance & m_pObjAPtr. s_pInstance is already checked by the caller and getInstance(). Here’s my question. Where would you add the check for m_pObjAPtr? Remember that the purpose of getInstance is to return a usable instance of SomeClass and the caller cares that SomeClass is usable because it is also validating what getInstance() returns.

(Continued in the next article: “Shields Up! (continued)”)