Shields Up!

March 12, 2010 § Leave a comment

Where do you hang your shield?

I spent the last week designing, writing, and testing a new error reporting system for our product. I spent a lot of time reviewing the design ensuring that I did not introduce any unnecessary dependencies or wrote code that was not required. While coding I intentionally made one omission to the design while knowingly ignoring one of our team rules. I then submitted it for review knowing that someone would reject the change, request that the rule be upheld, and log a defect so that it would be fixed. There will be debate. Eventually other members of the team will be involved because it involves one of our team rules. Sides will be picked and arguments for and against will be made. It will bring to light individual perspectives and debate about what constitutes good design in software.

So what was the single change I made that violated a team rule and opened Pandora’s Box of emotional debate? I did not check for a null pointer.

Is this a bad thing? If you are a programmer, I am sure that you have reached your conclusion already. Of course it is a terrible thing not to check your pointers before you use them. I absolutely agree. So would you be surprised that despite the fact I did not check for a null pointer, I did not violate the rule?

Here is some context.

The rule to check for a null was put in to place to ensure that code does not attempt to utilize a null pointer. If a null pointer was used, the repercussions could not be predicted. If you did not encapsulate it inside a try/catch, the code would likely crash. The ‘check for null’ rule shields you from this undesired behaviour in your code. Check it before you use it. That makes sense.

The question I pose is where to put the check? Where should you hang the shield that protects you for the highest return on investment?

Here is an example of the c++ code I was working with.

1:

   2: SomeClass.h

   3: -----------

4:

   5: class SomeClass {

   6:     public:

   7:         static boost::shared_ptr<SomeClass> getInstance();

   8:         void SomeMethod();

9:

  10:     private:

  11:         static boost::shared_ptr<SomeClass> s_pInstance;

  12: };

13:

  14: typedef boost::shared_ptr<SomeClass> SomeClassPtr;

15:

16:

  17: SomeClass.cpp

  18: -------------

19:

  20: SomeClassPtr SomeClass::s_pInstance;

21:

  22: SomeClassPtr SomeClass::getInstance()

  23: {

  24:     if (!s_pInstance)

  25:     {

  26:         s_pInstance.reset( new SomeClass() );

  27:     }

28:

  29:     return s_pInstance;

  30: }

31:

32:

  33: MyCode.cpp

  34: ----------

35:

  36: SomeClassPtr pLocal = SomeClass::getInstance();

37:

  38: pLocal->SomeMethod();

SomeClass is a singleton. This means that only one copy of it can can exist within the system. When an entity wants to use it, it calls “SomeClass::getInstance()”. getInstance() checks if it has a pointer to an instance of SomeObject. If not, it creates one. Then it returns the pointer to the caller. It is important to note that getInstance() will never return a null pointer. The check “if (!s_pInstance)” enforces that s_pInstance can never be null when returned to the caller. So I do not need to check if pLocal is null before calling SomeMethod().

It is important to note that all of this code is under the control of the team. SomeClass is not some external dependency on a third party SDK. We wrote it and maintain it.

Yet the argument was made that the ‘check for null’ was still needed even though getInstance() would never return a null pointer. This is what was being asked for:

   1: SomeClassPtr pLocal = SomeClass::getInstance();

2:

   3: if (pLocal) {

   4:     pLocal->SomeMethod();

   5: }

The argument for adding the null check is two fold:

It is easy to add so why not just add it.
It makes your code more defensive because you cannot trust what getInstance() returns.

Something that is easy does not mean it is the right thing to do. I agree it is better than not checking the pointer, but there is the perception that adding it is a low cost, easy win. This is a perception I believe to be false.

Let’s explore the cost around the rule to have the caller always check for null with a couple scenarios.

Scenario 1: “getInstance() never changes to return a null”

Let’s assume that getInstance() does not change for the life of the product. This means:

We spent time adding null checks each time getInstance() was called, even though they were not necessary
We slowed code reviews when someone forgot to add a null check because we had to enter a defect for it.
- The author then spends time making the change, retesting it, and resubmitting it for review.
- This delays software being sent to QA to validate the engineers modifications.
Most importantly, the null-check adds no value because getInstance() already guarantees never to return a null pointer.
The code is not more readable because of it. Extra null checks adds to the code you need to review when troubleshooting and making changes.
It has become more complex because the null check adds a new execution path through the code.
Unit tests are more complex because to ensure coverage you need to include the test if getInstance() returns a null. Even though we know that getInstance() does not.

Scenario 2: “getInstance() is changed and potentially returns a null”

Now let’s assume that someone does modify getInstance() several years from now to potentially return a null pointer. Will the code crash if the caller does not check the response from getInstance() for a null. It is absolutely possible. You know what? That someone could have done many things to SomeClass. Anything is possible. SomeClass could have been refactored in to a bridge pattern for a remote service for all we know. Perhaps someone did modify getInstance() and break the contract it had with those entities that depended on it. But someone could also leave getInstance() alone and remove the null checks too. The callers to getInstance() are not immune from code refactorings.

The problem with someday/maybe arguments is that they are speculative & presumptuous. You can’t make the argument that someone will do something to only one part of the code and ignore everything else. Anyone with an opposing viewpoint could make the same argument and have equal weight. Someday, someone might change ‘this’ code, and not that code. It may not be caught in code reviews, QA regression tests, or beta tests. It could make it in to production where a user might receive a null pointer and crash the product. Sure it is possible and a null check would catch it potentially averting a crash. Sure the code that uses the pointer does not execute and crash, but what about the objects and systems that depend on SomeClass performing its responsibilities?

Someone, someday willingly alters the code and breaks the contract in a way that might result in a null pointer, and the only way to defend against that person is check for a null pointer. That argument doesn’t hold water for me.

Someone altering the code is not something you can defend against. Why? Maybe that someone is the same person modifying the caller. Maybe someone goes through the code and strips code they feel is unnecessary and that they do not want or need to maintain. Don’t laugh. I’ve met several individuals and teams that have done just that. The argument “someone/someday” is far to vague and hypothetical. Picking one practice over another because ‘someone/someday’ might do something bad is not a good foundation to base a practice on. Remember, in this case we are not writing an interface or an SDK where the inputs are completely foreign and unknown. This is a cohesive, stand-alone product that we control. We know how it works.

Ok, lets run with the argument that ‘someone/someday’ will change SomeClass so that getInstance could return a null pointer. Lets also assume that SomeClass now has a newly introduced internal state that needs to be checked before it can be used. For example, SomeClass now calls a factory to retrieve a pointer to a helper class. Without the helper class, SomeClass cannot fulfill its responsibilities. Lets’ assume that the caller also checks for a null pointer:

1:

   2: SomeClass.h

   3: -----------

4:

   5: class SomeClass {

   6:     public:

   7:         static boost::shared_ptr<SomeClass> getInstance();

   8:         void SomeMethod();

9:

  10:     private:

  11:         static boost::shared_ptr<SomeClass> s_pInstance;

  12:         ObjectAPtr m_pObjAPtr;

  13: };

14:

  15: typedef boost::shared_ptr<SomeClass> SomeClassPtr;

16:

17:

  18: SomeClass.cpp

  19: -------------

20:

  21: SomeClassPtr SomeClass::getInstance()

  22: {

  23:     if (!s_pInstance)

  24:     {

  25:         s_pInstance.reset( new SomeClass() );

  26:     }

27:

  28:     return s_pInstance;

  29: }

30:

  31: SomeClass::SomeClass() {

  32:     m_pObjAPtr = GetFromFactory();

  33:     assert(m_pObjAPtr);

  34: }

35:

36:

  37: MyCode.cpp

  38: ----------

39:

  40: SomeClassPtr pLocal = SomeClass::getInstance();

41:

  42: if (pLocal) {

  43:     pLocal->SomeMethod();

  44: }

In the SomeClass constructor, it acquires a pointer, stores it in m_pObjAPtr. Following the ‘check for null’ rule we assert if it is null. If it is not a release build of the product, the assert will trigger and the user will be notified of the problem. In a release build, asserts are typically turned off. In either case, the code could continue to execute.

Here’s the problem. There are now two conditions to check for a valid instance of SomeClass, the null checks for s_pInstance & m_pObjAPtr. s_pInstance is already checked by the caller and getInstance(). Here’s my question. Where would you add the check for m_pObjAPtr? Remember that the purpose of getInstance is to return a usable instance of SomeClass and the caller cares that SomeClass is usable because it is also validating what getInstance() returns.

(Continued in the next article: “Shields Up! (continued)”)

Journeyman

Shields Up!

Scenario 1: “getInstance() never changes to return a null”

Scenario 2: “getInstance() is changed and potentially returns a null”

Leave a comment Cancel reply

Journeyman

Shields Up!

Scenario 1: “getInstance() never changes to return a null”

Scenario 2: “getInstance() is changed and potentially returns a null”

Share this:

Related

Leave a comment Cancel reply

What’s this?

meta