Xing \ Creative \ Coding

Web Application Software Development

The examples below are in PHP, but a knowledge of any OOP language should suffice in understanding the examples.

Agile programmers like to throw around their acronyms. Sometimes it’s just to make them look smart, but the better purpose is so that we can communicate a plethora of information without taking up a lot of time. SOLID programming is one of those acronyms that contains quite a few principles about code design that are foundational to good OOP programming.

Single Responsibility Principle

To start us off, the first letter “S” stands for Single Responsibility Principle. Perhaps the easiest way to understand it is by knowing its opposite. The opposite of this principle would be the god-object, which does everything. A god-object might format your strings, open the connection to the database, query for results, and handle business logic. The god-object is a jack of all trades and master of none. The Single Responsibility Principle in contrast states that your objects should only do one thing (and, if I might add, do it well).

Why is this important?

Let’s say that you have an e-mailing class MailMessage in your open source project. The class is, initially, just an OOP wrapper for the mail() method in PHP with methods like addRecipient(name,email), which streamline the composition of the e-mail. However, you start to realize that various platforms and requirements are going to require very different ways of sending e-mail:

  • Unix sendmail requires headers to be separated with \n.
  • Windows MTAs can’t handle the name in the to parameter for the mail() method.
  • Some people will likely need to send e-mail via SMTP.
  • It would be useful for unit tests to mock the sending process but not actually send e-mail.

So, in thinking about this problem, we have discovered two responsibilities. The first responsibility is the interface by which a user builds the e-mail, and the second responsibility is how that message is sent. Thinking of the Single Responsibility Principle we decide that we are going to separate these responsibilities into two separate classes: MailMessage and MailSender.

The MailMessage class is how the user builds the e-mail. This is unlikely to ever need to change. So, we’ll build this as a concrete class. However, we’ve already identified that our other responsibility–how to send the e-mail–is likely to change a lot. Just by separating these responsibilities, and knowing where things are likely to change, we’ve somewhat stumbled into the Open/Closed Principle (the “O” in SOLID).

Open/Closed Principle

If you write the MailMessage class with \r\n between headers, and someone using sendmail needs \n, they are going to have to directly modify your MailMessage class. The Open/Closed Principle reminds us that we want to avoid that scenario, because once they modify your class they can’t update your library to the latest without breaking their implementation. After we complete our MailMessage and MailSender classes they should be “closed to modification,” but “open to extension”. Anything that may need to change for different implementations should have an avenue of extension that does not require the original classes to be modified.

The obvious avenue of “open to extension” is inheritance. If you can inherit the class and modify what you need to modify, you can do whatever you want. However, that has problems of its own when the parent class’s behavior changes unexpectedly, so we don’t want to force users’ to extend our class in order to modify it’s behavior. In this case, we already know where the extension points need to be, which is that we need multiple instances of MailSender. To give an easy method of extension without modification, we are going to use another SOLID principle–out of order this time–Dependency Inversion Principle

Dependency Inversion Principle

The Dependency Inversion Principle states that our high-level, library class MailMessage should not depend on the low-level implementation details of our MailSender class. Instead, both the library and the client code should depend on abstractions. So, to avoid implementation details in our high-level class, we create an interface: IMailSender. Then, both our MailMessage class and our implementations will depend on this abstraction but not have any knowledge about each other. Here’s a rough concept of what that would look like:


interface IMailSender {
    /**
      * Returns true on success, or false on failure
      * @return bool
      **/
    public function send( $to, $subject, $message, array $headers );
}

Then, in our MailMessage class, we implement dependency-injection:


class MailMessage {
    private $_sender;
    public function __construct( IMailSender $sender ) {
        $this->_sender = $sender;
    }
}

So, that takes care of the SO**D, but unless you’re British, SOD isn’t a useful acronym. So, what about the “L”, and the “I”?

Now that we have an interface IMailSender, and we have dependencies in our MailMessage, we can discuss these two items.

Liskov Substitution Principle

I put a comment in the interface that it returns true on success and false on failure. Any implementation and any subtype to those implementations should be able to be “substituted” into the MailMessage class without altering the correctness of the MailMessage class. In this simplistic example, if an implementation started returning “success”/”failure” instead of true/false, it would alter the behavior of the program, violate this principle, and break the MailMessage implementation.

Now, this is where I stick in my shameless plug about strongly-typed languages being better, because in a strongly typed language like C#, trying to return a string from a bool method would cause a compile-time failure. However, PhpStorm and possibly some other IDEs do try to give intellisense that communicates the same problems when you utilize DocBlocks.

Interface Segregation Principle

In a nutshell, interfaces should be small and not have a lot of methods that an implementation has to define (or throw NotImplementedException for). In our case, we’ve done this. An implementation will only need to implement the send() method. If we had added a method setHeaderDelimiter() to our interface, this might be good for the default mail() implementations, but would not be needed or used by the SMTP implementations which would always use \r\n. Those implementations would have no need for such a method, so it’s better to leave it out and handle such concerns in the objects they pertain to.

Conclusion

It’s not the end of the world if you don’t follow all of these principles all of the time. These principles can also add to the complexity of your application, so if You-Aren’t-Going-to-Need-It (YAGNI), it may not be worth the time investment. However, it is a good idea to fully understand the principles so that you know which ones you are breaking, why you are breaking them, and possible code smells that can result as the application changes.

March 24th, 2015

Posted In: Software Design

Tags: ,

Leave a Comment

I was reading up on some interview questions on Scott Hanselman’s blog today, and in the comments I saw someone list Singleton and Service Locator as Anti-Patterns, thinking, “Why would someone go against the almighty Martin Fowler?” These are very common patterns of development. I can kind of see how the Singleton Pattern might be an Anti-Pattern–as it forces a dependency on a static method to instantiate the class. However, one of the very reasons that I don’t like this aspect about the Singleton Pattern is because it makes it unusable with some other patterns, like the Service Locator…the Service Locator can’t instantiate it. In any case, I was mildly surprised to see that the Service Locator Pattern was viewed by some as an Anti-Pattern, and wondering how I’d never come across the anti-SL ravings before.

What it seemed to come down to was that–with the Service Locator pattern–you get runtime errors instead of compile-time errors. Even more problematically, if you are using an API designed by someone else, you may not know that you need to configure a JsonSerializer for their library to work (or something to that effect). In such a case, being required to pass a JsonSerializer into the constructor of the API’s object would be more clear. Okay, that makes sense. I can see how being more explicit in what your object requires, especially when developing an API, is useful…and how using the Service Locator might be viewed as a code-smell. I do know that I have a grudge against Zend Framework 2. They seem to haphazardly create services on the fly. It seems that almost every ZF2 object is accessed via the service locator, and it just hasn’t seemed very developer-friendly. So, I can totally get on board with the idea that the Service Locator Pattern can have a code-smell. However, I’m still not sure I’m willing to call it an anti-pattern.

If you’ve ever setup database connection details in a config file (e.g. web.config), then–in essence–you were utilizing the Service Locator Pattern to specify the driver and connection details. To say that connecting to a database is an anti-pattern seems a bit silly. However, it has all the same problems. If you don’t know that the module requires a database connection, so you don’t configure it or you configure it incorrectly, then you get runtime errors instead of compile time errors. So, it has all the same potential for “code smell” that the Service Locator pattern has in terms of external modules, but to my recollection we don’t call connecting to the database an anti-pattern.

Personally, ZF2 aside, I still like using the Service Locator. Call it a personal weakness. When using something like Zend Framework 2, I don’t have control over the instantiation of the Controller classes. So, I can’t inject my services into the constructor of the controller without rewriting Zend. The Service Locator is also extremely useful in handling Convention over Configuration dependencies that can still be overridden with configuration. For instance, given the URL /user/register, we can dynamically get the dependency for the string ‘UserController‘. If none is found, we can look for the namespace ‘Ui.UserController‘, but the developer can easily define ‘UserController‘ to be assigned to ‘Modules.Authentication.Controllers.UserController‘ by explicitly giving the Service Locator a definition of ‘UserController’ so that the router doesn’t have to fallback to the default convention.

In the end, I think ANY pattern can create a code-smell if used carelessly. Imagine the mess you could make by doing almost everything with the Decorator Pattern. Heck, we could even write about all the problems that can arise from inheritance, but the problem isn’t inheritance, which is a useful and needed pattern…the problem is the implementation. So, I’m not ready to write-off the Service Locator quite yet. However, I will try to keep in mind this potential problem, especially if I ever develop an API that is going to require the consumer to provide the implementation.

February 10th, 2015

Posted In: Software Design

Tags: , , ,

Leave a Comment

Let’s say that your server runs in Central Time but it logs data for facilities all across the US. 99% of the time you won’t have any problem. You’ll grab the Date/Time out of the database, and if displaying to a facility on the East Coast, you convert the Date/Time to Eastern for the user to see. It’s all good. However, you set yourself up for a very big headache in the Fall.

In the Fall, the clocks will move back, which means that your server will experience 1am – 2am twice. If MySQL stored the offset in the DateTime field (e.g. -05:00 before the change and -06:00 after the change), then this wouldn’t be a problem. However, since it doesn’t, when you pull a log out of the table and get the time 1:30am, you have no way of knowing which 1:30am it is, the one before or the one after the change.

If only you had stored the times in UTC you would know which was which.

However, we have a bigger problem. Your Easter timezone facilities’ clocks don’t change back at the same time…they change back at their own 2am. This means that when you translate back and forth between Central and Eastern time, you are going to run into some odd glitches illustrated below. The “Time In” is the valid time sent from the Eastern timezone facility, the “Time Stored” is the time that it gets converted to on the local machine before storage. The “Time Out” is the translation back to local machine time when it pulls it out of the table, and the “Display Time” is the end-result that you display in your logs for the Eastern timezone facility.

Time In Time Stored Time Out Display Time
01:00 EDT 00:00 CDT 00:00 CDT 01:00 EDT
01:30 EDT 00:30 CDT 00:30 CDT 01:30 EDT
01:00 EST 01:00 CDT 01:00 CDT 01:00 EST
01:30 EST 01:30 CDT 01:30 CDT 01:30 EST
02:00 EST 01:00 CST 01:00 CDT 01:00 EST
02:30 EST 01:30 CST 01:30 CDT 01:30 EST
03:00 EST 02:00 CST 02:00 CST 03:00 EST

You’ll notice that you’re all good up until you get to 2:00am EST. Your local machine converts it to local time as 01:00 CST, which is correct. However, since MySQL doesn’t store the offset, when your local machine pulls it out of the database table, it translates it to 1:00am CDT since it doesn’t know which. This results in your Eastern timezone logs having 3 instances of 1am – 2am instead of 2, and a complete loss of the logs from 2am – 3am.

Therefore, if you are working with multiple timezones, or your business might grow to include multiple timezones…start things off the right way. You can avoid this problem by storing your DateTimes in any format that is immune to DST changes. So, the following will all work:

  • Store your Date/Time in UTC. This is the “best-practice” standard.
  • Store the offset as part of the DateTime (e.g. 2014-11-02T02:00:00-05:00). SQL Server has a DateTimeOffset column type, but for MySQL you would have to use a VARCHAR column. Using VARCHAR increases your storage space requirements while also reducing the inherent functionality provided with the DateTime column, so I don’t recommend it.
  • Store in a Unix Timestamp format. This is functionally equivalent to storing in UTC, and is great if you’re storing the time for “right now”, but it is not good for something that needs to store past or future Date/Times because it’s only good for 1970 – 2038. Also, when doing manual queries into the database just to look at the data, timestamps are not human-readable, so I never use them for anything myself.

It can be a bit of a pain getting used to storing data in UTC and then translating it into the display timezone later. However, it avoids many pitfalls, and the work you do up-front to get used to it will pay off in the long run.

December 18th, 2014

Posted In: Software Design

Tags: , , ,

Leave a Comment