A couple weeks ago I gave a presentation to my team, based on Malcolm Gladwell’s Outliers: The Story of Success. I focused on his analysis of mitigated speech (when we downplay or sugarcoat the meaning of what we say, usually out of deference to authority). Mitigated speech has been a key factor in numerous airplane crashes, when the crew did not voice safety concerns clearly enough to their captain.
Gladwell’s examples from black box recordings are astonishing. In many cases everyone in the cockpit except the captain knew the plane was in serious trouble, but even then wouldn’t speak up forcefully (typically the captains were good pilots, but were exhausted and weren’t picking up the hints from their crews). They’re powerful examples of the importance of being assertive, and just how hesitant most of us are even when it’s obviously vital to speak up. And my presentation was, I hope, more entertaining then some stereotypical HR assertiveness training class.
The last couple slides show similar examples from The Clean Coder: A Code of Conduct for Professional Programmers, about the importance of saying no and not being passing aggressive.
Here are my slides (posted on SlideShare.net)
A standard practice of Agile programming, and good programming in general, is to separate your display code from your application logic. Bob Martin:
Today’s modern programming environments make it possible to put many different languages into a single source file. For example, a Java source file might contain snippets of XML, HTML, YAML, JavaDoc, English, JavaScript, and so on… This is confusing at best and carelessly sloppy at worst.
The ideal is for a source file to contain one, and only one, language. Realistically, we will probably have to use more than one. But we should take pains to minimize both the number and extent of extra languages in our source files.
Separating display code is central to the MVC pattern, which underlies almost all PHP frameworks (and frameworks in many other languages as well). WordPress is a blogging platform and arguably a CMS, but for plugin authors, it does not provide the typical tools of a development framework. If you want to separate out your display layer, you’re on your own.
As a result, very few plugin developers do. A common style for WordPress plugins is a single file, sometime running thousands of lines, that consists of cryptically named functions (often in no discernible order), which have the display code scattered around within the core application logic. This style might be acceptable for something small and simple. But once the functionality becomes complex, this style makes the code very difficult to understand for anyone other than the author (and the author is likely to find it hard to understand six months after finishing work on it). It also makes extending the functionality difficult. For example, adding support for a mobile UI would entail a major rewrite.
For a WordPress plugin, you can use output buffering to solve this problem. You can see an example in my ShashinMenuDisplayerAlbum class. The run() method triggers the display of the menu, and its HTML is in a separate file:
ob_start(); // start the output buffer require_once($this->relativePathToTemplate); // get the HTML template $toolsMenu = ob_get_contents(); // store it in a variable ob_end_clean(); // empty the buffer and turn it off
Normally a require_once
call would include the file, evaluate any PHP code, and immediately output the HTML. Including the file with output buffering turned on allows you to instead store all the output in a variable. You can then control when it is displayed. With this approach you do not have to write all your HTML inline with your application logic.
The template file being included in my example does have PHP as well as HTML in it, so I am not meeting the ideal of complete separation. But as Bob Martin points out, such total separation isn’t always possible. In this case, the PHP code used in the template is the minimum necessary for the purpose of rendering the page (for example, it contains a foreach loop to display rows of data in a table).
Ron Jeffries wrote a post a couple days ago with a title indicating its about databases, but then he proceeds into a test-driven development (TDD) coding exercise. Halfway through, an ostensible reader interrupts:
Hey! I thought this was supposed to be about databases???
Hey, yourself. Try to pay closer attention. This is totally about databases.
Why is this even remotely about databases???
Well, here’s why:
We have just defined a core element of our member database record, namely the purchase count, and made sure that it works. Now when we read a member record in from our database, we can instantiate it into our class MemberRecord and send it messages to decide what to do. This is OO here, my young padawan, and that’s how we do it.
Yeah, sure, old man, but what about the membership number? That’s the key, according to the story, and you don’t even have it. And isn’t there just a small matter of actually storing and retrieving these babies?
Patience, youngling. Each thing comes in its time. Let’s see what the next tests bring us.
His post is a good illustration of the revelation I had when first learning Agile coding practices: they let you start in the middle, with the business logic. A common traditional approach is to instead define a database schema first, and then deal with the UI and business logic. There are 2 problems with this traditional approach 1. the database is the hardest layer to change, and your design is likely most volatile in its early stages, and 2. it makes it that much easier to end up tightly coupling the components of your application together.
By starting with the business logic and no supporting infrastructure other than your tests, you have no choice but to fully decouple the application logic from the UI and the database. Then when your boss says, “we need to make a mobile version of this app by Friday!” you can work on the new mobile UI design without having to cut through a tangled thicket of blended UI and business logic.
Having said that, there are challenges with mimicking complex interactions with the database and UI layers through unit tests. I consider myself a novice with Agile techniques, but I’ll share what I’ve learned so far:
Toppa.com has become a quiet place in recent months. I’ve been spending what time I have on coding Shashin 3, so I haven’t been blogging. I have been applying Agile coding practices to WordPress plugin development, and it’s more than I bargained for. I could write a book about it, but I’ll probably never have time, so I’ll see how far I can get with a series of blog posts instead. Here are some key points I hope to expand on in upcoming posts:
I hope to get SimpleTest for WordPress up at wordpress.org soon, so I’ll have more to say about testing in WordPress next.
Over the past several months I’ve been following Agile coding principles in my work, as described by Bob Martin and others (see, for example, Clean Code, Agile Software Development, and Growing Object-Oriented Software, Guided by Tests). Applying these principles in PHP presents some challenges. Applying them to WordPress plugin development presents even more challenges. This is the first in a series of posts on how I’m dealing with those challenges. My Google searches about PHP and Agile coding don’t turn up much, so I figure I can break some new ground 😉 . But I don’t consider myself an expert – feedback is welcome.
If you follow the Agile principle of small methods and small classes, your projects will consist of a large number of small files. In Java, this doesn’t matter much – the project is compiled before it’s deployed. But PHP is a scripting language, so it’s compiled on the fly. include
and require
statements can have a moderately expensive performance cost (see here and here). So what’s the best way to include your files for code readability, and for efficiency?
require_once
vs require
I always use require_once
to include a required file. I don’t use include
, which allows the compilation to proceed even if the file is not found (a class file isn’t going to be optional). require_once
has a reputation for being slower than require
, but this benchmarking indicates otherwise (I wouldn’t consider that an exhaustive study, but it’s enough that I’m not going to lose sleep over it). This way I don’t have to worry if a file has been included somewhere else already. If you have lots of small classes, used in various places, require_once
can save you from that headache.
require_once
callsThere are three options:
class
line). This is common practice in languages I’m familiar with. From a readability perspective, the dependencies are clearly stated at the top of the class file. From a performance perspective, you’re only loading what that class needs. And from a flexibility perspective, that class can now be used outside the context of any particular calling script (as long as you package it with what it depends on). Note that if you are doing dependency injection, you’ll be passing in already instantiated objects to the dependent class, so your file includes will be one step removed from the class where the objects are actually used. The simplest way to think about this is to include the class file dependencies wherever there are new
calls on those classes.require_once
statement for a class file on the line before calling new
on that class, regardless of where I was in the code. There is no technical problem with doing this – PHP will read the class into memory – it doesn’t matter if you’re inside a method. I did it in pursuit of improved performance: dependent classes are loaded only exactly when they are needed. The problem, however, is that the dependencies are now buried in the code of the class, making the reader of your code do more work to find them (and risking missing them, which leads to bugs…). A tenent of Agile programming is that you optimize for readability first, and that you sacrifice that readability for performance only when there is a demonstrable performance problem to address. What I was doing is an example of premature optimization (and the “gut feeling” optimizations many of us do, like I did, often turn out to not be optimizations at all when you profile the actual performance).If you do have performance problems, file includes are probably not the first place to look. Database queries are the more common culprit. But if you do need to reduce your hits on the filesystem, you should look at opcode cachers, such as APC (or, for WordPress, the W3 Total Cache plugin) before you contemplate writing hard to read and maintain God objects.
Note: this is a revised version of the original post.
Over the past several months I’ve been following Agile coding principles in my work, as described by Bob Martin and others (see, for example, Clean Code, Agile Software Development, and Growing Object-Oriented Software, Guided by Tests). Applying these principles in PHP presents some challenges. This is the first in a series of posts on how I’m dealing with them. My Google searches about PHP and Agile coding don’t turn up much, so I figure I can break some new ground 😉 . But I don’t consider myself an expert – feedback is welcome.
If you follow the Agile principle of small methods and small classes, your projects will consist of a large number of small files. In Java, this doesn’t matter much – the project is compiled before it’s deployed. But PHP is a scripting language, so it’s compiled on the fly. include
and require
statements can have a moderately expensive performance cost (see here and here). So what’s the best way to include your files efficiently?
require_once
vs require
I always use require_once
to include a required file. I don’t use include
, which allows the compilation to proceed even if the file is not found (a class file isn’t going to be optional). require_once
has a reputation for being slower than require
, but this benchmarking indicates otherwise (I wouldn’t consider that an exhaustive study, but it’s enough that I’m not going to lose sleep over it). This way I don’t have to worry if a file has been included somewhere else already. If you have lots of small classes, used in various places, require_once
can save you from that headache.
require_once
callsThe simplest option, and worst from a performance perspective, is to include all your project’s class files in the initializing script. Any given execution path through your project likely only requires a subset of your classes. Loading up all of them for every single http request is unnecessary and may impact performance. It’s also problematic in that dependencies between classes are not evident to someone reading your code. That is, if you tried to use one of the classes in a different project, and that class has dependencies on other classes, the class would fail to compile.
A second option is to include all the class files a given class depends on at the top of that class file (before the class
line). This is better, and in PHP code I’ve seen by others, it seems to be a fairly standard practice. It’s also good to see all the dependencies clearly stated at the top of the class file.
A third option is what I’ve been doing, which is to put my require_once
statement for a class file on the line before I call new
on that class. For example:
class Shashin { .... public function validateShortcode($shortcode) { require_once('Shortcode/ShashinShortcodeValidator.php'); $validator = new ShashinShortcodeValidator($shortcode); $validatorResult = $validator->run(); .... } .... }
I used to think this approach would cause PHP to think the included class was a child of the function where it was included, but that is not the case. The class is simply read and put into memory.
The potential downside of this approach is that you have to read through the file to see the dependencies, but so far I haven’t had a problem with that, because my classes are small 😉 . I like it because I’m making PHP hit the filesystem only when necessary – you don’t always need every possible dependent class for every execution path through a given class. Sometimes there is a trade-off between code readability and performance, but in this case, the impact on readability feels trivial.
In the case of Shashin (my WordPress plugin I’m currently updating), I have a main class where the various WordPress hooks are invoked (which in my experience is the only rational way to deal with WordPress). Therefore, that main class acts as a controller for the application, and ends up including almost all the other classes at some point, from within its various methods. So I do not want to just include them all at the top of the file, as only a handful are needed for any given hook.
The one case where I do include a dependent class at the top of a class file, is when it is a concrete class implementing an abstract class (or interface). I’ll include the abstract class at the top of the concrete class. There’s no reason to make the caller of the concrete class keep track of the dependency on the abstract class.
Lastly, if you do have performance problems, file includes are probably not the first place to look. Database queries are the more common culprit. But if you do need to reduce your hits on the filesystem, you should look at opcode cachers, such as APC before you contemplate writing hard to read and maintain God objects.
Something that set today apart from yesterday is that I had opportunities to talk with several of the speakers. When I arrived, Chet Hendrickson and Ron Jeffries were sitting by themselves at a breakfast table, so I joined them. They were having a conversation about human nature, and whether it doomed us all to failure (heavy stuff for 8AM). By the end of the conversation we were discussing doing Agile in University settings… and unusual experiences with plumbing.
I had lunch with Eduardo Jezierski, who just before lunch gave a riveting presentation: Architecture and Agility with Lives at Stake. He had many stories to tell. One was about developing, in real-time, a system to save lives after the earthquake in Haiti. He had a product owner in a tent at the airport in Haiti, with a low-bandwidth connection, tweeting user stories as he received calls and people came to him with information. He was right near the runaway for days, with C-130s constantly landing with supplies and taking off again. They were able to broadcast SMS messages to people with cell phones, telling them to text a specific number if they were trapped or injured and needed help, or knew someone who did. His team was developing a system to get those incoming messages into an RSS feed that could be accessed by local responders. It was also fed back to the Haitian diaspora in the US, where volunteers who knew the neighborhoods used Google Maps to help pinpoint locations where people were trapped, based on descriptive information. He said in situations like this, they’ve had to get help for their programmers who develop post traumatic stress disorder. When you receive a message like “I’m pregnant, I’m trapped in a collapsed building, and I’m hurt” and you’re trying to develop code in real time to get that message where it needs to go… well, you can get just a bit stressed. I was fascinated by the work his organization did in Haiti and elsewhere. They have often turned into the facilitators between governments, NGOs, and local communities, as they try to leverage IT to solve local health problems and address emergencies.
The other talks I attended were also good. But it’s past my bedtime, so I’ll skip summarizing them for now, and close by mentioning I attended my first Philly WordPress Meetup after the conference. It was a great group. I met the organizer, Brad Williams, and we talked about the next Wordcamp Philly this Fall. It’s still a ways off, but I may give a talk about my current experiences with applying Agile coding practices to WordPress plugin development.
The first day of the Philadelphia Emerging Technologies for the Enterprise conference was great. Molly Holzschlag‘s Keynote gave us some peace, love, and understanding, at least as far as web standards are concerned (she even made references to the Age of Aquarius…). After that, it was a challenge to choose from among the simultaneous sessions throughout the day. I’m there not just for me, but also for my team at SOMIS. They have a variety of interests, so I made sure to attend at least one session on each major theme of the conference (Agile, mobile design, management, languages, and infrastructure).
I’ll focus on my favorite talk, by Chet Hendrickson and Ron Jeffries, who years ago helped define many Agile practices. They presented A Retrospective Rant: 15 Years in the “Agile” Business. They’re both pleased that Agile has become a success and has gone mainstream, but also unnerved that it’s so mainstream that the stodgy Project Management Institute is going to offer scrum certifications (imagine Emperor Palpatine realizing the rebels actually aren’t so bad, and joining forces). Their presentation was very lively, and at times reminded me of Abbot and Costello, as they constantly played off each other. They focused on the most common failings they’ve seen in teams that are trying, but not fully succeeding, with Agile:
The other talks I attended were all very good:
I’m looking forward to Day 2!
It’s been over a month since I blogged or tweeted. Aside from this post, it’ll be probably another month before I do so again. I’d especially like to apologize to the people looking for help with Shashin and my other plugins, as I have not been responding to support requests (for my plugin users, please see this post).
As I mentioned back in the Spring, I’ve been leading our web team’s transition to scrum. Since then we’ve been working with Agile/scrum training coaches Bob Hartman and Darian Rashid, and they’ve done an amazing job helping us make the transition a successful one.
Before starting with scrum we had poor visibility into our future work – planning was extremely difficult. Now we’re getting better visibility, and it’s something of a “be careful what you wish for” situation. I’ve been working nights and weekends for the past month, getting a handle on all our projects and our schedule, so I can manage expectations for both my team and for our stakeholders. Work is the first thing I think about when I wake up, and the last think I think about before I go to sleep at night. It’s going to stay that way for at least a few more weeks (possibly months), as we get through this transition.
We have several goals: improving quality, teamwork, etc. But our first is to improve our planning: to align our workload with our actual capacity, establish a sustainable pace, and create reliable expectations for our stakeholders. With scrum’s velocity measures and other metrics, my ultimate goal is to clearly demonstrate to our stakeholders what our team already knows: that we do an incredible amount of quality work with a very small staff, and that if we’re expected to do even more, we need more people.
UPDATE: A disclaimer to calm some readers who were worried: this post is by no means an endorsement of Rumsfeld. I’m just taking a peculiar statement of his and using it as a way to think about scrum (and I just thought it was funny).
“There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we now know we don’t know. But there are also unknown unknowns. These are things we do not know we don’t know.”
– United States Secretary of Defense Donald Rumsfeld, February 12, 2002
It’s not an accident the name of my department at U Penn’s School of Medicine is Information Services, not Information Technology. We develop software, and for that we are adopting scrum, as I mentioned in an earlier post. But we also provide services, such as meeting various ad hoc reporting needs for our clients. Until now we have tended to be more reactive than proactive, in that we typically wait for service requests to come to us (and they often come on short notice) and then we scramble to fulfill them. Our current workflow is all about multi-tasking and being interrupt-driven, both of which are productivity killers. In a certain sense we are extremely far out on the agile spectrum, in that the idea of locking down requirements and schedules for even as short a duration as a 2 week sprint will be introducing a lot more workflow structure than we’re accustomed to.
To make the change, we need to become more proactive than reactive. Do all those ad hoc requests really need to be quite so ad hoc? Can we do more to anticipate service needs and plan better? This is where Rumsfeld’s seemingly mystifying statement has relevance:
A notion we’ll be testing as we move to scrum is whether we can move a lot of our current known unknowns into the known knowns column. To continue with the report example, if we know when it will be run, we could put some slack around it in the schedule, since it’s an important report. We also could do a test run of the report ahead of schedule and review it with the client to see if the numbers add up as expected, to minimize the risk of a last minute surprise. This will require us to be more proactive with our clients, in that we’ll need to seek out and identify upcoming service needs and incorporate them into our sprint planning. We also need to involve our clients more in the overall quality equation, such as asking (and expecting) them to test and review important reports ahead of their due dates. This calls for a culture change for some of our clients, which I expect will be the most significant challenge for us in adopting scrum.
With unknown unknowns such as the Dean asking for unexpected work, we may never be able to control them entirely, but with the tools scrum gives us (such as burndown charts) we can at least more clearly illustrate for our clients how ad hoc requests are impacting their project schedules. It’s also my belief that work requests that are truly impossible to anticipate don’t actually happen very often. The harder cases are bugs that take hours to unwind, or code changes in one place that have unexpected consequences somewhere else. We can reduce these over time by introducing technical practices such as test driven development and regression testing. Both of these will require training and the development of a testing infrastructure, so it will take a while to get there, but we can get there. In the meantime, I’m planning to not have our scrum teams completely fill their sprint time in their sprint planning. We’ll leave some time in the schedule for handling unexpected work.
You might be wondering about unknown knowns – things we don’t know we know. That gets us into the realm of repressed childhood memories and past lives, and I’m hoping that such things won’t impact our sprint planning…