If you want to hear about a real example of pure meritocracy, you should listen to Tarus Balog on Linux Link Tech Show episode 343. This guy has truly understood how to lead an open source project, and a number of really interesting things such as Open Source Marketing are discussed. (VCs who want to invest $2,000,000 in a company are also welcomed to listen to the podcast episode).
If you have a (possibly good) product that you have open sourced, but without being able to gather a community around it, then I urge you to listen to the episode.
Tarus Balog, keep up the good work, you clearly have understood what a number of supposedly open-source companies totally miss ! Good work pays in the long run !
Martin Fowler published an informal survey of version control tools among Thoughtworkers. Of course, the big winner is git, and anyone who has been going through the effort of learning it correctly would confirm.
BTW, if you want some help convincing people that git is better, do not hesitate to take a look at why git is better than X.
On a related note, you might want to listen to FLOSS weekly 111 podcast (direct link here), where CMake lead developer talks about his wonder build tool (I’ll never ever use autoconf again if I need to write some C any day) that is now developed using git.
As I explained in a previous post, Domain-driven Design (DDD) is a design principle I strongly believe in.
With more and more evidence of systems/companies switching to NoSQL for scalability reasons, creating a rich domain model becomes less and less of an option if you don’t want to shoot yourself in the foot. Indeed, while traditional applications sometimes often rely on the database to enforce integrity and referential constraints, this is no longer an option with NoSQL because of the CAP theorem.
So, this means enforcing constraints becomes the sole application’s responsibility, which is, IMHO, a good thing. Validation naturally belongs to the domain layer, and once you go through the trouble of transforming your POJOs/anaemic domain model into a rich domain model, you will certainly start adopting more and more DDD principles.
Responsibilities thus become clear : the storage layer handles the (possibly distributed) persistence, and the domain layer handles the domain-specific business rules and validation.
GeoTools developers have released the 2.6.2 version. GeoTools contains an incredible amount of utilities related to GIS and I am totally impressed by the feature set.
To give an example of its use, here is some sample code from gisgraphy-java-client ( a simple Java client I am writing for the open source GISgraphy project). It calculates the orthodromic distance between two coordinates :
public double distance(GisFeatureGeography o, Unit unit) {
Unit targetUnit = (unit != null) ? unit : SI.METER;
com.vividsolutions.jts.geom.Geometry me = location;
com.vividsolutions.jts.geom.Geometry other = o.getLocation();
try {
return SI.METER
.getConverterTo(targetUnit)
.convert(JTS.orthodromicDistance(
me.getCoordinate(),
other.getCoordinate(),
DefaultGeographicCRS.WGS84));
} catch (TransformException e) {
throw new RuntimeException(e);
}
}
Please note that the code makes use of two excellent libraries : JTS for geographical types, and JScience for units. And for your information, WGS84 is a friendly name to refer to the GPS coordinate system (x,y,z).
There is a constant about Software Developers : they love debating and arguing about every single aspect of the development process. Moreover, they will most likely debate forever, because there is usually nothing that can serve as a reference to tell good and bad practices apart. Want to know why ? Well.. everyone is making his own opinion based on his own vision of the truth. There is no axiom that is taken for granted and that serves as the basis for further discussion.
“In traditional logic, an axiom or postulate is a proposition that is not proved or demonstrated but considered to be either self-evident, or subject to necessary decision. Therefore, its truth is taken for granted, and serves as a starting point for deducing and inferring other (theory dependent) truths.”
The rest of the mathematical logic is based on these axioms, and given these axioms, everything else can either be proven right or wrong. While it is certainly impossible to create universal axioms and reasoning principles that cover the foundation of software development, I believe we should at least mimic the approach : decide of which design, architectural and coding principles we believe on, and then use these axiomatic principles as the foundation for decision making. Of course, the outcome of your next project will depend on the quality of these axioms, but at least you will be able to move forward and take consistent decisions throughout the lifecycle of the application.
As far as I am concerned, I tend to base my reasoning on a set of axiomatic principles that are based on the opinion of respected and talented people in the software industry. Even though nothing is perfect, I believe that listening to these experienced people will more likely lead to successes than listening to any lambda developer’s opinion. This is my bet, and the rest of this post is a first draft of the main design and architectural principles that I consider my axioms.
- Domain Driven Design, Specifications pattern, Layered design.
- Principles of OOD, and in particular, the S.O.L.I.D. principles
- Clean code: A Handbook of Agile Software craftmanship.
- Implementation Patterns
- Implementation of the right Enterprise Integration Patterns that best describes a given integration issue
- Defensive Programming (also see Defensive programming in Java) and its derived principles : Design by contract, Assertions, Immutability, Copy-On-Write, Immutable collections.
- Inversion of Control
- Declarative Programming, and in particular Functional Programming-style collection manipulation and Command/Query separation, side-effect free methods.
- No premature optimization.
- Lessons learnt from UNIX programming.
- Tell don’t ask.
- Robustness principle : Be conservative in what you do, be liberal in what you accept from others.
- Do not reinvent the wheel.
The people behind these principles are smart, experienced, and potentially more intelligent than you and me together. So let’s just follow these principles for now, and once we master every single aspect of them, we will be able to help create the next generation of design and coding principles. In the meanwhile, I take these principles for granted.
It is a pleasure to hear that some open source projects are conducting usability reviews :
Usability reviews are of uttermost importance if you want to learn how your end users use your product. I initially thought that conducting usability reviews was complex, and involved lots of steps, but Jean-Francois Proulx definitely convinced me of the approach when I attended one of his usability talks a few months ago.
To me, this is clearly a better investment than having your market-team meet during long hours in front of screenshots trying to shape the next version of your website
So…. you’ve been developing serious Java applications for quite a few years now, and while it was fun and enjoyable to discover the best practices, the misc. tools, how the messy fragmented ecosystem of frameworks and libraries hardly wonderfully integrates thanks to amazing JEE-whatever integration stacks (Spring, no pun intended), you now feel that the platform has become pretty much boring, and you want to try something else..
Of course, you still love Java (it pays more) and think it is still the best way to write serious applications (currently waiting for Scala to get decent IDE support ?). You do not want to hear about this over-hyped language that is supposedly perfect because you simply don’t like languages that use half of the keyboard’s non-letter keys as metacharacters. But you definitely want to hack a little bit using some dynamic language (maybe because you feel like an idiot when these dynamic-language lovers tell you they get a 10x productivity boost by using XXX instead of Java – replace XXX by whatever trendy, over-hyped popular language of the moment).
Anyways.. for some reason, your choice is Python (if not, then the rest of this post is of no interest to you). You have read a book or two online, and since you’re not an idiot, you already know how to code basic stuff (still need to lookup some stuff here and there, not sure of what is idiomatic yet, but you have definitely grasped the basic concepts). However, you feel a little bit alone in this new world, wondering what the best practices are, which tools are generally used.. etc. And nobody on the internet really helps you because when it comes to giving technological advice, people are either of the “mine is bigger than yours” type, or the “everything depends on your needs/preference/[...]” BS.
So, here are a few things I learned while developping pymager, a RESTful image conversion/rescaling service (hopefully, it will help you to find your way):
- The equivalent of jars is eggs.
- The equivalent of maven dependency resolution/download system is easy_install or the newer pip that is even better.
- The equivalent of ibiblio main repository is Pypi.
- the equivalent of maven is either distutils or setuptools. distutils is the default tool shipped with python, and setuptools is an alternative, that is simply superior. This is what runs your unit tests and creates source/egg packages for you.
- Installing dependencies can be done using several ways : using easy_install / pip (pollutes your system), using your system package manager (e.g. debian/ubuntu apt-get : super-clean, but does not install the most up to date packages), and easy_install / pip inside a virtualenv sandbox . See Tools for the modern Python hacker for some help regarding virtualenv.
- There are mainly 2 decent stacks for creating web applications : Django and TurboGears 2. Django is for people who like monolithic frameworks that reinvent the wheel, and TurboGears 2.0 is for people who favor integrating best-of-the-breed components. (TG2 gives you this integration for free, so you can see it as an equivalent of Spring ROO ).
- The equivalent of Hibernate is SQLAlchemy. In addition to what hibernate gives you, SQLAlchemy provides you with some lower-level utilities (such as SQL manipulation, DB-agnostic way to create a connection, ..). However, the transaction management is clearly inferior to what you get with Spring/Hibernate and their ThreadLocal implementation is just a hack that is clearly not suited to anything else than using from a web framework. If you need to do anything more serious, you will need to reinvent the wheel (See pymager’s reimplementation of ThreadLocal’s transaction management ).
- There are a dozen ways to expose a python webapp including nasty CGI-related techniques. All of them are either hacks or legacy stuff except the newer WSGI approach. Most websites use fancy names for describing what WSGi does, but it is mostly an equivalent of the servlet API.
- Cherrypy is a wonderful embedded web server that supports WSGi (and that can be used as a WSGi application itself behind apache, pretty much like tomcat can serve applications behind mod_jk). It used to be the one shipped with TurboGears, but they switched to Paste for some political reasons. (that was necessary for the merge with pylons)
- Naming conventions are a joke in python, as nobody seems to follow the same rules. Even some modules in the python standard library (e.g.: unittest module) seem to adopt different conventions than what looks like the python coding standard. I guess that too many ex-java developers program in python without being able to let away their java naming conventions..
- As a Java developer, there is some OO purity that you will need to forget about. It seems to be the “python way” to use module-wide variables, and you feel like you are fighting the platform / frameworks if you insist on applying your IoC best practices…
- nose is the way to go for running tests. (it integrates well with setuptools)
- there is no really manipulable classpath, and the default python mechanisms for discovering / handling files/data the equivalent of the classpath is pkg_resources. (an additional reason to use setuptools)
What is the maximum number of developers you can ever imagine working _efficiently_ on a project ? 5 ? 7 ? 10 ? 20 ?
Who wrote 2.6.33 reminds us how much the open source world (and in particular the linux kernel community) excels in this area. For the single 2.6.33 release that was developed in about 3 months :
“As of this writing, 10,500 non-merge commits have found their way into 2.6.33 – fairly normal by recent standards. These changes added almost 900,000 lines while deleting almost 520,000 others; as a result, the kernel grew by a mere 380,000 lines this time around.”
So, if you happen to struggle to scale your team past the 10-people mark using your usual development habits, then there are maybe a few things you could learn from the open source world.
My personal understanding of why it works so well :
- good elite developers
- top-notch, distributed, super-fast and merge-friendly version control tools (e.g. git)
- Fault-proof and compromise-free (though sometimes not politically-correct) ways of enforcing software quality and architecture. Examples showing the disagreement-proof nature of the kernel development process include last summer’s Alan Cox vs Linus Torvalds dispute regarding the tty subsystem, or Linus Torvalds vs Hans Reiser argument regarding Reiser4’s plugin system that does not fit well into linux architecture
- result-oriented and meritocracy-driven way of managing the project
- decentralized development (made possible thanks to distributed SCM tools). To quote Linus Torvalds : “Centralized _works_. It’s just *inferior*.“
- modular architecture supporting the collaboration of many developers. “The large number of developers and the fact that they are volunteers has an impact on how the system should be architected. With such a large number of geographically dispersed developers, a tightly coupled system would be quite difficult to develop — developers would be constantly treading on each others code.“
You might disagree on the reasons why it works so well (after all, that’s just my analysis based on my understanding of the situation), but the success is a reality, a fact.
Also, if you are tempted to think that it could not work in the corporate world, please think about that twice by taking another look at who wrote 2.6.33, where contributing companies are listed.
Looks like go is attracting some attention.
“”Open source does not mean anarchy. Somebody has to have a vision and the perseverance to see that through. The open source community can then create their own versions if they wish, but it is best if there is a main line, stable version with a consistent architecture with a guiding force behind it,” Gordon said.”
The Scala 2.8 beta 1 announcement gives hope regarding the availability of a decent IDE for editing Scala code. We will see what Scala 2.8 final looks like, but if the eclipse IDE support features basic Class and Method renaming, I will most likely make Scala my main programming language for writing open source code that targets the JVM. Two projects that I would most likely convert to Scala would be :
- Gisgraphy Java client : a Java library that gives access to gisgraphy City and GIS features search engine.
- Pymager Java client : a simple Java wrapper on top of the RESTful interface provided by pymager, an image service that provides simple conversion and thumbnailing / resizing features.