MongoDB offers java programmers a driver to connect to collections through a comfortable to use ORM (Object Relational Mapping). Yet, lots of coders like me took it for granted that it can supply all of their integration needs with this excellent database system. But, as I found out at a certain point, there are certain limits to their offering that should be dealt with.
A Bit Of History
MongoDB java driver has gone through a lot of changes during the past years. It looks like as if that the team who developed it, had looked for the best way to allow java programmers to express their queries using Java, while hiding the need to know each and every mongo shell command. The first result was to allow java coders to define a document object which was later converted into a JSON object similar to the JSON objects which were used in mongo shell scripts:
Document myDoc = collection.find(eq("i", 71)).first(); System.out.println(myDoc.toJson());
Although that made sense at the time, i.e sticking to the idea of a document within a java program, what it actually did was to force java coders to have their syntax filled with manipulations to the retreievd Document object. So instead of progressing the java ecosystem it returned the coders to the 90’s. And that wasn’t a smart thing to do.
MongoDB. The Javafied Version.
The MongoDB driver development team understood the problem & decided to implement a new methodology that every J2EE coder got acquaited with: The ORM methodology. If you haven’t got a chance to use this methodology, it means that every record (in relational databases) or collection (in a NoSQL database) is to be considered as a java object. This made things better, since the code for pulling a Person’s document into a Person object would have been looked like this:
aPerson = collection.find(eq("address.city", "Wimborne")).first();
Everyone can understand this. Can’t they?
This kind of mapping itseld saved a few dozens of source code lines, and creating a project based on MongoDB had become more simpler, but the misery of java coders didn’t end just yet, since there was one more extensive issue to be dealt with. It was the issue of aggregating and manipulating data from multiple collections.
MongoDB Aggregation, AKA in Java as Spaghetti Code
Aggregation is at the very heart of every large information system. No one can evade from dealing with it even when displaying a simple cross-table information. Hence, it only made sense that the MongoDB java driver would enable such feature with a minimal friction. Of course that didn’t happen. Instead what happened is a total mess that forced the java code to include all of the complexities MongoDB script for aggregation has. thus, for instance querying all of the stores whom are considered as bakeries, grouping them by the number of stars they have & printing them may look like this:
collection.aggregate( Arrays.asList( Aggregates.match(Filters.eq("categories", "Bakery")), Aggregates.group("$stars", Accumulators.sum("count", 1)) ) ).forEach(printBlock);
Besides being an ugly code in my point of view, this code hold 3 more problems:
- It doesn’t implement the principle of seperation of concerns. Fragments of code which are more suitable to be handled in the level of the DB system are scattered throught a java call. For example the Accumulators.sum which is in charge of instructing MongoDB to summarize the number of bakeries suddenly becomes part of the Java code.
- Creation of a more dynamic query in java is very limited since the more aggregate commands are added, the more a coder needs to create a new & maybe seperae java calls. The result of that is a larger runtime code which may request more system resources (due to increased memory paging and increased memory allocations).
- The learning curve of this code is longer for a newbie since both mongo script (which is analog code) as well as the java code are to be learnt.
The Solution: Seperation of Concerns
Being naive using software APIs has never been helpful, but good principles do help. Using seperation of concerns is the right thing to do here and that means as follows:
- Deciding that java will not handle any filtering nor grouping of data as long as it’s not a matter of performance.
- Changing the learning curve so that a newbie to MongoDB will be fairly acquainted with mongo shell script, the same way we would have demanded to be acquainted with SQL scripts.
- Creating simple calls in java that will pass a query string (either simple or a compound one) to the mongodb.
This way the changes to the java code will be minimalistic (as well as the changes to the mongo script code, in case the java code will be changed).
How can that be done? it’s quite simple. After defining a DAO for mongodb command in java (if needed) a call to runcommand in java is made. This call is described here:
Beyond being a proper solution to the above problems, this command is very powerful since it allows coders to access any kind of mongo script command, including those which are not reflected in the java driver. Furthermore it lessens the amount of calls a java application makes towards the MongoDB service, thus making it more efficient in both DB operations as well as communications operations.
One more caveat must be made for those who are willing to make the step and use it. Like in ancient times of the MongoDB java driver, it still returns a Document. But that’s no big deal: One of Document’s most important methods is call toJson:
Since toJson returns a String it can be easily converted to any java object you desire using GSon, Jackson or any other JSON to object ORM.
MongoDB Java driver’s ORM can supply very basic needs of a coder in Java as long as a single collection is to be updated. Yet, when the code is large enough & handles the reading (and updating) of several collections things get messy and underperformant. Hence, more generic approach such as using the dbRunCommand comes handy and makes the code more easy to read, and more seperated from the actual implementation of data access MongoDB requests.