Most of the times when working with online games we run into the need to persist data that is produced by the games. This can be anything from hand history to game state to audit trails for remote calls to other systems. But what they usually have in common is that it is high volume writes, hardly any updates, some reads and that the data has large variation in what we need to store even if it is within the same context.
That last part about the data having variations is what makes this interesting. Take hand history for instance, we want to save events that are executed on a Poker table. On a Poker table many things happens – player join and leave the table, bets are made, cards are dealt, players get more chips etc. etc. These individual events might have a very different set of attributes; a raise will be type of action (i.e. raise) and an amount, but a player making a buy in will have completely other attributes. So how can we best solve this with the lowest complexity?
Step 1 – design your database schema…. Or?
If we were to use a relational database we might end up with a single table with an ungodly amount of columns so that each event has all its specific columns available. But we will never use all columns for one event of course. Maybe try to re-use columns, and call them names like column1, column2 etc. Hmm… sounds like fun to maintain and develop against.
The other pattern would be to start creating a normalized schema with multiple tables – probably one per game, and one per even type etc. So then we end up with a complex schema that needs to be maintained and versioned. Inserts and selects will be spread across tables and for sure we need to change the schema when new games or events are introduced.
There is also a third pattern out there which is to store a binary blob in the database… lets not even talk about that one.
But what about this NOSQL stuff I have heard so much about? What if we use a document store instead? We will then have a schema-less design so adding new games, event types, attributes will not require any special attention to the database (i.e. there is not schema that needs to be updated and propagated through out all deployments). For most of the cases mentioned above the usage model also fits the bill, there will be one unique identifier and the attributes or data will not have any intrinsic relation with other data sets. This sounds intriguing!
Document Store
A document store or document-oriented database is a persistence tier that encapsulates data in some standard encoding (e.g. XML, JSON etc). The document stores I looked at for this example was primarily CouchDB and MongoDB. I am a big fan of CouchDB and have used it before, but MongoDB seems to have better support for ad hoc queries where as CouchDB generally requires an index in advance. The fact that I was able to run MongoDB embedded in Java also helped to tilt the scale. (And to be honest, the whole CouchDB – CouchBase separation thing turned me off a bit to both those two products).
So I chose to go with MongoDB.
Technology Stack
Poking around with different technologies for MongoDB I finally settled on these libraries:
- MongoDB (well obviously)
- Morphia – a Java based ORM for MongoDB. Gives me type-safety and less glue code.
- EmbedMongo – A library for running MongoDB embedded in Java. Awesome for testing!
EmbedMongo will only be used for automated unit tests, I don’t think it is intended for production use.
Implementation
I will not walk through how to use the different API’s, since they do a much better job at that on their respective web sites. But I will show you some code. First I create my document classes, these are regular Java objects that I annotate with the Morphia ORM annotations. For this example I chose Hand and Event, the idea is that a Hand will have many Events written as they occur in the game.
@Entity
public class Hand {
@Id
Integer handId;
Integer playerId;
String gameName;
}
@Entity public class Event { Integer handId; Integer playerId; String name; public Event(Integer handId, Integer playerId, String name) { this.handId = handId; this.playerId = playerId; this.name = name; } public Event() {} }
The annotation @Entity means that it is a data document. @Id means that MongoDB will use that field as the key. It would be possible to keep the Events inside Hand by setting them as a List<Event> and embed them into the Hand document. However, this seems like an unnecessary overhead for our use case – now we can just write new events to the data store without having to load the hand and update it. Faster and simpler too.
Writing to the data store is very simple using Morphia:
Mongo mongo = new Mongo("localhost");
Datastore datastore = new Morphia().createDatastore(mongo, "test");
So lets connect the dots and make a fully automated unit test for writing and reading some simple hand data to/from MongoDB!
public class PersistTest {
static int PORT = 12345;
static MongodProcess mongod;
static Datastore datastore;
@BeforeClass
public static void initDb() throws Exception {
MongodConfig config = new MongodConfig(Version.V2_0_1, PORT, Network.localhostIsIPv6());
MongodExecutable prepared = MongoDBRuntime.getDefaultInstance().prepare(config);
mongod = prepared.start();
Mongo mongo = new Mongo("localhost", PORT);
datastore = new Morphia().createDatastore(mongo, "morphia");
}
@AfterClass
public static void shutdownDb() {
if (mongod != null) mongod.stop();
}
@Test
public void testSimplePersist() throws Exception {
// Store a Hand object
Hand hand = new Hand();
hand.handId = 1;
hand.playerId = 100;
hand.gameName = "Black Jack";
datastore.save(hand);
// Make sure we can read it back
Query<Hand> query = datastore.createQuery(Hand.class).field("handId").equal(1);
Hand result = query.get();
Assert.assertThat(result.handId, CoreMatchers.is(1));
Assert.assertThat(result.gameName, CoreMatchers.is("Black Jack"));
}
@Test
public void testPersistHandEvents() throws Exception {
datastore.save(new Event(1, 100, "HIT"));
datastore.save(new Event(1, 100, "STAND"));
List<Event> events = datastore.find(Event.class, "handId =", 1).asList();
Assert.assertThat(events.size(), CoreMatchers.is(2));
}
}
Let walk through the test case.
@BeforeClass – Here we setup an embedded MongoDB thanks to EmbedMongo. You do not need to pre-install MongoDB as everything will be downloaded and run locally. We also create a new datastore for our test that is named morphia.
@AfterClass – We have to clean up after us eh?
@Test – These are the test, they are normal JUnit4 tests where we write to the data store and then read it back.
And that’s it! We now have an automated test that sets up a complete database every time we run it, there is no need to break out integration tests that require installations and schema updates. It is ready to go and we can insert any type of data into it. Oh, and its fast too! Executing mvn test for the test above runs in about 2.5 seconds on my 3 year old slow computer at home.
This means that we can now develop against a database that accepts any model changes, getting a fully automated integration test harness in place with a test cycle of less than 3 seconds… that is awesome!
Of course if we were using a relationship database we would use Hibernate + embedded H2, which is ok but we are actually using a different database then plus if you are using a predefined schema, views or stored procedures then forget about it.
The tests here are extremely basic of course, the goal of the test case is to show an integrated and working stack between MongoDB and Java.
Summary
By using this technology stack I can combine the strengths of a document store with the development speed and quality assurance of using test driven development. Note that I am not saying you should always use a NOSQL solution – you should always choose the right tool for the job. In this case I value having a very flexible data store for a high variantion of different but grouped types of objects. By using a document store and an ORM mapper we can take full advantage of Javas object oriented model for our persisted data design and not have to spend a lot of effort in mapping out schemas and glue-code, which tend to become a nightmare to maintain for this type of data.