A functional reactive alternative to Spring

Modern-day Spring allows you to be pretty concise. You can get an elaborate web service up and running using very little code. But when you write idiomatic Spring, you find yourself strewing your code with lots of magic annotations, whose function and behavior are hidden within complex framework code and documentation. When you want to stray away slightly from what the magic annotations allow, you suddenly hit a wall: you start debugging through hundreds of lines of framework code to figure out what it’s doing, and how you can convince the framework to do what you want instead.

datamill is a Java web framework that is a reaction to that approach. Unlike other modern Java frameworks, it makes the flow and manipulation of data through your application highly visible. How does it do that? It uses a functional reactive style built on RxJava. This allows you to be explicit about how data flows through your application, and how to modify that data as it does. At the same time, if you use Java 8 lambdas (datamill and RxJava are intended to be used with lambdas), you can still keep your code concise and simple.

Let’s take a look at some datamill code to illustrate the difference:

public static void main(String[] args) {
 OutlineBuilder outlineBuilder = new OutlineBuilder();

 Server server = new Server(
  rb -> rb.ifMethodAndUriMatch(Method.GET, "/status", r -> r.respond(b -> b.ok()))
  .elseIfMatchesBeanMethod(outlineBuilder.wrap(new TokenController()))
  .elseIfMatchesBeanMethod(outlineBuilder.wrap(new UserController()))
  .orElse(r -> r.respond(b -> b.notFound())),
  (request, throwable) -> handleException(throwable));

 server.listen(8081);
}

 

A few important things to note:

  • datamill applications are primarily intended to be started as standalone Java applications – you explicitly create the HTTP server, specify how requests are handled, and have the server start listening on a port. Unlike traditional JEE deployments where you have to worry about configuring a servlet container or an application server, you have control of when the server itself is started. This also makes creating a Docker container for your server dead simple. Package up an executable JAR using Maven and stick it in a standard Java container.
  • When a HTTP request arrives at your server, it is obvious how it flows through your application. The line[code language=”java”]rb.ifMethodAndUriMatch(Method.GET, “/status”, r -> r.respond(b -> b.ok()))[/code]

    says that the server should first check if the request is a HTTP GET request for the URI /status, and if it is, return a HTTP OK response.

  • The next two lines show how you can organize your request handlers while still maintaining an understanding of what happens to the request.For example, the line.elseIfMatchesBeanMethod(outlineBuilder.wrap(new UserController()))

    says that we will see if the request matches a handler method on the UserControllerinstance we passed in. To understand how this matching works, take a look at the UserController class, and one of the request handling methods:

    @Path("/users")
    public class UserController {
     ...
     @GET
     @Path("/{userName}")
     public Observable < Response > getUser(ServerRequest request) {
       return userRepository.getByUserName(request.uriParameter("userName").asString())
        .map(u -> new JsonObject()
         .put(userOutlineCamelCased.member(m -> m.getId()), u.getId())
         .put(userOutlineCamelCased.member(m -> m.getEmail()), u.getEmail())
         .put(userOutlineCamelCased.member(m -> m.getUserName()), u.getUserName()))
        .flatMap(json -> request.respond(b -> b.ok(json.asString())))
        .switchIfEmpty(request.respond(b -> b.notFound()));
      }
      ...
    }

    You can see that we use @Path and @GET annotations to mark request handlers. But the difference is that you can pin-point where the attempt to match the HTTP request to an annotated method was made. It was within your application code – you did not have to go digging through hundreds of lines of framework code to figure out how the framework is routing requests to your code.

  • Finally, in the code from the UserController, notice how the response is created – and how explicit the composition of the JSON is within datamill:
    .map(u -> new JsonObject()
    .put(userOutlineCamelCased.member(m -> m.getId()), u.getId())
    .put(userOutlineCamelCased.member(m -> m.getEmail()), u.getEmail())
    .put(userOutlineCamelCased.member(m -> m.getUserName()), u.getUserName()))
    .flatMap(json -> request.respond(b -> b.ok(json.asString())))

    You have full control of what goes into the JSON. For those who have ever tried to customize the JSON output by Jackson to omit properties, or for the poor souls who have tried to customize responses when using Spring Data REST, you will appreciate the clarity and simplicity.

Just one more example from an application using datamill – consider the way we perform  a basic select query:

public class UserRepository extends Repository < User > {
 ...
 public Observable < User > getByUserName(String userName) {
  return executeQuery(
   (client, outline) ->
   client.selectAllIn(outline)
   .from(outline)
   .where().eq(outline.member(m -> m.getUserName()), userName)
   .execute()
   .map(r -> outline.wrap(new User())
    .set(m -> m.getId(), r.column(outline.member(m -> m.getId())))
    .set(m -> m.getUserName(), r.column(outline.member(m -> m.getUserName())))
    .set(m -> m.getEmail(), r.column(outline.member(m -> m.getEmail())))
    .set(m -> m.getPassword(), r.column(outline.member(m -> m.getPassword())))
    .unwrap()));
 }
 ...
}

A few things to note in this example:

  • Notice the visibility into the exact SQL query that is composed. For those of you who have ever tried to customize the queries generated by annotations, you will again appreciate the clarity. While in any single application, a very small percentage of the queries need to be customized outside of what a JPA implementation allows, almost all applications will have at least one of these queries. And this is usually when you get the sinking feeling before delving into framework code.
  • Take note of the visibility into how data is extracted from the result and placed into entity beans.
  • Finally, take note of how concise the code remains, with the use of lambdas and RxJava Observable operators.

Hopefully that gives you a taste of what datamill offers. What we wanted to highlight was the clarity you get on how requests and data flows through your application, and the clarity into how data is transformed.

datamill is still in an early stage of development but we’ve used it to build several large web applications. We find it a joy to work with.

We hope you’ll give it a try – we are looking for feedback. Go check it out.

Weave social into the web

Disclaimer: This is the second post in a series where we are exploring a decentralized Facebook (here’s the first). It’s written by software engineers, and is mostly about imagining a contrived (for now) technical architecture.

How do you weave elements of Facebook into the web? Start by allowing them to identify themselves and all their content:

  • Establishing a user’s identity can be done rather straightforwardly by creating a unique public-private key pair for a user and allowing them to digitally sign things using their private key
  • Users can then digitally sign content they create anywhere on the internet – they can sign articles they publish, blog posts, comments, photos, likes and +1’s, anything really

Now that they’ve started to identify their content, it’s time to make everyone aware of it:

  • Notifications about content users generate needs to be broadcast in real-time to a stream of events about the user
  • Notifications can be published to the stream by the browser, or a browser plug-in, or by the third-party application on which the content was generated
  • Before being accepted into a user’s stream, notifications neet to be verified as being about the user and their content by the presence of a digital signature
  • Other parties interested in following a user can subscribe to a user’s feed

But that’s all in the public eye. To have a social network, you really need to allow for some privacy:

  • Encrypt data, and allow it to be decrypted selectively – this may include partial content – for example, it’d be useful to have a comment on an otherwise unencrypted site encrypted, only accessible by a select few consumers
  • Allow encrypted content to be sent over plain HTTP over TCP (not TLS) – this way the encrypted payload can be mirrored, and allow consumer privacy (if the consumer can access encrypted data from a mirror, it can do so privately, without the knowledge of the consumer)
  • Encryption is performed with a unique key for every piece of content
  • Decryption is selective in that the decryption key is given out selectively by the publisher (based on authorization checks they perform)

Deface, a decentralized Facebook

A disclaimer: we are a bunch of software engineers, so what follows is a wild technical thought experiment. Bring your imagination and your architectural chops.

What would a decentralized Facebook look like? Well, users should be able to:

  • Create a basic profile
  • Maintain one or more lists of friends
  • Share content with everyone on one or more of these lists
  • Have shared content only accessible by people on the list it was shared with
  • View content from all of their connections in one chronological “timeline”
  • View content from another user without the other user knowing how many times they’ve viewed it (consider how important it is that you can see someone’s photo on Facebook without them knowing, surreptitious as it sounds)

How would it work? Let’s start with user profiles and content:

  • Users can host their own profiles and content, or sign up with a service provider that hosts several users
  • Users can create a basic profile, which includeAll Postss their name, date of birth, and other basic biographical data
  • When they publish content, it is added to their personal timeline, and an event is shared with their connections notifying them of the new content

How do user connections and sharing work?

  • Each user maintains one or more lists of connections – for example, they may have a “friends” list, and a separate “colleagues” list
  • When they share content to a particular list, an event notification is shared with all the members on that list
  • Sharing of events can use a polling model where users poll for new events from their connections
  • alternatively, sharing can use a publish/subscribe mode – in this case, users can subscribe to one of their connection’s events so that events get published to them

How do users protect their content?

  • When a user publishes content, it is given a unique ID, and is encrypted with a unique key for that piece of content
  • The event notifications sent out for that content has a reference to the content’s unique ID
  • The consuming application uses the content ID to ask the publisher for the symmetric key it can use to decrypt the content
  • Once it has the symmetric key, the consuming user can access the content
  • The publishing user may subsequently refuse to give out the key for a particular piece of content (revoking access)

What all gets protected?

  • We protect the user’s profile information (portions of this are given unique IDs), as well as any content the users generate – this may include status updates, longform text, links, photos, location updates, etc.
  • Users may opt to make any of their content accessible publicly – in this case, it does not get encrypted

Content mirroring, not racking up a view count

  • The encrypted pieces of content, identified by unique IDs can be mirrored by public mirrors or private mirrors – since the data is encrypted, only those who obtain the proper symmetric key can decrypt the content
  • Consumers can choose to access content directly from a publisher, or through a public mirror
  • Public mirrors would be expected to not make view counts available on pieces of content

What are the potential weaknesses and exploits? Leave your thoughts as a comment

Cloud pricing is unfair

Is it fair to round the CPU usage of a virtual machine to the nearest hour when charging customers for cloud computing? We were curious about this so we thought we would ask the Internet. Of course, we wanted to get people’s opinions on cloud pricing overall so we asked about more than just the rounding of CPU use. We are not statisticians so the approach we took was rather simple, and took the form of an online survey. Our audience was a broad group of people involved in software, and included many independent developers, as well as those working as part of an organization.

poll-providers

When making the decision to go with a particular platform, by far the most important factors were the cost and quality of service. Surprisingly, brand name and trust was only somewhat important for many developers, especially those who were independent. The importance of brand name and trust was higher for those making the decision for teams and organizations.

poll-factors

The question we were most interested in was which pricing model was most appealing to users. The results showed that customers preferred to be charged a flat fee per month for a virtual machine – the Digital Ocean model. A similar model of paying a flat fee per month for a cloud application was also deemed fair. The most prevalent model used by AWS, Azure and many other providers of charging per unit of resource used was not particularly appealing when compared to the flat-fee approaches. Interestingly, those surveyed said that when their cloud applications exceeded a certain cost (when being charged per resource usage), they actually preferred if they were switched automatically to a flat-fee model for the remainder of the billing period instead of having their applications suspended. This seems to indicate that when it comes to pricing, users are finding being charged per unit of resource consumed complex and unpredictable. They strongly favor a pricing model that allows them to have a predictable cost per month.

Finally, to answer to the original question: is it fair to round to the nearest hour when charging users for CPU use? A most definite no.

While the results seem to indicate some solid opinions, I do want to point out that the survey is still open and if you have experience with cloud platforms and want to opine – follow the link below to our survey:

Opinions on cloud pricing