Demoing your new software solution, batteries included

Scenario: You’ve created a new tool/language/framework and want to get it shared – either online, to the highest number of people possible, or in a live demo and workshop. You need people to start working with it and learning as fast as possible, and to spend as little time as possible getting it set up.

One of the biggest challenges when creating a new development tool is precisely getting it into the hands of users. Chances are your target audience will have to setup the necessary environment and dependencies.

Languages like Golang and NodeJS, and Ruby offer simplified installers, and in the case of Golang and Ruby, in-browser code testing sandboxes that allow for getting a taste of the languages and possibly following through a tutorial online.

But to get a real sense for the tool, they need it on their machines. So, you either sacrifice the flexibility of working locally, or the immediacy of starting to code ASAP. That is, unless you take the hassle out of setting up an environment all together – that’s where Sandbox comes in.

With Sandbox, batteries are included

Let’s try a Ruby application. No, you don’t need to have Ruby installed, it will run inside a container. No you don’t need Docker installed, it comes with Sandbox, which includes everything you need to get going, right there in the Git repo. Automatically, no hassle.

Go ahead and clone this ruby sample repo. Simply run:

git clone https://github.com/stackfoundation/sbox-ruby
cd sbox-ruby
./sbox run server

And your app is running! Run ./sbox status to find out the IP Sandbox is running at, and your app will be at [Sandbox IP]:31001 ! This app will update as you change the code in /src, so feel free to experiment with it.

What just happened?

The magic was made by the Sandbox binaries included in the Git repo. They are tiny – less than 200k – but they install everything needed to run Docker and Kubernetes. A workflow file determines what should run where, which includes caching installation procedures, and all things needed for live reloading changing files, in easy to read YAML:

steps:
  - run:
      name: Install dependencies
      image: 'ruby:2.4.2-alpine'
      cache: true
      source:
        include:
          - Gemfile
      script: |-
        gem install foreman
        gem install bundler
        cd app
        bundle install
  - service :
      name: Run Application
      step: Install dependencies
      script: |-
        foreman start -d /app
      source:
        exclude:
          - Gemfile
          - Gemfile.lock
          - src
      volumes:
        - mountPath: /app/src
          hostPath: ./src
      ports:
        - container: 5000
          external: 31001

You can read more about how Sandbox does this here.

As a user, you don’t need to go through the hassle of installing a tool to know if it’s right for you – it just works. Also, as the workflow files are very clear and simple to read, they can get a sense of what needed to happen to make the application run, just by glancing at them.

As a developer, your tool can be that easy to share, and that easy to get running on someone else’s machine, with no issues and very little time spent. That means more time and user patience left to try out your creation, and a lower barrier of entry overall!

Running Linux stuff on Windows in 2017

Back in the day, as a developer who worked on Windows, I used to dread having to build projects that were built with Makefiles, and had Unix shell scripts in the build process. It meant having to install Cygwin or MinGW in order to get GNU tools that worked on Windows and then run the build, and hope it all magically worked.

Today, MinGW and Cygwin remain options for building Linux projects, running Bash scripts, and just using various Linux tools on Windows.

Windows Subsystem for Linux

But there are better options. If you are on Windows 10, since the Fall Creators Update, the Windows Subsystem for Linux means you can get a pretty full-fledged Linux environment on Windows. Follow the guide here to get it setup on your Windows 10 machine. From within the Linux environment, you have access to all the files on your machine, and you can get most Linux tools in the environment so that you can build projects, run Bash scripts, etc. And you’re not running a VM with this option.

VMs and Docker

Of course, a VM running Linux has always been an option but things have gotten a lot better on this front in 2017. Hardware-assisted virtualization is fairly common, and means VMs run just as fast. Docker has improved things  quite a bit in managing a VM on Windows so that you may not even realize you’re running on a VM. If you are on some versions of Windows 10, Docker for Windows will use the HyperV technology built-in to Windows to run the VM so that you don’t need an external hypervisor like VirtualBox. And if you’re on Windows Home (which doesn’t have HyperV support), you can use Docker Toolbox, which takes care of installing VirtualBox for you.

With Docker, your access to the VM, and your experience of running Linux on Windows turns into running a sandboxed container. You can run the Linux scripts, build the projects with Makefiles, and do it all within a container that you can simply delete after you’re done.

Sandbox

Finally, there’s Sandbox. It’s Docker but more convenient. It’s a single command-line binary that will setup Docker for you, before it runs Linux containers. It’s small so that it can be checked-in to your source control repo. That means if you’re in a team with other developers that like writing Linux scripts, you’re not going to have to translate them to Windows batch scripts. You can simply put Sandbox into the repo, and when you check out the repo, you can run those Linux scripts directly.

A quick example to show how it’ll look when Windows developers have to deal with Linux developers. Imagine you’re a Windows developer on a team working on the following C project which uses a Makefile:

https://github.com/stackfoundation/sbox-makefiles

Go ahead and clone the project repo, and just run the following to build the application:

git clone https://github.com/stackfoundation/sbox-makefiles
cd sbox-makefiles
./sbox run build-app

Or run the following to build and run the application:

./sbox run run-app

Or the following to run a bash script:

./sbox run hello-script

That’s how simple it can be to run Linux stuff on Windows in 2017.

 

Making development life easier isn’t easy

With Sandbox finally out, and having gone through a number of iterations and pivots (more on that on a later post, perhaps?), here is a small and personal account on what got me into this project, and what I see in Sandbox.

When I decided to join the StackFoundation project as a co-founder, I did so mainly out of admiration for both the vision of the project, and the ability to achieve it, knowing from previous experiences the capabilities of this team. The goal was clear, and a path to that goal appeared evident, at the time – to make development life easier.

Writing it now, it sounds awfully generic and bland – which tooling/testing/workflow developing company doesn’t claim this as their goal? Yet while the path to achieving it turned out to be a lot messier and blurrier than we thought it would be, the goal itself was as clear then as it is today. So what does this generic-sounding goal mean to me then?

In a world with hundreds of solutions for building, testing, deploying, creating and managing environments, everything seems to be pulling in its own direction. If a tool pleases a developer, it might not be robust or complete enough for a systems manager. If it pleases a systems manager it will be to complex for a QA to use or adjust. No consistency exists between roles, machines or targets, and no easy communication between these either.

Streamlining processes should start at the personal level

We were from the beginning attracted to the ideas brought by containerisation, as they seemed to be a good answer to the issues we wanted to address – guaranteed interoperability and consistency across systems, reproducibility of testing/production environments, the safety of sandboxing testing and production operations. The problem was – Docker is not an accessible tool (not to mention Kubernetes). If we were looking for a solution to facilitate development, that meant facilitating the work of all roles in a project, from QA to DevOps, and not need to imply a deep knowledge of new technologies by all parties involved.

This was particularly important to me, as my background as a programmer was definitely less backend- and systems-heavy than the rest of the team. To me, in particular, the project needed to tick a few boxes:

Make environments simple to setup

As I said, Docker and Kubernetes aren’t easy. They’re complex tools that imply a learning curve to overcome – a relatively small one for using it if thing have been setup for you, and a deeper understanding of the processes if you are in charge of setting it up.

One of our first mistakes was trying to remove this complexity altogether, in an earlier iteration of Sandbox. This lead to an exclusively UI-based system, which meant a slew of issues

Make environments easy to use for everyone

To me, this was the core of our project. As I said before, there are plenty of build/CI/containerisation tools out there, including Docker and Kubernetes, on top of which Sandbox is built. The missing piece is making those tools available and useful to the DevOps, the developer, the QA.

This meant sharing a workflow should “just work”: workflows are committed to git, and running them will install and run all necessary components in the user’s machine. No need to manually install machine dependencies, running a single command is all it takes.

Scalable up and scalable down

This actually stems from the previous two objectives, and sums them up nicely – a lot of the tools we use for deploying and managing environments, mainly due to their complexity and their weight, tend to be used only in the context of big shared servers – in the testing machine(s), in production, in a cloud CI service, etc. Usually, “containerising localhost”, while appealing for many reasons, tens to be very hard to implement in reality – port issues, setup issues, and just adding another point of breakage to the pipeline can make it more of a nuisance than a solution.

Yet for a lot of use cases (like the aforementioned QAs) it would be the ideal solution, were it simple and hassle free. Our tool aims to do exactly that, and that would mean using the same toolset for uses ranging from running the app on the developer’s machine to building a production-ready build, with all cases in between covered.

Fully automating the daily developer humdrum

In the beginning, and every day, the toil

Do you remember those first few days when you joined a new software project? Perhaps it was at your current job? Or maybe when you started contributing to an open source project? Or maybe it was when you decided to get your hands dirty with the latest fad framework?

Even in 2017, the place you often start is by going through a list of step-by-step instructions in a wiki, a README, or given to you by a fellow developer. Usually, you need to install this, then install that, put this file here, and run this script there.  It’s pretty tedious, error-prone, and often requires improvisation to account for things changing.

And of course, it’s not just at the start of a project. There are tedious tasks that you have to perform daily as a developer. Building your application, running unit tests, running functional tests, setting up seed data, creating certificates, generating versioned packages, setting up infrastructure, deploying your application, the list goes on. Continue reading “Fully automating the daily developer humdrum”

Announcing Sandbox

Announcing Sandbox

We’re very proud to announce that today, we’re ready to show off what we’ve been working on for the past few months at StackFoundation: a developer tool called Sandbox. Sandbox is a tool for running Docker-based workflows which reliably automate your day-to-day development chores. Whether it’s building your application, running your automated tests, deploying your application, or any other mundane task you perform regularly, Sandbox can help you create new scripted workflows, or make your existing scripts for these tasks more reliable. If you’re a developer, we built Sandbox to be useful for you everyday.

A Pivot

We built Sandbox to address our own needs as developers. We built a tool for ourselves, which is why we think it would be very useful for other developers. But we’ve been wrong before about what others might find useful. For anyone that has seen StackFoundation’s earlier days, and our earlier work, this is a shift in product for us. We spent a bit over a year on our first product, and we were wrong about it’s appeal. Which is why this time we are releasing early.

It’s in Alpha

We’re releasing early, and we’re releasing while Sandbox is still a bit rough around the edges. We are doing this because we want to hear from developers, and we need help from developers.

If you are developer, we would love it if you could try running a simple workflow with Sandbox, and tell us how it goes. There are a lot of platform-specific nuances that we need help in getting right – if Sandbox fails to run on your machine, we especially want to hear from you. But we want to hear from you even if it all works as it was designed to. Leave a comment on the blog, tweet at us, file an issue on our GitHub issue tracker, find us on Gitter, or drop us an email. We want to hear what you think.

And just for anyone wondering: Sandbox is free, not just now in Alpha, but also forever in the future.

Using Google Optimize with Angular

Recently, we’ve started doing A/B testing on the StackFoundation website, to better understand our visitors in, and to test out changes in design and messaging. It allows us to experiment using different messaging strategies (even ones we would normally dismiss outright), and get real data on the effectiveness of those strategies, as opposed to relying on our biased views (our biases are a good topic for an entirely different discussion).

For that we are using Google Optimize – an A/B testing tool that ties into Google Analytics, and is both free and easy to setup.

How Optimize works

Google Optimize works by injecting/replacing content on the page for a certain percentage of users. The exact changes are usually setup through a Chrome plugin, and triggered on page load. In the case of single-page style websites like ours, changes are triggered by an event.

Using Google Optimize with Angular was, as it turns out, a very simple process. Angular has a pretty robust system of lifecycle hooks that we could use, and using the right one is all we need.

Setting a catch-all hook to trigger experiment changes

For nearly all use-cases, all we need is a catch-all. The worry we had here was that it might mean a lot of traffic going back and forward between the user’s machine and Google if a lot of changes occur. As it turns out, reactivating Optimize doesn’t mean new requests get made, so this is quite simple.

On the root component of your app, simply:

@Component({
    selector: 'my-app',
    template: 
    `<header></header>
    <div class="content" >
        <router-outlet></router-outlet>
    </div>
    <footer></footer>`
})
export class Root implements AfterViewChecked {
    ngAfterViewChecked () {
        if (window['dataLayer']) {
            window['dataLayer'].push({'event': 'optimize.activate'});
        }
    }
}

This will ensure any change to the html is followed by a check and/or replacement pass by the Optimize snippet, and as it happens on AfterView, no risk of it being overwritten, or colliding with Angular processes.

This has the added benefit of not having any flicker due to content change, except for when the page first loads.

Avoiding content flicker on initial page load

Google provides a page hiding snippet precisely because of this flicker. Ours being a single page app, we prefer to avoid any page content/style changes that fall beyond our control. In this particular case, it was a mistake – the snippet does its work very well, and unless a very specific and fine-grained control is necessary, it is definitely the way to go.

That being said, we explored how to avoid this flicker using only Angular, and here is the result of that. I cannot stress enough – this ended up not being used, as it was deemed better to use Google’s snippet.

Analysing the snippet provided by Google, we see that an object containing data on the specific Optimize account and a callback should be added under the hide property of the dataLayer object Google uses. We also noted that if dataLayer happens to have already been created, the callback won’t run – but if that happens, there is no risk of flicker. Using APP_INITIALIZER:

{
    provide: APP_INITIALIZER,
    useFactory: (...) => {
        return () => {
            return new Promise<boolean>(resolve => {
                if (!window['dataLayer']){
                    let _valObj: any = { 'GTM-XXXXXXX': true };
                    _valObj.start = Date.now();
                    _valObj.end = () => resolve(true);
                    window['dataLayer'] = [];
                    window['dataLayer'].hide = _valObj;
                    setTimeout(function () {
                        resolve(true);
                        _valObj.end = null
                    }, 4000);
                    _valObj.timeout = 4000;
                }
                else {
                    resolve(true);
                }
            });
        };
    },
    [...]
}

 

10 tips for migrating from Maven to Gradle

Here’s a quick list of 10 lessons we learned when making the switch to Gradle from Maven for StackFoundation. Coming from a deeply Maven place, these are the things that gave us an “Aha! That’s how you do it in Gradle!” moment. As with any other tool, these are not the only way (nor the best) to do some of these things with Gradle – this is not meant to be a prescriptive list of best practices. Rather, it’s just a few things to help those Mavenistas out there who are thinking of switching to Gradle, or actively switching to Gradle, and figuring out how to get their mind to think Gradle.

1) Forget the GAV, use the directory layout!

Maven folks are used to thinking about a module’s GAV – it’s group, artifact, and version. When you switch to Gradle, you don’t have to think about this so much. Gradle will name projects based on the names of the directories by default. So if you have the following multi-project directory structure:

  • server
    • core
      • src/main/java
    • logging
      • src/main/java

These projects are named server, core, and logging. In Gradle, projects are identified with a fully-qualified path – in this case the paths are going to be  :server,  :server:core and  :server:logging.

Note: You can give projects a group, and version, if you want.

2) Build everything from the root!

More than a GAV, when you start using Gradle regularly, you’ll start thinking of projects by their path.

In Maven, you’re probably used to switching to a particular sub-module directory and then invoking mvn clean install, etc from there. In Gradle, you’re going to kick off all builds from the root of your multi-project setup. And you can simply use a sub-project’s path to kick off a task for that project. For example, you can invoke gradlew :server:logging:buildto build the logging sub-project, within the top-level server project.

3) Use custom tasks!

In Maven, if you need to perform some custom logic for a build or deploy, you go hunting for a particular plug-in, and then see if invoking one of it’s goals at some spot within the fixed Maven build lifecycle accomplishes what you want. If not, you try and look for another plug-in. And another, and then you might try writing one yourself.

Gradle is fundamentally built around tasks. You’re going to end up writing a custom task for a lot of what you want to do. Build a package by combining things in a specific way? Write a task. Deploy a service? Write a task. Setup infrastructure? Write a task. And remember all Gradle scripts are Groovy scripts so you are writing Groovy code when writing your tasks. Most of the time, you won’t write a task from scratch – you’ll start with a plug-in (yes, like in Maven, you’ll start by searching for plug-ins), and one of the tasks it defines, and then customize it!

4) Name your tasks, give them a group and description!

If you have a complex Maven project, you are very likely using a number of profiles, and you will probably have a specific order to build things, and maybe even a specific order to run things with different profiles activated. You’ll end up documenting this on a Wiki or a README file in your Git repo. And then you’ll forget to update that document so that eventually, how exactly something is built is only tribal knowledge.

In Gradle, you create custom tasks. This was already point 3 – but once you create them, you can give these tasks a group and a description. We give our most important custom tasks the group name ‘StackFoundation’. That way when we run a gradlew tasks, we see a list of tasks specific to our project in the list of available tasks to run. A great way to document our tasks.

5) Alias tasks, name them something you will remember!

Picking up from 3 and 4: You can create a task just to alias another task defined by another plugin. For example – the Shadow plugin is the Gradle version of the Maven Shade plugin. You might be happy with the default shadowJar task it provides but if in your project, a more meaningful name for creating that shadow JAR package is createServicePackage, you can create an alias:

task createServicePackage(dependsOn: shadowJar)

Note: It’s not exactly an alias, but close enough.

6) The Shadow plugin is the Gradle version of the Maven Shade plugin

This one is used by enough Maven folks that it’s worth repeating.

7) Use the Gradle wrapper

With Maven, you have to get everyone to setup Maven or use an IDE which comes with a Maven built-in in order to run builds for your project. With Gradle, there’s the Gradle wrapper – and you’re meant to check it in to your team’s repo. Setup your project to use the wrapper, and put it in your source control repo! Your team won’t have to think about getting Gradle.

8) Forget the inheritance parent, use external build scripts to define common tasks

In Maven, you use an inheritance parent to manage dependencies, and plugins.

With Gradle, you can reference other Gradle files from a build.gradle file – you do that using something that looks like: apply from: '../../gradle/services.gradle'. These are called external build scripts and there’s some caveats to using them but they’re a great way to define common tasks. For example, you can create some common tasks for deploying any of the services you use in your projects inside gradle/services.gradle and reference them from your other Gradle files.

Note: You can also put common task code inside buildSrc.

9) Forget the inheritance parent, create custom libraries

In Maven, you use a parent POM to define common dependencies. With Gradle, you can define common dependencies by putting them in an external build script (described in point 6). Here’s an example of a file in gradle/dependencies.gradle which defines some common libraries we use in all of our projects:

repositories {
    mavenLocal()
    mavenCentral()
}

ext {
    libraries = [
            aws            : {
                it.compile('com.amazonaws:aws-java-sdk-s3:1.11.28') {
                    exclude group: 'org.apache.httpcomponents', module: 'httpclient'
                    exclude group: 'com.fasterxml.jackson.core', module: 'jackson-annotations'
                    exclude group: 'com.fasterxml.jackson.core', module: 'jackson-core'
                    exclude group: 'com.fasterxml.jackson.core', module: 'jackson-databind'
                    exclude group: 'com.fasterxml.jackson.dataformat', module: 'jackson-dataformat-cbor'
                }
            },
            awsEcr         : 'com.amazonaws:aws-java-sdk-ecr:1.11.28',
            datamill       : {
                it.compile('foundation.stack.datamill:core:0.1.1-SNAPSHOT') {
                    exclude group: 'org.apache.httpcomponents', module: 'httpclient'
                }
            },
            datamillLambda : 'foundation.stack.datamill:lambda-api:0.1.1-SNAPSHOT',
            junit5 : [
                'org.junit.jupiter:junit-jupiter-api:5.0.0-M4',
                'org.junit.jupiter:junit-jupiter-migration-support:5.0.0-M4'
            ],
    ]
}

Note the use of GAVs to refer to Maven dependencies, and how you can setup exclusions using this approach. With this approach, we get to give our own names to these libraries instead of referring to everything with GAVs. This is especially great for us because colloquially, we refer to our dependencies using these names and this makes looking at project dependency information clear and concise. In addition, we can group multiple Maven dependencies into one custom user library, as with the junit5 example.

Here’s how a particular project defines the libraries as dependencies:

dependencies {
    compile libraries.datamill(it)
    testCompile libraries.junit5
}

10) Doing resource filtering

In Maven, you probably use resource filtering to replace property placeholders in resource files. There’s two equivalents in Gradle – the first is to use ReplaceTokens:

processResources {
    def props = [imageVersion: 'unspecified']
    filesMatching('*.properties') {
        filter(org.apache.tools.ant.filters.ReplaceTokens, tokens: props)
    }
}

This looks for placeholders of the form @imageVersion@, i.e., they’re delimited by @’s. It tolerates missing property names. A second form looks like this:

processResources {
    props = [imageVersion: 'unspecified']
    filesMatching("**/*.yaml") {
        expand props
    }
}

This looks for property placeholders of the form $imageVersion – well, sort of. It’s actually using a template mechanism in Groovy which makes it very powerful but if you use it for simple cases, you’ll probably encounter the following: if a simple placeholder references a missing property, your build will fail in error!

That’s all for now! More lessons from our experience migrating to Gradle at StackFoundation will be for a future post! Hope that helps those of you making the switch from Maven!

 

Going serverless with Lambdas

This is the first post of a series where we’ll be writing about the motivation and decision making process behind our cloud strategy for Sandbox.

Setting the stage

Our first assumption was that, given that Sandbox follows a client-server architecture and that most of its core functionality is on the client side, our cloud requirements wouldn’t be as high as if we were a pure SAAS product.

When we started architecting our cloud components for Sandbox, we came up with the following list of requirements:

  1. It had to allow us to have a low time-to-market. Rewriting certain parts later would be something we would accept as a side effect of this.
  2. It had to be cost efficient. We didn’t anticipate much traffic during our early days after our initial go live, so it felt natural to look for a ‘Pay as you go’ model, rather than having resources sitting idle until we ramp up our user base.
  3. It needed to allow us to have a simple go live strategy. This needed to apply both in terms of process and tooling used. It obviously also needed to leverage our previous knowledge of cloud providers and products/services.

So, what did we decide to go with, and most importantly, why?

First, choosing the cloud provider: we had a certain degree of experience using AWS, from our professional life as consultants, but especially from our days working at StackFoundation, so this was the easy choice!

The AWS offering is huge, and constantly evolving, so deciding which services to use wasn’t that straight-forward. From a computing perspective, there seemed to be 2 clear choices, using ECS to handle our EC2 instances, or going serverless with Lambdas. Our experience leaned more towards EC2, but when we revisited our cloud architecture requirements, we concluded Lambdas were a better fit, as they ticked all the boxes:

  1. In terms of the actual application development, choosing EC2 or Lambdas didn’t seem to make much of a difference. Once you decide on whether to go with plain Lambdas or use a framework to make Lambda development closer to standard API development, feature development was pretty straightforward, with a low learning curve. We initially went with Jrestless although we crafted our own solution lambda-api module of the Datamill project, more on this on future posts. On the EC2 world, this would have meant us using micro-services running on containers, so pretty much the same picture.
  2. The term ‘Pay as you go’ feels like it was coined for Lambdas, at least in terms of compute resources. You get billed based on your memory requirements and actual usage period (measured to the nearest 100ms). On the other hand, with EC2 instances, you have to keep your instances live 24×7. So we either we had to go with low-powered T2 family instances to keep the cost low and struggle if we were to get unpredicted usage spikes, or go with M3 family instances, and potentially waste money on them during periods where they were under-used. This would  probably be the case during our early preview period.
  3. In terms of overall architecture, it is more complicated to configure ECS clusters and all the different components around them than to simply worry about actual application code development, which is one of the selling points of Lambdas (and the serverless movement in general). This may be an overstatement, as even with Lambdas there is additional configuration. For example, you also need to take care of permissions (namely Lambda access to other AWS resources like SES, S3, DDB, etc) but definitely at least an order of magnitude simpler than doing this on the ECS front. In addition, even though the AWS console is the natural path to start with until you get a grasp of how the service works and the different configuration options you have, you will eventually want to migrate to using some kind of tooling to take care of deployment, both for your lower environments as well as for production. We came across Serverless, a NodeJS based framework that aids you in your Lambda infrastructure management (or other function based solutions by other providers). The alternative would have probably been to go with Terraform, which is an infrastructure as code framework. Our call here was motivated both in terms of project needs and the overhead of the learning curve of the tooling to use. If we went with Lambdas, we would mainly need to worry only about them in term of AWS resources, so Serverless offered all we needed in this sense. In terms of the learning curve, using a general purpose framework like Terraform would imply us having to deal with concepts and features we wouldn’t be needing for our use case.

So, now that we’re live, how do we feel about our assumptions/decisions after the fact?

In terms of application development, the only caveat we found was in terms of E2E testing. We obviously have a suite of unit and integration tests on our cloud components, but during development, we realised some bugs arose from the integration with our cloud frontend that couldn’t be covered by our integration tests. This was true specially when services like API Gateway were involved in the problem, as we haven’t found a way of simulating this on our local environment. We came across localstack late in the day, and gave it a quick chance, but didn’t seem to be stable enough. We couldn’t spend longer on it at the time, so decided to cover these corner scenarios with manual testing. We might decide to revisit this decision going forward, though.

In terms of our experience in working with Lambdas, the only downside we came across was the problem of cold Lambdas. AWS will keep your Lambdas running for an undocumented period, after which, any usage will imply that the Lambda infrastructure needs to be recreated in order for it to serve requests. We knew about this before going down the Lambda path, so we can’t say it came as a surprise. We had agreed that in the worst case scenario, we would go with a solution we found proposed on several sites: having a scheduled Lambda that would keep our API Lambdas “awake” permanently. To this end, we added a health check endpoint to every function, which would be called by our scheduled Lambda. In future releases we will also use the health check response to take further actions, for example notify us of downtime.

Overall, we’re quite happy with the results. We may decide to change certain aspects of our general architecture, or even move to an ECS based solution if we go SAAS, but for the time being this one seems like a perfect fit for our requirements. Further down the line, we’re planning to write a follow up post regarding the concepts discussed on this post, with some stats and further insights of how this works both in production and as part of our SDLC.

Of course, there is no rule of the thumb here, and what might work for us may not work for other teams/projects, even if they share our list of requirements. In any case, if you’re going down the serverless journey, we’d love to hear about your experiences!

Understanding NgZone

Angular 2 does a lot of things differently from Angular 1, and one of its greatest changes is in change detection. understanding how it works has been essential, particularly when using Protractor for E2E testing. This explores how to work with zones for testing and performance.

A live example of the mentioned code is here.

The biggest change in how Angular 2 handles change detection, as far as a user is concerned, is that it now happens transparently through zone.js.

This is very different from Angular 1, where you have to specifically tell it to synchronise – even though both the built in services and the template bindings do this internally. What this means is, while $http or $timeout do trigger change detection, if you use a third party script, your Angular 1 app won’t know anything happened until you call $apply().

Angular 2, on the other hand does this entirely implicitly –  all code run within the app’s Components, Services or Pipes exists inside that app’s zone, and just works.

So, what is a zone?

zone.js’s zones are actually a pretty complicated concept to get our head around. Running the risk of over-simplifying, they can be described simply as managed code calling contexts – closed environments that let you monitor, control, and react to all events, from asynchronous tasks to errors thrown.

The reason this works is, inside these zones, zone.js overrides and augments the native methods – Promises, timeouts, and so on, meaning your code doesn’t need to know about zone to be monitored by it. Everytime you call setTimeout, for instance, you unwillingly call an augmented version, which zone uses to keep tabs on things. 

What Angular does is use zone.js to create it’s own zone and listen to it, and what this means for us as angular users is this – all code run inside an angularapp is automatically listened to, with no work on our part.

Most times, this works just fine – all change detection “just works” and Protractor will wait for any asynchronous code you might have. But what if you don’t want it to? There are a few cases where you might want to tell angular not to wait for / listen to some tasks:

  • An interval to loop an animation
  • A long-polling http request / socket to receive regular updates from a backend
  • A header component that listens to changes in the Router and updates accordingly

These are cases where you don’t want angular to wait on asynchronous tasks/ change detection to run every time they run.

Control where your code runs with NgZone

NgZone gives you back control of your code’s execution. There are two relevant methods in NgZone – run and runOutsideAngular:

  • runOutsideAngular runs a given function outside the angular zone, meaning its code won’t trigger change detection.
  • run runs a given function inside the angular zone. It is meant to be run inside a block created by runOutsideAngular, to jump back in, and tell angular to start listening again.

So, this code will have problems being tested, as the app will be constantly unstable:

this._sub = Observable.timer(1000, 1000)
    .subscribe(i => {
    this.content = "Loaded! " + i;
});
this.ngZone.runOutsideAngular(() => {
    this._sub = Observable.timer(1000, 1000)
        .subscribe(i => this.ngZone.run(() => {
            this.content = "Loaded! " + i;
        }));
});

Simplifying usage

After understanding how NgZone works, we can simplify its usage, so that we don’t need to sprinkle NgZone.runOutsideAngular and NgZone.run all over the place.

We can create a NgSafeZone service to do exactly that, as the most common use case for this is:

  • Subscribe to an Observable outside the angular zone
  • Return to the angular zone when reacting to that Observable.
@Injectable()
export class SafeNgZone {

    constructor(private ngZone: NgZone) {}

    safeSubscribe(
        observable: Observable < T > ,
        observerOrNext ? : PartialObserver < T > | ((value: T) => void),
        error ? : (error: any) => void,
        complete ? : () => void) {
        return this.ngZone.runOutsideAngular(() =>
            return observable.subscribe(
                this.callbackSubscriber(observerOrNext),
                error,
                complete));
    }

    private callbackSubscriber(obs: PartialObserver < T > |
        ((value: T) => void)) {
        if (typeof obs === "object") {
            let observer: PartialObserver < T > = {
                next: (value: T) => {
                    obs['next'] &&
                        this.ngZone.run(() => obs['next'](value));
                },
                error: (err: any) => {
                    obs['error'] &&
                        this.ngZone.run(() => obs['error'](value));
                },
                complete: () => {
                    obs['complete'] &&
                        this.ngZone.run(() => obs['complete'](value));
                }
            };

            return observer;
        } else if (typeof obs === "function") {
            return (value: T) => {
                this.ngZone.run(() => obs(value));
            }
        }
    }
}

With this the previous code gets simplified quite a bit:

// The following:
this.ngZone.runOutsideAngular(() => {
    this._sub = Observable.timer(1000, 1000)
        .subscribe(i => this.ngZone.run(() => {
            this.content = "Loaded! " + i;
        }));
});

// Becomes:
this._sub = this.safeNgZone.safeSubscribe(
    Observable.timer(1000, 1000),
    i => this.content = "Loaded! " + i);

 

Backup and replication of your DynamoDB tables

We are using Amazon’s DynamoDB (DDB) as part of our platform. As stated in the FAQ section, AWS itself replicates the data across three facilities (Availability Zones, AZs) within a given region, to automatically cope with an eventual outage of any of them. This is a relief, and useful as part of an out of the box solution, but you’d probably want to go beyond this setup, depending on what your high availability and disaster recovery requirements are.

I have recently done some research and POCs as to how it would be best to achieve a solution inline with our current setup. We needed it to:

  • be as cost effective as possible, while covering our needs
  • introduce the least possible complexity in terms of deployment and management
  • satisfy our current data backup needs and be inline with allowing us to handle high availability in the near future.

There’s definitely some good literature on the topic online (1), besides related AWS resources, but have decided to write a serious of posts which will hopefully provide a more practical view on the problem and the different range of possible solutions.

In terms of high availability, probably your safest bet would be to go with cross-region replication of your DDB tables. In a nutshell, this will allow you to create replica tables of your master ones in a different AWS region. Luckily, AWS labs provides an implementation on how to do this, open-sourced and hosted in GitHub. If you take a close look at the project’s README, you’ll notice it is implemented by using the Kinesis Client Library (KCL). It works by using DDB Streams, so for this to work, streaming of the DDB tables you want to replicate needs to be enabled, at least for the master ones (replicas don’t need it).

From what I’ve seen, there would be several ways of accomplising our data replication needs:

Using a CloudFormation template

Using a CloudFormation (CF) template to take care of setting up all the infrastructure you need to run their cross-replication implementation mentioned above. If you’re not very familiar with CF, they describe it as:

AWS CloudFormation gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion.

Creating a stack with it is quite straight forward, and the wizard will allow you to configure the options on the following screenshot, besides some more advanced ones on the following screen for which you can use defaults in a basic setup.

Screen Shot 2016-07-25 at 09.38.36

Using this template will take care of creating everything from your IAM roles and Security Groups creation, to launching the defined EC2 instances to perform the job. One of those instances will take care of coordinating replication and the other(s) will take care of the actual replication process (i.e. running the KCL worker processes). The actual worker instances are implicitly defined as part of an autoscaling group, as to guarantee that the worker instances are always running, in order to prevent events from the DDB stream being unprocessed, which would lead to data loss.

I couldn’t fully test this method as after CF finished setting up everything, I couldn’t use the ReplicationConsoleURL to configure master/replica tables due to the AWS error below. Anyway, wanted a more fine grained control of the process, so looked into the next option.

Screen Shot 2016-07-25 at 11.14.46

Manually creating your AWS resources and running the replication process

This would basically imply performing most of what CF does on your behalf. So it would mean quite a bit more work in terms of infrastructure configuration, be it through the AWS console or as part of your standard environment deployment process.

I believe this would be a valid scenario in the case you want to use your existing AWS resources to run the worker processes. You’ll need to leverage what your costs restrictions and computing resource needs are, before finding considering this a valid approach. In our case, this would help us with both, so decided to explore it further.

Given that we already have EC2 resources set up as part of our deployment process, I decided to create a simple bash script that would kick of the replication process as part of our deployment. It basically takes care of installing the required OS dependencies, cloning the git repo and building it, and then executing the process. It requires 4 arguments to be provided (source region/table and target region/table). Obviously, it doesn’t perform any setup on your behalf, so the argument tables will need to exist on the specified regions, and the source table must have streaming enabled.

This proved to be a simple enough approach, and worked as expected. The only downside of it is that, regardless of it running within our existing EC2 fleet, we still needed to figure out a mechanism of monitoring the worker process, in order to restart it in case it dies for any reason, and avoid data loss as mentioned above. Definitely an approach we might end up using in the near future.

Using lambda to process the DDB stream events

This method uses the same approach as the above, in that it relies on events from your DDB tables streams, but removes the need of having to take care of the AWS compute resources you will need in doing so. You will still need to handle some infrastructure and write the lambda function that will perform the actual replication, but will definitely help with the cost and simplicity requirements mentioned in the introduction.

Will leave the details of this approach for the last post of this series though, as it is quite a broad topic that I will cover there in detail.

In the upcoming posts I will discuss the overall final solution we end up going with, but before getting to that, in my next post, I will discuss how to backup your DDB tables to S3.

Stay tuned!