From “Hello World”, to a Production-grade Microservice

Jingjie Zhan
Level Up Coding
Published in
6 min readFeb 3, 2021

--

It’s quite easy today to build a “hello world” service, with the resources online.

Image from Towards Data Science

For example, this is how to write a Hello World service in Golang.

It’s similarly easy to do it in Node.js.

However, figuring out how to go from this to a production-grade microservice, is not obvious. The resources online for building a production service are rather fragmented and sometimes scarce.

The goal of this post is NOT to provide all the answers for building a production microservice but to provide pointers on key things to consider. This is an inexhaustive list, so please suggest more in the comments!

Basics:

Below are the basics that almost all kinds of services will need.

Image from hackernoon.com

Programming:

  • Language. Which language to choose? For example, Golang is a performant language that’s especially good at multi-threading; Python might be a good choice if the service is related to machine learning or data analytics; Node.js can be a good choice if you want to keep JavaScript as the full stack language. Java is great because it’s widely used and if the team is already well versed in the Spring framework then it’s a plus.
  • Dependency management. Each language has its own dependency management tools. Sometimes it can be tricky to get it right when 1. there are external and internal dependencies, 2. handling dependency upgrade, and 3. more than one version of a library is being dependent on.
  • Configuration. Most services need to load configurations into the process, like PORT, database connection string, or authentication token. How to load the configs? How to differentiate the configs between local/staging/prod environments? How to pass configs around in the application code efficiently? How to dynamically reload configs when they change?

Web Server:

  • Web server library. Which server library to choose from? Most of the popular programming languages provide a default HTTP server, and there is a myriad of third-party servers to choose from. What are their pros and cons? Which one best fits your need?
  • Middleware. What middlewares do you need? Some common ones are logging, recovery from an exception, telemetry, etc. Here’s a list of ExpressJS middlewares.
  • Protocol. HTTP REST or GraphQL? gRPC, Thrift, or WebSocket? They each have their advantages and use cases, so using which one depends on your needs.

Quality:

Code quality can mean many things. Two important ones are code understandability and code coverage.

Photo by Scott Graham on Unsplash

Understandability:

  • Cognitive complexity. Is the code easy to understand, even for someone new? Is it easy to build on top of? Does the code have the right level of abstraction without complicating things? Does the folder/file structure make sense?
  • Linting. Consistent and standard code style is critical for code readability, and linters do more than just style checks. Which linter to use? How to make it part of the local development flow? And how to make it run as part of a CI build?
  • Documentation. For HTTP APIs, do they have Swagger/OpenAPI documentation, and is it up to date when the API changes? Does the code base have a sufficient amount of comments to explain the code? Does it have an informative and concise README?

Testing:

  • Unit tests. How to write better unit tests, and achieve better code coverage? Which test frameworks to use? How to efficiently mock objects? How to do test-driven development?
  • Component/integration tests. How to create them? And how to run them locally and on CI, in a fast and automated way?
  • Code coverage. How to report code coverage, and detect code coverage slip?

Observability:

Without strong visibility into the system, it’s as if flying blindly. The three pillars of visibility are logging, telemetry, and distributed tracing.

Image from dynatrace.com
  • Logging. How to collect structured and leveled logs? Which library package to use? How to link logs in a single session together with a unique ID?
  • Telemetry. Which framework to collect metrics, StatsD or Prometheus or something else? How to set it up? Alerting. How to set up alerts with telemetry? Basic threshold alerts, anomaly detection, and others. Integration with PagerDuty, Slack, and email is often necessary.
  • Distributed Tracing. Distributed tracing is useful to pinpoint performance bottlenecks, and issues in a distributed system. How to set it up? It’s less useful when not all microservices are doing it, so how to make it consistent across services and teams?

Security:

Microservices often mean more distributed security responsibility. There are many things we can do to make things more secure.

Image from xkcd
  • Secure by design. For example, validate and sanitize user input.
  • HTTPS. Use HTTPS for both external and internal APIs/websites. Attackers can be outside and inside your network.
  • Authentication/Authorization. Authentication makes sure the identity is what it claims to be. Authorization makes sure authenticated users have the right level of permissions. It’s not a trivial topic and worth a separate post.
  • Secrets management. First, don’t check in secrets into code. Then how to securely store and transmit secrets? There are solutions like HashiCorp Vault or solutions from cloud providers.
  • Rate-limiting. Proper rate limiting can slow down denial of service attack, and protect internal services’ integrity and performance. This can be implemented at the API Gateway level for public APIs, or in Service Mesh and Middleware. And there are many techniques for enforcing rate limits.

Static security analysis:

  • Scan the code. Use proper tools to scan things like hard-coded credentials, SQL query construction, insecure encryption algorithm, insecure random number generation, etc.
  • Scan dependencies. Third-party dependencies are usually > 80% of the code in an application. Use static analysis tools to find vulnerabilities in them.
  • Scan Docker images. Use minimal base image, use the least privilege, scan for open source vulnerabilities, etc.
  • Scan the Docker and Kubernetes configuration files for vulnerabilities.

Development flow:

Having an efficient, robust while understandable development environment is crucial to developer productivity (and happiness).

Image from Docker
  • Local environment setup. How to have one command to build and run a microservice locally? And equally important, make it fast? How to ensure it’s the same environment across each developer’s machine, and the same as on CI, Staging, and Production?
  • Docker. How to write a production-quality Dockerfile, that produces a small image footprint (faster deployment, and smaller attack area), and it’s fast, and understandable (< 15 lines hopefully)?
  • CI/CD: Which system to choose on the market? For example, GitHub Actions, TravisCI, CircleCI, DroneCI? How to create a production-quality configuration file that makes the CI fast and comprehensive? Which Artifact Registry to store the built artifacts?

Topics for a different day:

To keep the post short, there are other important topics related to microservice that won’t be covered today.

  • Container orchestration
  • Data storage
  • API Gateway
  • Service to service communication
  • And many more…

Thank you all for reading this. I hope one takeaway from this post is: building a production-grade microservice is not easy and obvious. It takes experience and time to get things right. And it’s tougher to do it consistently across engineering teams.

Handling all of the things above will take away time that can be focused on building the business logic. I’m working on a mission to help engineering teams move faster with production quality. If your team is interested in a technical consulting session, feel free to reach out to me through LinkedIn.

--

--