How to Upload a File from an Angular App using a JSON POST rather than form-data

Documenting my two-week struggle to get over far more hurdles than I ever expected

Michael Jacobson
Level Up Coding

--

Hurdles lined up on a running track
Image by Alohaflaminggo on shutterstock

I’m writing this story partly for my future self — so I’ll have this recipe at my fingertips if I ever need to do it again — and partly for others, to hopefully spare some poor souls some of the pain that I just went through.

Prologue

I inherited a task to implement a file upload feature in our Angular 8 app.

What made this task a bit different than a typical file upload was that we weren’t uploading directly to the file storage server, we were uploading to our intermediate Node server (owned by the Frontend team, my team) which then POSTed to the final destination file storage microservice (owned by the Backend team).

This was new to me, as well as to the rest of the team, so research was required.

If I had been looking to POST form-data from the browser directly to the file storage server, I would have been on Easy Street, as I discovered a sea of relevant and helpful resources.

Happy sporty man with raised up arms standing on mountaintop looking out at the sea
Photo by Denis Belitsky on shutterstock

Alas, I was taking the road less traveled by.

Dense forest
Photo by Family Way Studio on shutterstock

And that made all the difference.

The information I needed was scattered among many different sources, and gathering them all together into something that actually worked took a lot of time, research, and failed attempts.

The Ultimate Goal: POST JSON from Node to Backend Microservice

Ultimately what we needed to arrive at was a JSON POST from our Node server to our backend endpoint structured like this:

{
"filename": "important-info.csv",
"data": "QXMgSSBzYXQgaW4gdGhlIGJhdGggdHViLCBzb2FwaW5nIGE...",
"md5Hash: "a0450857392c612ea0f5369864b60194"
}

where data is the base64-encoded file data, and md5Hash is an MD5 hash value of the data to be used as a checksum to validate its integrity when it reaches the backend.

Task #1: POST from Browser (Angular) to Node

The POST from Angular would be just a typical JSON POST, not a form-data POST.

So we needed to:

  1. Extract the file’s raw data (ArrayBuffer to JavaScript folks, byte array to others)
  2. Generate a base64-encoded string of that data
  3. Generate an MD5 hash of that data

We happened to be using PrimeNG’s FileUpload component, but there are many ways for your JavaScript code to get a hold of files being uploaded.

What we were getting from the FileUpload component was an array of File objects.

Task 1.1: Extract File’s Raw Data

To get the file’s ArrayBuffer data, I used the FileReader API, which provides an interface for asynchronously reading the data from a File object.

Task 1.2: Get Base64-Encoded String of File Data

Once I got a hold of the ArrayBuffer, I encoded it to a base64 string using the handy base64-arraybuffer libary, which encodes and decodes base64 to and from ArrayBuffers. Perfect!

Here’s the code that completes Tasks 1.1 and 1.2, getting us from our File object to a base64-encoded string of the file’s data (wrapped in an Observable because, as all good-hearted Angular developers, we handle our asynchronous events as streams):

Task 1.3: Get MD5 Hash of File Data

The next thing we needed was an MD5 hash of the file data.

To get that, I used the SparkMD5 library in combination with this clever usage of the library written by Michael Monerau that reads the file in chunks to avoid loading the whole file into memory at once. (You’ll notice that I wrapped his Promise-based solution as an Observable because…good-hearted Angular developer.)

Note lines 44 and 48, where he provided two formatting options for the final hash. I needed the hexdigest form so I used the one on line 48.

Task 1.4: Gather Pieces Together for POST(s) to Node

With the utility methods in place to generate the data we needed, we then just had to gather it all together in our FileUploadService.

Here’s the service method that our component calls, passing in the array of File objects obtained from the FileUpload component:

That method calls two other methods: one to build a POST body for each file, and the other to perform the actual upload of the files in batches.

Construct the POST Bodies

The first method, getPostBodyForFileUpload, pulls the pieces together for each file. Here’s what that method looks like:

Getting the encoded file data and generating the MD5 hash are both asynchronous tasks that we wrapped as Observables, so we just use forkJoin to perform the two tasks and map the results to an object with the structure we need for our POST body.

Perform the POSTs

The second method called by saveFiles, uploadFiles , takes the array of POST bodies and performs the actual calls to our HttpService. Here’s that method:

Each POST could include an array of files, but because of the size limitation for each request (see Infrastructure/Architecture Hurdles below), I created a method called getGroupedFileUploadBodies that iterates over all of the files and groups them into collections that are within the size limit. So I ended up with an array of arrays (uploadGroups), which then mapped to an array of POST requests (requests).

Infrastructure/Architecture Hurdles

With the uploads finally POSTing from the browser, we entered this next phase of our journey: infrastructure hurdles.

Hurdle #1: Express Middleware Request Body Size

The first one was an obvious one that I had already planned for.

We use Express body-parser middleware on our Node server, and the default max request body size is 100KB, so I had to increase that to account for the 2GB file size limit requested by the Product Owner:

Easy peasy!

Unfortunately, all subsequent hurdles were much less obvious.

Hurdle #2: Kubernetes with NGINX Ingress Controller

We were finding that uploads larger than about 1MB were triggering error responses, specifically 413 Request Entity Too Large error response pages from what looked like an nginx server:

We also found that the requests were not reaching our Node server, as the Node server logs had no record of them.

Know Your Architecture!

We use Kubernetes to manage our services, and those of you familiar with it and with the NGINX Ingress Controller often used in conjunction with it probably knew right away where that 413 response was coming from.

Unfortunately, I was woefully ignorant of it at the time, and it took several sessions with the Backend folks and the SRE folks, explaining to them exactly what we were doing from the front end, before they figured out that the ingress controller was intercepting these requests from the browser to our Node server inside the Kubernetes cluster, and that the controller had a default max request body size of 1MB.

Thus the 413 response and no record of the request in our Node server logs.

Getting over that hurdle required adding the annotation nginx.ingress.kubernetes.io/proxy-body-size: "0" to our app’s YAML file to prevent the ingress controller from imposing a body size limit on requests to our Node server:

Hurdle #3: Our Noisy Node Logs

Once past the 1MB limit, our next hurdle was 10MB.

Requests larger than 1MB were finally reaching our Node server, but requests larger than 10MB were erroring out on the POST from Node to our Backend service.

That request was an internal, intra-cluster request and therefore not passing through the ingress controller, so we needed to look elsewhere.

I quickly discovered that troubleshooting these errors would be impossible with our Node logs at the time.

We had logging in place on our Node server that logged the POST body being sent on each request from Node, as well as the error object received when an error occurred. The error object, in turn, included the request object. So each POST body was being written to the logs at least twice.

Since we had never done file uploads before, this was never an issue. But now that our POST bodies could include, say, a 25MB base64 string, logging that full POST body — and multiple times if an error occurred — turned our logs into nothing but noise.

Poltergeist movie image: little girl touching TV screen that’s all static

So before I could even begin to troubleshoot effectively, I had to find everywhere in our code where a request body or an error object was being written to the log, and truncate that log message to a reasonable size.

Hurdle #4 (The Final Hurdle!): Axios

After getting our noisy logs under control and useful again, I found that the 10MB restriction was coming from axios, our Node HTTP client.

A POST larger than 10MB from Node to the Backend service was triggering this error message from axios:

Request body larger than maxBodyLength limit

Research revealed that axios had a default max body size of 10MB that could be overridden with the following options added to the axios request config object:

maxContentLength: Infinity,
maxBodyLength: Infinity

Based on the documentation, I thought I would need to setmaxBodyLength, but that didn’t fix it — setting maxContentLength is what fixed it. So, to be safe, I just set both. (I was comfortable setting them both to Infinity because we control the request size in other places.)

Working to Perfection!

With that axios config change, we finally made it over our final hurdle, and the app was working to perfection.

Summary

Here’s a summary of the journey that got us to a working file upload feature.

Constructing the POST Bodies

  1. We obtained a File object for each file being uploaded using PrimeNG’s FileUpload component.
  2. We extracted the file’s ArrayBuffer data using the FileReader API.
  3. We base64-encoded the ArrayBuffer data using library base64-arraybuffer.
  4. We generated an MD5 hash of the ArrayBuffer data using the SparkMD5 library and the Promise-based implementation of that library found here.

Getting the POSTs Through the Pipeline

Constructing the properly-formatted requests, it turned out, was only half the battle. Getting those requests through the pipeline proved to be at least as challenging.

Here’s a diagram showing the various gatekeepers along the way whom we needed to placate to keep the upload moving forward to its final destination:

Diagram showing where the infrastructure hurdles occurred
Infrastructure Hurdles for File Upload (cartoon character from Dilbert by Scott Adams)

Conclusion

This was one of the most challenging tasks I’ve had in a long time because there were so many unknowns for me.

I think about how much more smoothly it would have gone if I had known two weeks ago what I know now about all these pieces of the puzzle, from how to construct the POST bodies to how to get those POST requests through the various gatekeepers in the pipeline.

At times I felt like Indiana Jones trying to get past the protective traps on the path to the Holy Grail: as soon as I solved one challenge, there was another one waiting right behind it.

Indiana Jones and the Last Crusade: Only the penitent man will pass

Parting Thought

So I hope this article helps to gather together in one place the pieces required to complete this task, and maybe spares at least one person a little time and stress.

--

--

Frontend Developer working with Angular for 10+ years. I love solving problems and building cool stuff. I sweat the details because…I love the details.