The Context
I'm building Plan Swiftly, which generates management plans for engineering projects based on the user-defined characteristics and preferences for the project. Once the user has tweaked their approach, they press "Generate" and hey-presto, they have something they can adjust, review and eventually download as a Word version for sharing in the office. It would be very nice for this not to need a Word version, but I think we all know that's unavoidable in a B2B setting.
The app is built on Laravel (so PHP). It creates a representation of the document in a database, and then generates a Word file from that when required.
Below is the journey I went on to do this, what I learnt, and what I ended up settling on.
First, I tried in PHP
I really wanted to find a solution that kepy the stack clean, and leveraged my existing skills.
Unfortanately, the libraries for generating Word documents from PHP aren't very feature rich or mature (some with unpredictable behaviour and constraints).
So I had to give that up. But what to do...?
Finding an alternative library
So I looked across a number of languages to see how Word generation looked.
I think the best candidate was always going to be using COM/OLE, or macros. But that meant setting up a Windows or Mono environment, which means bigger servers and the potential for licencing issues. I decided to back away from this thinking, saving it for the scenario where I absolutely needed it.
There weren't many other competitors after that, a few libraries in various languages. The one that looked most promising was DOCX, so I started there.
Initial prototype
So first of all, I wrote a basic JS script that depended on DOCX to generate a Word document. The API was lovely, and it was easy to get rolling. It was reliable and feature rich too.
I pushed the envelope a bit, and got the JS script to, based on a argument passed when initiating it, generate a document by drawing on the database directly to extract records relating to the document. This all worked beautifully.
There were two niggles: calling the script, and having confidence that it obeyed the multitenancy scopes.
At this point in the design, I decided that a Laravel Job would call it directly as a process, passing in the document ID as an argument. Simple enough.
But the multitenancy issue bugged me. In Laravel, I can build up the protections that ensure automatic scoping of database calls. I didn't want to build all that in JS, and even if I thought I might need to, the opportunity for the two to become misaligned and create cross-pollination of data was too big.
I decided I needed another way.
Using Express
So I quickly learnt Express. Credit to its creators, amending my script to become an Express server with one route was a few minutes work.
So now, after a bit of tinkering, I could call my script as an endpoint on localhost at a predefined port. This was still using the database.
Without much further effort, I got to a place where Laravel did all the database queries, and serialised the whole document (document and child objects), before passing it as part of a POST request to the Express app. Much cleaner: all the data manipulation is now done in Laravel, using its guardrails and the ones I'm able to create there.
I had a great MVP solution. But I was worried about deployment. And documents include images, which originate as files on disk.
Using Express and Serverless
I extracted the app as a new project, and put it into AWS Lambda using Serverless. That didn't take long.
The Laravel and Express app share access to an S3 bucket, which Express can get all its files from (provided in the serialised object as path strings) to include in the Word document. That worked nicely, and wasn't hard to set up.
Well, actually, it was a pain. Async vs sync caused me no end of grief as a) Express needs sync function calls in the router and b) I was using a lot of forEach
and map
, which have issues likewise. Not wanting to rewrite it all, I was grateful for a second reason to ditch this approach: tenancy again.
So I tried something out. Since I was in control of image sizes, could I pass the images in as part of the serialised object? It turns out, yes. With some tweaking of Express' config, I was able to accept large post requests, and using Base64 encoding, I could include images in the JSONified document.
So now, Laravel wraps up the whole document in a JSON object, images included, POSTs it to Express which makes a Word document out of it, which streams that back to Laravel to save to disk or to stream to the user as a downloadable attachment.
But then Serverless failed me, and I got nervous about trusting it.
Using Express and going server-ful
Serverless kept failing to establish a connection during development, with no errors or reasons. So I removed a whole load of complexity and abandoned it.
Instead, I deploy the Express app as a project to the same Forge server as my Laravel app. A Daemon boots it and watches it, and it runs happily at localhost. My Laravel app can reach it, and that's all that matters. It removes a load of unnecessary network latency too.
If I ever need to pivot back, it won't be hard to do using some sort of edge function workers, or similar. But for now, this is a lean, fast solution.
So where are we?
I have a Laravel app that can directly, or as part of a job, make POST requests to an Express app on the same server to get some work done. Laravel and Express make this very easy. The Express server can be deployed, and even scaled, elsewhere if necessary at a later date.
The requests to the Express app are entirely stateless and self-contained, which is a nice win and lends itself to scaling later. Did I accidentally make a microservice? I feel dirty.
I also found out that Forge was more than happy with this slightly odd config, and has had my back like it was all Laravel.
DOCX has since been explored further, and I've achieved quite a lot with it. So I think it's here to stay.
My plan, if the requests and objects are too large, is to zip the objects as they are streamed into a file. The file will never need to be stored to disk, instead the Express app can just unzip on the fly (say, a JSON file and a set of image files in a flat zip file) and otherwise work as it did before.