Do That Chat Backend Design

Do That Chat Backend Design

Starting in Februrary, 2024 we challenged ourselves to build We call it dothatchat.com, a fully funcitonal chat aggregation system, within 3 months. To accomplish this, we needed a backend framework that was easy to work on, complete with a number of first party features, and scalable.

We also needed it to handle complex and long-duration requests to our LLM vendors. Since we don't want make massive 45 second requests to our webservers, we designed the system to queue requests to LLMs and handle them asynchronously. We also needed it to provide the browser client with a powerful API to query our tree-structured messaging system.

Given those requirements, we decided on the following stack for our backend.

The Framework: Laravel

Laravel is a proven popular backend framework written in PHP. It emphasizes ease of development and has a wide ecosystem of packages that deliver many capabilities out of the box. From prior experience with a previous project, we were already familiar with laravel setup. Out of the box, Laravel gives us an easy to use auth, database modeling and migrations, and tools for generating services and modules.

Some stand-out packages and capabilities are listed below.

Laravel Reverb

Released in early 2024, conveniently when we were starting DoThatChat, Laravel released its very own first party websocket server. We never liked the idea of websockets-as-a-service provided by sites like Pusher given that we expected to dispatch a relatively high volume of websockets per person, hundreds in a minute even, given the high consumption rate of running streamed LLM Responses through our websocket system. Laravel reverb lets us host our own websocket server easily, keeping costs far below the equivalent bandwidth we'd be getting from pusher.

Laravel Horizon

Laravel horizon is a step up from a traditional CRON system available in laravel. It's a first party package that cans spin up and auto-scale php workers on different queues based on developer-defined configuration settings. It also publishes a UI on the api server that allows developers view current jobs, job failures, and other useful metrics.

DoThatChat depends on this queueing system to fulfill chat requests. Users submit an initial chat message, which is then entered into a high-priority queue for fulfillment by a backend worker that dispatches websocket events to update the client browser on the chat's fulfillment status. Horizon allows us to add auto-scaling and visualization to this queueing system, which helps us more easily balance the high priority LLM requests with lower priority jobs that we need to run from time to time.

Laravel Octane

Laravel Octane is a first party Laravel package that allows a Laravel application to be served from a Roadrunner, Swoole, or FrankenPHP server, significantly decreasing the typical 'bootstrap' issue that laravel has when it's served via nginx/PHP-fpm workers. We were excited to use this technology given how must faster it was able to serve requests.

The Infrastructure: Digital Ocean

Digital Ocean is a great platform for small teams. Our application is deployed as a docker container on their container service and uses their managed Postgresql and Redis databases. The way their UI is laid out is intuitive, allowing developers to perform many configuration tasks via the interface, but they also offer in-browser CLI interactions with containers when it's needed. Amazon's roughly equivalent offering via lightsail was slower to deploy, less intuitive, and less feature rich than Digital Ocean Container Service when we tested it.

Database: Postgresql

Digital Ocean offers postgres as a managed database in both dev/experimental and production level tiers. Postgres is a great choice of SQL database generally for a few reasons.

Firstly, postgres supports recursive queries, which have been essential in allowing us to make queries that traverse/mass modify or message tree system. This allows us to scale chat threads to hundreds of thousands of messages with little impact on performance.

Secondly, postgres materialized views allow us to store compute intensive values and refresh them asynchronously, again allowing us to deliver more advanced features without a penalty to system performance.

Cache: Redis

This is also available as a managed service from Digital Ocean. Redis serves as our cache and as queue storage for our horizon jobs. It's blazing fast and reliable.

The Monitoring Service: Sentry.io

Sentry is an error monitoring and logging service with an SDK allowing for easy Laravel integration. Sentry has given us invaluable insights into hidden inefficiencies in our requests as well as rapid notifications of site failures and downtime.

Conclusion

Our tech stack reflects our need for quality frameworks and components at a relatively low cost and high development speed, and our willingness to rely on out-of-the-box packages for things like websockets and queues. While Laravel is not the best for every use case, it has a very strong offering for businesses like ours.

One thing we had to think about during this project was whether we wanted to fully swallow the Laravel pill and become one with the Matrix vis-a-vis Envoyer or Laravel Forge. We see those as strong offerings, but we require more control over our environment as we need to host a python service in each container for token calculations. Every app is different and has different requirements, but hopefully it's helpful to see how we chose our stack.