(x mins read)
How I host libkakashi.dev (WIP)
An Xiao ESP32-S3 Sense at my place running an IPv6-only web server (IPv4 is behind
CGNAT) with DDNS and an overly huge battery (pls no ddos, you might physically
burn that thing)
The full code of the website is available at: https://github.com/libkakashi/website
BACKGROUND
top
I made this website back in Oct 2021 (see /original.html) mostly for my college
applications (I never went to one), and I never really updated it from then to now
(July 2025). The outdated version is pretty much all I've ever needed as a resume/
CV/whatever (I've never needed a resume/CV/whatever)
Some time ago, I thought of hosting it on my own hardware, on the smallest thing I
could get this thing to run on. Why? Because I'm not sane. This page is how I now
host it and a little networking and embedded programming 101
HISTORY
top
Oct 2021 - Sep 2023 - Replit (Replit removed their free hosting, bad)
Oct 2023 - Jul 2025 - Vercel (Got bored of it)
Jul 2025 - now - A Xiao ESP32-S3 Sense home server
LIVE STATUS
top
(updates every 5 seconds, if javascript is enabled)
Raw: /metrics.json
Files: /files.json
Uptime: uptime.libkakashi.dev
THE BOARD
top
The Xiao ESP32 S3 Sense is primarily meant for computer vision related use cases,
but throwing the camera and mic aside, it's also the most convenient board of its
size (21x17mm) for hosting a website like this.
Here's why I chose it:
I wanted to host this website on the smallest hardware possible, both in terms of
physical size and computational resources. I started my way up from the tiniest of
the boards to the first one that would fit well.
The original plan was to use an ESP8266, but it has poor IPv6 support and 80KB of
usable RAM is too less for SSL.
The second choice was ESP32 C3 Super Mini, it was almost perfect for my use case:
280KB of usable RAM, 4MB of flash memory, IPv6 support—and it's tiny. I had a full
working version of this website running on that for almost a week. The only problem
being that I needed a microSD card adapter (more on it in media), which is not
difficult at all to add, but it needs cables, and cables drastically increase the
overall size and fragility of the setup.
Third choice was ESP32-CAM: it has a dedicated microSD card slot, bit higher specs,
and similar size, which would've been amazing if it had not gotten rid of the USB
port, I like the convenience of being able to use a regular powerbank and USB-C
cable (both of which I'm always carrying) for powering it. (The programmer board
for it with USB-C is ugly and big)
And so finally, we get to the Xiao ESP32 S3 Sense, another camera module, but it
has everything I'll ever need, and way more. The specs are quite an overkill: 512KB
SRAM (more than enough), 8MB PSRAM (pointless), 8MB of flash memory (great, 4MB is
all I'd ever need). The extra specs, I suppose, will only ever motivate me to think
of more creative ways of utilizing them.
This server is IPv6-only. I do not have a static IPv4 address and could not be
bothered to ask my ISP for one. Apart from just being dynamic, it's also behind
CGNAT. Which means, not only does the router not know what its public address is,
it also does not have access to the port forwarding rules outside of its LAN.
CGNAT exists because of IPv4 addresses' shortage. While you don't have to deal with
that when it comes to IPv6, and you can configure your router to assign static IPv6
addresses to devices using SLAAC, your router still needs to repeatedly lease an
IPv6 prefix from your ISP for every certain duration of time.
Whenever that lease expires, the router has to create another, which resets the
prefix. This makes it hard to have a simple DNS setup pointing to a static IP
address. In my case, the prefix changes roughly once every two days.
There're multiple ways to detect that change of IPv6 prefix. The one I'm using is
listening to netif events, which I believe is the most efficient way to do it. The
router signals all connected devices whenever they're assigned a new IP address,
and you can capture that signal to update the AAAA record for a domain.
I was trying out dedicated DDNS services like noip.com for a while but decided to
stick with regular DNS (porkbun) and use their REST API to update the domain with
a very low TTL (300 seconds). There's some potential downtime here that can't be
helped.
HTTP ROUTING
top
The server tries to act like a static files server. Though, it actually isn't, it
actually doesn't even have an internal file system at all.
ESP32 allows read-only memory-mapped flash, which means you can access the flash
memory via regular memory adddresses without first needing to copy data into the
RAM. It also uses it by default for instructions and data address spaces to save
RAM.
Since my server is mostly read-only, and I already know all static files to serve
at compile-time, I can store the files in the data section and dynamically generate
a file, during the build process, that contains the pathnames, sizes, and virtual
memory addresses of all of the files. This allows me to serve most pathnames on
this website with near zero CPU and memory overhead apart from the networking.
I just thought preserving extensions in pathnames gave it more brutalism vibe. It's
ironic how it has to do that in order to appear more brutal, while actually already
being a lot more than that just without a good way of conveying it.
Edit:
It is infact a regular static file server now. I made another trade-off between
convenience and efficiency where I don't want to unplug it, connect to my laptop,
and reflash it every time I update any page. I now have a (protected) /write
endpoint to dynamically write files to the board. On my laptop, I've a script that
listens to file changes while I type and uses that endpoint to update the website
live.
The /write endpoint also simplifies SD card writes a lot (see media for more on SD
card). With ESP32, you cannot write to the SD card directly while flashing--the
board needs to boot and initialize SPI before it can detect and access the SD card.
My previous approach was to let it boot and use serial I/O to write files to the SD
card. Now, the board just knows, based on the file extension in the /write
endpoint, whether to write to the SD card or the flash.
Even with all the added overhead of a file system, the file serving is actually
more efficient now. I mentioned earlier how I had 8MB of PSRAM which was of no use
to me, I now just use that as cache to bypass the file system entirely and have a
similar setup as the previous memory-mapped flash except with the much faster PSRAM
instead.
HTML/CSS
top
All HTML/CSS files are pre-compressed with brotli, as part of the build process,
before they're stored in board's flash memory. This is the full command I use for
that:
brotli -f -q 11 --lgwin=24 -o ${COMPRESSED_FILE} ${WEB_FILE}
This gives me, on average, over 3x compression ratio for HTML files.
When sending the HTTP responses, the server simply adds a Content-Encoding: br header
to the response and sends over the pre-compressed files. Not only does it save
storage space, it also drastically reduces the time and bandwidth required to send
these files over.
This website uses AV1 format for videos, and AVIF for images (they're both the same
format, an AVIF image is just a single frame AV1 video).
AV1 is very modern and royalty-free. While it doesn't yet have dedicated hardware
instructions for probably most active devices out there, I'm sure it has good
software support for everyone I imagine opening this website.
These are the ffmpeg commands I use for encoding and compressing the media files:
ffmpeg -i image.jpg -vf "scale=1200:-2" -c:v libsvtav1 -crf 35 -preset 0 -svtav1-params "avif=1"
-frames:v 1 image.avif
ffmpeg -i video.mp4 -vf "scale=1920:-2" -c:v libsvtav1 -crf 35 -r 30 -preset 0 -movflags +faststart
-an video.webm
This website, as of now, has two clips (1.2 MB and 1.4 MB) in the /structured-
output.html page and one image (11.4 KB) here on this page. For the two clips, this
gets me around 50x and 14x compression while outputting fairly good FHD videos. For
the image, it's around 43x.
ERC32 S3 has 8MB of flash memory, which is more than enough for both the clips and
hundreds of images, but it's not too future-proof. I do not want to update this
website's code or hardware, ever.
Therefore, I've a microSD card on the board. I plan on storing all images in flash
forever (I'm sure I'll never run out of it), and all videos on the SD card.
The card is significantly slower than flash memory, but the primary performance
bottleneck is the networking anyways, and since people are likely to take at least
a moment before they press the play button on any of the clips, that's all the time
I need to preload them.
LOGGING
top
There's no client-side tracking code of any kind on this website. I do, however,
log the IP addresses of all requests on the server, among other things. I've a very
neat but also very pointlessly over-engineered logging system.
To start, this is the data I store, in two separate tables:
typedef struct {
uint32_t conn_id;
uint64_t start_timestamp;
uint64_t end_timestamp;
uint32_t bytes_sent;
uint32_t bytes_received;
uint32_t tcp_handshake_us;
uint32_t tls_handshake_us;
tcp_close_reason_t close_reason;
uint8_t ip_address[16];
} __attribute__((packed)) tcp_log_entry_t;
typedef struct {
uint32_t tcp_conn_id;
uint64_t start_timestamp;
uint64_t end_timestamp;
uint16_t status_code;
http_method_t method;
uint32_t path_addr;
uint32_t user_agent_addr;
} __attribute__((packed)) http_log_entry_t;
This is everything I need to track the traffic volume, latencies, geolocations
(vaguely), some user behavior, and basic device info. I don't even need or use that
data but whatever.
I've a 1MB partition in the flash memory for storing logs, split across the two
tables. There'a no file system or DBMS, the logs are just pushed at the end of the
tables with a header maintained at the start. The table layout looks like this:
| 32-bit total logs count | 32-bit flash logs count | row 1 | row 2 | ...
Very straightforward so far. The TCP table starts as the top of the partition and
grows downwards to the bottom, while the HTTP table starts at the bottom and grows
upwards. They meet when the 1MB partition is full. When that happens, the tables
are flushed into the SD card where they stay forever, and the flash partition is
reset.
TCP log entries are fixed length and need nothing more. You might've noticed,
though, that I've path_addr and user_agent_addr fields in the http logs. The paths
and user agents are not fixed length, which makes storing them while allowing
efficient random access difficult.
This is a very standard problem in computer science. Postgres, for example, stores
the rows in sets of 8KB pages (configurable), the page headers contain memory
offsets for all rows inside them in sequence, and, in all the indexes, the rows
are identified by (page_number, item_number) tuples. This is what allows postgres
to support types like TEXT while allowing efficient random access.
Postgres' approach is not too complicated, but it is a very generalized approach.
Given all we know about our logging system, we can build something both much
simpler and more efficient.
I allocate 10MB of memory on the SD card to store all pathnames and user agents,
null-terminated, and simply appended one after the other with no metadata. Whenever
the device boots, all of that is loaded into the PSRAM and converted into a hashmap
mapping the pathnames/user agents to their addresses in the SD card.
Since this website only has a few pages, and the potential user agents are also
limited, there's almost always a pre-allocated copy for both of them in the SD
card. The hashmap helps find those and they are reused in the new log entries. If
they're not found, they're added to the hashmap and the SD card.
There's a /logs.json endpoint for fetching the logs. It supports some filtering by
timestamps and indexes, given a secret key. As much as I'd like to make the logs
public, I'm not sure if I should, even if I hid the IP addresses.
MONITORING
top
I've a uptimerobot monitor that sends a couple of HEAD requests to the website
every 5 minutes and drops me an email if they fail. There's a bit of extra code I
had to write to handle HEAD requests instead of just GET but it works well now.
Here's the status and uptime page for this website: uptime.libkakashi.dev.
There's also /metrics.json which returns the live resource usage stats at the time
of request.