How I built natural language querying for a SQL database

A writeup of the early learnings from building my first GPT-based application

Yakko Majuri


For what felt like the longest time, I pretty much ignored the developments happening in the generative AI/LLM (large language model) space.

I knew there was a lot of cool stuff happening, I just didn’t get around to trying it out very much.

Then I had the chance to look over how a friend was using ChatGPT on his day-to-day and decided it was time to build out a little project.

I’ve spent the last few weeks building out an open source tool for monitoring and managing ClickHouse clusters called HouseWatch, and wondered about how I incorporate GPT into it to build a new feature.

What I landed on was natural language querying.

HouseWatch has a lot of features aimed at giving you an overview of how your ClickHouse node/cluster is doing, but it also has a built-in query editor so you can dig deeper beyond the information already provided. Unlike a lot of other databases, ClickHouse gives you a ton of metadata about the system in its system tables, so often times when debugging issues, I’m actually writing ClickHouse SQL to extract the data I need from these tables.

From years of managing large ClickHouse clusters, I have a lot of in-depth knowledge about what information is stored, and more importantly, where it is stored and how to use it. However, not everyone involved in managing a given cluster will have this knowledge, particularly early on.

So I thought it might be cool to build something that could help users get the information without inspecting schemas and reading docs for a long time.

This was the result:

The feature is still in Alpha and I’ll be shipping further improvements to it in the coming weeks, but it does a reasonably good job already!

I always try to write up about my little projects (e.g. here and here) so that I cement the knowledge I’ve gained better, and, given…



Yakko Majuri

Programmer, writer, traveler, hitchhiker, climber, photographer. i.e. lost. (P.S. amateur at most of the above) //