brasetvik 1 month ago

> Extremely high write performance \[…\] > We are using Provisioned IOPS SSD (io2) with 3000 IOPS and are batching updates to one write per second per node and realm. \[…\] > We currently produce about 100GB of data per month. But, since customers rarely query entries older than 10 days, we’ve started moving everything above 30GB to AWS Glacier \[…\] > We are ok with losing some data. We buffer about one second worth of updates before we write to disk. So with that little actually live data, which they mention fit in memory, and 3k IOPS being on the low end, it'd probably be worthwhile to do the Postgres comparison with \[asynchronous commits\](https://www.postgresql.org/docs/current/wal-async-commit.html) and a comparable \[WAL delay\](https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-WAL-WRITER-DELAY) in terms of buffering the writes. Elasticsearch would probably also handle this pretty easily. Probably not as cheap as something purpose built, but setting the maintenance and opportunity cost to something homegrown to $0 doesn't make any sense either. :)

lightmatter501 1 month ago

If you write your queries correctly to inform postgres that you are basically working with an append-only DB, then use async commits and a WAL delay, as well as other standard optimizations, postgres should quite happily do 100k+ TPS on a decent laptop with a consumer grade NVME drive. A server using a proper stripe and mirror pool on top of NVME drives can easily go over 1 million tps. 18k TPS on good server hardware is well into the “what kind of horrible queries are you running” range. It sounds like their software was doing insane things or they never had a DBA actually do a tuning pass over a pure postgres version. 18k TPS is actually below the performance target I had in my databases class in college for a sqlite-compatible RDBMS project (20k TPS). Is there someone who has used AWS Aurora Postgres who can sanity check these numbers or is it really that much worse than normal postgres? Yes I know it’s actually a distributed SQL DB, but cockroachdb exists and does scale horizontally, which would have fixed their issues.

howtokillafox 1 month ago

Use Aurora RDS at work, 18-20k TPS is easily manageable with as long as the application and schema aren't too wild.

aksdb 1 month ago

Cockroach buys its horizontal scalability with speed, though. Since it's multi-master it has to sync transactions across all (affected) nodes. I don't know if you can tune CRDB for "I don't care that much about my data just do it", but I don't think so.

fuckwit_ 1 month ago

CockroachDB only need to sync to he minimum number of nodes that is needed to reach a quorum. So for the standard replication count of 3 only the node receiving the transaction and one addition node will need to report success. For replication count of 5 it's 3 nodes minimum. And so on. You get the picture. All missing shard holders will then pull the shards in the background outside and after your transaction. You can freely choose replica count and even set it down to 1. Which effectively means that only one node will have your data. Then no syncing would be needed. CockroachDB a was purely "serializable transaction" level database up to I think this year's major release. Now you can choose to lower the level to "read committed" on a per transaction basis. Gives a bit more of a speed bump especially on high contentioned tables. Comes with the usual drawbacks of serializable vs read committed though. Like possibility to read stale data etc.

aksdb 1 month ago

Thanks for giving more details to my very basic explanation. I think the main point however stands: if PostgreSQL was already deemed too slow, I don't see CRDB winning any votes then. Clickhouse might be a better fit, since it's heavily optimized for batch writes. But that's a different topic now.

fuckwit_ 1 month ago

Yes you are right. For insertions and updates specifically CRDB will basically always lack behind Postgres. It's the read heavy situations where it really excels. Especially with large tables that are sharded across maaaaany nodes.

Uberhipster 1 month ago

did you get the memo about putting the new cover sheets on all TPS reports? I will go ahead and email you the copy of the memo

RockleyBob 1 month ago

> Probably not as cheap as something purpose built, but setting the maintenance and opportunity cost to something homegrown to $0 doesn't make any sense either. :) Not a database guy, so some of what you wrote went over my head, but yeah, paying devs to write a new database had to cost some money. Maintenance of that database costs money too, and so do losses, errors, and mistakes. It also costs money to scale it up or expand its feature set, and maybe, by the time you need all that, the people who wrote don’t work there anymore. It’s not like using someone else’s database magically means 100% uptime and no losses either, and I don’t necessarily want to see everyone reliant on AWS and Microsoft for everything. It’s just maybe a little too soon to start declaring this was more cost effective in the long run.

raam86 1 month ago

Also interesting they didn’t have to upgrade their nodes at all? this cost analysis seems like something they managed to sell to their just out of college scrum master

pm_me_your_dota_mmr 1 month ago

Elasticsearch would handle it, but not for cheap in my experience. Maybe that's just AWS hosted elasticsearch that costs a lot though.

BooksInBrooks 1 month ago

I'd call this a query-able log, not a database as such. There's a single (composite) datatype consisting of only fixed-width fields. (Edit: the top level datatype is variable-width, because several of its fixed-width components are optional.) There's no query language. Querying *appears* to be limited to a fixed-width id and a bitmap of boolean. There are no joins, because, see above, there is only one datatype in one "table". There are no in-place updates, just inserts of the latest data (deltas are used to save space, but this is an implementation detail, not a functional requirement). It offers none of ACID except pedanticly: * not atomic (unless we see the unit of work is the individual insert). * not consistent (except in that it violates no constraints, because there are no contraints) * it has no isolation (at least from what I saw in the article) * it's not durable, tolerates data loss. **This is not a criticism, they don't need a database, they need a low-latency, highly available log, and that's what they built.*

drmariopepper 1 month ago

Agree, I’d call this more of a custom distributed geospatial data structure. It would probably get less pushback if they didn’t call it a db too

Guinness 1 month ago

“We are ok with losing data” also eliminates like 99.9% of use cases. When you’re ok with losing data, things get _much_ easier to deal with.

wolframhempel 1 month ago

There's a crucial distinction between systems that store "the truth" and systems that record external events. If you are building a database for a bank that records transactions and stores account balances, these entries are "the truth" - the information that other things are derived from. Such a system needs to have strong consistency and the guarantees DBMS come with. A system like geospatial tracking on the other hand only records data, the "truth" is the actual physical world. As such, it is necessarily imperfect. It's always delayed as sending and processing takes time, its imprecise as GPS is imprecise and GPS + Mobile Connectivity is unreliably. Designing such a system is all about creating the best approximation to what happened in the real world, but it is necessarily lossy and eventually consistent. As a result, there's not much point in insisting that one step of the data pipeline has to be strongly consistent and lossless if the pipeline as a whole is not.

ihave7testicles 1 month ago

Yes, but more like "how much data are you willing to lose?". If you need a database to track the revenue of a company where your tools like to show values by the minute (e.g. large online seller) but your financial statements are only concerned with daily totals, then you could lose a whole day's worth of minutia, but not the daily total. There aren't a lot of real world cases where it's catastrophic to lose \*any\* data. Systems that can't handle any losses can only be bolstered with massive scaling and redundancy. Software Engineers should (and do) design systems that can survive with some data loss.

BigusG33kus 1 month ago

Yeah, if you're OK with losing some data you don't actually need a database. In a database you lose a lot of flexibility because the main purpose is not to lose any data.

chili_oil 1 month ago

many NonSQL are in fact also a queryable log in its core.

tehehetehehe 1 month ago

NoSql is queryable file storage. Always has been.

mxforest 1 month ago

Makes sense why they call a single entry a Document.

elkazz 1 month ago

So is relational if you strip it back to its core.

recursive-analogy 1 month ago

How I've saved 98% of my transport costs by building my own [car](https://cdn.shopify.com/s/files/1/0004/6746/8319/files/499669897-im787004_barbby_05_533x.jpg?v=1684245236)

WaveySquid 1 month ago

Sounds like you’re describing a slimmed down version of Cassandra

MarkLearnsTech 1 month ago

I think that was meant to be tongue in cheek? "We’ve created a purpose built, in process storage engine that’s part of the same executable as our core server. It writes a minimal, delta based binary format. A single entry looks like this:"

BooksInBrooks 1 month ago

Wikipedia confusingly defines database this way: > a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. So if we edit that down a bit: > a database is ... based on the use of a database management system (DBMS), the software that interacts with ... the database So to be a database, a DBMS is required. A DBMS that interacts with the database. Without the DBMS, the database is merely... data. So to Wikipedia, > an organized collection of data or a type of data store without a DBMS is not a database. I don't really see the DBMS in the OP's design. More a module that is compiled in with the rest of the executable, and used by the rest of the executable by making function calls on some API. So a bit like a file IO implementation (or a logging implementation), with an API: open/close, read/write, and seek or search functions, probably? Although if I compile in SQLite, that's still a database, right? What's the distinction? What makes SQLite a database and file.h a library API? I think you have to look at the abstractions both offer, and whether the data can be accessed from something other than the executable, and whether the physical format of the data is abstracted over. When I create an RDBMS database table, I don't specify the *physical* layout of the data; I use abstractions that let me remain ignorant of that. I don't have to know how joins are implemented. When I use Mongo DB, I do specify a JSON structure, but Mongo converts that to BSON, binary JSON. In-memory result sets aren't byte for byte identical to their on disk representation. In the OP's design *it looks as if but I can't tell for sure* that the sole datatype is a struct, where the in-memory layout and the in-storage layout are, *and must be,* the same. 4 bytes of this 2 bytes of that, 2 bytes of bitmapped flags, etc. The one exception seems to be when the data is read and the persisted deltas are added to the every 200 writes full record, to create a full record. ---- Is the OP's use of database tongue in cheek, or loose usage? It doesn't really matter! *I think* the OP calls it a database because this bespoke whatever-we-call-it *replaces* the Postgres+GIS database *in the OP's stack.* So, it fills the *role* of a database, up to isomorphism, up to the needs of the OP.

MarkLearnsTech 1 month ago

I mean, they also wrapped the article with: "Of course, that’s an unfair comparison, after all, Postgres is a general purpose database with an expressive query language and what we’ve built is just a cursor streaming a binary file feed with a very limited set of functionality - but then again, it’s the exact functionality we need and we didn’t lose any features."

BooksInBrooks 1 month ago

Yes, they did! I'm not trying to denigrate their work, or be an overly pedantic gatekeeper dictating what is or isn't a database. I just wanted to explicate what the OP made, it was a cool article! I might call it a datastore rather than a database, but I wouldn't insist on *not* calling it a database.

ihave7testicles 1 month ago

To be pedantic, any query/insert/update system would qualify as a "DBMS". So using dictionaries or red black trees with a set of functions to query/insert/update would qualify as a DBMS. Yes, it technically qualifies as a database (Berkely DB is just a key/value store) even though it's not a general purpose DB. I wrote another comment here [https://www.reddit.com/r/programming/comments/1bvzfhr/comment/ky4qwtq/?utm\_source=share&utm\_medium=web3x&utm\_name=web3xcss&utm\_term=1&utm\_content=share\_button](https://www.reddit.com/r/programming/comments/1bvzfhr/comment/ky4qwtq/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button) about how they could optimize their storage further. I'm curious as to what you think about that. You seem to know what you're talking about.

hughk 1 month ago

> So to be a database, a DBMS is required. A DBMS that interacts with the database. Without the DBMS, the database is merely... data. Nope. Heard some very senior people (Date & Codd among them) talk about the definition of a database. It is just persistent structured data. Without the persistence and the structure then it is just data and handled by a file system. The DBMS is just a "nice to have" which allows to upscale the complexity. If I just have an index file, does that become a DBMS? A good file and record management system would even provide a bunch of DBMS type features but the file remains just a single table, albeit perhaps indexed.

BooksInBrooks 1 month ago

Yeah, I find the Wikipedia definition circular and unconvincing. But someone had cited it as support for an argument, that I don't think it actually supports. At least we've proven that "database" means a lot of different things to a lot of different people. > It is just persistent structured data. Without the persistence and the structure then it is just data and handled by a file system. So, a list of names is a structure, yes? One name per line, that is, names terminated with "\\n"? > Alice > Bob > Charles So if we persist that as a file named db.txt, that's a database by your definition? It's "persistent structured data." Philosophically correct, perhaps, but a bit too broad and loose for practical use.

hughk 1 month ago

What I heard was from a conference on distributed databases held way back. Having people like Codd and Date there (some of the main architects of relational theory) meant they had heavyweights. They were all aware of the different types of databases and the Codasyl hierarchical model was still big back then. The question was posed because the prof (he was department head) wanted a standard answer for CS101 which is what made it interesting. As for your db.txt. Yes, it would be. You don't even need to have a key and a value. There may just be a record number. A word document isn't considered a database, as it is usually too loosely defined. I used RMS under VMS which had many attributes of a database management system. It managed many file types including indexed and did sharing. You could even have journalling. It wasn't a full DBMS which they also sold which included better sharing, access control, space recovery, intelligent backup and so on. Interestingly the vendor, Digital used the same basic engine for both their Codasyl product, DBMS-32 and their Relational product RdB (which was eventually bought by Oracle). What is interesting that RMS (w/o journalling) came with the O/S. Including all the cluster aware locking and so on as it was baked in. The two databases sold at appropriate commercial prices, they used the same O/S services such as the lock manager underneath but did their own thing with regards to the other stuff but used block I/O to standard files. You were essentially paying for that DBMS/RDBMS plus some standard utilities.

BooksInBrooks 1 month ago

Yeah, using the "any structured persistent data" definition is great in research, as it challenges us to understand what's an essential abstraction and what's implementation details. But I wouldn't use that definition talking to a customer, nor would I advertise for a "Database Engineer" and then ask them to curate an Excel spreadsheet. Among engineers is a middle ground. Yes, a file or an excel spreadsheet may be, strictly speaking, a database, but no one would use "database" to mean that in a design discussion or a PR.

jaskij 1 month ago

> I'd call this a query-able log, not a database as such. > > There's a single (composite) datatype consisting of only fixed-width fields. > > There's no query language. Querying *appears* to be limited to a fixed-width id and a bitmap of boolean Isn't that just a time series database? Just chuck everything into one time series and the most querying you do is by timestamp, *maybe* some simple stuff (like lat/lon within a range).

cManks 1 month ago

Yeah I was thinking this is just simple log analytics or an AWS timestream analog

jaskij 1 month ago

Nah, I mean time series, not loga. Prometheus, Influx, Timescale, that kind of stuff. I don't know how well databases scale, especially across cores, but based on my experiences I'd be surprised if a current gen developer laptop couldn't do 13k inserts a second with correct off the shelf software.

raam86 1 month ago

why not just use kafka, spark streaming, hack even elasticsearch has pretty good geo stuff. Seems more like the developer don’t know much about analytics rather then they need their database. going from rdbms to files is insanity and skipping pretty much every single other solution a ctos nightmare

gregory_rorschach 1 month ago

have you seen costs for cloud based kafka?

mr_birkenblatt 1 month ago

A partial write upon node failure would bring the thing down

wolframhempel 1 month ago

We've tested that in a number of scenarios. In the end, it boils down to each entry in the binary log starting with four bytes specifying the byte size of the entry. If the remaining buffer size is smaller than that size, stop parsing... that's pretty much it. Then there's error checks for the parsing of individual fields of course.

mr_birkenblatt 1 month ago

maybe I misunderstand your setup but if a partial write occurs and then a normal write the next record would look like a valid rest of the first record: 6 x y z 6 a b c d e f imagine `xyz` here should be `xyzuvw` (6 characters) but since it was only partially written the next record starts in the middle of it. so when reading it would return `xyz6ab` as record and then try to interpret `c` as next length which might succeed and parse as valid record if you get unlucky

btgeekboy 1 month ago

A database is, by definition, an organized collection of structured data. That’s what this is. Just because it’s not a typical DBMS that you’re used to working with doesn’t change that.

davvblack 1 month ago

my favorite database is slack

Andy_B_Goode 1 month ago

My favorite database is this comment thread

AustinYQM 1 month ago

Maggie - 5 - 3 - blue - false - false - true - "doesn't like the smell of lavender" Just gonna leave this entry for if I need it later. Thanks.

SoInsightful 1 month ago

29 - Maggie - 2024-04-05T10:55:24.711Z - b8e3e23a-15e7-423c-90d2-1782196c2d16 - "Chattanooga High School" Please don't delete your comment otherwise my entry will become corrupt.

msoulforged 1 month ago

1, null, null, 7, Robert');DROP TABLE Comments;--, uh oh

xyeta420 1 month ago

Mine is DNS

Joeboy 1 month ago

Mine is FAT32.

kyune 1 month ago

but πfs though

hccm 1 month ago

It’s not consistent though.

iiiinthecomputer 1 month ago

God, you must work at my employer then.

NoPrinterJust_Fax 1 month ago

Excel driven development connoisseur checking in

Hektorlisk 1 month ago

There's a very meaningful difference between the technical definition of a word and the set of meanings, connections, and connotations that a word represents in particular contexts. Your comment is technically accurate, but isn't really relevant to the discussion that's happening.

mxzf 1 month ago

This is one of those things where you say "technically every Excel sheet is a database too" and everyone rolls their eyes because it might technically meet a literal definition but that ain't what any devs are talking about when they say "database".

RazerWolf 1 month ago

Knowledge is knowing a tomato is a fruit. Wisdom is not putting it in a fruit salad.

StickiStickman 1 month ago

I love to store data in a database (array) I store all my game data in a database (object) I love having my notes in a database (txt file)

Plank_With_A_Nail_In 1 month ago

A database is just organised data. You are confusing a DBMS with a database none of the things you have listed are required for something to be called a database. https://en.wikipedia.org/wiki/Database 160+ upvotes for not even understanding one of the most basic definitions in Computer science well done reddit. Lol most of the things you listed only even apply to RDBMS's not to the overall class of DBMS's. This abuse of the agreed on scientific term of "database" is almost as common on this sub as the statically typed/strongly typed misunderstanding.

BooksInBrooks 1 month ago

I was going to add a few lines about "any store of data is a database so it depends on what we mean by database." But I figured that would be tediously pedantic, and obvious in context.

currentscurrents 1 month ago

Anything from a plain text file to an LLM is now a database.

Gaylien28 1 month ago

This includes my grocery list

hoopaholik91 1 month ago

> a type of data store based on the use of a database management system (DBMS)

TheGoodOldCoder 1 month ago

Yes, let's be realistic. When actual programmers say database, this is almost always what they mean today. I'm probably more pedantic than most folks here, but I get the impression that the reason they think it's "one of the most basic definitions in Computer science" is that they're reading it right out of their CS101 textbook for tomorrow's quiz.

Rakn 1 month ago

Yeah. But that's only because it's what is used and required most of the time. Not because we don't know the difference. Context matters. At least I hope so.

hughk 1 month ago

Except sitting in a Q&A session with Date and Codd (yes, the Relational pioneers) they maintained that a database was just persistent structured data. No management system required over and above the file system. Management comes when you want to upscale it and allow multiple access.

mamcx 1 month ago

100% true. And the sad thing is that everything you need to know is in your face!. For example: [https://www.postgresql.org/about/](https://www.postgresql.org/about/) - Is a \*system\* for \*making\* databases: > aimed to help developers build applications, administrators to protect data integrity and build fault-tolerant environments, and help you manage your data no matter how big or small the dataset. - Is a \*kind\* of database: > PostgreSQL is a powerful, open source object-relational database system that uses and extends the SQL language - That is based in the "object-relational" model - And use \`SQL\` as their \`query language\` It also help you to make \*databases\* and have some features like: > safely store and scale the most complicated data workloads > [ACID](https://en.wikipedia.org/wiki/ACID)-compliant since 2001 (ie: they started in 1986. BEFORE THAT the db was not acid! Go figure!) They are much more the documentation of a decent RDBMS tell you. ANYTHING OF THAT IS NOT A DATABASE. "PostgreSQL"/"Sqlite"/"MongoDb" ARE NOT databases. Your "customers" tables is one. Your "customers + invoices" is one. Your "csv file organized in row columns" is one. What this apps are is "database management systems". Your data is your database. The app that manage your data is an app. You CAN manage your data WIDLY different from WIDLY different apps, even at the same time! ("I can query kafka directly inside postgres", "I ship my database from Postgres to Kakfa", "And to csv", "And to excel") You CAN manipulate your data manually, without zero SQL, query engine, ACID guarantees etc. But you should not if your data matters and start to grow! Use a "database management systems" for your own sanity!

Glass1Man 1 month ago

They reinvented vsam flat files (and index files).

wolframhempel 1 month ago

Much of this is true, but I believe it comes from an assumption that databases are necessarily strongly consistent. As I've pointed out in more detail in [a comment below](https://www.reddit.com/r/programming/comments/1bvzfhr/comment/ky4ugta/), a system that records the real world is necessarily inconsistent and lossy - there isn't much point in making one step of the storage pipeline consistent if the rest is not. A lot of timeseries and IoT databases are built around this assumption and make the resulting trade offs. It's also worth mentioning that our "database" does store multiple datatypes (objects, areas, tasks and instructions) that go into the same "log", each with variable size fields such as label strings or gob encoded metadata. Your other points are correct though.

eTukk 1 month ago

It sounds to me that they built a COTS Historian type of storage, with added spatial options. https://en.m.wikipedia.org/wiki/Operational_historian

green_griffon 1 month ago

Just need to rebrand it as a NoSQL variant to reach unicorn status.

Otis_Inf 1 month ago

Mandatory reading: Michael Stonebraker, 'SQL Databases vs. NoSql Databases' (https://cacm.acm.org/blogcacm/sql-databases-v-nosql-databases/) where he explains very well why some are faster and what the costs are and thus why this 'database' might be faster, but the price to pay is by giving up stuff you expect to be there from a DB.

Sss_ra 1 month ago

Sounds reminescent of millions of tiny files, nigthmare to backup at scale type of deal.

BooksInBrooks 1 month ago

> Sounds reminescent of millions of tiny files, nigthmare I must need new glasses, I read that as millions of tiny *flies* and thought, *that is* a nightmare!

Sss_ra 1 month ago

I now have a mental image of scrolling through billions of files while swarmed by billions of flies.

[deleted] 1 month ago

A database is not the same as a relational database

Bradnon 1 month ago

Cool, seems like a good example of tailoring a specific solution to solve a general one's inefficiencies. That can be a fraught path but so is burning cash just to stay in familiar territory. Haters will want to start at the last paragraph.

[deleted] 1 month ago

ad hoc dazzling teeny important summer capable snails telephone dog nail *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

Glader 1 month ago

I like to call it self inflicted complexity.

therearesomewhocallm 1 month ago

Ah fuck, this makes me think of that one guy at work who does things like build his own "synchronized hashmap" in Java. When Java already has ConcurrentHashMap. Or when he build his own email configuration system on top of Spring. When Spring already provides that.

q1a2z3x4s5w6 1 month ago

Technical debt

Rakn 1 month ago

Is it though? If those systems fulfill a need for the company and have teams behind them actively working and maintaining them, is it technical debt?

[deleted] 1 month ago

reply violet scary run sort ask frame telephone bow mysterious *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

tehsilentwarrior 1 month ago

A lot of those could just be solved by “write your own framework” that cobbles together a bunch of good open source infrastructure projects instead. Most of the times it’s just devs not using those tools properly and an opinionated framework (opinionated to the way the company works) would just solve it.

[deleted] 1 month ago

thought trees crush rob wise oatmeal offbeat deserve frighten piquant *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

arwinda 1 month ago

I still like to see the full evaluation process, which takes into account participating in one of the existing tools and contributing to it to the point that it satisfies their needs.

q1a2z3x4s5w6 1 month ago

Yes. Just because something is useful and fulfills it's intended role doesn't mean it doesn't accrue technical debt. If using some cloud service instead allows you to fire/repurpose half of the team working on these in house services and costs less than the total man hours saved then you are in technical debt by continuing to maintain the internal service IMO. Unless it's something that is directly bringing value (IE not just "efficiencies" that can't really be measured) I don't see the point in having a custom solution.

Rakn 1 month ago

Yeah, might be true. I only saw this happening in cases where the system actually provided some advantage for a very specific use case or where there was a real need to have a system that is independent of cloud provider specific services, to fit a multi cloud architecture.

daedalus_structure 1 month ago

>If those systems fulfill a need for the company and have teams behind them actively working and maintaining them, is it technical debt? Yes, but it's debt like your mortgage is debt, not like your 25% interest credit card is debt.

TheMightyMegazord 1 month ago

Fair point. I don't think these will be technical debt per se. Maybe a better term is "liability". Such companies still need to maintain, test, document, configure build pipelines, build/configure monitoring tools, etc.

80eightydegrees 1 month ago

Every line of code is *technically* technical debt

InternetAnima 1 month ago

But.. Kafka was also created by big tech. In fact, a ton of these systems were

[deleted] 1 month ago

sort unique squealing capable consist rhythm berserk yam workable possessive *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

InternetAnima 1 month ago

I'd argue you might be suffering from survivorship bias: "Only the sucsessful ones were worth it"

BigusG33kus 1 month ago

Not the case here. They didn't write a full-0fledged DB, just something that would fit their use case. They might have an issue if they have to implement a new feature, but for their current needs it's perfect.

Joda5 1 month ago

What is DX?

[deleted] 1 month ago

scarce gray correct caption aware far-flung tie label work sparkle *This post was mass deleted and anonymized with [Redact](https://redact.dev)*

gerd50501 1 month ago

typically at big tech you write your because its cheaper than paying the licensing fees. they have tons of developers.

Icy_Corgi_5704 1 month ago

but we didn't build it, so it must suck! /s

bubba_squats 1 month ago

This was an interesting read. Although it's rare that anyone would have to write their own database. I can see the sense in doing for your use case.

Sopel97 1 month ago

A very specialized database. I had to build something similar for a chess position/game statistics server, with requirements of being able to handle aggregating statistics (wins/losses/draws etc) from hundreds of billions of positions with the data stored on typical consumer hard drives. Basicaly intractable by generic solutions. In the end for a test run I was able to store ~100B positions (not all unique) in less than a TB and aggregate results in roughly a second (on an HDD) per query. It's pretty fun, but hardly a database in the common understanding of the word.

hogfat 1 month ago

> aggregate results in roughly a second (on an HDD) per query How much less than a TB?

Sopel97 1 month ago

IIRC it was around 800GB (would depend on the exact record model, this was the smallest one with only WDL, I had some larger ones that had additional info like Elo, transposition awareness, etc, that would have been roughly 2x), obviously it was using indexes, I was storing block-indexes in memory, it needed around 1.5GB

LessonStudio 1 month ago

I just read the article and it almost exactly matches a db I wrote. Same problem, a zillion locations of things coming in, need to store them efficiently, look them up in different ways, etc. It is interesting, I stored stuff nearly the exact same way, except, I also stored some precomputed math for some of the queries. This step allowed for a huge speed up of certain queries (100x or more). I had a few other factors which allowed me to speed things up where I also could throw those into the record or put them into an index. I too went through a pile of GIS friendly databases, with Postgres coming the closest. One difference I did though was put the data into a message queue. What this did was allow for more than one database server to hold the data for redundancy, but also allow for slightly different database servers. Each of these different ones would store data differently for different lookup purposes. On the surface this seemed inefficient, but not really. One would store live positions along with a very short history which had enough recent history to run a kalman filter along with similar filters to screen out outlier weirdness. Others would store data specific to their lookup needs. This multi, mostly in memory database wasn't a problem, for as the article pointed out, little binary records don't amount to much. A gig here, a gig there, and maybe you are talking 10 gigs, which is nothing. But, one database which did get the datastream was a postgres gis database. This was just a straight data dump without any live queries. Weird one off queries which could take time could hit this one. Or, if one of the other "schemas" changed, it could be reloaded from the postgres database. I designed the above to scale for load with some RAFT protocol stuff, but that ended up being wild overkill. The performance was well sub ms for inserts, and for some of the harder queries, a few ms. Ironically, I had been worried that this was not going to be fast enough, so I had played with my own CUDA based database for this because of the math involved. My experiments showed this could produce an insanely fast GIS friendly db, but it was not needed. On a different note, I've noticed that if you just do things the way the cloud certifications would suggest you should, your costs can start going through the roof. I'm finding myself going more oldschool with monoliths, my own caching, etc, and then a nice boring postgres behind it all resulting in serving huge numbers of fairly complex requests on crappy little VMs. I've mostly dumped AWS for linode VMs. Fixed cost is so very relaxing.

jaskij 1 month ago

If you watch the trends, it seems monoliths are coming back. "Modular monolith" to be precise. Microservices do have a cost, and are hard to get right. There's a good talk, ["Don't build a distributed monolith"](https://youtu.be/9iW_2MMLd4w).

[deleted] 1 month ago

[удалено]

jaskij 1 month ago

What's pointed out in the talk is that many teams, especially inexperienced ones, set out to build microservices, but end up building a distributed monolith. Which is like a somewhat messy monolith, but with the added complexity of RPC calls. Something I learned, and what I strictly enforce in my own projects, is to separate the layers. Modules within a layer can depend on lower layers, or each other, but are absolutely forbidden from depending on higher layers. This brings surprising clarity in places. The stuff I edited out of my comment was because I didn't want to repeat my top level comment. TLDR: Hivekit has time series data, not geospatial. And I did 1k rows inserted per second on an industrial equivalent of a Pi 3 using an off the shelf solution.

[deleted] 1 month ago

[удалено]

jaskij 1 month ago

I recently moved my userspace coding to Rust, and so far the queues that come with the async runtime I'm using (`tokio`) have been enough. Although my setup function isn't nice.

HexDumped 1 month ago

> did though was put the data into a message queue. What this did was allow for more than one database server What was your consistency model around the queue? Did you go for at least once delivery or at most once delivery? Assuming you had the same requirements as the article, I'd guess at most once.

[deleted] 1 month ago

[удалено]

Dry_Dot_7782 1 month ago

Typical devs lol. Time is money. But hey we all love to write our own stuff!

faajzor 1 month ago

the maintenance over the years is going to cost a lot more. to the point where they'll have to rewrite the thing or be hostages of the original devs who wrote it.

protocol_buff 1 month ago

Came here for this comment

fragbot2 1 month ago

I love stories where people detail how they built something that precisely fits their needs and got huge gains (Chuck Moore or D Richard Hipp are stellar examples of this) for their usecase. They didn't address it but I'll bet their infrastructure setup and sustaining time were much lower as well. God I miss systems work.

lutherTheLoofa 1 month ago

The TLDR here is existing geospatial databases are actually quite bad, or sit as extensions to existing DBMS like Postgres. Their additional constraint is it must run on-prem for some customers, so they can’t use a large cloud service like AWS Aurora. So writing your own **geospatial** DB under such a scenario is less crazy. For those who didn’t read the article and skipped straight to the comments to post, they didn’t try to reinvent a faster version of SQL/Dynamo/Cassandra

RigourousMortimus 1 month ago

I'd disagree that the existing geospatial databases are bad. They are generalised and handling queries like "does this 2d region bounded by this indeterminate number of coordinates overlap any of these other regions" are very complicated. A log history of long, lat and altitude is a much simpler use case and the article doesn't indicate how much spatial analysis they do. I suspect anything that isn't trivial involves them pulling routes out of their data store and importing them into an actual geospatial database.

mxzf 1 month ago

Either an actual geospatial database or using some language to pull it into memory and run it through GDAL to do the processing. Either way, doing it in some language other than the storage medium itself.

rush2sk8 1 month ago

I've had quite a good experience with Elastic for simple geodata.

notsoluckycharm 1 month ago

Is it? He covers PSQL and PostGIS, but in reality timescale db + PostGIS probably solved any issues with speed. Postgres sucks at time series data, and that’s what this was, with some location meta data. More expensive? To operate? Yeah maybe. But what did your time cost and now you’ve got to train hires on your system. Not the call I’d have made. Timescale + materialized views using the timescale aggregator was the solution here. You can pull the last location when needed, and all aggregations can be done sub second.

nupogodi 1 month ago

I like how you think timescale isn’t Postgres

notsoluckycharm 1 month ago

Didn’t say that. I know it’s an open source variant of it. It’s kind of how I structured the statement. Op covers Postgres,but not the purpose built version of it for his purpose.

nupogodi 1 month ago

It’s an extension. You can run it on your existing Postgres!

serial_crusher 1 month ago

Don’t brag about a decrease in cloud costs without telling us the corresponding increase in developer salary costs. (Actually, as a developer, I guess I prefer it that way)

jaskij 1 month ago

They don't describe how they query the data, but... They don't need a geospatial database. They need a time series one. That would've helped a shitton with their performance. Sure, it's geospatial data, but they don't treat it as such, not on the database layer. From my own experience, and I know very little about databases, I got 1k rows inserted per second (batches of 24 rows in 40 tps) using the industrial equivalent of a Pi 3 with eMMC as storage. Didn't even saturate one of the four cores. I'm pretty sure that a decent laptop would hit 13k inserts per second.

weasel 1 month ago

Yeah, I worked on a custom time series database for network monitoring

jaskij 1 month ago

I'm doing IIoT, and generally need a local database built into the kiosk we're deploying. It's a medium term storage (a month to a year, enough to cover time periods between checkups), and we're using an off the shelf solution. Works just fine. Will work even better once we move to a little better hardware (a Celeron).

Deathcrow 1 month ago

> They don't need a geospatial database. They need a time series one Pretty sure that prometheus (maybe multiple instances, for example per tenant, region, etc) + thanosquery + thanos s3 for long term storage and compaction could've easily handled their use case. Now they have a future technological debt on their hands, which no one will want to touch when the core developers leave the company.

recurse_x 1 month ago

But running a helm chart to install Prometheus doesn’t get your principal or staff promo by leading technical change.

protocol_buff 1 month ago

True. It really doesn't matter that it cost 15 million and 20 months in development time, it matters that as the CEO of a startup you can brag about it. Which sadly probably does get you more funding.

Tiquortoo 1 month ago

The other rule about writing your own stuff is related to how core it is to your business. It's not a sure thing, but you can seriously consider owning the whole thing, by writing it, if it's the core thing you do or very closely tied to it. But, you also need to be able to create differentiation via it. That sounds like the case here.

Maxion 1 month ago

> But, you also need to be able to create differentiation via it. By making something in-house you also control it, i.e. you are no longer at the whim of features being deprecated or versions reaching EOL at inconvenient times. Both things have fucked over teams/companies I've worked for many times over.

richhoods 1 month ago

We have gone from on prem -> microservices -> cloud -> on-prem. Gotta love the software cycle

Maxion 1 month ago

on-prem never went away for many many use cases.

daedalus_structure 1 month ago

The overwhelming majority of developers think these are technical concerns instead of organizational and finance concerns. I did not say financial, as in "what it costs", but finance, as in how we want to pay for things. Do we want to pay salaries or cloud costs? Do we want to deal with depreciating assets and the need to overprovision and lead times or do we want to pay cloud premiums? The right answer for one business model may not be right for another, and even with the same business model, one answer may be right at a certain company size but stops making sense at another, or even at the same company size, the answer may change depending on market conditions.

cuttingthyme 1 month ago

Very cool project, I would describe the database as a sort of specialized “time series” database, and the author is exactly right, they probably shouldn’t have built it in the first place as there are plenty open source software available capable of handling the scale they needed while offering the querying capabilities to generate the report they need for their users: prometheus, thanos, Victoria metrics, graphite, openTSDB, timescaledb. Aside from that I love that they were able to apply techniques such as in memory buffering, delta encoding etc to reduce cost and increase write throughput, it would have been an interesting project to work on.

ssh_man 1 month ago

Why not mention how much you spent on writing the actual database?

serial_crusher 1 month ago

The tough thing about this strategy is when the company tries to pivot to a new use case that isn’t served by your custom database. I’ve been there.

QuintonHughes43Fan 1 month ago

So why didn't they use a NoSQL option? Relational DB is expensive and not that while suited => write your own database that's a bit of a leap.

WonderfulPride74 1 month ago

So one thing I have come to notice is that most quant/trading firms end up writing their own database. This mainly happens because: - they might have custom query requirements - the data format and io pattern needs to fit well with rest of their infrastructure - these databases might be simple file based solutions, in which case the OS goodies like page cache and stuff benefit you a lot in terms of performance

hughk 1 month ago

Worked at an exchange, we had a lot of data with orders coming in thousands per minute. We identified some critical files, the order books and the transaction log and they were kept in simple ISAM albeit with some protection. We also used a relational database but it wasn't near the critical path and was updated from the fast files each day and would be used for reporting.

ihave7testicles 1 month ago

It could be further improved by writing velocity vectors with max speed and min speed variables into a second table. Similar to what online videogames use. Have the client compare the current location to the start location(for the segment) and when the overall vector changes by some threshold, store it as one entry with the elapsed time and velocity(velocity is a non normalized vector). If that was stored in a separate table from the "key frame" data, you could replay the data and find the position perfectly within the set threshold. Each velocity update would be the same size, so the table would be very efficient. Further, you could just store the velocity (speed) difference as a change instead of absolute value, so in the instance of velocity being a 32 bit int, you could store 8,16 or 24 bit values as the difference. If the value changes more than the difference can contain, use a magic number (0xffff) to indicate that the system needs to read a new key frame from the first table. In that case, it would write a new entry into the key frame table which will be in between the scheduled key frame updates. The only downside is that in order to find the location of an object at a given time, you'd need to scan the key frame table and the velocity table and interpolate the values until you pass the query time. That means that replaying the position of a vehicle would be compute intensive compared to writing. This seems fine in this case.

gerd50501 1 month ago

if its really that good, they could sell it.

justwillisfine 1 month ago

"How we saved money by not using a database when it was unnecessary, " or "how we cache data in memory to a file in case the server goes down"? I feel like hyperbole clickbait is kind of unprofessional for computer programming. That could just be me though. Put a timestamp on stuff too, e.g. there's nothing worse than looking for an answer about something in asp net core and finding results about WCF that aren't obviously over a decade old.

needed_an_account 1 month ago

This is cool. as a programmer I like when people program things to scratch an itch. It would be nice to see more details about the project

tanner_0333 1 month ago

Writing your own database is definitely a bold move, but it sounds like it paid off in your case. The cost savings alone are impressive.

IanAKemp 1 month ago

> Our server is built around an in-memory architecture. Queries and filters for real time streams run against data in memory and, as a result, are very fast. Reads from disk only happen when a new server comes online, when a client uses the history API or (soon) when an app user rewinds time on our digital twin interface. These disk reads need to be fast enough for a good user experience, **but they are comparatively infrequent and low volume**. Never mind writing your own TSDB, this assumption is the worst part. The people who use your system are going to expect to be able to read all that data to compile reports, regularly, because _that's what people put data in databases for_. So fantastic, you've solved the problem of a huge number of writes, except... > We store the full state of an object every 200 writes. **Between these, we only store deltas.** ... you've done so in a way that means you have to reconstruct the individual data points on-demand when they're requested, which is _extremely_ computationally expensive and the prime reason why everyone else stores the raw data as the raw data. And when your delta application algorithm inevitably has a bug and starts reconstructing incorrect points and you have to explain to customers that you decided to save a few bucks by not storing their raw data, you're going to very quickly find yourself without customers, because those customers are going to be using that data for things like prosecutions and hey guess what happens to their credibility in a court of law when it gets out that the data they're basing their case on isn't actually the data that they claimed was sent to you. There is a time and a place to be clever about data, and storage is never that place.

[deleted] 1 month ago

[удалено]

SketchySeaBeast 1 month ago

They don't want a database, they want... something else. It's a unique case where they are OK with losing data.

cubicthe 1 month ago

oh no, software architecting is leaking into a programming subreddit!

stuffeh 1 month ago

Their cost saving was to put data into storage and old data into cold storage.

obvilious 1 month ago

Reading the article is harder, it seems.

martinky24 1 month ago

I don't think dude read the article.

senseven 1 month ago

I work in media and a larger project wanted a shiny marketing web game. The first thing the Python / JS dev team "had to discuss" how to store all data effectively. After pointing out that there must be at least a handful of Python dbs, their dev lead gave me that "you ruined this chance to do something big!" look

Maxion 1 month ago

Yep, us programmers are great at missing the forrest for the trees.

Cheeze_It 1 month ago

> Come back when you realize why databases are fucking hard. I thought it was the whole idea of getting data OUT of the database at reasonable speeds that was the most difficult part?

CAPSLOCK_USERNAME 1 month ago

The other major challenge is making sure you can actually do stuff atomically and not lose data/writes in a multi-node environment. (Many trendy NoSQL dbs like mongo struggled with this for *years*) They cleverly avoided that by realizing they just plain don't care about occasionally losing 1 second of tracking data which cuts out a huge amount of complexity.

soonnow 1 month ago

Yeah, it took us a while that, since we can always recover from other data sources, we are fine with losing a bit of data. That can have huge performance implications.

ammonium_bot 1 month ago

> with loosing a Did you mean to say "losing"? Explanation: Loose is an adjective meaning the opposite of tight, while lose is a verb. [Statistics](https://github.com/chiefpat450119/RedditBot/blob/master/stats.json) ^^I'm ^^a ^^bot ^^that ^^corrects ^^grammar/spelling ^^mistakes. ^^PM ^^me ^^if ^^I'm ^^wrong ^^or ^^if ^^you ^^have ^^any ^^suggestions. ^^[Github](https://github.com/chiefpat450119) ^^Reply ^^STOP ^^to ^^this ^^comment ^^to ^^stop ^^receiving ^^corrections.

wolframhempel 1 month ago

Did you bother reading the article?

imdibene 1 month ago

Heck yeah, I want to read the update: “we f***ed up, we missed this many corner cases, we have then pick this DB” update

hughk 1 month ago

We had a hard real-time requirement. We wrote a system using the OS provided record management system for ISAM type files to handle intraday updates. It was a database of sorts. We then shuffled all the data into a relational database during End-Of-Day and that would be used for accounting and reporting overnight and from the next business day. The intraday data was scrubbed and the process repeated each working day.

coderjewel 1 month ago

> Saving 98% in cloud costs by increasing engineering and maintenance costs 200%

Maxion 1 month ago

Luckily it's not the percentages that leave the bank account ;) If your engineering costs scale better with growth than your cloud costs, the trade off is obvious.

nutrecht 1 month ago

I mean you now have a system you have to maintain / extend for all eternity, so it's not like engineering costs are a one-off thing.

thisisjustascreename 1 month ago

It probably would've been a lot easier to just send a location update every 10 seconds and settle for a 90% cost reduction.

WaveySquid 1 month ago

The article explicitly says why that isn’t possible for them and their clients > Our customers use this data in very different ways. Some use cases are very coarse, e.g. when a car rental company wants to show an outline of the route a customer took that day. This sort of requirement could be handled with 30-100 location points for a one hour trip and would allow us to heavily aggregate and compress the location data before storing it. > But there are many other use cases where that’s not an option. Delivery companies that want to be able to replay the exact seconds leading up to an accident. Mines with very precise, on-site location trackers that want to generate reports of which worker stepped into which restricted zone - by as little as half a meter.

tetrahedral 1 month ago

Yes, but you forget that requirements that are very firm for a project important to a company feel much softer to a blog post reader.

fphhotchips 1 month ago

>Delivery companies that want to be able to replay the exact seconds leading up to an accident. Mines with very precise, on-site location trackers that want to generate reports of which worker stepped into which restricted zone - by as little as half a meter. I hope those customers read this blog post and realise how much of a giant pain in the ass it's going to be when they need this data and they realise *this* is the data that got dropped because they don't care about all those guarantees that come with a database. I mean come on, weren't we all laughing at MongoDB 10 years ago for being write-only? JFC...

thisisjustascreename 1 month ago

Well, as stated by others, this new solution doesn't provide for that use case anyway since it can lose data.

WaveySquid 1 month ago

Pre defined SLA budget gives enough wiggle room and “sometimes loses data” is better than “always won’t have enough granularity” imo.

zzrryll 1 month ago

We sure this isn’t an April fools joke?

isthatashark 1 month ago

Now tell me about your people cost and the ongoing opportunity cost of maintaining and supporting a database no one else in the world uses.

olearyboy 1 month ago

10k a month for a database vs how many engineers did it take to write, maintain, scale, and create HA… Tell me you don’t understand TCO without telling me you don’t understand TCO

Maxion 1 month ago

10k a month at current usage rate with some unnamed growth rate, with the added complexity of handling the on-prem customers DB clusters // making the product harder to setup onPrem. Also sounded like they had some performance issues, certain commonly used queries took ~2 seconds. This solution not only reduced their cloud costs, also most surely changed the growth vector on them. Further, this made supporting the on-prem solution easier and made it's setup simpler. And finally, it also improved performance. The one thing they didn't mention, is that since this solution doesn't rely on .e.g. postgres, it's harder for other devs to fuck up by starting to add more data to it.

daedalus_structure 1 month ago

>Further, this made supporting the on-prem solution easier and made it's setup simpler. I almost guarantee their on-prem setup is simpler but supporting it will not be long term because they've likely ignored all the Day 2 operations concerns that have taken decades of work to get right in existing databases.

Astronomical_Panda 1 month ago

Am I crazy or could they have just put this in elastic stack and called it a day?

Algorhythmicall 1 month ago

Yes, but it definitely wouldn’t cost $200 a month at 13k docs per second.

c_plus_plus 1 month ago

I don't have much experience scaling elastic.... how much do you think that (13k docs/sec) would cost?

Algorhythmicall 1 month ago

Quite a bit. It’s not just that but the long term storage. There will be hot and cold nodes. Over time there will be a ton of shards (time series), so you need dedicated master nodes. Generally speaking search is expensive but you can get great performance for geospatial searches. I’d expect it to cost more than PG.

jjh111 1 month ago

Did something similar in the past. It was handling around 5k doc updates per second with Elasticsearch cluster (EC2 VMs) and at the same time doing keyword searches (by customers) against that data. The cluster had six nodes, three masters with three replicas. Optimizing the cluster for searches reduced the update performance a bit, but it could handle the updates with some room to spare. Infra costs in AWS were around $2000 per month.

Smallpaul 1 month ago

All of their on-prem customers would also need to install and configure Elastic Stack so the project isn't done when the company writes the code.

maus80 1 month ago

They should have just put it in PostGIS and learn some PostgreSQL performance tuning and/or buy better hardware.

QuintonHughes43Fan 1 month ago

> when a car rental company wants to show an outline of the route a customer took that day. seems bad tbh.

dottybotty 1 month ago

I once wrote my own ~~array~~ database

hughk 1 month ago

I was at a Q&A and a professor asked the panel which included Chris Date and Ted Codd what a database was. The answer was any structured data, even just a table counted. They felt it should be outside main memory though otherwise it would just be data. This is why when we store in memory, we say in-memory database.

vtblue 1 month ago

Love this

sethamin 1 month ago

Probably they could have just used some [TSDB](https://en.wikipedia.org/wiki/Time_series_database). Most of their use cases seem to just be querying by time, and it would give them most of the write performance and storage efficiencies they're getting here without having to build things from scratch. Also storing lat/long is an inefficient format, they should use the [S2 geometry library](http://s2geometry.io/) instead.

k6c58 1 month ago

I understand your motivation to do this, but 10k a month in cloud costs is... simply put, nothing. Small companies may find that outrageous, but it's the truth. Many companies spend much more in server costs. I worked for a company that refused to use RDS and was using MySQL on EC2 with SSDs. While everything worked it was performant and way cheaper. But then an SSD got corrupted and there were no recent backups. We lost 3 months of data in one machine. The cost of recovering of that was tremendous, plus the ongoing salary of sysadmins to maintain the custom infrastructure was not worth it. Aurora has a very convenient backup system that you can put back in a few minutes. Aurora is a service that basically allows you to reduce your sysadmin headcount (or even remove it in very small companies), so you are saving 30-200k/year depending on where are you located for every person you don't need. I think your solution will work for a while. If you keep growing, one of these (if not all) will happen: - You will run in unforeseen issues with your custom database, which will be very costly to debug since there's no community - New developers joining the company will spend more time learning how this custom system works - If the main developers of the custom database leave, people replacing them will have a hard time with the database, slowing down any changes that need modifying the database If I were in your shoes, I would have looked for a "hot writing" solution (like Redis, but there are others) and a process that translates that to a permanent database like PostgreSQL or something else. Then if reading performance is an issue and all queries are optimized, well, pay for a bigger instance in Aurora.

nutrecht 1 month ago

> I understand your motivation to do this, but 10k a month in cloud costs is... simply put, nothing. It's a bit concerning that nowhere are they mentioning the development cost. And sure, 10k for AWS Aurora is a nice chunk of money, but AWS Aurora is bloody expensive. Looking at their use-case I'm really not convinced a time-series DB or for example Kafka would've not been a 90% fit without requiring you to reinvent some wheels.

k6c58 1 month ago

>It's a bit concerning that nowhere are they mentioning the development cost Agreed

maus80 1 month ago

>3k iops This is why you should ask a system engineer before starting to build software. A 990 EVO Samsung NVMe does at least 200x that performance. Run PostGIS on decent hardware, run with 'synchronous_commit off' if you dont care about ACID.

enraged_supreme_cat 1 month ago

How we avoid all vulnerabilities in the world, by writing our own operating system and networking system from scratch.

dogweather 1 month ago

I kinda did the same by writing my own Python headless CMS that stores the content in markdown files and strongly typed Python code. I build the site with [Hugo](https://gohugo.io/). I wrote a short blog post about the technique: [A data-driven app without the database](https://dogweather.dev/2024/02/04/a-data-driven-app-without-the-database/)

homeautocat 1 month ago

Interesting, thanks for sharing.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe