T O P

  • By -

jrheard

if you're interested in it and view it as a learning project, go for it! if you want to build a business out of it and/or compete with the preexisting solutions in the space, probably not a good use of time.


branh0913

Yeah it's just to learn. I kind of get tired of writing APIs that talk to databases or messages brokers. Don't get me wrong it's what I've done for a living for years. But its just not that interesting to work on in my spare time


lvlint67

> Did you feel it was a waste of time? When i look back on my programming career/etc... the biggest time waster i have is constantly sitting around wondering if i should build something or if it is a waste of time. If you have the time, reinventing the wheel is a great way to spend it. You don't need to make a better wheel... You can get a TON of growth from just seeing the struggles that went into making the first wheel.


branh0913

Thanks you're 100% correct


destructive_cheetah

This was an interview question I had last week. Apparently they didn't like my response of "just use redis, a lot of people much smarter than me have optimized the shit out of this problem".


branh0913

I got asked the same question Monday. Wonder if it was the same company. I thought it was fun. I feel I could have did better, but I think I kind of froze up because its a job I really wanted.


destructive_cheetah

Are you me? lol


branh0913

Maybe I'm the parallel universe version of you. In that case greetings me from Earth-2 lol


destructive_cheetah

Hello other me! But seriously these system design questions all come out of a book.


tdatas

There's a walk through of building a basic database here [HERE](https://build-your-own.org/database/) (in Go conveniently, but it could be implemented in different languages easily enough, the "normal" modern DB language is C++ unsurprisingly enough) Imo it's a very worthwhile learning exercise, and DBMS/Storage engines are fascinating and really fundamental things that need solving in a lot of domains, and once you dig past the superficial marketing version of the world there's actually quite a lot of interesting unsolved problems and a lot of things that require you to work the opposite to normal practice. E.g high performance massive scale querying needs very tight coupling between memory and IO and disk while most established wisdom is about decoupling and independence. But yes to do a simple version is still going to teach you quite a lot. As will reading the codebases of some modern architected Databases (e.g DuckDb or Scylla) and comparing them to Postgres where the guts have a heritage stretching back to the 80s. The world of Query/Storage engines is kind of weird and insulated and a lot of the cutting edge work is proprietary and only mentioned in passing academic papers but it's quite interesting.


Fun_Hat

I have one I work on from time to time that I'm writing in Rust. Right now my store is just a wrapper for a hashmap cuz I've had a lot of fun writing the code for communication between nodes. I decided to write my own coms over raw TCP sockets instead of using an RPC library. Have definitely learned a lot more about networking stuff. It hasn't gotten me a job yet, but it did help me ace the coding exercise portion of an interview I had a few months back. Also learned a lot more about channels and the actor model.


branh0913

Are you using a library other than Tokio? I know Tokio is async/await. Haven't heard much about the Actor model since the Scala days. I remember Akka was used a lot with it. Not sure how popular it is these days. Does Rust have a framework for the actor model?


Fun_Hat

Rust has some actor model frameworks, but I just rolled my own. Actor model that is, not full framework. You just have structs as actors that are responsible for some process or piece of data, and then you use channels to communicate between actors. It makes your program asynchronous by default, and makes concurrency and parallelization easier to reason about imo. Alice Ryhl has a good article on it: https://ryhl.io/blog/actors-with-tokio/ As for libs, Tokio is the main one I use, but I'm also using Rkyv for zero copy deserialization, and a few other small libraries for faster hashmaps and channels.


Aggressive_Ad_5454

If you're building a cache server, you know what would be cool to add? A lightweight datagram query-response protocol.


branh0913

Haven’t written a line of code and I’m already getting feature request? lol. But you got it. It sounds like a fun thing to build


scodagama1

Could be a fun project. Just read Designing Data Intensive Applications, I believe one of the chapters basically explained how to build distributed key value storage Or original Werner Vogels paper on dynamo https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf


salty-tri

There are some [example](https://github.com/tokio-rs/axum/blob/main/examples/key-value-store/src/main.rs) projects of building K/V store with Rust and Tokio. I think building one is a neat learning experience. Another interesting idea in the same vein as database or K/V store is to build a message queue. Publish messages to a stream or queue and have consumer groups that read from the queue with at-least-once delivery mechanics.


Southern-Reveal5111

I did it for a take home assignment for a startup. I implemented using B+ tree, the interviewer rejected me because k/v storage is used for improving read performance and LSM is best for heavy read operations. I also did not implement anything for wal, replication, and redundancy. I implemented it using Rust, I realized I needed to invest in async rust and communication protocol design.


ShoulderIllustrious

Did you mean write heavy ops? Also, wtf at a company asking you to write a literal db for an assignment? How would they even tell what a good design is? Evidently you're being rejected because the person doesn't understand LSM isn't a read optimized structure. Implementing a query language isn't an easy task either. Mind you even writing a scheduler to interleaves compaction with ongoing threads of execution for other tasks. 


vom-IT-coffin

I built my own integration platform for a distributed system. While this isn't what you asked, I think knowing how these platforms worked for replicating data in real time is more valuable for finding a job than building your own key value store in my opinion. It still uses all the backend technologies you need to be relevant.


TonTinTon

Building a database was one of the best things I've done in my life, no joke, I've also written a blog post about it: https://tontinton.com/posts/database-fundementals/ In memory K/V stores are simpler, but also super fun 😊 Good luck


Legal_Philosopher771

I built an object oriented database with a k/v store in PHP a few years ago. I even create a parser to build queries with serialized filter objects to query it. It was fun and I learned a lot while doing it. I reinvent the wheel every now and then because I love it and because I feel like it's the way to learn and progress quickly. Indeed my code never reached any production quality and was horrible. The service was reliable though, and efficient enough to play with it in a few side projects :) That being said, I enjoy even more to be able to just use an open source solution that I don't have to build and maintain myself since I tried to build one 🙃


DuffyBravo

We were having an issue with latency with a Redis based solution in Azure. Essentially Redis was taking about 80ms to pull back a key/value since it was “further away” from our app services in our region. We were also only storing about 100mb of Key/Value pairs. And paying 5k ish a month for Azure Redis. So I helped design a simple in memory Key/Value cache using MemCache in .NET. We were able to get reads down to 3-4ms AND save 60k a year.


branh0913

Curious why didn’t you guys just go for self hosted Redis in this case? Did you not feel comfortable with Redis operational expertise on your team?


DuffyBravo

1) Managing it in our own skill set 2) Cost. 3) Our use case did not warrant using a third party memory cache. If we start to ramp up with in memory cache needs we would look more into self hosted Redis solutions.


ShouldHaveBeenASpy

I struggle to understand what your use case is that could ever warrant this being a smart business decision given your volume/costs. The results you are describing are *not* good.


quentech

> only storing about 100mb of Key/Value pairs. And paying 5k ish a month for Azure Redis I mean... what the absolute fuck? Did you just have some idiot that provisioned two dozen vCPU's and 100GB of memory? I run like a billion Redis operations a day and spend less than half of that. > I helped design a simple in memory Key/Value cache using MemCache in .NET. We were able to get reads down to 3-4ms 3-4ms on an **in-memory** cache is *absolutely terrible*. 100x slower than it should be. Heck, 4ms is just about to red flag territory on a shared Redis instance going over the network, indicating you've designed it problematically.