What are skiplists good for?

(antithesis.com)

48 points | by mfiguiere 1 day ago

10 comments

teiferer 40 minutes ago
In the age of agentic programming and the ever increasing pressure to ship faster, I'm afraid this kind of knowledge will become more and more fringe, even moreso than it is today. Who has the time to think through the intricacies of parallel data structures? Clearly we'll just throw more hardware at problems, write yet another service/api/http endpoint and move on to the next hype. The LLM figures out the algorithms and we soon lose the skills to develop new ones. And tell each other the scifi BS myth that "AI" will invent new data structures in the future so we don't even beed humans in the loop.
[-]
- dgb23 31 minutes ago
  I think the opposite is the case. We increasingly need to care more about performance and know how to leverage hardware better.
  The market is telling us that through increased hardware prices.
  LLMs being very powerful means that we need to start being smarter about allocating resources. Should chat apps really eat up gigabytes of RAM and be entitled to cores, when we could use that for inference?
bob1029 28 minutes ago
On practical machines they aren't good for much. To access a value in a skip list you have to dereference way more pointers than in a b+ tree. On paper they're about the same, but in practice the binary tree will tend to outperform. You get way more work done per IO operation.
winwang 1 hour ago
Only somewhat related but there is supposedly a SIMD/GPU-friendly skiplist algo written about here: https://csaws.cs.technion.ac.il/~erez/Papers/GPUSkiplist.pdf
josephg 38 minutes ago
For this problem, I’d consider a different approach. You have a fuzzer, and based on some seed it’s generating lots of records. You then need to query a specific record (or set of records) based on the leaf.
I’d just store a table of records with the leaf, associated with the seed. A good fuzzer is entirely deterministic. So you should be able to regenerate the entire run from simply knowing the seed. Just store a table of {leaf, seed}. Then gather all the seeds which generated the leaf you’re interested in and rerun the fuzzer for those seeds at query time to figure out what choices were made.
tooltower 1 hour ago
In my personal projects, I've used it to insert/delete transactions in a ledger. I wanted to be able to update/query the account balance fast. Like the article says, "fold operations".
ahartmetz 2 hours ago
Skiplists have some nice properties - the code is fairly short and easy to understand, for one. Qt's QMap used to be skip list based, here's the rationale given for it: https://doc.qt.io/archives/qq/qq19-containers.html#associati...
mrjn 2 hours ago
skiplists form the basis of in-memory tables used by LSM trees, which are themselves the basis of most modern DBs (written post 2005).
locknitpicker 2 hours ago
FTA:
> Skiplists to the rescue! Or rather, a weird thing we invented called a “skiptree”…
I can't help but wonder. The article makes no mention of b-trees if any kind. To me, this sounded like the obvious first step.
If their main requirement was to do sequential access to load data, and their problem was how to speed up tree traversal on an ad-hoc tree data structure that was too deep, then I wonder if their problem was simply having tree nodes with too few children. A B+ tree specifically sounds to be right on the nose for the task.
feverzsj 1 hour ago
If you need a graph db, use a graph db.
linzhangrun 25 minutes ago
[dead]