When I was a senior at Swarthmore College, Herb Wilf came over from U Penn to teach a course in combinatorial algorithms. I was encouraged to attend.
He claimed that choosing a subset of k integers at random from {1..n} should have a log in its complexity, because one needs to sort to detect duplicates. I realized that if one divided [1..n] into k bins, one could detect duplicates within each bin, for a linear algorithm. I chose bubble sort because the average occupancy was 1, so bubble sort gave the best constant.
I described this algorithm to him around 5pm, end of his office hours as he was facing horrendous traffic home. I looked like George Harrison post-Beatles, and probably smelled of pot smoke. Understandably, he didn't recognize a future mathematician.
Around 10pm the dorm hall phone rang, one of my professors relaying an apology for brushing me off. He got it, and credited me with many ideas in the next edition of his book.
Of course, I eventually found all of this and more in Knuth's books. I was disillusioned, imagining that adults read everything. Later I came to understand that this was unrealistic.
Love anecdotes like this! But admittedly I feel a bit lost, so please forgive my ignorance when I ask: why does choosing a subset of k integers at random require deduplication? My naive intuition is that sampling without replacement can be done in linear time (hash table to track chosen elements?). I’m probably not understanding the problem formulation here.
your random number function might return the same number multiple times? So to choose k random but unique numbers you may have to call the random number function more than k times?
Of course my intuition would be that you can do a random shuffle and then take the first k, which is O(N). So I might be misunderstanding.
1977. And I didn't know what a hash table was, though I can't explain now why they didn't think of using a hash table. I was effectively using a dumb hash function.
Their 1978 second edition works in exactly the memory needed to store the answer, by simulating my algorithm in a first pass but only saving the occupancy counts.
Oh, and thanks (I guess). I really didn't expect to ever be reading FORTRAN code again. One learned to program at Swarthmore that year by punching cards, crashing our IBM 1130, and bringing the printout to my supervisor shift. I'd find the square brackets and explain how you'd overwritten your array. I even helped an economics professor Frederic Pryor (the grad student in the "Bridge of Spies" cold war spy swap) this way, when I made an ill-advised visit to the computer center on a Saturday night. Apparently I could still find square brackets.
The article, and many of the responses, are hinting at the fact that bubblesort is an example of an anytime algorithm. This is a wide class of algorithms which provide a correct answer after some amount of time, but provide an increasingly good answer in increasing amounts of time short of the completion time. This is a super valuable property for real time systems (and many of the comments about games and animations discuss that). The paper that introduced me to the category is "Anytime Dynamic A*" [0], and I think it's both a good paper and a good algorithm to know.
Anytime algorithms are great for robotics planning, for example. A plan does not have to be perfect to be useful, especially when it can be refined further in the next timestep. And the robot cannot act out the plan instantaneously, so by the time one is close to the point where a non-ideal segment would be, one has had many timesteps to refine/optimize it. But robot could start moving right away.
An anytime algorithm monotonically improves some evaluation metric. For a sort, the evaluation metric is usually the number of inversions in the list. At completion, there will be zero inversions. If at time 0 there are N inversions, then at time 0 < t < completion time there will be ≤ N inversions; that is, the list is "more sorted" than it was before. As the various examples about games and animation elsewhere in the comments show, this can be interpreted as "somewhat smoothly moving towards sorted over time," which is an (occasionally) ((rarely)) useful property.
Okay, but it gives you a mostly good answer! Unlike many other sorts where if you interrupt it before the last step, you get total nonsense.
It's basically asymptotically approacting the correct (sorted) list instead of shuffling the list in weird ways until it's all magically correct in the end.
> Unlike many other sorts where if you interrupt it before the last step, you get total nonsense.
which ones you have in mind? and doesn't "nonsense" depend on scoring criteria?
selection sort would give you sorted beginning, cocktail shaker would have sorted both ends
quick sort would give vast ranges separation ("small values on one side, big on the other"), and block-merge algorithms create sorted subarrays
in my view those qualities are much more useful for partial state than "number of pairs of elements out of order" metric which smells of CS-complexity talk
I used bubblesort on purpose in a game project. Specifically, to sort sprites in an NES game back to front, lazily, spending as few CPU cycles as possible. Bubblesort on the very small list (a dozen objects max), and early exit after the first swap. It eventually completes, and that was just fine. It's tiny, incredibly simple, and somewhat resilient to the list changing from frame to frame as objects spawn and despawn. Each partial sort makes some progress no matter what.
A few other algorithms would have fit the bill just as well, but bubblesort is perfectly adequate, so that's what will likely ship. More complex algorithms end up losing out due to greater initial overhead or larger ROM size.
The primary complication is that objects move around between frames, so the list doesn't stay sorted. Insertion requires a second scan once I've found an object that is out of position, whereas bubble can swap right away. (The candidate object is always N-1, after all.)
My goal is to stop working as soon as possible; this is a 1.7 MHz CPU and it has many other tasks to perform on a given frame. It's hard to appreciate unless you sit down to actually do it, but on a processor this slow, the scanning of the list and the comparison of depth is not cheap. It's computed on the fly as we go, there isn't spare RAM to cache it, and it changes every frame anyway. All of that makes the scanning and comparisons way more expensive than the eventual swap.
Insertion would be ideal here for a freshly spawned object, since we already know it is out of position and merely need to find its destination. (The work to shift the rest of the list down isn't super expensive, thankfully.) In practice this is already a big enough visual change that the player doesn't have much time to notice a depth error before it corrects itself.
There is "early exit after the first swap", which actually makes his algorithm closer to insertion sort than bubble sort. If the list couldn't change between each pass, it would be a very inefficient insertion sort in O(n^3) as you are constantly scanning the part of the list that is already sorted.
But this is a case where theory doesn't count as much as having an acceptable result. It is typical in video games, if it plays well and looks fine, who cares if it is incorrect. Here I guess that sometimes a sprite appears on top of another sprite when it should have been behind it, because of the early exit, but while playing, it turned out not to be noticeable enough to change the algorithm.
I've used bubblesort when simulating LEO satellite constellations, calculating which satellite is closest to a location. I used one single backwards pass of bubblesort, so O(n) every k timesteps to bring the closest to the head of the array, then every timestep just do one backwards bubblesort pass over the first few in the array. Given satellites move smoothly, if you initialize right (a few full passes at the start to get the closest few at the front) and get the constants right so a satellite outside the front few in the array can't have moved far enough to become closest without being promoted to the front few by a periodic full pass, then you always maintain the closest at the front of the array very cheaply. And this has the advantage of also being very simple to code.
On 8-bit and 32-bit microcontrollers (e.g. 8-bit AVR, 32-bit ESP8266/ESP32), Insertion sort is 6X faster than Bubble Sort on random data. I have tested this up to about N=1000.
Both Insertion sort and Bubble sort are O(N^2). Both are stable sorts. Insertion sort consumes only about 10-20 bytes more flash memory than Bubble sort. It's hard to think of situations where Bubble sort would be preferred over Insertion sort.
Shell sort is vastly superior if you can afford an extra 40-100 bytes of flash memory. (It's not too much more complicated than Insertion sort, but sometimes, we don't have 100 extra bytes.) It is O(N^k), where k ≈ 1.3 to 1.5. As soon as N ⪆ 30, Shell sort will start clobbering Insertion sort. For N ≈ 1000, Shell sort is 10X faster than Insertion sort, which in turn is 6X faster than Bubble sort. Unfortunately Shell sort is not stable.
Comb sort has a similar O(N^k) runtime complexity as Shell sort. But it seems slower than Shell sort by a constant factor. Comb sort is also not stable. I cannot think of any reason to use Comb sort over Shell sort.
Quick sort is not much faster than Shell until about N ≈ 300. Above that, the O(N*log(N) of Quick sort wins over the O(N^k) of Shell sort. But Quick sort is not stable.
Merge sort is stable and runs in O(N*log(N)), but it consumes an extra O(N) of RAM, which may be impossible on a microcontroller. You may be forced back to Insertion sort for a stable sort.
There is stable in-place merge sort, it runs in O(n*log(n)^2). It is about 3 times more complex than shell sort. I implemented it here https://github.com/thomasmueller/bau-lang/blob/main/src/test...
(most sort algos you mentioned above are in the same direcory btw)
You didn't mention heap sort. A simple implementation, which doesn't do any method calls just like shell sort (also next to the merge sort above) is about twice as complex than shell sort.
Thanks for the reference to in-place merge sort. GitHub is having a lot problems right now, I cannot see your code. I will look at it when I get a chance.
I think I ignored Heap sort because it uses O(N) extra RAM, which is precious on a resource-constrained microcontroller.
Heap sort is in-place (and does not even need recursion, unlike quicksort). But yes, it is not stable, and usually slower than even shell sort (except for very large arrays).
For me it's lsb radix, Yeah I know it only works on numbers, but much younger me independently invented it when slinging 3480 mainframe tape as a grave shift operator. The company had invested in mainframes early and by the time I had got there was was slightly disfunctional, they still had mainframe operators and we would run the nightly batch jobs to process orders. While they had a hard drive(the ramac) they never liked to update their programs to use it, so every major step of the process would read a tape and write a new tape(they used the tapes sort of like a massively inefficient version control, so the process could be restarted at any point) at the end of the night we would have to file a couple hundred tapes back in the library. As I hated randomly seeking through the library and was bad at ad hock sorting I put together a manual sorting routine so the numbered tapes could go back in order. What ended up working best for me I found out later was the good ol' LSB radix sort and I have a soft spot for it to this day.
I read this all the time from other people, but for me, Selection sort is the easiest to remember and implement. My next easiest would be Insertion sort.
Bubble sort doesn't click for me easily. I think it's because the terminating condition seems uglier than Selection sort or Insertion sort. I always have a little voice in the back of my mind, "Is this outer loop guaranteed to terminate?"
Selection sort was the first sorting algorithm that I ever created: Find the smallest element for slot #1. Find the next smallest for slot #2. And so on. I've recreated it from scratch many times. The double for-loop means that it is guaranteed to finish, and it is obviously O(N^2).
I did not learn about Bubble sort until my university class, where I thought, this is a terrible algorithm: It's not completely obvious that it will finish, and it's not obvious what the runtime complexity is. For example, if the inner loop used ">=" instead of ">", it would work for test data sets with unique elements, then hang forever with a real data set that contained duplicates.
> Can you type it from memory, without looking it up?
Well, yeah, but that’s not a very useful heuristic for people who are already aware of the algorithms in question. If you ask people to come up with a way from scratch – probably without explicitly asking, in order to get better success, like by watching how they sort cards – I bet they won’t tend to come up with bubble sort. Maybe that says something about the relative intuitiveness of the approaches? Or not.
As others have said, it is easy enough for a child in the 80s, with only a
BASIC manual to come up with it. Been there, done that. Didn't even had a name for it.
Later I read a magazine explaining several algorithms and found the name of what I had implemented.
For the curious, the ZX Spectrum microdrive listed files on the cartridges by order found on tape. I wanted to display it in alphabetical order like the "big" computers did.
I felt this comment in my soul. I’ll never understand it: I’ve written thousands of lines of code (as a hobbyist) to solve all sorts of problems I’ve run into and yet always seem to struggle to wrap my mind around the core algorithms any real developer should be able to handle easily. This is why I’ve never pursued programming as a career.
It took computer scientists of the past, a lot of effort to come up with these complicated algorithms. They are not easy or trivial. They are complicated and that's OK that you can't just quickly understand them. Your imaginary "real developer" at best memorised the algorithms, but that hardly differs from smart monkey, so probably not something to be very proud of.
It is your choice which career to pursue, but in my experience, absolute majority of programmers don't know algorithms and data structures outside of very shallow understanding required to pass some popular interview questions. May be you've put too high artificial barriers, which weren't necessary.
To be a professional software developer, you need to write code to solve real life tasks. These tasks mostly super-primitive in terms of algorithms. You just glue together libraries and write so-called "business-logic" in terms of incomprehensible tree of if-s which nobody truly understands. People love it and pay money for it.
Thanks for your kind comment! I do not have any systematic leaning of computer science; I often feel confused when reading textbooks on algorithms hahaha.
Should I be familiar with every step of Dijkstra’s search algorithm and remember the pseudocode at all times? Why don’t the textbooks explain why the algorithm is correct?
> Should I be familiar with every step of Dijkstra’s search algorithm and remember the pseudocode at all times?
Somehow, I think you already know the answer to that is "no".
I've been working as a software engineer for over 8 years, with no computer science education. I don't know what Dijkstra's search algorithm is, let alone have memorised the pseudocode. I flicked through a book of data structures and algorithms once, but that was after I got my first software job. Unless you're only aiming for Google etc, you don't really need any of this.
You should know the trade-offs of different algorithms, though. Many libraries let you choose the implementation for a spcific problem. For instance tree vs. hash map where you trade memory for speed.
> Why don’t the textbooks explain why the algorithm is correct?
The good ones do!
> Should I be familiar with every step of Dijkstra’s search algorithm and remember the pseudocode at all times?
If it’s the kind of thing you care to be familiar with, then being able to rederive every step of the usual-suspect algorithms is well within reach, yes. You don’t need to remember things in terms of pseudocode as such, more just broad concepts.
Something that helps, I think, is to make something practical that demands it.
I used to think alike (I'm +30 year programing) until I decide to do https://tablam.org, and making a "database" is the kind of project where all this stuff suddenly is practical and worthwhile.
I haven't investigated selection sort. You have me curious though. Is that a step down from bubble sort? If so, maybe it's just as well I figured out bubble.
The only use I've seen is incrementally sorting large arrays during brute-force search of said arrays, since that is approximately free and brute-force search is pretty efficient and fast on modern CPUs. Set a "sorted" flag if/when the array is eventually sorted.
The idea was that the vast majority of arrays in a large set are not searched often enough to justify the cost of sorting them and sorting is an expensive operation if you are computing on a deadline. You also don't always know which ones will be heavily searched ahead of time. Using bubblesort, only the heavily accessed arrays end up sorted but as a side-effect of search rather than having separate heuristics to decide when/what to sort.
Yeah, the article beat me to the gamedev example. Bubble sort being able to always "soft sort" on every iteration makes it the easiest to suspend and resume when you have a lot of other work to do, and when sorting is low priority.
Also, general wisdom to be mindful of data sizes and cache coherency. O(NLogN) vs. O(N^2) doesn't mean much when you're only sorting a few dozen items. Meanwhile, O(N) space can have drastic performance hitches when reallocating memory.
Not only in best case. Haven't seen this elsewhere, and know only few people who know that, so, a kind of a puzzle: what are the conditions when bubblesort is always O(n)?
When the array is almost sorted. Bubble sort complexity is linear + inversions so if the inversions are low (the more sorted the array the lower the number of inversions), bubble sort is close to a linear pass.
For small sets, or small-ish sets when you are coding quick, don't have a convenient standard library sort to hand, and are prioritising correctness over absolute performance.
Though in reality almost never: you almost always have a convenient built-in sort that is as quick & easy to use (likely quicker & easier), and in circumstances where the set is small enough for bubblesort to be just fine, the speed, memory use, or other properties of what-ever other sort your standard library uses aren't going to be a problem either.
As others have pointed out, sometimes it is useful for partial sorts due to the “always no less sorted than before at any point in the process (assuming no changes due to external influence)” property.
wrt:
> If you make each frame of the animation one pass of bubblesort, the particles will all move smoothly into the right positions. I couldn't find any examples in the wild,
There are hundreds of sort demos out there, both live running and on publicly hosted videos, that show the final positions by hue, getting this effect. Seems odd that they couldn't find a single one.
EDIT: actually, I can't find any of the rainbow based sort demos I was thinking of, a lot of promising links seem dead. I take back my little moan!
Reminds me of an interview I had a while ago. The interviewer in all seriousness asked me to code up a sorting algorithm on the whiteboard. He was more of a business person than technical so was probably thinking of insertion, selection and bubblesort.
I said sure, quicksort, mergesort or radixsort?
He just said "okay, let's skip to the next question". :)
I'd be tempted to ask under what circumstances they'd expect me to be coding a sorting routine... Seems a bit like asking an accountant to add two 10 digit numbers.
When sorting eigenpairs of a dense matrix, usually tou end up with a Schur decomposition. The basic operation that you can do is swap two adjacent eigenvalues on the diagonal, so bubblesort is a natural candidate.
For the downvoters, he's referring to this instance when (then) Senator Obama jokingly referenced bubble sort during this Google event: https://www.youtube.com/watch?v=koMpGeZpu4Q
It was one of the many viral moments during Obama's original campaign where he seemed cool and in touch.
Of course, being able to call out a sorting algorithm is the kind of thing that makes you very "cool and in touch" in the grander scheme of things, unlike, say, playing sports or lifting 500lbs.
A: For small arrays. I would add: particularly if you need a stable sort algorithm, which is either complex (Block Sort) or uses O(n) space (Merge Sort).
There is stable in-place merge sort [1], which is O(n*log(n)^2) and not that complex or hard to implement (about 80 lines, and that includes the ~15 lines of binarySearch, which you might need anyway).
bubble sort is sometimes used in information retrieval use cases for reranking top k based on some signals, especially specific to a user profile. I feel heap sort comes up as well, yet neither are necessarily the most efficient.
If you need a stable sort, can't be bothered finding a massive oversize library to link to, and only need to sort a relatively small number of objects on a system that's resource-constrained, I'm guessing?
I'm surprised that the simple, ~80 lines version of stable-in-place merge sort (see link in the above comments) is not more widely known. It is O(n log n log n) and not all that hard to implement.
In all your big-O analyses, remember: n = 3 more often than you think. n = 12 a lot more often than you think. If that's your case, there's nothing wrong with bubble sort unless you have very tight performance constraints.
Worse, big-O always hides a constant factor. What's bubblesort's constant? What's quicksort's? It wouldn't surprise me if, for small enough n (2 or 3, and maybe a bit higher), bubblesort is actually faster.
Note well: I have not actually benchmarked this.
Also note well: Determine what your n is; don't assume that it's either large or small.
Shell sort is sooo much faster than Bubble sort for tiny microcontrollers, for only a little bit more flash memory, like 40-100 bytes. If that's too much, then Insertion sort is 6X faster than Bubble sort, for only 10-20 bytes of extra flash.
He claimed that choosing a subset of k integers at random from {1..n} should have a log in its complexity, because one needs to sort to detect duplicates. I realized that if one divided [1..n] into k bins, one could detect duplicates within each bin, for a linear algorithm. I chose bubble sort because the average occupancy was 1, so bubble sort gave the best constant.
I described this algorithm to him around 5pm, end of his office hours as he was facing horrendous traffic home. I looked like George Harrison post-Beatles, and probably smelled of pot smoke. Understandably, he didn't recognize a future mathematician.
Around 10pm the dorm hall phone rang, one of my professors relaying an apology for brushing me off. He got it, and credited me with many ideas in the next edition of his book.
Of course, I eventually found all of this and more in Knuth's books. I was disillusioned, imagining that adults read everything. Later I came to understand that this was unrealistic.
Of course my intuition would be that you can do a random shuffle and then take the first k, which is O(N). So I might be misunderstanding.
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
Their 1978 second edition works in exactly the memory needed to store the answer, by simulating my algorithm in a first pass but only saving the occupancy counts.
Oh, and thanks (I guess). I really didn't expect to ever be reading FORTRAN code again. One learned to program at Swarthmore that year by punching cards, crashing our IBM 1130, and bringing the printout to my supervisor shift. I'd find the square brackets and explain how you'd overwritten your array. I even helped an economics professor Frederic Pryor (the grad student in the "Bridge of Spies" cold war spy swap) this way, when I made an ill-advised visit to the computer center on a Saturday night. Apparently I could still find square brackets.
[0] https://cdn.aaai.org/ICAPS/2005/ICAPS05-027.pdf
It's basically asymptotically approacting the correct (sorted) list instead of shuffling the list in weird ways until it's all magically correct in the end.
which ones you have in mind? and doesn't "nonsense" depend on scoring criteria?
selection sort would give you sorted beginning, cocktail shaker would have sorted both ends
quick sort would give vast ranges separation ("small values on one side, big on the other"), and block-merge algorithms create sorted subarrays
in my view those qualities are much more useful for partial state than "number of pairs of elements out of order" metric which smells of CS-complexity talk
A few other algorithms would have fit the bill just as well, but bubblesort is perfectly adequate, so that's what will likely ship. More complex algorithms end up losing out due to greater initial overhead or larger ROM size.
My goal is to stop working as soon as possible; this is a 1.7 MHz CPU and it has many other tasks to perform on a given frame. It's hard to appreciate unless you sit down to actually do it, but on a processor this slow, the scanning of the list and the comparison of depth is not cheap. It's computed on the fly as we go, there isn't spare RAM to cache it, and it changes every frame anyway. All of that makes the scanning and comparisons way more expensive than the eventual swap.
Insertion would be ideal here for a freshly spawned object, since we already know it is out of position and merely need to find its destination. (The work to shift the rest of the list down isn't super expensive, thankfully.) In practice this is already a big enough visual change that the player doesn't have much time to notice a depth error before it corrects itself.
But this is a case where theory doesn't count as much as having an acceptable result. It is typical in video games, if it plays well and looks fine, who cares if it is incorrect. Here I guess that sometimes a sprite appears on top of another sprite when it should have been behind it, because of the early exit, but while playing, it turned out not to be noticeable enough to change the algorithm.
Both Insertion sort and Bubble sort are O(N^2). Both are stable sorts. Insertion sort consumes only about 10-20 bytes more flash memory than Bubble sort. It's hard to think of situations where Bubble sort would be preferred over Insertion sort.
Shell sort is vastly superior if you can afford an extra 40-100 bytes of flash memory. (It's not too much more complicated than Insertion sort, but sometimes, we don't have 100 extra bytes.) It is O(N^k), where k ≈ 1.3 to 1.5. As soon as N ⪆ 30, Shell sort will start clobbering Insertion sort. For N ≈ 1000, Shell sort is 10X faster than Insertion sort, which in turn is 6X faster than Bubble sort. Unfortunately Shell sort is not stable.
Comb sort has a similar O(N^k) runtime complexity as Shell sort. But it seems slower than Shell sort by a constant factor. Comb sort is also not stable. I cannot think of any reason to use Comb sort over Shell sort.
Quick sort is not much faster than Shell until about N ≈ 300. Above that, the O(N*log(N) of Quick sort wins over the O(N^k) of Shell sort. But Quick sort is not stable.
Merge sort is stable and runs in O(N*log(N)), but it consumes an extra O(N) of RAM, which may be impossible on a microcontroller. You may be forced back to Insertion sort for a stable sort.
You didn't mention heap sort. A simple implementation, which doesn't do any method calls just like shell sort (also next to the merge sort above) is about twice as complex than shell sort.
I think I ignored Heap sort because it uses O(N) extra RAM, which is precious on a resource-constrained microcontroller.
Bubble sort doesn't click for me easily. I think it's because the terminating condition seems uglier than Selection sort or Insertion sort. I always have a little voice in the back of my mind, "Is this outer loop guaranteed to terminate?"
I don't believe that insertion/selection sort is easier to grasp. Can you type it from memory, without looking it up? Here's bubble sort:
Selection sort was the first sorting algorithm that I ever created: Find the smallest element for slot #1. Find the next smallest for slot #2. And so on. I've recreated it from scratch many times. The double for-loop means that it is guaranteed to finish, and it is obviously O(N^2).
I did not learn about Bubble sort until my university class, where I thought, this is a terrible algorithm: It's not completely obvious that it will finish, and it's not obvious what the runtime complexity is. For example, if the inner loop used ">=" instead of ">", it would work for test data sets with unique elements, then hang forever with a real data set that contained duplicates.
Well, yeah, but that’s not a very useful heuristic for people who are already aware of the algorithms in question. If you ask people to come up with a way from scratch – probably without explicitly asking, in order to get better success, like by watching how they sort cards – I bet they won’t tend to come up with bubble sort. Maybe that says something about the relative intuitiveness of the approaches? Or not.
For the curious, the ZX Spectrum microdrive listed files on the cartridges by order found on tape. I wanted to display it in alphabetical order like the "big" computers did.
It is your choice which career to pursue, but in my experience, absolute majority of programmers don't know algorithms and data structures outside of very shallow understanding required to pass some popular interview questions. May be you've put too high artificial barriers, which weren't necessary.
To be a professional software developer, you need to write code to solve real life tasks. These tasks mostly super-primitive in terms of algorithms. You just glue together libraries and write so-called "business-logic" in terms of incomprehensible tree of if-s which nobody truly understands. People love it and pay money for it.
Should I be familiar with every step of Dijkstra’s search algorithm and remember the pseudocode at all times? Why don’t the textbooks explain why the algorithm is correct?
Somehow, I think you already know the answer to that is "no".
I've been working as a software engineer for over 8 years, with no computer science education. I don't know what Dijkstra's search algorithm is, let alone have memorised the pseudocode. I flicked through a book of data structures and algorithms once, but that was after I got my first software job. Unless you're only aiming for Google etc, you don't really need any of this.
The good ones do!
> Should I be familiar with every step of Dijkstra’s search algorithm and remember the pseudocode at all times?
If it’s the kind of thing you care to be familiar with, then being able to rederive every step of the usual-suspect algorithms is well within reach, yes. You don’t need to remember things in terms of pseudocode as such, more just broad concepts.
I used to think alike (I'm +30 year programing) until I decide to do https://tablam.org, and making a "database" is the kind of project where all this stuff suddenly is practical and worthwhile.
The idea was that the vast majority of arrays in a large set are not searched often enough to justify the cost of sorting them and sorting is an expensive operation if you are computing on a deadline. You also don't always know which ones will be heavily searched ahead of time. Using bubblesort, only the heavily accessed arrays end up sorted but as a side-effect of search rather than having separate heuristics to decide when/what to sort.
Also, general wisdom to be mindful of data sizes and cache coherency. O(NLogN) vs. O(N^2) doesn't mean much when you're only sorting a few dozen items. Meanwhile, O(N) space can have drastic performance hitches when reallocating memory.
and every new hire got taken to the whiteboard to learn about sort algorithm performance: bubblesort is O(n) in the best case.
and in our codebase, the data being sorted fit that best case (the data was already sorted or almost sorted).
Though in reality almost never: you almost always have a convenient built-in sort that is as quick & easy to use (likely quicker & easier), and in circumstances where the set is small enough for bubblesort to be just fine, the speed, memory use, or other properties of what-ever other sort your standard library uses aren't going to be a problem either.
As others have pointed out, sometimes it is useful for partial sorts due to the “always no less sorted than before at any point in the process (assuming no changes due to external influence)” property.
wrt:
> If you make each frame of the animation one pass of bubblesort, the particles will all move smoothly into the right positions. I couldn't find any examples in the wild,
There are hundreds of sort demos out there, both live running and on publicly hosted videos, that show the final positions by hue, getting this effect. Seems odd that they couldn't find a single one.
EDIT: actually, I can't find any of the rainbow based sort demos I was thinking of, a lot of promising links seem dead. I take back my little moan!
I said sure, quicksort, mergesort or radixsort?
He just said "okay, let's skip to the next question". :)
When I was playing The Farmer Was Replaced and needed to implement sorting, I just wrote a bubble sort. Worked first time.
It was one of the many viral moments during Obama's original campaign where he seemed cool and in touch.
https://en.wikipedia.org/wiki/Shellsort
Shellsort can be regarded as an improvement over either Bubble Sort or Insertion Sort.
[1] https://github.com/thomasmueller/bau-lang/blob/main/src/test...
And while I've never hit a case I would think it would have merit with data known to be pretty close to properly sorted.
You can also do something like a calendar queue with bubble sort for each bin.
Worse, big-O always hides a constant factor. What's bubblesort's constant? What's quicksort's? It wouldn't surprise me if, for small enough n (2 or 3, and maybe a bit higher), bubblesort is actually faster.
Note well: I have not actually benchmarked this.
Also note well: Determine what your n is; don't assume that it's either large or small.