I wonder whether Raymond Hettinger has an opinion on this PEP. A long time ago, he wrote: "freezing dicts is a can of worms and not especially useful".
This is why I love how Rust approached this; almost by accident to make borrow checking work. Every reference is either mutable or not, and (with safe code), you can't use an immutable reference to get a mutable reference anywhere down the chain. So you can slowly construct a map through a mutable reference, but then return it out of a function as immutable, and that's the end of it. It's no longer ever mutable, and no key or value is either. There's no need to make a whole new object called FrozenHashMap, and then FrozenList, and FrozenSet, etc. You don't need a StringBuilder because String is mutable, unless you don't want it to be. It's all just part of the language.
Kotlin _kinda_ does this as well, but if you have a reference to an immutable map in Kotlin, you are still free to mutate the values (and even keys!) as much as you like.
You cannot return an immutable version. You can return it owned (in which case you can assign/reassign it to a mut variable at any point) or you can take a mut reference and return an immutable reference - but whoever is the owner can almost always access it mutably.
Arg, you’re right. Not sure what I was thinking there. I still think my point stands, because you get the benefits of immutability, but yeah, I didn’t explain it well.
Only if you're returning a reference or wrapping it in something that will only ever return a reference. If you return an object by value ('owned'), then you can do what you like with it and 'mut' is just an light guardrail on that particular name for it.
It explains a lot about the design of Python container classes, and the boundaries of polymorphism / duck typing with them, and mutation between them.
I don't always agree with the choices made in Python's container APIs...but I always want to understand them as well as possible.
Also worth noting that understanding changes over time. Remember when GvR and the rest of the core developers argued adamantly against ordered dictionaries? Haha! Good times! Thank goodness their first wave of understanding wasn't their last. Concurrency and parallelism in Python was a TINY issue in 2006, but at the forefront of Python evolution these days. And immutability has come a long way as a design theme, even for languages that fully embrace stateful change.
> Also worth noting that understanding changes over time. Remember when GvR and the rest of the core developers argued adamantly against ordered dictionaries? Haha! Good times!
The new implementation has saved space, but there are opportunities to save more space (specifically after deleting keys) that they've now denied themselves by offering the ordering guarantee.
Ordering, like stability in sorting, is an incredibly useful property. If it costs a little, then so be it.
This is optimizing for the common case, where memory is generally plentiful and dicts grow more than they shrink. Python has so many memory inefficiencies that occasional tombstones in the dict internal structure is unlikely to be a major effect. If you're really concerned, do `d = dict(d)` after aggressive deletion.
> Ordering, like stability in sorting, is an incredibly useful property.
I can't say I've noticed any good reasons to rely on it. Didn't reach for `OrderedDict` often back in the day either. I've had more use for actual sorting than for preserving the insertion order.
Thinking about this upfront for me, I am actually wondering why this is useful outside of equality comparisons.
Granted, I live and work in TypeScript, where I can't `===` two objects but I could see this deterministic behavior making it easier for a language to compare two objects, especially if equality comparison is dependent on a generated hash.
The other is guaranteed iteration order, if you are reliant on the index-contents relationship of an iterable, but we're talking about Dicts which are keyed, but extending this idea to List, I see this usefulness in some scenarios.
Beyond that, I'm not sure it matters, but I also realize I could simply not have enough imagination at the moment to think of other benefits
Personally, I find lots of reasons to prefer an orders Dict to an unordered one. Even small effects like "the debugging output will appear in a consistent order making it easier to compare" can be motivation enough in many use cases.
It seems like opinions really differ on this item then. I love insertion sort ordering in mappings, and python with it was a big revelation. The main reason is that keys need some order, and insertion order -> iteration order is a lot better than pseudorandom order (hash based orders).
For me, it creates more reproducible programs and scripts, even simple ones.
Same. Recently I saw interview feedback where someone complained that the candidate used OrderedDict instead of the built-in dict that is now ordered, but they'll let it slide... As if writing code that will silently do different things depending on minor Python version is a good idea.
Well it's been guaranteed since 3.7 which came out in 2018, and 3.6 reached end-of-life in 2021, so it's been a while. I could see the advantage if you're writing code for the public (libraries, applications), but for example I know at my job my code is never going to be run with Python 3.6 or older.
This morning for example, I tested an object serialized through a JSON API. My test data seems to never match the next run.
After a while, I realized one of the objects was using a set of objects, which in the API was turned into a JSON array, but the order of said array would change depending of the initial Python VM state.
3 days ago, I used itertools.group by to group a bunch of things. But itertools.group by only works on iterable that are sorted by the grouping key.
Now granted, none of those recent example are related to dicts, but dict is not a special case. And it's iterated over regularly.
This was 19 (almost) 20 years ago.
As stated in the lwn.net article, a lot of concurrency has been added to python, and it might now be time for something like a frozendict.
Things that were not useful in 2006 might be totally useful in 2026 ;P
Still, like you, I'm curious wether he has anything to say about it.
I think Raymond Hettinger is called out specially here because he did a well known talk called [Modern Dictionaries](https://youtu.be/p33CVV29OG8) where around 32:00 to 35:00 in he makes the quip about how younger developers think they need new data structures to handle new problems, but eventually just end up recreating / rediscovering solutions from the 1960s.
“What has been is what will be, and what has been done is what will be done; there is nothing new under the sun.”
It's interesting that he concludes that freezing dicts is "not especially useful" after addressing only a single motivation: the use of a dictionary as a key.
He doesn't address the reason that most of us in 2025 immediately think of, which is that it's easier to reason about code if you know that certain values can't change after they're created.
You can't really tell though. Maybe the dict is frozen but the values inside aren't. C++ tried to handle this with constness, but that has its own caveats that make some people argue against using it.
Indeed. So I don't really understand what this proposal tries to achieve. It even explicitly says that dict → frozendict will be O(n) shallow-copy, and the contention is only about O(n) part. So… yeah, I'm sure they are useful for some cases, but as Raymond has said — it doesn't seem to be especially useful, and I don't understand what people ITT are getting excited about.
I agree, same with frozenset. If you really want to use one of those as a key, convert to a tuple. There might be niche use cases for all this, but it's not something that the language or even the standard lib need to support.
Problem being that sets aren't consistently ordered and conversion to a tuple can result in an exponential (specifically, factorial) explosion in the number of possible keys associated with a single set. Nor can you sort all objects. Safe conversion of sets to tuples for use as keys is possible but the only technique I know requires an auxiliary store of objects (mapping objects to the order in which they were first observed), which doesn't parallelize well.
tuple(sorted(s)) and if you can't even sort the values, they're probably not hashable. I get that this involves a copy, but so does frozenset, and you can cross that bridge in various ways if it's ever a problem.
I'm not saying the problem with tuple doesn't exist, but that there doesn't need to be a built-in way to deal with it. If for some unfortunate reason you've got a mixed-type set that you also want to use as a dict key, you can write a helper.
> Another PEP 351 world view is that tuples can serve as frozenlists; however,
that view represents a Liskov violation (tuples don't support the same
methods). This idea resurfaces and has be shot down again every few months.
... Well, yes; it doesn't support the methods for mutation. Thinking of ImmutableFoo as a subclass of Foo is never going to work. And, indeed, `set` and `frozenset` don't have an inheritance relationship.
I normally find Hettinger very insightful so this one is disappointing. But nobody's perfect, and we change over time (and so do the underlying conditions). I've felt like frozendict was missing for a long time, though. And really I think the language would have been better with a more formal concept of immutability (e.g. linking it more explicitly to hashability; having explicit recognition of "cache" attributes, ...), even if it didn't go the immutable-by-default route.
Apple (or perhaps NeXT) has solved this problem already in Objective-C. Look at NSArray and NSMutableArray, or NSData and NSMutableData. It’s intuitive and Liskov-correct to make the mutable version a subclass of the immutable version. And it’s clearly wrong to have the subclass relationship the other way around.
Given how dynamic Python is, such a subclass relationship need not be evident at the C level. You can totally make one class whose implementation is independent of another class a subclass of the other, using PEP 3119. This gives implementations complete flexibility in how to implement the class while retaining the ontological subclass relationship.
> ImmutableFoo as a subclass of Foo is never going to work. And, indeed, `set` and `frozenset` don't have an inheritance relationship.
Theoretically, could `set` be a subclass of `frozenset` (and `dict` of `frozendict`)? Do other languages take that approach?
> linking [immutability] more explicitly to hashability
AFAIK immutability and hashability are equivalent for the language's "core" types. Would it be possible to enforce that equivalence for user-defined types, given that mutability and the implementation of `__hash__` are entirely controlled by the programmer?
> Theoretically, could `set` be a subclass of `frozenset` (and `dict` of `frozendict`)?
At one extreme: sure, anything can be made a subclass of anything else, if we wanted to.
At the other extreme: no, since Liskov substitution is an impossibly-high bar to reach; especially in a language that's as dynamic/loose as Python. For example, consider an expression like '"pop" in dir(mySet)'
> Barbara Liskov and Jeannette Wing described the principle succinctly in a 1994 paper as follows:[1]
> > Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.
My expression `"pop" in dir(mySet)` gives an explicit example of how `set` and `frozenset` are not subtypes of each other (regardless of how they're encoded in the language, with "subclasses" or whatever). In this case `ϕ(x)` would be a property like `'"pop" in dir(x)' = 'False'`, which holds for objects x of type frozenset. Yet it does not hold for objects y of type set.
Your example of `hasattr(mySet, 'pop')` gives another property that would be violated.
My point is that avoiding "Liskov violations" is ("theoretically") impossible, especially in Python (which allows programs to introspect/reflect on values, using facilities like 'dir', 'hasattr', etc.).
The root of the issue here is that Liskov substitution principle simply references ϕ(x) to be some property satisfied by objects of a class. It does not distinguish between properties that are designed by the author of the class to be satisfied or properties that happen to be satisfied in this particular implementation. But the Hyrum’s Law also states that properties that are accidentally true can become relied upon and as time passes become an intrinsic property. This to me suggests that the crux of the problem is that people don’t communicate sufficiently about invariants and non-invariants of their code.
> > Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.
This says "if hasattr(parent, 'pop') == True then hasattr(child, 'pop') must be True". This is not violated in this case, since hasattr(parent, 'pop') is False. If you want to extend the above definition so that negative proofs concerning the parent should also hold true for the child, then subtyping becomes impossible since all parent and child types must be identical, by definition.
The property in question is `hasattr(x, "pop") is False`.
> If you want to extend the above definition so that negative proofs concerning the parent should also hold true for the child, then subtyping becomes impossible since all parent and child types must be identical, by definition.
The distinction isn’t “negative proofs”, but yes, that’s their point. In Python, you have to draw a line as to which observable properties are eligible.
ImmutableFoo can't be a subclass of Foo, since it loses the mutator methods. But nor can Foo be a subclass of ImmutableFoo, since it loses the axiom of immutability (e.g. thread-safety) that ImmutableFoo has.
When you interpret Liskov substitution properly, it's very rare that anything Liskov-substitutes anything, making the entire property meaningless. So just do things based on what works best in the real world and aim for as much Liskov-substitution as is reasonable. Python is duck-typed anyway.
It's a decent guiding principle - Set and ImmutableSet are more substitutable than Set and Map, so Set deriving from ImmutableSet makes more sense than Set deriving from Map. It's just not something you can ever actually achieve.
Python discovers immutable data, and gets it wrong. Frozendict is a blunt instrument - instead of a mutable free-for-all, it just locks the data for the entire lifecycle. Brace for the wave of code littered with deep copies, proudly proclaiming how functional it all is.
If you want real immutable data structures, not a cheap imitation, check out pyrsistent.
The main trick is to store everything in a tree with a large branching factor.
The elements are on your leaves.
When you want to make an edit you need to create only the nodes from the root to the actual leaf. Then you return the new root. Everything else you can share.
This is a log operation, normally implemented with branch 32.
It works with vectors as well as dicts.
Creating a copy is constant, it's immutable can be shared freely (especially on GC'd languages).
Insert/Delete are O(log32(n)) instead of O(n) they'd be with a full copy.
And, yet, somehow, functional languages have been doing this since forever? I jest, I jest! There's an entire field of study focused on this! I'm no expert, but imagine you've got a singly-linked list, and we're going to allow only push_back() and pull_back() (no insert()). Let's go ahead and let multiple owners have at this list. If A does "pull_back()" what really happens is A has a local "tail" pointer that moves back along the end of the list and has a new belief about the end of the list. When A does "push_back()" it begins attaching links at its new tail. Since the links just "point backwards" without modifying the old links, this doesn't mutate the list "as is", it only mutates A's version of the list. So, if B is also looking at the list, it could just start doing "push_back()" and add its own "tail". The result is a data-structure that's a "bunch of shared singly linked lists" in a bush structure that's more akin to a tree than a singly linked list.
You can do the same thing with binary trees and other structures. It's more fiddly, and there's definitely places where single-ownership allows mutability "under the hood" for real performance gains, but that's the basic idea.
If your dicts are frozen, you shouldn't need to deep-copy. The point of immutability is that if you want a new frozendict based on another one, you just rebuild the indirection data structure up top and leave the values it references alone.
You're absolutely right: an "indirection data structure" is necessary. Freezing the data is the least interesting part - it doesn't give you any of the benefits typically associated with immutable data structures in functional languages. That's my point - Python is shipping a half solution that's being mistaken for a proper one.
You think Python developers are going to roll their own HAMT on top of frozendicts? Or are they just gonna make copies? Personally, I'd just use pyrsistent which seems to get it right.
It's a little early to say that "Python is shipping" anything related to this. The PEP is still in draft status and it may be modified (as it was three hours ago) or even rejected. The article says it is likely to be accepted, but that has not happened yet. That also means there is time to comment on the PEP in the discussion it links to if you'd like.
Except at 10k+ items, batchup dates on 100K+ items or inserting 100 keys.
This is rarely the case in practice, most dictionaries and dict operations are small, if you have a huge dict, you probably should be chunking your load or delegating that to infra.
Not to mention pyrsistent's API is incompatible with dicts, so you can't pass it to external code without conversion.
You'd better have an incredible ROI to justify that.
Where's the code? Have you observed the bottleneck call?
> Except at 10k+ items, batchup dates on 100K+ items or inserting 100 keys.
> This is rarely the case in practice
Where's the stats on the actual practice?
> You'd better have an incredible ROI to justify that.
The ROI being: fearless API design where 1) multiple instances of high level components are truly independent and could easily parallelize, 2) calling sites know that they keep the original data intact and that callees behave within the immutability constraints, 3) default func inputs and global scope objects are immutable without having to implement another PEP, 4) collections are hashable in general.
Yes, this only prevents the callee from mutating it, it can't provide a strong guarantee that the underlying mapping won't be changed upstream (and hence MappingProxyType can't be washable).
Concurrency is a good motivation, but this is super useful even in straight line code. There’s a huge difference between functions that might mutate a dictionary you pass in to them and functions that definitely won’t. Using Mapping is great, but it’s a shallow guarantee because you can violate it at run time.
Shoutout to pyrsistent, a personal favorite library for frozen/immutable/hashable data structures whenever I need to use lists, sets, or dicts as dict keys, or make sets of such items: https://github.com/tobgu/pyrsistent
Can someone ELI5 the core difference between this and named tuples, for someone who is not deep into Python? ChatGPT's answer boiled down to: unordered (this) vs ordered (NTs), "arbitrary keys, decided at runtime" vs "fixed set of fields decided at definition time" (can't an NT's keys also be interpolated from runtime values?), and a different API (`.keys()`, `.items()`), etc (I'm just giving this as context btw, no idea if there's inaccuracies in these).
So could this also have been approached from the other side, as in making unordered NamedTuples with support for the Mapping API? The line between dictionaries and named tuples and structs (across various languages) has always seemed a bit blurry to me, so I'm trying to get a better picture of it all through this.
> arbitrary keys, decided at runtime" vs "fixed set of fields decided at definition time" (can't an NT's keys also be interpolated from runtime values?)
If you want to create a named tuple with arbitrary field names at runtime then you need to create a new named tuple type before you create the instance. That is possible, since Python is a very dynamic language, but it's not particularly efficient and, more importantly, just feels a bit wrong. And are you going to somehow cache the types and reuse them if the field names match? It's all a bit of a mess compared to just passing the entries to the frozendict type.
The key difference is what the GPT outlined: tuples have order for the fields and named tuples are not indexed by the name, but instead a field accessor is used, i.e. foo.bar vs foo["bar"]. In addition namedtuples can be indexed using that order like tuples can (foo[0]), which clearly isn't possible with dicts and would be quite confusing if dict had integer key.
So I think the differences aren't great, but they are sufficient. A frozendict is not going to be indexable by an integer. Python already has an abstract type for this, for mostly the use of type checking I imagine: https://docs.python.org/3/glossary.html#term-mapping
On top of generating types dynamically being slow and bad as quietbritishjim said, “str values that also happen to be valid identifiers” is a very limiting dict key requirement.
The values in tuples cannot change. The values that keys point to in a frozen dict can?
But yeah I'd be in favour of something that looked a lot like a named tuple but with mutable values and supporting [name] access too. And of course some nice syntactic sugar rather like dicts and sets have with curly brackets today.
> The values in tuples cannot change. The values that keys point to in a frozen dict can?
The entries of a tuple cannot be assigned to, but the values can be mutated. The same is true for a `frozendict` (according to the PEP they don't support `__setitem__`, but "values can be mutable").
This place hates laziness and imprecision. Using ChatGPT for editing or inspiration is okay as long as you personally review the results for accuracy and completeness, at which point people care about it as much as you announcing that you used a spell checker.
Fair, because they’re not your words. I’ll edit my comment for what I had in mind: that it can be helpful for that like a spell checker - for example, I know non-native English speakers who find them useful for editing but they completely understand the topic intellectually.
you made a non-comment, that's why you're downvoted
comments don't have to be substantial, but they should be adding something that might merit more responses, or could. yours does not. what kind of reply could you even give to "this place [and so, its users] hates [thing]"
I'd ask you to qualify it, but you can't really qualify such a statement
> so having a safe way to share dictionaries between threads will be a boon
Since only the keys are const, the values not, frozendict is not thread-safe per se. There needs to be a small lock around the value getter and setter.
it's thread safe on operations on the dict but not on the values. Same relates to other immutable structures like tuples. Lock will not help here cause unsafety comes from operation on value after value is obtained.
I’m curious why the dict->frozendict operation is not an O(1) operation when there’s a refcnt of 1 on the dict - that resolves the spooky action at a distance problem raised and solves the performance for the most common usage pattern (build up the dict progressively and convert to a frozendict for concurrent reads).
Personally, I’ve been using a wrapper around `collections.namedtuple` as an underlying data structure to create frozen dictionaries when I’ve needed something like that for a project.
That works if you're dealing with a known set of keys (i.e. what most statically-typed languages would call a struct). It falls down if you need something where the keys are unknowable until runtime, like a lookup table.
I do like dataclasses, though. I find them sneaking into my code more and more as time goes on. Having a declared set of properties is really useful, and it doesn't hurt either that they're syntactically nicer to use.
For example, I often write classes that do cacheable analysis that results in a dict (e.g. the class stores a list of tiles defined by points and users want a point-to-tiles mapping for convenience). It's worth caching those transformations, e.g. using @functools.cached_property, but this introduces a risk where any caller could ruin the returned cached value by editing it. Currently, I take the safety hit (cache a dict) instead of the performance hit (make a new copy for each caller). Caching a frozendict would be a better tradeoff.
Maybe you should take a look at pyrsistent. That allows you to make frozen “maps”. You can “evolve” them into new versions of the objects and it keeps the references to the unchanged keys and values under the hood so it’s fast and memory efficient.
Next step: Auto-inferring the correct (most narrow) TypedDict type from a frozendict. (Think `const foo = { … } as const` in TypeScript.) I miss this TS feature in Python on the daily.
What would that do exactly? Auto-generate a TypedDict class?
Bringing up TypedDict is sort of interesting, because it seems like a frozen dictionary could have been implemented by PEP 705, if typing.ReadOnly was enforced at runtime, and not just a hint to a static type checker.
After what release? Type checkers already support the various TypedDict PEPs, even those that haven't been accepted yet. Yet, none of them infers TypedDict types.
Not sure we're talking about the same thing. Inline TypedDicts are already in the process of being formalized, see https://peps.python.org/pep-0764/ , and have experimental support in e.g. Pyright.
Ah, I didn't know about PEP 795. I don't really know if I like it though. It looks like something someone invented because they don't want to write classes or specifically want to be able to access a data structure like a dict, for whatever reason.
It basically provides data types, but also ensures that you have no easy way to reuse them, and it will miss cases where two data structures happen to have the same signature.
It's getting a little messy when we have class, namedtuple and TypedDict, which all can do much the same thing.
It would be great to have **kwargs as frozendict by default. It could help with caching decorators to speedup key construction. For example natural key is simply `(args, kwargs)`.
If this gets wide enough use, they could add an optimization for code like this:
n = 1000
a = {}
for i in range(n):
a[i] = str(i)
a = frozendict(a) # O(n) operation can be turned to O(1)
It is relatively easy for the JIT to detect the `frozendict` constructor, the `dict` input, and the single reference immediately overwritten. Not sure if this would ever be worth it.
I think GP was talking about optimizing away the O(N) call on the last line. The GC will take care of removing the reference to the old (mutable) dict, but constructing a new frozendict from a mutable dict would, in the current proposal, be an O(N) shallow copy.
There are also potentially other optimizations that could be applied (not specific to dict/frozendict) to reduce the memory overhead on operations like "a = f(a)" for selected values of "f".
Indeed, the Tcl implementation does this so e.g. `set d [dict] ; dict set d key value` can modify d in place instead of creating a copy (since everything is immutable).
Agree it's about time to make this built in. Other functional packages like JAX [0] are already using the concept but they build it into their library from scratch.
I wish Python could move away from types-as-mutability-determiners. Or paper over them; all could have mutable and non-mutable variants, with "to_mut()" and "to_immut()" methods. And in the same vein, explicit control over whether functions mutate a variable in place, or make a local copy. Python is my first lang, and I still am unable to consistently reason about these.
SQLAlchemy has its own frozendict which we've had in use for many years, we have it as a pretty well performing cython implementation these days, and I use it ubiquitously. Would be a very welcome addition to the stdlib.
This proposal is important enough that I chimed in on the thread with a detailed example of how SQLAlchemy uses this pattern:
This subject always seems to get bogged down in discussions about ordered vs. unordered keys, which to me seems totally irrelevant. No-one seems to mention the glaring shortcoming which is that, since dictionary keys are required to be hashable, Python has the bizarre situation where dicts cannot be dict keys, as in...
{{'foo': 'bar'}: 1, {3:4, 5:6}: 7}
...and there is no reasonable builtin way to get around this!
You may ask: "Why on earth would you ever want a dictionary with dictionaries for its keys?"
More generally, sometimes you have an array, and for whatever reason, it is convenient to use its members as keys. Sometimes, the array in question happens to be an array of dicts. Bang, suddenly it's impossible to use said array's elements as keys! I'm not sure what infuriates me more: said impossibility, or the python community's collective attitude that "that never happens or is needed, therefore no frozendict for you"
Turning a dictionary into a tuple of tuples `((k1, v1), (k2, v2), ...)`; isn't that a reasonable way?
If you want to have hash map keys, you need to think about how to hash them and how to compare for equality, it's just that. There will be complications to that such as floats, which have a tricky notion of equality, or in Python mutable collections which don't want to be hashable.
/? "Existing threads re: consts and applications thereof"
I wasn't able to find a URL to this post (2021) from the python-ideas mailing list archives using a double quoted search term today; I had to use the python mailing list search engine? Did something break crawling of mailing lists? Old mailman HTML archives were very simple to crawl.. ENH: pypa: add a sitemap.xml for/of mailing list archives, forums; @pypa: ask for search engine indexing advice: "How do we make sure that the python mailing list archives will be search indexed?" (as they traditionally were)
How to find the .txt of mailing list archives posts these days?
I'm out of time for this; (reformatting this for HN so that URLs will be auto-linkified but newlines won't be eliminated) here's the full email as .txt, the mailing list archive has a hyperlinkified version with newlines preserved. GH Markdown and CommonMark Markdown also preserve newlines and auto-linkify:
From: [@westurner]
Date: Thu, Jun 17, 2021, 10:43 AM
Subject: Re: [Python-ideas] Re: Introduce constant variables in Python
Cc: python-ideas <python-ideas@python.org>
On Mon, May 24, 2021 at 5:43 PM Chris Angelico <rosuav@gmail.com> wrote:
Requiring that a name not be rebound is well-defined and testable.
Requiring that an object not change is either trivial (in the case of,
say, an integer) or virtually impossible (in the case of most
objects).
What would be the advantage of such a declaration?
ChrisA
## Existing threads re: consts and applications thereof
So, `/? from:me pyrsistent` I found a few results:
- "[Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)" 2016-04
https://mail.python.org/pipermail/python-dev/2016-April/143958.html
- ~Sandboxing python within python is nontrivial to impossible; consts might help a bit
- https://mail.python.org/pipermail/python-dev/2016-April/143958.html
- "Proposal to add const-ness type hints to PEP-484"
https://mail.python.org/archives/list/python-ideas@python.org/thread/OVPF5I6IOVF6GOJQRH5UGCCU3R7PQHUF/
- https://github.com/python/typing/issues/242
- "Final names and attributes" https://github.com/python/mypy/pull/5522
This is where `typing.Final` comes from.
- "[Python-ideas] "Immutable Builder" Pattern and Operator"
https://mail.python.org/pipermail/python-ideas/2017-January/044374.html
- [pyrsistent] and "fn.py [do] immutables:
https://github.com/kachayev/fn.py/blob/master/README.rst#persistent-data-structures "
- "[Python-ideas] Add recordlcass to collections module"
https://groups.google.com/g/python-ideas/c/9crHfcCBgYs/m/6_EEaWJAAgAJ
- ORMs (e.g. Django, SQLAlchemy) require "dirty state" checking to know which object attributes have changed and need an SQL statement to be executed to synchronize the state; this is relevant because when we're asking for mutable namedtuple we're often trying to do exactly this pattern.
- "[Python-ideas] Suggestions: dict.flow_update and dict.__add__"
https://www.google.com/search?q=%22%5BPython-ideas%5D+Suggestions%3A+dict.flow_update+and+dict.__add__%22
> dicttoolz has functions for working with these objects; including dicttoolz.merge (which returns a reference to the merged dicts but does not mutate the arguments passed).
>
> https://toolz.readthedocs.io/en/latest/api.html#dicttoolz
> https://toolz.readthedocs.io/en/latest/api.html#toolz.dicttoolz.merge
>
> pyrsistent has a PRecord class with invariants and type checking that precedes dataclasses. pyrsistent also has 'freeze' and 'thaw' functions for immutability. PRecord extends PMap, which implements __add__ as self.update(arg) (which does not mutate self)
https://github.com/tobgu/pyrsistent/blob/master/README.rst#precord
>
> https://github.com/tobgu/pyrsistent/blob/master/pyrsistent/_pmap.py
- "[Python-ideas] How to prevent shared memory from being corrupted ?"
https://www.google.com/search?q=%22How+to+prevent+shared+memory+from+being+corrupted+%3F%22
> PyArrow Plasma object ids, "sealing" makes an object immutable, pyristent
>
> https://arrow.apache.org/docs/python/plasma.html#object-ids
> https://arrow.apache.org/docs/python/plasma.html#creating-an-object-buffer
> > Objects are created in Plasma in two stages. First, they are created, which allocates a buffer for the object. At this point, the client can write to the buffer and construct the object within the allocated buffer. [...]
- [Python-ideas] Experimenting with dict performance, and an immutable dict
https://mail.python.org/archives/list/python-ideas@python.org/message/DNBGUJHDH4UTPSETMFFWMJHNXQXIWX4I/
> https://pyrsistent.readthedocs.io/en/latest/intro.html#pyrsistent :
>
>> Pyrsistent is a number of persistent collections (by some referred to as functional data structures). Persistent in the sense that they are immutable.
>>
>> All methods on a data structure that would normally mutate it instead return a new copy of the structure containing the requested updates. The original structure is left untouched.
>>
>> This will simplify the reasoning about what a program does since no hidden side effects ever can take place to these data structures. You can rest assured that the object you hold a reference to will remain the same throughout its lifetime and need not worry that somewhere five stack levels below you in the darkest corner of your application someone has decided to remove that element that you expected to be there.
>>
>> Pyrsistent is influenced by persistent data structures such as those found in the standard library of Clojure. The data structures are designed to share common elements through path copying. It aims at taking these concepts and make them as pythonic as possible so that they can be easily integrated into any python program without hassle.
> What would be the advantage of such a declaration?
Constants don't need to be locked or unlocked; which is advantageous for parallelism and reasoning about program correctness.
True consts (wherein everything referred to in that object is 'frozen' and immutable or at least only modifiable with e.g. copy-on-write)
wouldn't require locks,
which would be post-GIL advantageous.
You could do consts by never releasing a threading.Lock (or similar):
- https://docs.python.org/3/library/asyncio-sync.html#locks
- https://docs.python.org/3/library/threading.html#lock-objects
- This from
https://docs.python.org/2/library/sets.html?highlight=immutable#immutable-transforms re ImmutableSet/FrozenSet
is not present in the python 3 docs:
https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset
Though - even if Python enforced normal consts in the language - all of the other code objects would still be mutable, so you still have the impossibility of sandboxing python.
Functional and contracts coding styles rely upon invariance;
which can be accomplished with various third-party packages that enforce const-ness throughout what may be an object tree behind that reference that would otherwise need to be copy.deepcopy()'d.
## pyrsistent
Src: https://github.com/tobgu/pyrsistent
> - PVector, similar to a python list
> - PMap, similar to dict
> - PSet, similar to set
> - PRecord, a PMap on steroids with fixed fields, optional type and invariant checking and much more
> - PClass, a Python class fixed fields, optional type and invariant checking and much more
> - Checked collections, PVector, PMap and PSet with optional type and invariance checks and more
> - PBag, similar to collections.Counter
> - PList, a classic singly linked list
> - PDeque, similar to collections.deque
> - Immutable object type (immutable) built on the named tuple
> - freeze and thaw functions to convert between python standard collections and pyrsistent collections.
> - Flexible transformations of arbitrarily complex structures built from PMaps and PVectors.
## icontract
Src: https://github.com/Parquery/icontract
> icontract provides design-by-contract to Python3 with informative violation messages and inheritance.
>
> It also gives a base for a flourishing of a wider ecosystem:
>
> - A linter pyicontract-lint,
> - A sphinx plug-in sphinx-icontract,
> - A tool icontract-hypothesis for automated testing and ghostwriting test files which infers Hypothesis strategies based on the contracts,
together with IDE integrations such as icontract-hypothesis-vim, icontract-hypothesis-pycharm, and icontract-hypothesis-vscode,
> - Directly integrated into CrossHair, a tool for automatic verification of Python programs,
together with IDE integrations such as crosshair-pycharm and crosshair-vscode, and
> - An integration with FastAPI through fastapi-icontract to enforce contracts on your HTTP API and display them in OpenAPI 3 schema and Swagger UI.
Regarding the spooky-action-at-a-distance concerns of a `.freeze()` method on dict:
`.freeze()` should probably just return a frozendict instead of in-place mutating the dict, and they should be separate types. Under the hood, you'll have to build the hashtable anyway to make the frozendict; as long as you're doing that work, you may as well build an object to contain the hashtable and just have that object be separate from the dict that birthed it.
The values referenced by both the original dict and the frozendict can be the same values; no need to clone them.
Can someone give a strong rationale for a separate built-in class? Because "it prevents any unintended modifications" is a bit weak.
If you have fixed keys, a frozen dataclass will do. If you don't, you can always start with a normal dict d, then store tuple(sorted(d.items())) to have immutability and efficient lookups (binary search), then throw away d.
this is likely in reference to the fact that dicts have maintained insertion order since Python ~3.6 as property of the language. Mathematically there's no defined order to a set, and a dict is really just a set in disguise, but it's very convenient for determinism to "add" this invariant to the language.
Sets use a different implementation intentionally (i.e. they are not "a dict without values") exactly because it's expected that they have different use cases (e.g. union/intersection operations).
Debugging is a completely different and better animal when collections have a predictable ordering. Else, every dict needs ordering before printing, studying, or comparing. Needlessly onerous, even if philosophically justifiable.
Not related to Python, but one of the possible implementations of a set, i.e. of an equivalence class on sequences, is as a sorted array (with duplicates eliminated, unless it is a multiset, where non-unique elements are allowed in the sorted array), as opposed to the unsorted array that can store an arbitrary sequence.
So sets can be viewed as implicitly sorted, which is why the order of the elements cannot be used to differentiate two sets.
Being sorted internally to enforce the equivalence between sets with elements provided in different orders does not imply anything about the existence of an operation that would retrieve elements in a desired order or which would return subsets less or greater than a threshold. When such operations are desired, an order relation must be externally defined on the set.
So a possible definition of sets and multisets is as sorted arrays with or without unicity of elements, while sequences are unsorted arrays (which may also have the constraint of unique elements). However the standard set operations do not provide external access to the internal order, which is an order between arbitrary identifiers attached to the elements of the set, which have no meaning externally.
A stable order does not imply sorted order. If you convert the same set to a list twice you'll probably get the same order both times but it isn't guaranteed anywhere, and that order may change between python implementations and versions. The order may also change as the set grows or shrinks.
So are dictionary keys, but Python decided to make them insertion ordered (after having them be unordered just like set elements for decades). There is no fundamental reason sets couldn't have a defined order. That's what languages like JavaScript have done too.
Python's decision to make dict keys ordered in the spec was a mistake. It may be the best implementation so far, but it eliminates potential improvements in the future.
Agreed. The only reason to make them sorted is because people would wrongly assume that they where. You can argue that a programming language should not have unexpected behaviors, and apparently unsorted dictionary keys where a surprise to many, on the other hand I feel like it's a failure of education.
The problem was that assuming that keys would be sorted was frequently true, but not guaranteed. An alternative solution would have been to randomize them more, but that would probably break a lot of old code. Sorting the keys makes no difference if you don't expect them to be, but it will now be a greater surprise if you switch language.
https://mail.python.org/pipermail/python-dev/2006-February/0...
Kotlin _kinda_ does this as well, but if you have a reference to an immutable map in Kotlin, you are still free to mutate the values (and even keys!) as much as you like.
If you in fact return e.g. an Rc::new(thing) or Arc::new(thing), that's forever (though of course you can unwrap the last reference!)
It explains a lot about the design of Python container classes, and the boundaries of polymorphism / duck typing with them, and mutation between them.
I don't always agree with the choices made in Python's container APIs...but I always want to understand them as well as possible.
Also worth noting that understanding changes over time. Remember when GvR and the rest of the core developers argued adamantly against ordered dictionaries? Haha! Good times! Thank goodness their first wave of understanding wasn't their last. Concurrency and parallelism in Python was a TINY issue in 2006, but at the forefront of Python evolution these days. And immutability has come a long way as a design theme, even for languages that fully embrace stateful change.
The new implementation has saved space, but there are opportunities to save more space (specifically after deleting keys) that they've now denied themselves by offering the ordering guarantee.
This is optimizing for the common case, where memory is generally plentiful and dicts grow more than they shrink. Python has so many memory inefficiencies that occasional tombstones in the dict internal structure is unlikely to be a major effect. If you're really concerned, do `d = dict(d)` after aggressive deletion.
I can't say I've noticed any good reasons to rely on it. Didn't reach for `OrderedDict` often back in the day either. I've had more use for actual sorting than for preserving the insertion order.
I don't often care about a specific order, only that I get the same order every time.
Granted, I live and work in TypeScript, where I can't `===` two objects but I could see this deterministic behavior making it easier for a language to compare two objects, especially if equality comparison is dependent on a generated hash.
The other is guaranteed iteration order, if you are reliant on the index-contents relationship of an iterable, but we're talking about Dicts which are keyed, but extending this idea to List, I see this usefulness in some scenarios.
Beyond that, I'm not sure it matters, but I also realize I could simply not have enough imagination at the moment to think of other benefits
But maybe it does all just come down to equality comparisons. Just not always within your own code.
For me, it creates more reproducible programs and scripts, even simple ones.
This morning for example, I tested an object serialized through a JSON API. My test data seems to never match the next run.
After a while, I realized one of the objects was using a set of objects, which in the API was turned into a JSON array, but the order of said array would change depending of the initial Python VM state.
3 days ago, I used itertools.group by to group a bunch of things. But itertools.group by only works on iterable that are sorted by the grouping key.
Now granted, none of those recent example are related to dicts, but dict is not a special case. And it's iterated over regularly.
I would expect to use a different data structure if I needed an ordered set.
Things that were not useful in 2006 might be totally useful in 2026 ;P
Still, like you, I'm curious wether he has anything to say about it.
“What has been is what will be, and what has been done is what will be done; there is nothing new under the sun.”
Still well before the talk.
He doesn't address the reason that most of us in 2025 immediately think of, which is that it's easier to reason about code if you know that certain values can't change after they're created.
What a change in culture over the last 20 years!
I have crossed that bridge, and I'm telling you (again) that a sorted tuple is not a generic solution.
... Well, yes; it doesn't support the methods for mutation. Thinking of ImmutableFoo as a subclass of Foo is never going to work. And, indeed, `set` and `frozenset` don't have an inheritance relationship.
I normally find Hettinger very insightful so this one is disappointing. But nobody's perfect, and we change over time (and so do the underlying conditions). I've felt like frozendict was missing for a long time, though. And really I think the language would have been better with a more formal concept of immutability (e.g. linking it more explicitly to hashability; having explicit recognition of "cache" attributes, ...), even if it didn't go the immutable-by-default route.
Given how dynamic Python is, such a subclass relationship need not be evident at the C level. You can totally make one class whose implementation is independent of another class a subclass of the other, using PEP 3119. This gives implementations complete flexibility in how to implement the class while retaining the ontological subclass relationship.
Theoretically, could `set` be a subclass of `frozenset` (and `dict` of `frozendict`)? Do other languages take that approach?
> linking [immutability] more explicitly to hashability
AFAIK immutability and hashability are equivalent for the language's "core" types. Would it be possible to enforce that equivalence for user-defined types, given that mutability and the implementation of `__hash__` are entirely controlled by the programmer?
At one extreme: sure, anything can be made a subclass of anything else, if we wanted to.
At the other extreme: no, since Liskov substitution is an impossibly-high bar to reach; especially in a language that's as dynamic/loose as Python. For example, consider an expression like '"pop" in dir(mySet)'
I never said it's a problem (and I never said it's not!). I was specifically addressing two things:
- The "theoretical" nature of the question I quoted (i.e. ignoring other aspects like subjectivity, practicality, convention, etc.)
- The reasoning about "Liskov violation", which was quoted further up this thread.
For context, here's Liskov's definition of their principle (from https://en.wikipedia.org/wiki/Liskov_substitution_principle ):
> Barbara Liskov and Jeannette Wing described the principle succinctly in a 1994 paper as follows:[1]
> > Subtype Requirement: Let ϕ(x) be a property provable about objects x of type T. Then ϕ(y) should be true for objects y of type S where S is a subtype of T.
My expression `"pop" in dir(mySet)` gives an explicit example of how `set` and `frozenset` are not subtypes of each other (regardless of how they're encoded in the language, with "subclasses" or whatever). In this case `ϕ(x)` would be a property like `'"pop" in dir(x)' = 'False'`, which holds for objects x of type frozenset. Yet it does not hold for objects y of type set.
Your example of `hasattr(mySet, 'pop')` gives another property that would be violated.
My point is that avoiding "Liskov violations" is ("theoretically") impossible, especially in Python (which allows programs to introspect/reflect on values, using facilities like 'dir', 'hasattr', etc.).
(FYI I became rather jaded on the Liskov substitution principle after reading https://okmij.org/ftp/Computation/Subtyping )
The root of the issue here is that Liskov substitution principle simply references ϕ(x) to be some property satisfied by objects of a class. It does not distinguish between properties that are designed by the author of the class to be satisfied or properties that happen to be satisfied in this particular implementation. But the Hyrum’s Law also states that properties that are accidentally true can become relied upon and as time passes become an intrinsic property. This to me suggests that the crux of the problem is that people don’t communicate sufficiently about invariants and non-invariants of their code.
This says "if hasattr(parent, 'pop') == True then hasattr(child, 'pop') must be True". This is not violated in this case, since hasattr(parent, 'pop') is False. If you want to extend the above definition so that negative proofs concerning the parent should also hold true for the child, then subtyping becomes impossible since all parent and child types must be identical, by definition.
> If you want to extend the above definition so that negative proofs concerning the parent should also hold true for the child, then subtyping becomes impossible since all parent and child types must be identical, by definition.
The distinction isn’t “negative proofs”, but yes, that’s their point. In Python, you have to draw a line as to which observable properties are eligible.
When you interpret Liskov substitution properly, it's very rare that anything Liskov-substitutes anything, making the entire property meaningless. So just do things based on what works best in the real world and aim for as much Liskov-substitution as is reasonable. Python is duck-typed anyway.
It's a decent guiding principle - Set and ImmutableSet are more substitutable than Set and Map, so Set deriving from ImmutableSet makes more sense than Set deriving from Map. It's just not something you can ever actually achieve.
If you want real immutable data structures, not a cheap imitation, check out pyrsistent.
Look up persistent data structures [0]
The main trick is to store everything in a tree with a large branching factor.
The elements are on your leaves. When you want to make an edit you need to create only the nodes from the root to the actual leaf. Then you return the new root. Everything else you can share.
This is a log operation, normally implemented with branch 32.
It works with vectors as well as dicts. Creating a copy is constant, it's immutable can be shared freely (especially on GC'd languages).
Insert/Delete are O(log32(n)) instead of O(n) they'd be with a full copy.
[0] https://en.wikipedia.org/wiki/Persistent_data_structure#Pers...
You can do the same thing with binary trees and other structures. It's more fiddly, and there's definitely places where single-ownership allows mutability "under the hood" for real performance gains, but that's the basic idea.
You think Python developers are going to roll their own HAMT on top of frozendicts? Or are they just gonna make copies? Personally, I'd just use pyrsistent which seems to get it right.
https://peps.python.org/pep-0814/
This is rarely the case in practice, most dictionaries and dict operations are small, if you have a huge dict, you probably should be chunking your load or delegating that to infra.
Not to mention pyrsistent's API is incompatible with dicts, so you can't pass it to external code without conversion.
You'd better have an incredible ROI to justify that.
Since when is Python about speed?
> Just ran a quick benchmark
Where's the code? Have you observed the bottleneck call?
> Except at 10k+ items, batchup dates on 100K+ items or inserting 100 keys.
> This is rarely the case in practice
Where's the stats on the actual practice?
> You'd better have an incredible ROI to justify that.
The ROI being: fearless API design where 1) multiple instances of high level components are truly independent and could easily parallelize, 2) calling sites know that they keep the original data intact and that callees behave within the immutability constraints, 3) default func inputs and global scope objects are immutable without having to implement another PEP, 4) collections are hashable in general.
I won't waste more of your time.
Only if you deny access to the underlying real dict.
So could this also have been approached from the other side, as in making unordered NamedTuples with support for the Mapping API? The line between dictionaries and named tuples and structs (across various languages) has always seemed a bit blurry to me, so I'm trying to get a better picture of it all through this.
If you want to create a named tuple with arbitrary field names at runtime then you need to create a new named tuple type before you create the instance. That is possible, since Python is a very dynamic language, but it's not particularly efficient and, more importantly, just feels a bit wrong. And are you going to somehow cache the types and reuse them if the field names match? It's all a bit of a mess compared to just passing the entries to the frozendict type.
So I think the differences aren't great, but they are sufficient. A frozendict is not going to be indexable by an integer. Python already has an abstract type for this, for mostly the use of type checking I imagine: https://docs.python.org/3/glossary.html#term-mapping
Documentation for namedtuple: https://docs.python.org/3/library/collections.html#collectio...
But yeah I'd be in favour of something that looked a lot like a named tuple but with mutable values and supporting [name] access too. And of course some nice syntactic sugar rather like dicts and sets have with curly brackets today.
The entries of a tuple cannot be assigned to, but the values can be mutated. The same is true for a `frozendict` (according to the PEP they don't support `__setitem__`, but "values can be mutable").
I don't see anything wrong with your asking to understand
Edit: Of course, I get down voted as I predicted I would. Lol.
always has been even before GPT
https://news.ycombinator.com/item?id=46206457
comments don't have to be substantial, but they should be adding something that might merit more responses, or could. yours does not. what kind of reply could you even give to "this place [and so, its users] hates [thing]"
I'd ask you to qualify it, but you can't really qualify such a statement
Since only the keys are const, the values not, frozendict is not thread-safe per se. There needs to be a small lock around the value getter and setter.
Personally, I’ve been using a wrapper around `collections.namedtuple` as an underlying data structure to create frozen dictionaries when I’ve needed something like that for a project.
I do like dataclasses, though. I find them sneaking into my code more and more as time goes on. Having a declared set of properties is really useful, and it doesn't hurt either that they're syntactically nicer to use.
https://peps.python.org/pep-0416/
Also one in 2019 for a "frozenmap":
https://peps.python.org/pep-0603/
https://pypi.org/project/immutabledict/
For example, I often write classes that do cacheable analysis that results in a dict (e.g. the class stores a list of tiles defined by points and users want a point-to-tiles mapping for convenience). It's worth caching those transformations, e.g. using @functools.cached_property, but this introduces a risk where any caller could ruin the returned cached value by editing it. Currently, I take the safety hit (cache a dict) instead of the performance hit (make a new copy for each caller). Caching a frozendict would be a better tradeoff.
It’s optimized for fast reads in exchange for expensive creation.
Bringing up TypedDict is sort of interesting, because it seems like a frozen dictionary could have been implemented by PEP 705, if typing.ReadOnly was enforced at runtime, and not just a hint to a static type checker.
I expect this is not a very big ask and the various typecheckers will have versions of this soon after release.
What I meant was that
should be inferred asIt basically provides data types, but also ensures that you have no easy way to reuse them, and it will miss cases where two data structures happen to have the same signature.
It's getting a little messy when we have class, namedtuple and TypedDict, which all can do much the same thing.
IMHO TypedDict in Python are essentially broken/useless as is
What is needed is TS style structural matching, like a Protocol for dicts
There are also potentially other optimizations that could be applied (not specific to dict/frozendict) to reduce the memory overhead on operations like "a = f(a)" for selected values of "f".
But actually, I suspect it can't do this optimization simply because the name `frozendict` could be shadowed.
[0] https://flax.readthedocs.io/en/latest/api_reference/flax.cor...
This proposal is important enough that I chimed in on the thread with a detailed example of how SQLAlchemy uses this pattern:
https://discuss.python.org/t/pep-814-add-frozendict-built-in...
{{'foo': 'bar'}: 1, {3:4, 5:6}: 7}
...and there is no reasonable builtin way to get around this!
You may ask: "Why on earth would you ever want a dictionary with dictionaries for its keys?"
More generally, sometimes you have an array, and for whatever reason, it is convenient to use its members as keys. Sometimes, the array in question happens to be an array of dicts. Bang, suddenly it's impossible to use said array's elements as keys! I'm not sure what infuriates me more: said impossibility, or the python community's collective attitude that "that never happens or is needed, therefore no frozendict for you"
If you want to have hash map keys, you need to think about how to hash them and how to compare for equality, it's just that. There will be complications to that such as floats, which have a tricky notion of equality, or in Python mutable collections which don't want to be hashable.
"PEP 351 – The freeze protocol" (2005, rejected) https://peps.python.org/pep-0351/ ; IIUC the freeze protocol proposed basically:
/? "Existing threads re: consts and applications thereof"I wasn't able to find a URL to this post (2021) from the python-ideas mailing list archives using a double quoted search term today; I had to use the python mailing list search engine? Did something break crawling of mailing lists? Old mailman HTML archives were very simple to crawl.. ENH: pypa: add a sitemap.xml for/of mailing list archives, forums; @pypa: ask for search engine indexing advice: "How do we make sure that the python mailing list archives will be search indexed?" (as they traditionally were)
How to find the .txt of mailing list archives posts these days?
From "[Python-ideas] Re: Introduce constant variables in Python" (2021) https://mail.python.org/archives/list/python-ideas@python.or... :
- pyrsistent: PMap, PVector
I'm out of time for this; (reformatting this for HN so that URLs will be auto-linkified but newlines won't be eliminated) here's the full email as .txt, the mailing list archive has a hyperlinkified version with newlines preserved. GH Markdown and CommonMark Markdown also preserve newlines and auto-linkify:
https://en.wikipedia.org/wiki/Design_by_contracthttps://en.wikipedia.org/wiki/Invariant_(mathematics)#Invari... [ https://en.wikipedia.org/wiki/Class_invariant ]
> What is the difference between "invariant" and "constant" and "final"?
`.freeze()` should probably just return a frozendict instead of in-place mutating the dict, and they should be separate types. Under the hood, you'll have to build the hashtable anyway to make the frozendict; as long as you're doing that work, you may as well build an object to contain the hashtable and just have that object be separate from the dict that birthed it.
The values referenced by both the original dict and the frozendict can be the same values; no need to clone them.
Not really, C# has ConcurrentDictionary.
If you have fixed keys, a frozen dataclass will do. If you don't, you can always start with a normal dict d, then store tuple(sorted(d.items())) to have immutability and efficient lookups (binary search), then throw away d.
So sets can be viewed as implicitly sorted, which is why the order of the elements cannot be used to differentiate two sets.
Being sorted internally to enforce the equivalence between sets with elements provided in different orders does not imply anything about the existence of an operation that would retrieve elements in a desired order or which would return subsets less or greater than a threshold. When such operations are desired, an order relation must be externally defined on the set.
So a possible definition of sets and multisets is as sorted arrays with or without unicity of elements, while sequences are unsorted arrays (which may also have the constraint of unique elements). However the standard set operations do not provide external access to the internal order, which is an order between arbitrary identifiers attached to the elements of the set, which have no meaning externally.
The problem was that assuming that keys would be sorted was frequently true, but not guaranteed. An alternative solution would have been to randomize them more, but that would probably break a lot of old code. Sorting the keys makes no difference if you don't expect them to be, but it will now be a greater surprise if you switch language.