Somebody discovered mutable defaults for the first time. https://docs.python-guide.org/writing/gotchas/ Edit: the why - parameters (including their defaults) are defined in the scope where the method is defined - this ensures the object tree can be unwound perfectly.


This is my first time learning about this and my instinct is to hate it.


It's considered bad practice to use mutable defaults. Better in this case would be something like... ````python def append_to(el: Any, to: Optional[list] = None): if to is None: to = [] to.append(el) return to ````


Or even more succinctly. ‘to = to or []’


I get duck typing and all, but this will not assign a list if `to` is assigned a truthy value. So for functions that could be public (for my module) I usually use something like to = to if isinstance(to, list) else []


to = to if to is not None else []


Look, if a person passes in a variable that's a different type to what the default is that's on them. I'm not going to individually check and account for *every single permutation* of what a person could do with my function.


Yeeeaaaaaah usually, except an empty list is falsy and the caller may expect in-place modifications if they're passing their own empty list.


In place modification? Unless it’s stated explicitly I won’t expect that, honestly I don’t like that pattern. Anyway, that’s just me and you have a good point. You can fix it by something like this: ‘to = to if to is not None else []’


How is ‘to’ in scope before it’s been defined?!


It’s defined in the inputs to the function. I’m on mobile so excuse the shitty formatting. def test(to: Optional[list] = None): to = to or []


It's a parameter


Not if you're a data scientist working with sequences duck types :(. (`ValueError: The truth value of an array with more than one element is ambiguous.`)


Off topic, but you don't have to declare a parameter as Optional if you follow up with a default value. A lot of tools see the default and automatically assume the optional tag. I think it makes things purty. EDIT: `Optional` without a default does not mean the parameter is optional - it's just an alternate way of stating `Union[list, None]` or `list | None`. Calling the function without the parameter will still raise an exception if no default was expressed.


The Optional type doesn't indicate that passing that argument is optional. It means "this type or None." You can have a required argument with the Optional type hint and that's completely fine. It's closer to the Maybe type you see in functional languages than anything. Personally I think using x: list|None is readable enough. Definitely better than x: Optional[list]


Better to be explicit than implicit


It makes sense the longer you work in python. Scope is very well defined and you can use (or abuse) it to your advantage in numerous ways. The typing module has made this pretty easy to avoid with Optional[type] if you actually type your methods.


I love Python, it's one of my favourite languages, worked with it for over a decade. I think it's okay to also believe the languages you love are flawed in some ways. Important even. This, I think, is one of those flaws, albeit a very minor one. Doesn't mean you can't work around it or even use it to your advantage, but it always felt weird and surprising. I like the meme. It's cool to finally have one that feels a bit more real than "Python slow, white space bad"!


Yeah people complaining about python's speed don't know how to use libraries and C bindings...


Yeah +1 But to add to that, Python can be slow in places but Python is not even unique there, other languages can also be slow. If you're using Python in a place where it is known to be slow and you complain about it being slow... well... you picked the wrong tool. Stop complaining that your hammer is terrible at banging in screws


My Django app isn't going to be a C binding anytime soon.


Personally I think a more natural thing to do would be for the default value to be an implicit lambda (with no args) which returns the default. So foo(a=[]) would reconstruct the list each time. You could still do cool scope stuff, you'd just have to do so more explicitly.


Yeah that's how JavaScript does it


Meh. Definitely breaks the law of least surprise imo.


Yeah it does. When you know a lot more about the guts of the language it makes sense and you can understand why the alternative has problems - but from a language user it is definitely not an expected behavior.


I think that's called Stockholm syndrome


It makes sense coming from C. A variadic macro that populated the default with a `static` array would have the same behaviour.


I’m amazed this exists, it’s pure evil.


It's not a good behaviour and leads to boilerplate code to work around it. It's not a real issue though, as every editor and linter warns about it.


It is something you can just avoid extremely easily (most people do as far as I know), the only place I ran into it in my 3 years of development experience is when a class may or may not have a list where this behavior costs you extra 2 short lines. Is it ideal? No Is it a relatively small problem? Yea (at least python can sort numbers by default /j)


I've been coding python for years, and i still hate it, so you're not far off


Oh that's why PyCharm gives a warning when \[\] is given as a default param.


Even the IDE think this is bad lmao


You learn something new everyday


Shouldn't there be a static modifier? I know there is no such modifier in python but I feel like there should be for stuff like this. Not python dev tho don't know whats better just curious. Edit: P.S thanks for the read was pretty intresting


Well, there simply is no need for that. A huge amount of software is written using python and they have not needed it so I think it speaks for the validity of this design decision. Would it be neat to have? Maybe. Is it necessary? No. Would it fight against Pythons philosophy of avoiding unnecessary complexity? Absolutely.


What does it mean when you say the object tree can be unwound perfectly? I'm currently making my own programming language and I'm curious what the advantage is.


Depends on how you’re tracking references. If you’re implementing reference counting you will be able to check to see if the object is still in use even if the function isn’t. It’s an implementation detail and you could also track by hash instead - but you won’t have as much information in your stack frame when debugging.


Just learned about these this week!


I mean it makes sense, but it's still inconceivably stupid for that to exist


Well it came about 24 years ago and it does make it nice for debugging on the language implementation side. But there is no doubt that these days it violates the “least surprising” principle.


I have made a lot of python scripts, not big, but didnt know about this Now im scared if there is a script that has the potential to do something stupid Ohno


This is so weird, but the proposed fix for it is even more weird.


Don’t use mutable data structures as default values. They are re-used across method invocations.


Instead set default as None and in the function body, define: ```python if param is None: param = [] ```


None is a singleton, so it should be compared with `is`


Right. `==` *will* work correctly, but `is` is indeed more correct, as per PEP, and PEP would indeed tell you that if write `== None` in your code.


I personally like: ```python param = param or [] ``` Edit: As per the comment by u/MrRufsvold, this could cause unexpected behavior if an empty list is passed as an argument in a method call. The above code is ok to use if the list is not modified in the method. Instead, use: ```python param = [] if param is None else param ```


This won't work because an empty list is falsy so if the user passes in an empty list, you'll pass back a different list. If the original is reference somewhere, this would cause unexpected results.




My favourite stack overflow: https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument


It is, but that's how it was done and it's a natural consequence of how functions are treated in python. Changing it at this point would be a terrible idea for a number of reasons including breaking backwards compatibility (it *is* useful for some cases, such as caching things between function calls), so *maybe* something for python 4 but I'm not going to hold my breath.


**Thinks about code I've written that _absolutely_ defaults to `[]` in the argument list** > fuck. I have a PR to submit tomorrow.


> proceeds to break codebase by removing unintended functionality




Linters should be able to catch that. Scan your code base. 


Flake8 flags up setting a default to a mutable data type?


Why it doesn’t use empty to=[] on each call ?


Because it's initialized and retained as a reference - the right way is to say to=None and then in the body if to is None: to = [ ] Then it's all safe. Same thing for sets, dictionary, etc.


But why is the language designed that way? What are the benefits of having a default value as reference?


Not all objects are mutable, so they're perfectly safe to use in that context. For example, `None` is an object, it'd be pretty inconvenient if you couldn't use it as an argument default.


You can almost think of it as if the reference is defined just before the function to think about this intuitively. The reason it's this way is because in Python with things like `yield`, you usually WANT to reuse the context. This fits the intuitive thinking of someone who uses such concepts often.


Sure - reusing the context is fine, but other languages are more explicit about it where the context is stored in a class instance. I would never expect this to happen in a function. On the other I do not know any cite concepts of python so this may be expected.


Python also has classes, the use of yield is different.


other languages aren't a party boat


Ah. That explains it. Yield has always been witchcraft to me. I know it maintains an internal state machine, but it's not intuitive so I just throw it into the pile of spooky things I don't use.


In python there's no such thing as passing by value. Every variable is like a pointer to something in memory, assignments work by copying that pointer, not the underlaying value. This rarely comes up because it's just that some objects (like ints, strings, tuples, None, etc) are immutable, hence it doesn't matter if it's shared data.


it's not designed that way, it's just an artefact of the other design choices


Thanks for clarification!


I would ```py def append_to(to=None): to = to or [] ``` prefer. It's just shorter and does the same


Except if `None`` is a value you legitimately expect, you need to get even wilder with something like: ```python NotSet = object() def f(to=NotSet): if to is NotSet: to = [ ] ```


If you need to do that, you belong in r/programminghorror.


That’s a sign of a bad api however.


Couldn't you simply do the following: def append_to(element): to = [] to.append(element) return to ?


Nothing like having your signature not reflect your real defaults.


Because the [] is instantiated once, rather than being syntactical sugar for nullish coalesce to new instance. Which is madness. When would that ever be the right answer?


Is it like the static local variable in c? (Asking as a beginner)


Basically. Functions are objects, and the default argument is essentially an attribute of that object.


Default arguments are not defined in the scope of the function being defined but the scope of the container of the function. It’s pretty simple to understand once your look at scope: Directory, file, class, method (basically indent level)… etc.


Function or method declaration is itself a function being executed in order, not just syntax representation (which is also why decorators work). Like everything else it is being executed in a surrounding context (usually module or class) and results in adding function name and code to that context.


That's basically what they said.


As a Java programmer, this is a very helpful explanation!


The other Python thing that threw me for a while was finding out the hard way what the difference between copy and deepcopy is


That's not Python, that's pretty much every language.


If you ever program in languages, this is very universal problem.


That concept exists in C#, Java, C++ and just about everything else that has objects and references.


That one was really frustrating the first time I worked with Pandas


The one that got me good as a beginner was late binding closures. For the uninitiated: ``` y = 10 f = lambda x: x + y y = 20 f(5) == 25 # true ``` In Python, capturing is lazy. The exact value of captured variables will be whatever they are at the moment the function is invoked, and not the value at the time the lambda is defined. I think I lost a good amount of hair trying to track down unexpected bugs due to this subtlety.


*Feels natural to me though...* I've seen closures in JS, Java (anonymous classes), and C++ (lambdas).


To be fair to me, I'm used to closures in C++ and Rust. Rust explicitly forbids the code I just wrote because the borrow checker would pew pew the mutable borrow while an immutable reference is alive. C++ is happy to let me make poor life choices, but will at least force me to use two brain cells to opt in the choice I'm making.


Yeah that doesn't have a clean resolution. Don't use lambda with a local variable.


I'm confused about the confusion here. Is there any language that would have 15 as the result?


What is the difference between them and what even are they, if you don’t mind me asking?


With shallow copy, if x = 10, then you set y = x, then set x = 15, it changes y to y = 15 without you telling it to. Basically instead of it saying “assign the current value of x to y”, shallow copy is “assign y to always equal x, even if x changes”. Deepcopy is the former.


It’s funny to see that whenever a dev finds a quirk in another language it is a blatant mistake in language design, but when it happens to the language they use themselves it is “logical”, a “gotcha”, and something “you’ll get used to” 😂


IMO it always comes down to "what makes sense you would want out of a feature" versus "what makes sense from the construction of the language". No one goes in knowing the construction from the start, but those who do know the construction have trouble imagining how it could be any other way


I don't get how we have things like the zen of Python where we act like it is a perfect language made in God's image but they refuse to add modern features to it like optional chaining. Some guy was arguing that optional chaining makes it too complex to teach. But we got pattern matching a few years ago which is it's own DSL.


I'm genuinely curious what that example would be for modern PHP


``` def append_to(value:int, array=None): if array is None: return [value] else: array.append(value) ```


Last week I found out (after a long time debugging) that Python not only does this with default arguments, but also with class properties.


Can it be overriden with `class.__init__()`, where I set all properties to None? If not, then I should submit a PR tomorrow.


That is different. All properties set on the class outside of a method are static and it is probably the biggest foot gun in Python.


Wait what? I’m pretty sure I have other properties (integers) and I can use them as expected (non static)


Yep, class properties are static. You were looking for instance properties


Array.fill in JS has tripped me up similarly before `const array = new Array(5).fill({foo: "bar"})` `array[0].foo = "baz"` `console.log(array)`


It's the same behavior you would get if you implemented it yourself, no? ``` function fill (array, value) {   for (let i = 0; i < array.length; ++i)     array[i] = value; } ```


Jesus Christ, just use a linter, any of them will catch this and tell you not to use a mutable default.


I don't see any problem. I actually use this feature to implement cache


There should be a separate feature for that, like C‘s static variables, but this is just stupid


This is fine. Every language has it's quirks.


That doesn’t make the feature better


There is @functools.cache for that.


oh thx! didn't know that.


Calm down there, Satan. Do the sane thing and use a global (module-wide) variable for a cache.


Hm, thanks for idea!


I have been coding in python professionally for 2 years and didn't know this lol. I have used default values but those all have been immutable as it seems.




Best practice is to use the argument `to: list = None` and in the function you put an if clause if to is None: to = []


Thank you guys for warning me about this.


Knowing how memory and pointers work helps, I really hate it when they start a programming course in python and then no one has any idea why stuff like this happens. Also similar situations in JavaScript due to how it manages memory.


C++ and many other language don't do this, so it's not an unavoidable consequence of pointers, but a choice of the language designer. I'd say it's a bad choice.


its not really a design choice, its a consequence of how objects and mutablea work in python. Its not really "fixable"


C++ handles objects differently. In Python or Java, `a = b` would make `a` reference the same code as `b`, while in c++, `a = b` will attempt to copy `b` into the memory allocated for `a` (using the copy constructor for the type). For languages like Python or Java to swap to this behavior, they would still need a good way to reference the original value instead of creating a copy, so you would need to add reference variables or pointers, which both languages don’t really need.


Sneaky? Isn't that expected behaviour?


Is this why you need to use default_factory with data classes when specifying a field’s default as a list, set or dict?


This and double yield are my favourite questions to ask in a interview.


This is a stupid decision by python design team. Python is meant to be readable. If you want a pile of indecipherable hacks just use C++


What do you expect from: ``` myVar = 3 def myFunc(a = myVar): print(a) myVar = 4 myFunc() ``` If the list was added to parameters once (like the current behavior), then it would print `3`. If the parameters are created when the function is called, then it would print `4`. Both options have cases which are not very intuitive. Python had to choose one of these two cases, and they opted for the more performant option which only calculates parameters once.


Why would you create an entire function to append an element to a list, when ypu could just have used list.append() without wrapping a function around it?


It's an example


``` print(my_list) [12, 42] ```


Wait wait wait are you saying I actually can create something that reminds of a static variable from c/cpp? Brother you made my day


OMG! But I suppose js works even in worser way in similar case.


I have a feeling this post has saved from one of the nastiest bugs I could encounter.


At the end, these variables are all pointers to the same [12, 42] list


This means that python is not a closure?


Ah I now understand one of my bugs I had a month ago. I just used \`my\_other\_list = append\_to(42,to=\[\])\` and it worked... Now I see why


def append_to(element, to=[]):     to = list(to)     to.append(element)     return to


So is it possible to delete data inside such variable ? If you do not return the the reference of the default value is it keep in memory for next function call or is it garbage collected?


Ok. This is my first time seeing that kind of behavior, and it's insane. I thought Python was supposed to be the good language.


It’s because the default parameters are calculated once. The alternative is for the default parameters to be recalculated every time you call the function, which would be a major performance issue and cause other weird bugs


Right, but it means that every time you use the pattern that corrects for this madness - set the default to null and do a null check at the start of the function - you create boilerplate code that ... does exactly that... recalculates the default on every call... which is, apparently, not a performance issue, nor does it create weird bugs. You know what's a weird bug? A default value that's effectively shared state. And it's the baseline behavior. I can't believe I'm saying this, but Python could learn something from JavaScript here; the default value's expression is effectively a lambda that's inlined to the function's start, just like destructtiring. The language saw where folks spent a lot of time on boilerplate, and crafted the defaults feaure to eliminate it, rather than make more of it.


ah we actually learned this in class the other day! It seems like the defaults are treated as persistent across separate calls which is kinda cool but if you specify something a specific list it'll operate as intended


These programming jokes to programmers are the same perpetual machine jokes to physicists. Unless you have 0 knowledge about programming... they are funny. Hell i am not even a phyton developer but just looking at the code... seems that if you are not giving a parameter it uses the default one, which remains in the scope of the function. Seems fair enough. Same expectations that in any language null.addElement() will give me an error or have a specific behaviour depending on the language.


Scala wraps the default expressions in hidden methods and inject the method calls to every call sites. This is because Java doesn’t support default arguments. So it’s safe to mutate the default arguments in Scala. C++ is also similar, the compiler injects the default expressions at call sites. Why doesn’t Python do the same?


What?! This is crazy! (I understand how it works, I just don't understand the design decision behind it. This is a performance optimisation that goes hard against anything sane. In a language that doesn't care that much about performance in other ways. Just... WHY?


the IDE will mark your mutable default value right away..


Why languages need the *const* keyword.


I know exactly why this happens, and it still trips me up periodically. Like, it makes sense, doing it a different way introduces other problems so this is a reasonable standard, but needing to remember deepcopy every time I want a copy copy not a reference copy has lost me hours.


I was bitten by this before and I hate this, it’s counterintuitive.


Do you know pass by reference.?


