Hacker Read

Geminidog · 2020-12-22 09:29:12

I am not talking about leaky code. I am talking about code that is not modular.

Rest assured, I know you’re talking about a perceived isomorphism between a function with a struct as a parameter and the same struct with a method. There are some flaws with this direction of thought.

It is the usage of implicit ‘this’ that breaks modularity. When a method is used outside of a class the ‘this’ is no longer implicit thereby preventing the method from ever being moved outside of the context of the class. This breaks modularity. Python does not suffer from this issue.

Couple this with mutation. Often methods rely on temporal phenomena (aka mutations) to work, meaning that a method cannot be used until after a constructor or setter has been called. This ties the method to the constructor or setter rendering the method less modular as the method cannot be moved or used anywhere without moving or using the constructor with it.

My claim is that combinators can be reorganized without dragging context around thereby eliminAting technical debt related to organization and repurposing and reusing logic.

Note that when I say combinator, I am not referring to a pure function.

reply

Geminidog | karma 309 | avg karma 0.79 · | 2021-01-28 04:38:10+00:00

This guy makes a simple concept too complicated.

The main thing that's different about python OO is the method.

    class SomeClass
       def someMethod(self: SomeClass, someVar: int) -> int:
           return self.x + somVar

The explicit passing of context through "self" gives python methods combinator-like properties. Because of this, you can copy and paste the method and run it outside of the context of the class:

           def someMethod(self: SomeClass, someVar: int) -> int:
               return self.x + somVar

The above will work in global context. This makes python code significantly more refactorable than other OO languages. For example C++:

   class SomeClass {
       int someMethod(int someVar) {
             return this.x + someVar;
       }
   }

If you copy and paste the method outside of the class:

       int someMethod(int someVar) {
           return this.x + someVar;
       }

It won't work.

The python way just allows the method to be easily refactored to operate on a generic type (caveat: you have to ignore typing which the python interpreter actually does). Given a self.x, any value for self will be valid so long as it contains an x. Python does this automatically and that is the main difference.

The main problem with OO is unfortunately still not resolved under either example. OO promotes context to follow logic around preventing the logic from being reused in other contexts. Ultimately, this is what's needed for the greatest modularity:

     def someMethod(x: int, someVar: int) -> int:
         return x + someVar

And although you can write the above with OO, the style ultimately promotes practices that steers programmers away it.

nendroid | karma 10 | avg karma 0.05 · | 2020-10-28 17:16:48

It's not just pure functions. Two things break modularity: Free variables and mutation.

The problem with OOP is that no method is truly pure, no method is a combinator.

   class Thing
      var memberVar
      def addOne():
          return memberVar+1;

The above is an example of your typical class. AddOne is not modular because it cannot be used outside of the context of Thing.

You can use static functions but the static keyword defeats the purpose of a class and makes the class equivalent to a namespace containing functions.

   class Thing
      static def add(x):
           return x + 1;

the above is pointless. Just do the below:

   namespace Thing:
       def add(x):
           return x + 1;

The point of OOP is for methods to operate on internal state. If you remove this feature from OOP you're left with something that is identical to namespaces and functions.

Use namespaces and functions when all you have are pure functions and use classes when you need internal state... there is literally no point for OOP if you aren't using internal state with your classes.

reply

dlsspy | karma 1374 | avg karma 2.76 · | 2010-10-20 01:40:23

This is mostly a java disease, though it affects a few other languages. In python, you wouldn't design this because you can change the implementation of the class without having to go and rewrite all the callers.

pvg | karma 14880 | avg karma 1.9 · | 2010-02-04 04:24:25+00:00

Right. It doesn't happen by rooting around objects looking for methods by name. That's usually a sign of some misdesign that's making you re-invent the polymorphic mechanisms of the language yourself. True both in Java and Python.

TTPrograms | karma 1676 | avg karma 2.89 · | 2017-03-30 20:25:16+00:00

I don't really see how it's a problem that it's "leaking" a tuple API - it's not like there's private methods in Python. Everything can see everything. If anything having the state of an object available explicitly as a tuple seems like a good idea to me, as opposed to trying to figure out what's a stateful variable and what's a method etc.

pdpi | karma 11546 | avg karma 4.83 · | 2014-06-09 15:05:53

You still have to explicitly pass a `this` to that function, which is kind of my earmark for telling methods and functions apart. (In this regard, Python treads a fine line where it has an explicit `self` parameter that is passed in automatically).

contravariant | karma 8428 | avg karma 2.4 · | 2024-04-12 00:29:57

It's always possible to 'deconstruct' objects with methods into just dumb structs with functions acting on them. This is the direction Julia takes for instance.

One advantage of this is that it exposes that 'inheriting' a method is really just applying the same base function to all classes that support a particular interface. You can override a method by specialising the function for a subinterface. This interface can just be a marker that says 'this struct has these fields, and semantically is a 'IMyInterface').

Of course object oriented programming languages tend to encourage private and protected properties and such, which force all of this to take place inside the class. At first I though that that was the best way to avoid a mess, but it prevents you from doing this, possibly leading to more code duplication. And after some more experience with python there's something to be said for python's approach of just using name conventions to point out when to be careful.

reply

saghm | karma 7579 | avg karma 2.55 · | 2019-11-08 16:46:30

That still doesn't preclude the possibility that within the module the function gets called with a field from another instance. I think the idea is that by making it a method that just takes a reference to self, it's impossible to accidentally mutate a field on a different instance, while taking a reference to the field itself doesn't prevent the programmer from accidentally calling it with the wrong instance of the field.

overgard | karma 10309 | avg karma 4.9 · | 2015-01-04 18:10:20+00:00

Python is my favorite language, but I think that explicitly having to pass self to methods was a huge mistake. It's a gotcha that catches even experienced programmers, the error message is confusing and unhelpful, it's verbose, and the one case it allows (calling the class directly instead of the instance?) is not generally that useful. It makes some metaprogramming stuff slightly cleaner sometimes, but that's not a great justification.

falsedan | karma 2127 | avg karma 1.33 · | 2017-09-04 10:50:40+00:00

Most other OO languages can pass self implicitly, and don't distinguish between initialization & construction.

> There is something of a culture in Python of not being overly clever

This is the part I disagree with. So much python I read in apps, libraries, web frameworks, and pythonista tweets is needlessly clever.

reply

WesolyKubeczek | karma 3085 | avg karma 2.17 · | 2021-05-15 15:28:32+00:00

I can't get used to the fact that for builtin types grow methods, you need to import corresponding modules everywhere.

You import module X, which gets you objects as defined in the module Y. To use methods defined for that type, you need to import Y yourself. Python, of course, by virtue of binding objects to methods, doesn't need it.

There are some more namespacing quirks that may require one to give up on that sweet syntactic sugar to disambiguate things (two unrelated modules defining methods on a single type, with the same signature but different behaviors, and you need both modules imported for some reason?), but this is, again, something one would have to get used to. It's not a different world like Rust or Prolog.

It's not a complete showstopper, just something I keep bumping my head into now and then.

reply

eatonphil | karma 21581 | avg karma 5.52 · | 2015-03-16 18:51:34+00:00

Then isn't this pattern the epitome of the philosophy?

At some point you trade an api for being explicit. Why isn't memory management explicit in python? I just don't agree with the (seemingly arbitrary) choice of making self explicit. It really does not feel like the "Python" way.

reply

the_mitsuhiko | karma 13985 | avg karma 3.97 · | 2013-02-12 15:00:20+00:00

> Classes hold state. More state adds more complexity, and typical OO design add lots of layers of indirection which also add complexity. Most of the time, you don't need the flexibility that the extra indirection gives you. So you get complexity for little benefit.

Functions hold state as well, they just encapsulate it. You can still break up your algorithm into a class with multiple template methods and then encapsulate it in a function if you're afraid of leaking state elsewhere.

What I see instead is that people can't get rid of state and put it as global variables into Python modules. Case in point: pickle. It uses sys.modules and pickle.dispatch_table to store some of its state.

> If you don't need to hold state, then a collection of functions will usually suffice. In Python, you can use modules for this.

Classes have another advantage: virtual method calls. If one function in a module calls into another function in the module I can only do two things: a) copy/paste the library and change one of the functions or b) a monkeypatch which modifies a shared resource someone else might want to use with the old semantics.

A class I can subclass and change the behavior for each method separately.

reply

jakobnissen | karma 1828 | avg karma 6.44 · | 2022-01-25 00:21:32

Right, that's true. I think structs are inevitable, and classes is the way to get structs in Python.

But yes, the entire OOP-classes with namspaced methods and "self", that's not inevitable

reply

im3w1l | karma 8737 | avg karma 1.51 · | 2014-05-09 21:23:43+00:00

I recently made a class implementing an interface from a library in python. That interface had a "canonical" implementation, in that most instances of that interface were of that class. When I tried to call some functions from that library with my own class, they tried to call methods from the "canonical class", which were neither present in the interface nor my implementation.

Result: program crashed when trying to call not present method. Had to clear up a few of these before it would work. If the type system hadn't been so forgiving the library implementers would have realized how their abstractions were leaking, and it would have been a lot easier for me to implement the interface correctly.

reply

p4wnc6 | karma 2251 | avg karma 2.53 · | 2016-03-27 20:47:59+00:00

This is nothing new at all, it is called the "Fluent Interface" design. Many Python libraries implement this design, for instance the Pandas library is a popular example.

I hate this design approach deep in my soul. It makes for very brittle code that creates lots of backward compatibility issues. If you're working on some legacy code that has some nonsense like

    foo.get_status().dispatch_handler().log_error().close()

it is maddening! You have to untangle just what exactly gets returned by every step of the chain, so that you can ensure you're in the right context to know exactly what the next call of the chain is doing.

In that example, say someone changes `foo.get_status()` to return some new kind of "status" object, and it alters the `dispatch_handler` and so on. Of course one can implement this in a way where the chain of downstream calls doesn't break, but the point isn't so much that, through huge engineering effort it is possible, but rather that it is extremely brittle and adds a layer of complexity that's not needed.

It's just so much better to write something like:

    dispatch_result = run_dispatcher(foo.get_status())
    log_error(dispatch_result)

When the intermediate points of the chain are just functions, instead of member functions of a class, it means you can easily experiment with them and figure out what's going on without needing to recreate the entire set of context along the whole chain.

`run_dispatcher` in my example would be a hell of a lot easier to unit test and throw some mocked example class into for debugging or refactoring than if it is `some_class.run_dispatcher` ... and then if `some_class` has child classes that specialize the behavior, you're just hosed.

The problem is composability. People think that the fluent interface makes things composable because from some arbitrary point in the middle of the chain of calls, they have easy attribute-like access to the next operation they want to do. This artificially feels easy and convenient.

But contrast this to a functional language like Haskell, where none of these things need to be member functions of an object, and hence the context of the object doesn't have to be created at any point in the fluent chain. Then you can write something even better:

    (close . logError . dispatchHandler . getStatus) foo

We can even easily refer to this whole chain of events with a single function name:

    let statusDispatchLog = (close . logError . dispatchHandler . getStatus)

(And, of course, we get lots of nice type checking in statically typed languages to ensure that the composition actually makes sense -- which not only protects you at run time, but is also a huge help to clue you in to your design flaws. If you're trying to shoehorn some stuff into a fluent interface and it's not working, it probably means you have thought clearly about how the methods should "flow" in the call chain.)

To do the same thing in a fluent interface, we need a horrible lambda or a whole new function definition, exactly because the fluent interface is only sweeping the composability issues under the rug.

    statusDispatchLog = lambda x: x.get_status().dispatch_handler().log_error().close()

The difference is subtle, but important. Instead of making a new function that is explicitly the composition of other functions, you are making a function that just happens to access other functions as attributes, and if you set it up correctly then it acts as a sequence of composition.

In Python this is particularly a shame because functions are first class objects. Of course, you can write helper functions / decorators that sort of do function composition (if you're willing to throw away useful argument signatures), or you can use flaky hacks like the common Infix pattern in Python, and then live with ugly "<< . >>" or "|.|" misleading syntax.

It always makes me sad that Python lacks an extremely short function composition infix operator that provides some information about the function signatures of the functions being composed.

Because not even a comprehension can help you when you need to do the fluent interface stuff in Python.

    [x.h().f().g() for x in some_iterator]

This is so much worse than

    map(g.f.h, someIterator)

or

    [(g.f.h) x | x <- someIterator]

or even

    [g(f(h(x))) for x in some_iterator]

regularfry | karma 8115 | avg karma 1.96 · | 2012-09-08 18:23:50+00:00

Class methods needing decorators really bugs me too. The problem goes back to explicit self. The explicit self, as currently implemented, carries no information: yes, you get to pick "self" as the instance name, but everyone picks "self" for that and There's One Way To Do It in Python anyway. If you could declare a class method like this:

    class Foo(object):
      def class_method(Foo):
        pass

then suddenly explicit self gains a meaning (Hey! Instance method!) and class method declaration doesn't go through a completely unrelated mechanism.

>> methods only become class methods when a decorator is used

> That's because the decorator is basically a flag to tell the attribute-resolution process to handle this method differently from the normal method-resolution process.

Do you see how that's just special-casing method declaration in a similar way to Ruby's mechanism, just relying on the programmer to drive the mechanism himself? To a not-so-casual observer it looks like Guido wasn't willing or able to change enough semantics to make a syntactic fix work for class method declaration, despite there being an obvious space in the syntax for it to fit in, so lent on a convenient implementation detail - a hack - to get the same effect with more work for the programmer.

Of course, it's equally possible that he believes that class methods are a code smell and should be explicitly made ugly, which I could have a certain sympathy for, but to me the current situation just looks really inelegant.

reply

AnkhMorporkian | karma 1518 | avg karma 3.86 · | 2015-03-16 19:15:11+00:00

Memory wise it's certainly much better in classes than in a naive implementation via these. Every time you create a new pseudo-class you'll be attaching all methods over and over again, eating up way more memory than the shared methods between python classes. You can get around this by defining the functions outside the class definition itself, but at that point it feels like it's defeating the purpose.

mehrdadn | karma 286 | avg karma 0.06 · | 2020-08-24 00:42:47

No I'm not saying I would rather it be implicit. That can have other adverse repercussions. My goal here was just to identify the problem precisely, because (as you can clearly tell from the thread) it is anything but obvious, even for people who've been using Python for years. I can imagine other ways to mitigate it that don't have dangerous side-effects (like warnings), but there are lots of approaches, and how to solve it is a bit more subjective and warrants a longer discussion.

The pitfall is only part of the problem. The problem is much deeper than that. In Python, the solution lies in the base class, which another author may have written in a totally different time/place. Yet the problem only manifests itself when you, another poor soul, try to derive from it later. This kind of spooky action at a distance pierces abstractions, which is just about the last thing you should want from a programming language... and in a sense it literally violates causality (for the lack of a better word). The poor soul that notices this in multiple-inheritance will have to go very much out of their way to work around it: after spending a while trying to track it down (and understanding what's even going on, which is not easy), they'll have to either monkey-patch the original class at runtime, or introduce superfluous wrappers. That is a very high price to pay, and that's on top of the silent failure that led your code to crash and burn.

But anyway, I was just illustrating Python can be quite complicated (even moreso than C++ in some ways) and has flaws despite its simple and straightforward appearance, and it can catch even experienced developers completely off-guard. Just answering the question you asked, basically.

reply