* The Mill build tool looks a lot better than SBT, but seems like everyone is still using SBT
* Scala minor version are binary incompatible, so maintaining Scala projects is a big pain. Upgrading Spark from Scala 2.11 to Scala 2.12 was a massive undertaking for example.
* Scala has tons of language features and lets people do crazy things in the code. Hard to win technical arguments with Scala geniuses that like using complicated language features.
* Scalatest is stil used by most projects and is annoying to use, as described here: https://github.com/lihaoyi/utest#why-utest. The overuse of DSLs in Scala is really annoying. Too many DSLs is another example of something I consider to be an antipattern, but there is no Scala community consensus on the responsible use of DSLs.
I'm optimistic about Scala. There are some folks that love the language and are continuously improving the ecosystem. Scala 3 will have to sell a better story about ditching legacy tooling and giving users a better default stack if it wants to compete with modern Go/Rust/Python.
> Hard to win technical arguments with Scala geniuses that like using complicated language features.
I was on a team building good old crud apps using monads, monoids, categories, combinators, effects cats, seamless and bunch of other nonsense that i've now purged from my brain.
You could've easily mistaken our team for a programming language research group at a university.
This is literally the number one reason i would never choose scala. There is no upper bound to amount of stupidity one can indulge in. Our code was so convoluted that even intellij had trouble understanding what the hell was going on and would spit out compilation errors when there were none.
I now work with clojure team and i feel a sense of relief and weight taken off of my head from all the useless stuff i had to learn and use.
I'm on exactly that kind of team now. They've been working on a very simple problem for several years now and have an enormous, sophisticated, but still buggy and unstable solution.
They're really smart people, and they had to be extremely capable programmers to get this far, but the embarrassing question is, how would a team of mediocre developers have tackled this problem? They would have picked a mediocre language and probably written a naive, straightforward solution, and after a few months of squashing bugs and developing production workarounds, it would have been a workable, tolerable system. They would have spent the last year working on something else.
The naive solution would have more inherent unreliability, but our vast and intricate solution, designed to scale to the moon and be more self-healing than the T2000, is never going to approach the same reliability as a naive solution, because it's too complex (and the code is too hard for mortals to read and reason about) for us ever to iron out the bugs.
They make a huge deal out of how much safer and easier to reason about our code is thanks to the FP discipline, but why do we have just as many concurrency-related production issues as a typical team using blocking operations and threadpools? Why do we have _more_ incorrectness caused by swallowed errors than I came to expect in projects that relied on exceptions?
I'm still keen on mastering this style of Scala, because I think the benefits can be had, but it annoys me that some of my teammates are happy to use those stated benefits to justify their programming adventures without seeming to care if they actually materialize.
"I'm on exactly that kind of team now. They've been working on a very simple problem for several years now and have an enormous, sophisticated, but still buggy and unstable solution.
They're really smart people, and they had to be extremely capable programmers to get this far, but the embarrassing question is, how would a team of mediocre developers have tackled this problem? They would have picked a mediocre language"
I don't mean to be rude to your colleagues, but almost by implication in what you've said, they're not good programmers. Being a good programmer is nothing to do with leveraging a fancy FP language. It's about delivering working, maintainable, testable code. Whether you write buggy spaghetti code in Go or misuse Scala, its the same root problem.
I've been a Scala dev for five years, worked on some massive projects and it's been a dream. As long as you agree on a style, don't go crazy with the language and use it with discernment, it's a joy to work with. It's a sharp tool though and it takes discernment to know how to wield it appropriately. I don't know if that's a downside, it's more a warning.
I think people can be good programmers and poor software architects at the same time. Sounds like that's what has happened in this case: brilliant abstractions that are misused and result in a buggy product and less productive team.
"I think people can be good programmers and poor software architects at the same time. Sounds like that's what has happened in this case: brilliant abstractions that are misused and result in a buggy product and less productive team."
That's fair. I guess if by good programmer we mean, fluency with the language and algorithms / abstractions /types etc then perhaps they qualify. I really meant something more like "software engineer" ie. someone who can architect a simple, sane, working solution adhering to the usual best practices of good software construction. Maybe there's a missing piece between being a good coder vs scaling this knowledge up to the application level. The latter is indeed a much rarer skill and too often, people without it influence major decisions.
I would guess that it's mainly a question of experience. When you have smart kids fresh out of school, they are going to be eager to use every tool in their toolbox. Eventually (hopefully) they will develop the wisdom to know when to deploy what.
Maybe the issue is that Scala requires a bit more wisdom that some languages. The problem is that the profession is filled with inexperienced people because of the rate at which new programmers are appearing. In that world you either want safe and, dare I say it, lowest common denominator languages, or you need more hierarchy where senior people take a stronger lead and set the rules. In other words, if you're going to use Rust, Scala, Haskell etc acknowledge the tradeoffs and potential footguns and insist that someone who knows what they're doing, leads the project in a very hands on way. Properly mentored juniors or mid level devs will have that moment of enlightenment where they see the point of these languages and don't just succumb to the blub paradox https://en.wiktionary.org/wiki/Blub_paradox.
The issue I've seen is that the help venues are overwhelmingly filled with language-"extremists" that have too much time and lead beginners and those looking for help into the rabbit hole. A bit like people that spend more time telling people to TDD-everything than coding...
Frankly, they are bad programmers. They are smart highly intelligent people who don't have either knowledge or aptitude or willingness to be good programmers.
My pet peeve in this business are supposedly good programmers who get praised despite never having actual results. And it is not like they would be rare.
Unfortunately the leetcode whiteboard interviewing strongly encourages this type. Even almost requires it. If all energy is spent on that, it is time not spent learning how to build stable products which can be maintained long term.
Because turns out that popping out algorithm exercise code after another into source control doesn't produce a good product. But that's what current interview fad measures, thus it's what it produces.
True, and with people switching jobs every 12 to 24 months (at least in the Bay Area) many of them never get to see the outcome of their choices, and don't learn from experience.
The companies in turn don't realize the outcome of their choice either and just consider the inefficiencies to be the normal status... I've never seen such a low productivity than in large companies with a lot of churn... Thankfully they have good people that can hold and patch the walls and are the real not-recognized "heroes"...
I have the same feeling with Kotlin. If you stay away from overusing DSLs and over FPing things (traduction, using Arrow or most of it) it is beautiful and easy to maintain and understand even by junior devs.
And my feeling is the same with Clojure and Scala, I'm not a specialist, but code can be written in such a way that I can even contribute to it.
I am fairly certain the jury is out on this question:
It's always better to write simple, almost dumb code that anyone can understand than to use advanced abstractions from category theory or whatever.
Why? Because every single study or anedocte I've ever heard is like yours: adding complex abstractions to code do NOT make it more reliable. But they do make it much harder to modify, understand, and fix.
FP tought us that unconstrained mutable state is very bad, and that has helped us immensely in the mundane languages of the world (Java, C#, C++, Go). But I think it's time to learn the other lesson from FP: overly complex abstractions are also very bad and should NOT be used willy nilly as they simply have a very low or negative cost-benefit.
Finding the right level of abstraction is basically the one and only skill that matters for a software engineer IMO. I've got about 15 years of experience in programming and it's still something I'm actively working on.
I disagree with the general take that "adding complex abstractions to code do NOT make it more reliable". Good use of abstraction makes code easier to write, easier to maintain and easier to extend. Good use of abstraction can make code more concise and more regular while at the same time allowing better diagnostics when you do something wrong.
As a quick example: a generic JSON serialization library that can work with any type is probably a lot nicer to use that one that requires manual reimplementation for every class in your program. It'll be more complicated to write but it's probably well worth it in the end. Similarly a logging system that can abstract over several backends and log levels depending on the environment probably beats having a bunch of if/else every time you want to log a message.
But one needs to remember the old saying: debugging code is harder than writing it, so if you write code that's as smart as you're capable of producing then by definition you're not smart enough to debug it.
You are completely right: finding "THE RIGHT" level of abstraction is extremely important. Without abstractions we would be writing code in Assembly.
This is the reason I used the term "overly complex abstractions", not just "abstractions" in general, of course. Overly complex meaning it's a failure to use "the right abstraction" where you instead go for something much more complex than the proble called for... this is what I understand most people in this thread are referring to when they refer to previous Scala projects they've worked on.
When I was young I though that good writers are the ones who write such complex sentences that nobody can understand. Later I learned that good writers are clear and understandable.
I used to write complex code because I found it elegant. Now I write simple code because I enjoy breaking down a complex problem into its simplest form, which for me means I fully understand the problem.
I agree 100%, and I always bring it back to one simple observation that has been true of almost all the projects I've worked on: the limiting resource that programmers work with is their own time and brainpower. To my mind, much of the discipline of programming is centered around making frugal use of that scarce resource.
I still think programmers should work with powerful tools, because "dumb" tools often force you to express things in convoluted ways, and you don't need a powerful language to make a huge mess. (Java proves that you only need single inheritance to produce virtually unlimited amounts of spaghettiness in real-world projects. The difference between Java and Scala is not that Scala enables teams to produce epic piles of FP crap, it's that we long ago stopped being shocked when teams produce epic piles of OO crap.)
I think the programmers doing terrible things with Scala would be doing terrible things in any language, given the chance. What they need is hands-on technical management. When they have a wild idea, they need to be walked through an appropriate engineering decision-making process. They need somebody with the authority to tell them, "Ha ha, yeah, that design with etcd and Kafka and the robot space lasers would friggin' rock, how cool would that be, but on the other hand, we could just stick a REST API in front of a database and you'd be done in two weeks, so let's compare pros and cons."
It does not matter how "simple" each line of code is.
What matters is how "simple" the whole application/solution is.
You can argue that you can write very simple code in Assembler - all instructions are very clearly defined and very simple (well, maybe not anymore in CISC...). Someone can learn them all in a day. The problem is just that now you end up with a lot of code and overall that will be hard to maintain.
Compare it to Javascript: certainly not the language where code is most maintainable or simple, but the end-result is easier to maintain then assembler.
In the end, the choice of programming language is like a the choice of a compression-tool like gzip. The more complex, the more difficult to learn and use (setting library size and various compression settings) but the output will be smaller compared to a simple tool or no compression (Assembler) while containing the same amount of information.
I write lots of F#, and I find that computation expressions (a fancy version of do-notation) makes my code more "dumb" and "simple", despite being an advanced "FP" feature. This is because it pushes the complexity from my business logic and into library code.
With computation expressions:
async {
let! foos = fetchFoos
for foo in foos do
do! launchFoo foo
do! clearFoos
}
Without computation expressions (this is probably wrong but you get the idea):
You should always reach for the simplest solution first (and Haoyi has a great set of guidelines about that: https://www.lihaoyi.com/post/StrategicScalaStylePrincipleofL... ). The point of those complex FP abstractions is to let you keep writing dumb code even when you want to do something fiddly (such as async operations with error recovery). But don't cargo-cult the complex solution if you don't have the complex problem! A plain function whose result is a plain value should be expressed that way.
> adding complex abstractions to code do NOT make it more reliable.
Sometimes false. If the (more) complex abstractions are in well-tested libraries that haven't been written solely for your code.
> But [complex abstractions] do make [your code] much harder to modify,
Not necessarily. You can separate concerns; you can isolate changes; and you hopefully reduce the amount of code you've written yourself, significantly even.
> [complex abstractions] do make [your code] much harder to understand
This is as likely false as it is true.
> [complex abstractions] do make [your code] much harder to fix
Again, this very much depends. If you're using a well-tested, widely-used external library with those abstractions, it may well be easier to fix your own code.
> the other lesson from FP: overly complex abstractions are also very bad
Lisp, in its persistent failure to achieve world domination, has been teaching us this lesson for over 60 years now!
Infinitely powerful languages are fun and, yes, powerful. But as soon as the project has more than one or two people in it, they also result in runaway complexity that will hurt the product far more than any gains in expressibility.
From the client's point of view that may be true; abstraction should be the process of pushing complexity down. But in this case I think we're talking about code that has inadvertently introduced unnecessary complexity in an attempt to be hyper-generic.
i think it depends if you need it, rust for example is quite complex, but it solves a class of issue that other language have trouble solving while maintaining speed
Reading this thread is giving me some catharsis. I worked on a large project in a big tech company where the org’s leadership was ideologically determined to build everything in Scala. We ended up with dozens of engineers and failed to deliver basic functionality. The leadership eventually left the company and now they are running a startup with a few of the Scala savants from the old team. They got lots of funding and they invited me to visit to get me to join. They showed me the awesome type system they had implemented and tried to demo it but the product was broken in five ways.
I don't think it is a Scala problem but more of a leadership/organizational problem. If you look at Jane street, they do everything in OCaml which is an interesting choice to say the least, but their stuff works, evolve and they can demonstrate...
Yeah, I don’t think Scala itself caused the problem. There was this mindset: “if we just properly model the universe in the type system, the solution will just fall out!”
Because that was the obsession, they ended up spending all their time iterating on this perfect type system and not enough on the product. I think people with that issue are more likely to want to use Scala.
I am continually perplexed at the swallowing of exceptions in certain FP communities.
Result types, Either et cetera make it extremely easy to swallow errors by flatMapping thoughtlessly losing the context of where they were - you typically /want/ the call stack when you hit into an exceptional flow.
As I am slowly acquiring the habits of thought that make it possible to read and write FP code at a reasonable speed, I'm horrified by how much idiomatic Scala FP code relies on projecting certain assumptions onto types and operations that seem, at first glance, to be neutral mathematical abstractions.
For example, I find myself hating the Either type, because I feel like there is a socially established convention that one half of the Either is the type that matters, the value that you want, the value that is the point of the computation you're doing, and the other half is a garbage value that should never be directly handled. So I really feel like I should conform to the convention and reserve Either for cases where one of the possible types doesn't matter. But how often is it true that one side of the Either doesn't matter? People want me to encode success/failure in an Either type, but if I do that, are they going to treat failure with the care it deserves?
I often handle Either (and Option) using pattern matching when I feel it's important to give both code paths equal importance and equal visibility in the code, but people change it because flatMap is supposedly more idiomatic, and they believe that eliminating pattern matching from their code is a sign of sophistication.
I feel like this stems from a strong desire among FP folks for the happy path to be the only one visible in the code, and the non-happy path to work by invisible magic. Maybe there are some brilliant programmers who achieve this by careful programming, but there are people mimicking them who seem to rely more on faith than logical analysis. They just flatMap their way through everything and trust that this results in correct behavior for the "less important" cases.
I'm sorry that this turned into a bit of a rant, but I'm entirely fed up with it, and it accounts for a lot of what I dislike about the code I work with on a daily basis.
This was great, actually.
I don't program in Scala, but it was very interesting to hear about the difference between types as abstractions vs types as they are used.
For unfamiliar topics or when presented with uncommon insight, I believe rants, monologues, even diatribes are actually some of the best things to read.
> For example, I find myself hating the Either type, because I feel like there is a socially established convention that one half of the Either is the type that matters, the value that you want, the value that is the point of the computation you're doing, and the other half is a garbage value that should never be directly handled. So I really feel like I should conform to the convention and reserve Either for cases where one of the possible types doesn't matter. But how often is it true that one side of the Either doesn't matter? People want me to encode success/failure in an Either type, but if I do that, are they going to treat failure with the care it deserves?
There's always a tradeoff between making the happy path clear and making the error handling explicit. The whole point of Either is to be a middle ground between "both cases are equal weight and you handle them by pattern matching" (custom ADTs) and "only the happy path is visible, the error path is completely invisible magic" (exceptions). Given that people in Python or Java tend to use exceptions a lot more than they use datatypes, I'd argue that a typical Scala codebase puts more emphasis on actually handling errors than a typical codebase in other languages.
Where each case really is of equal weight, consider using a custom datatype (it's only a couple of lines: sealed trait A, case class B(...) extends A, case class C(...) extends A) rather than Either.
I've been working through 'Haskell From First Principles', and it turned on a lightbulb: the 'right' half of Either is the 'important' one because its type variable is free to change.
instance Functor (Either a) where -- a is fixed here!
fmap :: (b -> c) -> Either a b -> Either a c
fmap _ (Left l) = Left l
fmap f (Right r) = Right (f r)
As a general rule, the last type parameter of a type carries special significance: the same applies to Tuples.
You _can_ trivially construct a type where the two labels are swapped; it's just labels, Left and Right aren't intrinsically important, except insofar as they reflect the positions of the type arguments in written text.
This is also the reason we have the convention in Scala as well - the inference to partially apply the type works in a certain way. But I agree with the parent post, a more descriptive name would be better.
Right; ostensibly you could create a language that lets you easily poke holes in any slot of a type, but I'm not sure you necessarily /gain/ a lot in doing so except for confusion. It would take a lot more convolution to specify types and instances for every function application.
> I often handle Either (and Option) using pattern matching when I feel it's important to give both code paths equal importance and equal visibility in the code, but people change it because flatMap is supposedly more idiomatic, and they believe that eliminating pattern matching from their code is a sign of sophistication.
Isn't this the same as letting exceptions bubble up in a non-FP language?
You don't necessarily lose the stack trace. Typically the left side of an either is an Exception (or an error ADT that wraps one). When you want to handle the left case, you can log out the full trace as you would without Either.
The Monad instance for Either means that chaining them together with flatMap has a short-circuiting effect and the first failure will stop the rest of the chain from being evaluated. I find this actually makes it easier to know where your errors are happening, and also allows you to centralise your error handling logic.
Sure - you can implicit srcloc or capture the Exception, both of which preserve it; but that's not the default behaviour and it's not what we recommend to beginners.
If you go onto the scaladoc for Either today, you see a stringly-typed Either where they discard the Exception.
Hmm, when I first learnt Scala, I haven’t had too advanced FP knowledge, so I am yet to have first-hand experience with this sort of exception-handling and I’m yet to decide how good it is.
Compared to Haskell, it is probably better in some way because you have the proper stacktrace; but it “feels” impure a bit..
In a way Java’s exceptions are already an Either type with the result type and the thrown Exception (with “auto-decomposition”, unless checked exceptions) —- is the advantages like manual management of when mapping/flatmapping happens worth it in your opinion?
Nonetheless thanks for the heads up, I might try out Scala again with the exception handling model you mentioned!
It's supposed to work like that, but it's a lot easier to screw up. Smart people screw it up all the time, and it's hard to spot in code review, whereas average programmers have no problem avoiding swallowing exceptions once they realize it's important, and if they do mess up it stands out like a sore thumb in the code.
You shouldn't swallow errors like that in FP either. You can trap the error in an effect type and throw it at the top level of your app when your effect type gets run after all of the code that processes that error value has a chance to recover. See something like Zio for an example, though you can do similar things without Zio.
If the team uses Zio, or maybe they’ve gotten lost in the religious war of Scalaz, vs Cats, vs Zio... One of the major issues I have with Scala is how fragmented the community is. Everyone has an opinion on how something should be done and everyone thinks that everyone else is wrong.
Well you have extremist who believe that you have to cull the community due to a few bad (but still contributing) apples. Additionally you have a big lightbending company that produces some questionable contributions to the community as if it's the "scala-endorsed way".
Don't get me started with the whole forcing of "lets be python like" initiatives as of late.
we use error types that extend Exception specifically for the purpose of getting a call trace. A lot of people hate doing this but it's been extremely helpful on our project
We do this too - but the other problem you get is that Exception aren't treated specially (like Any and Nothing) in the inference hierarchy so errors are liable to "collapse" to Throwable or Exception. I like wrapping them with a new type if possible to stop this behaviour.
It makes the code impure (that is with side-effects). One can also argue that if you can encode the potential errors in the type they are not exceptional.
It's not that bad though, because you get much less of these kind of errors than in a typical Java program (for example).
Yes, it is a side-effect - and does make the code impure; but it's an /exceptional/ flow like OOM, infinite looping, stack overflows which also break equational reasoning.
My argument is that as a programmer you /choose/ the base lemmas which you're comfortable with - with an exception you're saying for a large swathe of your code, it will assume a lemma. When is /does/ break you get to point the finger at it with a stack trace saying here is where the lemma was broken. As an aside there are nicer ways to do this ala contracts, but exceptions aren't a /bad/ way.
The equivalent of try.toOption is try catch everything and discarding. The equivalent of flatMapping without adding surrounding context is equivalent to try catch rethrowing. Both of these are /much/ too common in FP codebases.
This is spot on! Way too often in Scala codebases, which tend to wrap IO calls in Future or similar is that you'll have chains of side effecting futures like
And this all gets passed up the call stack to one generic thing that basically does nothing in the case of error, maybe logging it and that's it. It's tempting to write code like this because it's so easy and looks so clean and readable, but in reality it is not how you build reliable software because what one must do in response to a failure from the first call is not the same as what one must do for a failure of the second call, or third or fourth, but this flatMap().flatMap().flatMap() style seduces the programmer into thinking they've handled errors when really they've just lumped the entire workflow as one big chain that can fail at any point and in practice usually just ignore all error cases.
One strategy I find does slightly mitigate this is to use future.transformWith[S](f: Try[T] => Future[S]). This at least presents to you the opportunity to think about error handling a bit more consciously, but I still find that the chaining operator approach nudges you away from thinking about error handling because it becomes quickly difficult to read.
It's probably to do with how limited the control structures are when dealing in Futures. You don't have while/if-else/etc, you must lift all your control into flatMaps and iterations are only doable via recursive calls with explicit accumulator state (yuck!) and nesting. So whereas a properly thoughtful treatment of error cases in synchronous code may take 20 or 30 lines of legible code, this gets tranformed in the async style to 20 or 30 levels of nested callbacks and recursive calls which becomes unreadable.
Amen. I saw exactly the same problem with over-reliance on flatMap, either explicitly or in for expressions, in future-heavy Akka code that I do in cats and cats-effect code.
People love the unifying abstractions underneath these types, and they love developing instincts about how to write code based on them, but from an application programming point of view, their instincts are often counterproductive. I don't think people take a mathematical enough viewpoint. The abstractions can't tell you what is important and unimportant. They can't tell you how your code should be shaped. They can't tell you which values should be transformed further and which should be short-circuited. They can't tell you which values deserve to be given a name for readability and which values should be anonymous.
"All these values are monads, so I can combine them with a for expression" is a meaningless statement of a trivial mathematical fact, not a clue about how you should write your code.
I think error handling code can be written in a straightforward style, but it isn't as pretty as people would like. I think the trap they fall into is holding onto elegance while they reach for correctness, instead of holding fast to correctness while they reach for elegance.
That's a great observation about for-comprehensions. It's an other one of these cute niceties that in the most basic cases can come in handy but again nudges users away from consciously thinking through what they're doing (good luck handling the error path in a for-comprehension over Futures).
I dunno. You see the same pattern with chained class methods in Python and it suffers from the same exception specificity problems, but doing it is still a fair judgment call. Sometimes all you need to know is that there was an error in the chain. Like if an exception is raised by dataframe.transpose().to_dict().values() I don’t ask myself where in the chain the error occurred, I just think, “oh that’s weird why couldn’t I turn that into a list of dicts? Did I call those methods on the wrong object?”
Your point's well taken. I'd say that there is a bigger issue in the case of flatMapping side-effectful monads because they're typically dealing with things like writing/reading from databases and the error cases should be thoughtfully considered (do I have to roll something back or perform compensating actions, or clean up anything?)
Doesn't Scala have transformer libraries like ExceptT in Haskell. Aside, much of what this comment thread is about, is similar in Haskell-land as well.
Anyway, a pattern I've started to use in Haskell with ExceptT is along the lines of
So the code still remains legible (minus the plethora of Haskell operators, data wrappers and unwrappers :)), I can trap individual exceptions along the way, which I can handle. And at the end unhandled exceptions are returned by the function as Left.
where each method just returns Unit or throws exception? The version with IO/Future is superior since it at least explicitly states that you can get an error here. If you want to say that Go-style error handling is better because if forces errors handling, I can kind of buy it, but it also has some cost.
>You don't have while/if-else/etc
Maybe you are looking for ifM/whileM functions from e.g. here: typelevel.org/cats/api/cats/Monad.html.
The call stack is "magic" built into the language, so as soon as you're not using the "blessed" way of error handling, you lose it and need to rebuild the same functionality "by hand".
I agree that some kind of logical call stack is a very useful thing to have, and I'd recommend implementing something along the lines of https://github.com/lancewalton/treelog that provides it.
How do you mean? If your either turned left you should be stuck with that exact left and no amount of flatmapping should effect your stack, it simply does not kick in unless you added something like leftmap that swallows the exception
> how would a team of mediocre developers have tackled this problem?
In my experience, mediocre developers would stick to a simple language like Python and only use simple Python features... and write a completely impenetrable, 100% coupled and wholly unreadable mess that would make you wish for over-engineered Scala. Don't even get me started on what happens with "simple" distributed computing tools like Hive...
I see people complain about languages like Scala leading to over-engineering and I just don't get it: have you not worked with sub-par Python or JavaScript code? Sure, poorly designed complex solutions have problems, but so do under-abstracted codebases! Bad code is bad code, and I don't think it's meaningful to say that "over-engineering" is categorically worse than "under-engineering".
I actually worked with somebody who wrote pathologically over-complicated Haskell. Had some friction between us over that. His Python code? Somehow even worse.
No serious language I've seen at either end of the abstraction-friendly spectrum (not Haskell/Scala nor Go/Java/Python/etc) puts a meaningful floor on code quality. An inexpressive language can have a low ceiling for how good code is, but restrict or simplify the language however you like and people will still happily write utterly unworkable code.
I've found that languages—especially less common languages thanks to the "nobody gets fired for IBM" effect—get used as scapegoats for not talking about cultural issues. If people on the team are writing clearly poor, over-complicated and bug-riddled code, it means there is some sort of technical leadership shortcoming, and it would not be any better if they were using Java instead. (I mean, have you seen Enterprise™ Java™ codebases?) And, at the same time, it's clear that a Scala codebase with tasteful technical direction and leadership—conveyed through team culture, code reviews, shared expectations... etc—can be absolutely great to work in.
At this point, I'm leery of blaming languages for programming problems without a clear mechanism. "The language lets people do bad things" isn't a real mechanism because all languages do bad things. "The language attracts people who like complexity" seems specious too. (Again, let's not blame languages for hiring and cultural problems!) "The languages defaults incentivize poor code" is a better argument, although it can be a bit fuzzy. And arguments like "language A doesn't have capability B that we need" or "language C allows classes of bugs other languages don't which empirically cause problems" are the most compelling of all.
>I was on a team building good old crud apps using monads, monoids, categories
Here's the thing, if you look at a basic Spring crud app, you are also using Monads, Monoids, Categories, Traverse, etc. but you aren't expressing it in the type system. Seriously go look at modern Spring's flux stuff, it's all there minus the type classes.
I've seen teams that tried to over engineer Spring, teams that tried to over engineer Node applications, and yes there are teams that over engineer Functional style Scala.
All the teams I worked with (and managed) at Verizon leaned pretty heavily into FP style Scala without much over engineering and the experience was extremely pleasant. The only production issues in my three years there I remember were performance related, finding out how to get more throughput or lower latency out of FP Scala. I literally can't remember any 'bugs' that made it to production.
Yeah, I'd say that Spring is the Java equivalent of the same problem (over-engineering), just using reflection and runtime bytecode generation instead.
I agree that using Monads, Monoids, etc. isn't necessarily indicative of over engineering in itself. If used well they can make the code clearer/simpler.
I'm postulating that any modern CRUD app framework has Monads, Monoids, and other categorically inspired structure, even if they don't call it that or have a way to abstract over it.
If it isn't called that and doesn't have a way to abstract over it, there's a very good chance that many of the logical Monads/Applicatives/Monoids/etc. are subtly not implemented as those things, resulting in equally subtle bugs.
In the Spark world, you can use a tiny subset of the Scala features and enjoy huge productivity gains over the other language APIs (Java & Python). Those productivity gains are wiped out as more crazy language features get used.
I don't think the super complex language features should be removed. Li's libs do some crazy stuff under the hood, but provide a clean, Python-like public interface. Most devs aren't that good and complex underlying implementations leak and yield complex public interfaces.
Lots of folks would love Scala codebases that only use 10% of the available language features and none of the complex frameworks. But, like you mentioned, it's a hard language to use responsibly.
I think the trick to using Scala is to read the high level/key features of the language (case classes/pattern matching/using Option/Some/None, type aliases etc) and basically stick to them. A good yardstick for what are those features is the Scala Book[0]. This is what I read before I started to actually code in Scala.
That coupled with side-effect free functional style (as much possible without overstretching) uses the power of Scala and keeps the code concise and readable. That is how I use it and find it very pleasurable to write in Scala compared to Java.
> Lots of folks would love Scala codebases that only use 10% of the available language features and none of the complex frameworks. But, like you mentioned, it's a hard language to use responsibly.
Does this have the same problem in large languages such as C++ though? Where everyone thinks there's an optimal subset of the language, but no one agrees on what that optimal subset is?
For me, Scala is a multi-paradigm language with tons of features. The community just needs different types of style guides for different apps. vars are horrifying for the functional crowd, but cool with me for example.
On the other hand Kotlin devs can also be really conservative in what they let creep into the language. But that's what I like about the language, it doesn't try to please everybody and that create a somewhat more rigid frame in which I'm more comfortable expressing myself.
This is my biggest gripe with Scala as well. I'm sure it's a great language, but I've often seen it used to add complexity where none is necessary. It's abused to give developers a sense of accomplishment and intellectual superiority.
In fact, in one project my former employer was involved in, one main reason they picked Scala was to, on the one hand, weed out the chaff from their existing team of .net developers (in a "shape up or ship out" kind of fashion), and on the other to weed out the 95% of mediocre Java developers.
>...I've often seen it used to add complexity where none is necessary.
This reminded me of a great blurb about Niklaus Wirth's approach to languages:
>Wirth’s philosophy of programming languages is that a complex language is not required to solve a complex problem. On the contrary, he believes that l languages containing complex features whose purposes are to solve complex problems actually hinder the problem solving effort. The idea is that it is difficult enough for a programmer to solve a complex problem without having to also cope with the complexity of the language. Having a simple programming language as a tool reduces the total complexity of the problem to be solved. [0]
Scala is a very simple language at its heart. Unfortunately that empowers developers to write really complicated libraries, leaving application developers in much the same place as if they were using a language with complicated features. (Most application developers can't even tell the difference - most complaints about "Scala is a complex language" turn out to be "I was using a complex library in Scala")
> You could've easily mistaken our team for a programming language research group at a university.
I had the misfortune of working with basically this same team, on scala (of course).
I argued for maintainable simple technology but scala was too hip to pass, for many.
Still remember one meeting trying to decipher a bug and while reviewing the code in question the lead scala fan said "It is not reasonable to expect to understand what code does by looking at it. I'll need to go research this for a few days."
I hope to never see scala (or its ilk) ever again. Give me the most stable language and environment, where every question is a FAQ and every odd behavior is documented in every book and I never need to fall into the rabbit hole of language traps. Then I can just work on building the product, which is the whole point.
> I was on a team building good old crud apps using monads, monoids, categories, combinators, effects cats, seamless and bunch of other nonsense that i've now purged from my brain.
Ah I had the same experience.
That this is allowed to happen just shows that putting naive non-technical managers in charge of coders is a disaster.
I don’t know of anyone at those companies using clojure and I haven’t seen any tech blogposts by those companies about using clojure. If clojure exists at those companies I imagine it’s a small very niche team.
One very public example of a Clojure shop through and through is NuBank out of Brazil, who employs some 700 "Clojure developers" according to them. In fact they are so committed to Clojure they bought Cognitect last year.
I have no gripe with Clojure, I take issue with someone claiming that Amazon, Apple, and Netflix have systems that operate at scale because of it. Clojure is a fine niche language, but it’s not the secret sauce to big tech web scale.
I was not claiming that, just that also big companies use it. It may not be used for the core of the business but I'm convinced the services that are build with Clojure are reliable, because the language is solid.
> Hard to win technical arguments with Scala geniuses that like using complicated language features.
This is an interesting point. Perhaps part of the reason my current employer has been successful with Scala has been that we never were integrated into that part of the community.
We hire non-Scala programmers and they write Scala without any training, and it generally turns out OK and ends up converging in a pretty boring style without any fanciness
Yep, the Spark codebase is a great example of what a big Scala codebase should look like. Nothing crazy, not too many traits, mainly just "regular functions".
This is a false dichotomy. You'll run into all sorts of pain and frustration as soon as Spark touches your codebase, no matter what kind of Scala you're writing.
I've almost every big data processing system out there and Spark causes no more pain than any other. Less than most. It's also a data processing system not a library so if you're integrating it into existing code (rather than writing code for it and running code on top of it) I'd argue you're doing it wrong.
One big reason, maybe the biggest, to write Spark jobs in Scala and (not move to pyspark) is code reuse between different components. My team maintains a fairly sizeable library that is common to our Spark jobs and several web services. Spark is a library dependency if you do anything remotely complex with it and it can easily creep up everywhere if you aren't careful.
Decoupling modules isn't always obvious in a codebase that's grown organically for 6-7 years now (long before I joined) and the cohabitation with Spark is inevitably going to cause some pains. A couple examples:
- Any library that depends on Jackson is likely to cause binary compatibility issues due to the ancient versions shipped with Spark. Guava can be a problem too. Soon enough you'll need to shade a bunch of libraries in your Spark assembly.
- We have a custom sparse matrix implementation that fits our domain well, it was completely broken by the new collections in Scala 2.13. It makes cross-publishing complicated if I don't want to be stuck to Scala 2.12 because of Spark.
After a number of years with Python, I started a position working in Scala about 6 months ago. I really wanted to learn something new, and become acquainted with functional programming concepts.
I agree with most of the above. A couple of additional thoughts:
* sbt -- I still have a lot of coming up to speed to do here, but the manual is like 500 pages and it's somewhat overwhelming. There are tons of little oddities, like why can't I run `sbt --version` and instead have to do `sbt sbtVersion`?
* The functional side is fascinating -- I'm still studying the cats library. I can almost describe a Monad! It's a pretty big mountain, though, and there are times where I have doubts whether the benefits will be worth it. Would love to hear some re-assurance! ;)
* The ecosystem for microservices seems pretty closely tied to akka & lagom. These are quite complex in their own right and we've been having trouble with the latter in particular. Curious to learn about alternatives. ZIO?
* Re: DSLs. Also not a huge fan of DSLs. One refreshing thing about python is that often configuration can just be in Python itself (as in Django, for example). See also:
> sbt -- I still have a lot of coming up to speed to do here, but the manual is like 500 pages and it's somewhat overwhelming.
SBT is not worth it. Ignore it and use Mmaven.
> The functional side is fascinating -- I'm still studying the cats library. I can almost describe a Monad! It's a pretty big mountain, though, and there are times where I have doubts whether the benefits will be worth it. Would love to hear some re-assurance! ;)
You shouldn't use these things unless and until you need them. Cats is basically a library of techniques for letting you accomplish things that seem like they might need language features by instead writing plain old functions that return plain old values. If you use the fancy technique for the sake of using the fancy technique, you're putting the cart before the horse. You should use them where you'd otherwise have to use some weird language feature (exceptions, magic async, mutable variables...).
> * The ecosystem for microservices seems pretty closely tied to akka & lagom. These are quite complex in their own right and we've been having trouble with the latter in particular. Curious to learn about alternatives. ZIO?
Mostly you don't need anything too complex. I'd recommend using akka-http to start with, but don't use any akka proper - stick to the routing DSL level and use futures rather than actors. When you're more comfortable with the functional abstractions you can switch to http4s.
I learnt Scala before diving deeper into FP, but I think properly learning/understanding Monads and some other FP concepts greatly help every sort of programming I do (even though it is mostly OOP nowadays). You come to notice it everywhere and even if you can’t abstract over them (there is no one `bind` or `join` function, so it may be called different with each Monad instance), but after understanding them you will gain a better reasoning power over them.
> It's a pretty big mountain, though, and there are times where I have doubts whether the benefits will be worth it. Would love to hear some re-assurance! ;)
It’s not worth it, the entire language is a mountain of documentation and hard to understand concepts that just gets in the way of actually delivering product features for the business. It will help you to think differently about programming problems though.
Let's clarify some points for folks not so familiar with Scala.
> * Scala minor version are binary incompatible, so maintaining Scala projects is a big pain. Upgrading Spark from Scala 2.11 to Scala 2.12 was a massive undertaking for example.
Scala just chose a strange naming scheme. Other languages would have just increased their major version instead. The scala minor version is increased every few years and not every month or so.
> * Scala has tons of language features and lets people do crazy things in the code.
Actually, that's not true. Or rather: compared to what language?
Scala has surprisingly few language features, but the ones it has are very flexible and powerful. Take Kotlin for example. It has method extensions as a dedicated feature. Scala just has implicits which can be used for method extension.
> * Scalatest is stil used by most projects and is annoying to use, as described here: https://github.com/lihaoyi/utest#why-utest. The overuse of DSLs in Scala is really annoying.
I agree with the overuse of DSLs. Luckily that got much better, but older libraries like scalatest still suffer from that.
> * Li's libs (os-lib, upickle, utest) have clean public interfaces, but most Scala ecosystem libs are hard to use, see the JSON alternatives for examples
I think that just comes from using the library in a non-idiomatic way. In most applications, you will need to use the whole json anyways, and then you use (or can use) circe like that:
{
"id": "c730433b-082c-4984-9d66-855c243266f0",
"name": "Foo",
"counts": [1, 2, 3],
"values": {
"bar": true,
"baz": 100.001,
"qux": ["a", "b"]
}
}
case class Data(
id: String,
name: String,
count: List[Int],
values: List[DataValue]
)
case class DataValue(
bar: Boolean,
baz: Float,
qux: List[String]
)
import circe.generic.auto._
import io.circe.parser._
val data = decode[Data]("...").get
data.copy(name = data.name.reverse).asJson
Yeah, that is more code, but as I said, in the vast majority for projects, you need all or most of the fields anyways, so all the structure definition is a one-time thing.
The advantage is that the last line is plain Scala code. You don't even need to understand the json-library to do transformations and re-encode into json.
> Scala just has implicits which can be used for method extension.
I don't think it's fair to say it like this. Scala's implicits can mean different things, depending on where they're used. Scala 3 even divides 'implicit' to multiple keywords.
(kind of 'static' in C++ I guess, only more complicated)
Scala has one feature (implicits) but it can be used ("mean") for different things.
Essentially, you can mark definitions as implicit and you can mark parameters as implicit. Yes, Scala 3 uses different keywords to make it easier to understand which is what, but both is still just the concept of things being implicit.
Think about it: one without the other is completely useless. If you cannot define implicit parameters, then marking any value as implicit will not have any effect. The other way around too: you can mark your parameters as implicit as much as you want, if you can't define implicit values, you will always be forced to pass all parameters manually.
Even implicit classes (excentions) are just syntactic sugar for regular methods that are marked implicit.
I’m not a Scala programmer so I don’t know who is more right here, but _ai_ was saying that calling three different features by one name does not mean there’s really one feature. Which is different than saying one feature can be used in three different ways.
The C++ static example was used because in that case the same keyword was used for several literally different features to avoid adding additional reserved words.
> Scala has tons of language features and lets people do crazy things in the code.
In many ways, it's even worse than that. Scala has exactly two features (implicits and punctuation-free method calls) that allow you to build massively complicated libraries that pretend to be language features, and which work worse than equivalent features in languages that support them explicitly. A good example of this is typeclasses which are core features of haskell and rust, but are implemented by explicitly passing implicits (hah) in Scala.
Right. We don't have a full production 3.0 release yet, so I tend to think of "Scala" as Scala 2 unless it's specifically qualified as Scala 3.
Scala 3 seems to be a really good step in the right direction for me. There's a clear desire for many of the features that are currently achieved through burdensome hacks on top of implicits, and actually reifying several of them into core language features is a counterintuitive way to make things simpler rather than more complicated.
> Scala has tons of language features and lets people do crazy things in the code. Hard to win technical arguments with Scala geniuses that like using complicated language features.
I also like(d) a lot about Scala, but this is what ultimately led me to other languages. There's too many language features cranked together. This makes a) for a steep learning curve and b) exposes you to expert's code that is so dense and 'smart' that its a PITA to decipher what it does.
b) you'll encounter especially when incorporating libraries that are not fully stable, and you have to track down bugs in them (if only to know if your code is at fault, or the library's code).
I had this with Akka when it was just released. One line of code and 2hrs of debugging to determine what it did, and if it was faulty or not.
I can agree with all your points (wrote a fair bit of Scala all though I wouldn't identify as a scala programmer).
Scala is a weird case, I would normally be the perfect fanboy for it, I like functional programming, did my fair share of SML, Haskell, Clojure. Also I am not at all dogmatic and can even find joy in writing Java.
However, Scala and I never got along well. It is - I think - too magic. I'd even prefer pure Java over it tbh.
And then, the community is fairly toxic, which also made me not want to stick around.
> The overuse of DSLs in Scala is really annoying. Too many DSLs is another example of something I consider to be an antipattern, but there is no Scala community consensus on the responsible use of DSLs.
Domain specific languages are heavy users of the implicit keyword and implicit conversion; maybe that makes the code more concise, but it doesn't quite help with reading / understanding the code. For me that's the greatest problem with scala - figuring out what this code in front of me does.
The way I tend to think about it is that there are different axes of "complexity" in a language/ecosystem. Looking only at the language itself then yes Scala is quite complicated with a lot of language features (implicit, higher-minded types, macros, ...) that are uncommon in other popular languages. But another axis to consider is "how many abstractions do I need to understand in order to grok a large codebase?" And that I think is where Scala shines. There are different ecosystems but if you look at something likes cats/cats-effect then there are a core set of type classes on which everything else is built and if you understand those core abstractions then understanding any codebase which uses them is very easy. And those abstractions are extremely flexible to the point where they can be applied to an astonishing variety of problems.
Contrast that with something like Java which, while quite simple syntactically, has a huge and varied super-structure of incongruent abstractions on top of it. Every time I dive into a largish codebase of "enterprise Java" it feels like the engineers has a running bet on who could use the most GoF patterns in any given module.
I think you have something here. Scala is complez, but it lets you "push down" a lot of application-level complexity into the language where it can be better managed snd understood by tooling.
For example,
- Scala singleton objects add complexity footprint to the language, but reify a super common pattern in Java and ensures everyone does it the same way.
- Case classes add yet more resource footprint, but again reify a common pattern that in Java-land is fragmented between Beans and POJOs and other things.
- Named and optional parameters add complexity over Java's simple method calling style, but subsume a whole zoo of builders, overloading, telescoping and other patterns used to work around their absence
Each of these features certainly makes the language more complex, but arguably at the same time they make user code more simple and boring
good point! on the other hand java got some functional features since java8 (lambda/streams/option chaining) that allow to have less bloated code; i think nowadays it is (at least) possible to have a less design pattern heavy code style in java.
True. Java has improved substantially since java8 (including something kinda sorta like case classes with record types in jdk15) but I still think Scala is miles better. I don't think it is generally appreciated how essential higher-minded types and typeclasses are to making functional programming tractable and safe. Perhaps because it is something that is generally hidden away in libraries and developers working on actual application code (i.e. 99% of scala devs) never use them.
* Li's libs (os-lib, upickle, utest) have clean public interfaces, but most Scala ecosystem libs are hard to use, see the JSON alternatives for examples: https://www.lihaoyi.com/post/uJsonfastflexibleandintuitiveJS...
* The Mill build tool looks a lot better than SBT, but seems like everyone is still using SBT
* Scala minor version are binary incompatible, so maintaining Scala projects is a big pain. Upgrading Spark from Scala 2.11 to Scala 2.12 was a massive undertaking for example.
* Scala has tons of language features and lets people do crazy things in the code. Hard to win technical arguments with Scala geniuses that like using complicated language features.
* Scalatest is stil used by most projects and is annoying to use, as described here: https://github.com/lihaoyi/utest#why-utest. The overuse of DSLs in Scala is really annoying. Too many DSLs is another example of something I consider to be an antipattern, but there is no Scala community consensus on the responsible use of DSLs.
I'm optimistic about Scala. There are some folks that love the language and are continuously improving the ecosystem. Scala 3 will have to sell a better story about ditching legacy tooling and giving users a better default stack if it wants to compete with modern Go/Rust/Python.