Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you mean that these scaling problems hapenned because ESB was misused? Or would it occur with any architecture or architectures (if different teams used different instruments) just because of sheer volume of data that needs to be processed by business logic?

May be I'm bad at phrasing this, but the point is, sometimes you have perfomance problems because you could write better code, or use better tech, or tweak requirements a little bit, or do something else a little smarted. But sometimes you have perfomance problems simply because you just do a lot of actual work.



I think these problems occurred because an ESB is too easy to over-use.

Once you have it, it's very tempting to use it for everything. But of course the more you use it, the more features it needs to have.

Let me illustrate that with a example compiling some of the patterns I have come to see at companies that overly used their ESBs:

Say Bob is working at a trading company and sets up an ESB where he publishes the daily trades of the company.

Soon, the trading desk learns about that, and decide that it's actually cool to use this for trading, they just have to listen on the ESB, and send the trades accordingly. That's the error pattern 1: making the ESB business critical.

The next day, the risk team learns about that ESB thing, and decide its very handy to perform post-trading checks by just listening to the trades flowing on the ESB. So they setup a system that listens for trades on the ESB, check that the trade is compliant with some limits, and send an other message on the ESB to let everyone know that this trade is validated. This is error pattern 2: cascading messages triggered from other messages.

The week after that, the security team learns about the ESB, and decides its very insecure to let anyone see the trades of the company, so they start implementing access control on the ESB. This is error pattern 3: now you have an overly complex layer on top of the ESB to decide who can see what and who can publish what.

Rince and repeat patterns 1, 2,and 3 for 5 years and here is the situation you end up with:

- The ESB is not the easy and handy system it was in the beginning. Since it has become the de-facto standard for publishing information in the whole company, it has to support features for _all_ the company use cases. There is access control to publish per topic, access control to listen per topic, multiple bindings of varying quality for each technological stack/language that each team in the company is using. The company of course is not capable / prepared to maintain a software of this scope, so the ESB is crippled with bugs that nobody can fix, because, you know, the infrastructure team cannot fix their groovy scripts using the ESB cause the guy that wrote them left. And the marketing team has some interns using the excel plug-in but they don't have time to rewrite them this year. The ESB is now partly un maintained, because the company relied totally on it without having the capacity / willingness / foresight of understanding how intricate it can be to update something that everyone use.

- The ESB is now very slow, because it was so tempting to publish anything of various interest on it that everyone did it. The problem is that the ESB is also critical for the company, so the whole flow of message is now slowly moving and overflowing, requiring endless tuning and tentatives at scaling it better. Of course 80% of the messages on the ESB are actually not listened to by anyone, but since nobody really knows who listens to the published messages, it's very tempting to just _not_ stop programs from publishing, ever, because god knows if some random team at the other end of the company might have a program reading these messages.

- You most likely have now an IT team dedicated to maintain the ESB. They are squeezed and pressured by the business teams to keep the ESB fast and easy to stable without requiring them to recode all the crap they plugged on it. On the other end, the other IT teams are pressuring them to update the ESB to support <place your language/stack here>. Of course the ESB team has no incentive to make any improvement whatsoever to the ESB, because that would definitely crash most of the crap the less technical teams of the company plugged to it. But the ESB team is the de facto guardian of the temple of the ESB, so everyone ends up frustrated by the situation.

---

I'm not sure I did a good job at explaining the various problems here, but basically, the one size fits all that ESBs are promoting is often not a future proof choice.

The reality is that you don't want your whole company coupled to a single system like that. Otherwise your system will be as good as the worst user of it.


FWIW, I worked at a trading firm that had a very tidy and long-lived ESB that I found to be a joy to work with.

-but-

My experience there 100% confirms what you say about it not being something that scales well.

It was a relatively modestly sized firm by headcount, as trading firms go, and there was a corporate culture of absolutely intense inter-team and inter-departmental communication. One absolutely would not dream of subscribing to an event stream without first talking to the team that maintained the program that published it. Any changes to the event stream - both the grotty little details about what was being published and how, and the grotty little details about what was being consumed and how - would be preceded by a discussion among all the people who were working with it, to make sure that everyone involved continued to have a complete picture of the interactions involving it.

That level of communication, which I do believe was essential to the bus's long term success, just wouldn't scale to a large company. Nor could it have been maintained at a company that had a more relaxed attitude in general. Nor could it have been maintained at a company where programmers are allowed to believe that most of their time at the office can be spent with hands on a keyboard.


> That level of communication, which I do believe was essential to the bus's long term success, just wouldn't scale to a large company

What would though? An ESB could be a way to enforce a standard and slow down gung-ho devs & teams using ad hoc solutions for every problem. But say an ESB is not the solution -- what is? As the company grows, overhead and friction become more important. I'm not sure "every piece connecting to every other piece in whatever way" would help with scale, rather than compounding the problem...


IMO, there is no software solution, because it's not a software problem, it's a social problem.

As far as specific things to try go, domain-driven design is my personal favorite off-the-shelf mental framework for dealing with these sorts of things. Especially the concept of bounded contexts. Embrace Conway's Law; recognize it's not a criticism, per se, it's also a scaling strategy.


Fair enough. I agree partially: it's a social problem. Thing is, software engineering deals with social problems too: those related to development.

I've never seen DDD successfully used in any company I've worked in, but that's probably a shortcoming of my own experience. (Likewise, I've never seen TDD or Agile or lots of things people often mention in their blogs successfully used. Again, this is probably my own problem!).

addendum: to be fair I've never seen a completely working ESB either. Always a plan to build or deploy one, never the finished thing ;)


I doubt it's your own problem. The thing about methodologies like DDD and Agile is that they're not just a development practice. They're also (in my opinion much more importantly) frameworks for how the entire company interacts with the dev team.

I see them fail more often than not, too, and one thing that's consistent about every failed implementation I've witnessed first-hand is that non-developers who are involved in or influence product development weren't engaged with, bought into, or properly trained in the framework.


You did an extremely good job at explaining what are the risks.

But if everybody it in the company it is using it, maybe it is worth to have.

Must be well though of from the beginning, like proper ACL and notification of reception (so we know what messages some other people listen to) but overall it does not seems so bad.

In your example, the trading desk would either need an intern to communicate the trades or some software developed ad-hoc. Similarly for the second examples. The data need to be moved from the trade to the risk analysis someway. Either custom software or some human need to do it.

For the third pattern, I agree that should be backed into the EBS, but you need ACL anyway. Either you let all the people in the company see the trade or you have ACL implemented somewhere.

I definitely can see the problems, but I can also see the benefits...


Thanks for the detailed explanation!

I'm not sure I understand why making the ESB business critical is an error. I mean, it's a key piece in the architecture of your system (say, like a database would be for an app that uses one). It makes sense that it's critical if it's central to your solution. It's also unsurprising if it requires a dedicated team to maintain and monitor, much like any piece of critical infrastructure would. Am I missing something?

What's the alternative? Multiple ad hoc, p2p, uncontrolled connections and streams between arbitrary components of your solution, of varying quality, and many that require maintenance of different kinds. This works for a smaller software with fewer connections, but how does the effort scale as the system grows?


Don't get me wrong, buses/queues/pubsubs/etc systems are definitely a good thing, and can quite elegantly and efficiently solve cross processes/teams needs.

But in the case of this discussion, we should not forget the "E" in ESB. These aim at being company-wide.

In the example I wrote above, we could have a bus for flowing the trades, the risk department could use a database to store their check results, the devops could use an ad-hoc time series database for their metrics, the interns with excel could just read plain files exported from other systems, etc.

The key here is that technological diversity inside a company is not a bad thing. Sure, it does not seem as "unified", but in the long run, it gives each team its own little realm of responsibility, room for standalone technological improvements and tech stack switches, etc.

I guess overall there is a fine equilibrium to find between having a totally uniform stack used by everyone, and having each team using drastically different tools. ESBs can make you fall into an extreme without realising it, because at first, it may look like a grandiose unification plan.


I guess overall there is a fine equilibrium to find between having a totally uniform stack used by everyone, and having each team using drastically different tools. ESBs can make you fall into an extreme without realising it, because at first, it may look like a grandiose unification plan.

This totally nails it on the head. The problem with Entreprise service buses is the same problems that every major unification plan runs into. It usually executives looking for one ring to rule them all. This misses all the complexity that is going on in the business and deprives the teams that are actually solving those problems the independence to solve them the best way.


> The key here is that technological diversity inside a company is not a bad thing

Agreed! I love working in companies where there is margin for teams to choose their own tech (within reason). It's just that it's hard to know where to draw the line. At one company I worked in, the infrastructure team had a terribly difficult time developing company-wide solutions within acceptable deadlines because the tech and standards where all over the place, the result of a policy of "everything goes, as long as it keeps the company running". This works until you reach a point where some rule must be enforced company-wide... and then chaos ensues. Rules can be anything company-wide: tests, privacy, backups, automated checks, disaster recovery, any kind of compliance, monitoring, etc.

Agreed about your overall point though, and that the ESB is an extreme.


Why not both though? What makes an ESB more onerous than, say, providing a microservice for other departments to use?


Implicit vs. explicit interactions/contracts. Both between consumers & producers but also...

The features, constraints, goals, etc. expected out of the ESB substrate by the different use cases becomes a nightmare of responsibility/blame shifting. Durability and replay? That's the ESB's problem. Load balancing? Yep, ESB. But it's screwing up the ordering guarantees! Well, make the ESB 'smarter' and we can just keep punting all of our problems to someone else. Etc.


Agreed, but that's one side of the coin. The other is that these guarantees and constraints are clearly located in one place (and one team), instead of spread across many teams and implemented with varying maturity and seriousness. I've seen this often happens when separate teams are responsible for separate microservices, implemented in random technologies.

For this to work, the ESB must be acknowledged as the critical piece of the architecture, and the team responsible must be empowered enough.


Thanks such a detailed answer! I've never worked with enterprise, so this is very interesting.

However my question remains: is the ESB the source of the problem here? If your company would world "like that" — as in, with all the circuimstances and limitations you mentioned — would using different systems really bring a better outcome than ESB?


Wonder whether these can explain many characteristics and issues of both democracy and court as well as totalitarian process. Real life it is always a mix. USA version has a total open public bus, (congress senate and court) but closed one (3 letters). China mostly closed but on its own way open (with heavy sanction). Just wonder.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: