Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nothing particularly notable here. A lot of it seems to be 'We have something in-house designed for our use cases, use that instead of the standard lib equivalent'.

The rest looks very reasonable, like avoiding locale-hell.

Some of it is likely options that sand rough edges off of the standard lib, which is reasonable.





> We have something in-house designed for our use cases, use that instead of the standard lib equivalent

Yea, you encounter this a lot at companies with very old codebases. Don't use "chrono" because we have our own date/time types that were made before chrono even existed. Don't use standard library containers because we have our own containers that date back to before the STL was even stable.

I wonder how many of these (or the Google style guide rules) would make sense for a project starting today from a blank .cpp file. Probably not many of them.


For the containers in particular this makes a lot of sense because the C++ stdlib containers are just not very good. Some of this is because C++ inherited types conceived as pedagogic tools. If you're teaching generic programming you might want both (single and double) extrusive linked list types for your explanation. But for a C++ programmer asking "Which of these do I want?" the answer is almost always neither.

The specification over-specifies std::unordered_map so that no good modern hash table type could implement this specification, but then under-specifies std::deque so that the MSVC std::deque is basically useless in practice. It requires (really, in the standard) that std::vector<bool> is a bitset, even though that makes no sense. It sometimes feels as though nobody on WG21 has any idea what they're doing, which is wild.


Linked lists used to be more efficient than dynamic arrays — 40 years ago, before processors had caches.

Intrusive linked lists still firmly have a place in modern code, for reasons other than performance. I don’t know many good reasons for extrusive linked lists, even before caches. There might be a few, but a dynamic array is (and has always been?) usually preferable to an extrusive list.

> I don’t know many good reasons for extrusive linked lists

for one, its iterator won't be invalidated


I haven't benchmarked them myself yet, but the C++23 flat map containers are supposed to finally have fixed this. Chrome lists them as TBD: https://chromium.googlesource.com/chromium/src/+/main/styleg... .

When you say "fixed this" which "this" do you think they fixed? Are you imagining this is a hash table? It's not

It's an adaptor which will use two other containers (typically std::vector) to manage the sorted keys and their associated values. The keys are sorted and their values are stored in the corresponding position in their own separate std::vector. If you already have sorted data or close enough then this type can be created almost for free yet it has similar affordances to std::map - if you don't it's likely you will find the performance unacceptable.


Don't use standard library containers because we have our own containers that date back to before the STL was even stable.

Flashback to last job. Wrote their own containers. Opaque.

You ask for an item from it, you get back a void pointer. It's a pointer to the item. You ask for the previous, or the next, and you give back that void pointer (because it then goes through the data to find that one again, to know from where you want the next or previous) and get a different void pointer. No random access. You had to start with the special function which would give you the first item and go from there.

They screwed up the end, or the beginning, depending on what you were doing, so you wouldn't get back a null pointer if there was no next or previous. You had to separately check for that.

It was called an iterator, but it wasn't an iterator; an iterator is something for iterating over containers, but it didn't have actual iterators either.

When I opened it up, inside there was an actual container. Templated, so you could choose the real inside container. The default was a QList (as in Qt 4.7.4). The million line codebase contained no other uses; it was always just the default. They took a QList, and wrapped it inside a machine that only dealt in void pointers and stripped away almost all functionality, safety and ability to use std::algorithm

I suspect but cannot prove that the person who did this was a heavy C programmer in the 1980s. I do not know but suspect that this person first encountered variable data type containers that did this sort of thing (a search for "generic linked list in C" gives some ideas, for example) and when they had to move on to C++, learned just enough C++ to recreate what they were used to. And then made it the fundamental container class in millions of lines of code.


time to refactor the code base so this tumor can be deleted?

The complete refactor, bringing it forwards from VS2008 to VS2022, and from a home-built, source-code edited Qt 4.7.4 to Qt 6.something, took about two years from start to finish.

> home-built, source-code edited Qt 4.7.4

That's scarier than the containter craziness you mention


> I wonder how many of these (or the Google style guide rules) would make sense for a project starting today from a blank .cpp file. Probably not many of them.

The STL makes you pay for ABI stability no matter if you want it or not. For some use cases this doesn't matter, and there are some "proven" parts of the STL that need a lot of justification for substitution, yada yada std::vector and std::string.

But it's not uncommon to see unordered_map substituted with, say, sparsehash or robin_map, and in C++ libraries creating interfaces that allow for API-compatible alternatives to use of the STL is considered polite, if not necessarily ubiquitous.


The majority of things Chromium bans would still get banned in green-field use.

Some notable exceptions: we'd have allowed std::shared_ptr<T> and <chrono>. We might also have allowed <thread> and friends.


> I wonder how many of these (or the Google style guide rules) would make sense for a project starting today from a blank .cpp file. Probably not many of them.

That also depends on how standalone the project is. Self-contained projects may be better off with depending on standard library and popular third-party libraries, but if a project integrates with other internal components, it's better to stick to internal libraries, as they likely have workarounds and special functionality specific to the company and its development workflow.


I'd argue that the optimum was in long run to migrate to the standard version, that everyone (e.g. new employees) know. Replacing the usually particular (or even weird) way implemented own flavour.

I know, I know, long run does not exists in today's investor dominated scenarios. Code modernization is a fairytale. So far I seen no exception in my limited set of experiences (but with various codebases going back to the early 90's with patchy upgrades here and there, looking like and old coat fixed many many times with diverse size of patches of various materials and colour).


When I led C++ style/modernization for Chromium, I made this argument frequently: we should prefer the stdlib version of something unless we have reason not to, because incoming engineers will know it, you can find advice on the internet about it, clang-tidy passes will be written for it, and it will receive optimizations and maintenance your team doesn't have to pay for.

There are cases, however, when the migration costs are significant enough that even those benefits aren't really enough. Migrating our date/time stuff to <chrono> seemed like one of those.


[flagged]


Look, I even share your language preference but this is still unnecessary.

Are there really any good reasons to start a brand new project in c++ though? No one who can write modern c++ has any trouble with rust in my experience, and all the other common options are even quicker to pick up. Creating bindings isn't hard anymore if your niche library doesn't have any yet. Syntactic preference I guess, but neither c++ or rust are generally considered elegant or aesthetic choices.

Because "brand new" doesn't mean devoid of context. Within your domain, there will still be common libraries, interfaces, and tools.

C++ is very flexible, with a lot of very mature tooling and incredibly broad platform support. If you're writing some web server to run on the hardware of your choosing, then sure, that doesn't matter. But if you're writing something deeply integrated with platform/OS interfaces, or graphics, or needs to support some less common platforms, then C++ is often your only practical option for combining expressiveness and performance.


This is the sort of info I was trolling for, but what are those platforms and os? Targets llvm doesn't handle yeah c++ makes sense, or c. A sibling mentions xcode, which makes sense. Graphics seems questionable, vulkan support is fine. Windows support has seemed finetoo, the same gui has worked as what we wrote for Linux.

Dependencies. There are billions of lines of C++ out there that have been optimized and production hardened over decades that you might want to reuse. Rust lang interoperability with anything but C sucks in practice.

Unreal, Godot, CryEngine, DirectX, PlayStation, Switch, XBox, CUDA, SYSCL, LLVM, GCC, V8,...

Yes, there are plenty of domains where Rust has zero ecosystem.

Not to mention that Rust advocates keep forgetting their compiler is partially written in C++ (LLVM/GCC).


Maybe, maybe not. But either way it's just plain rude to charge into a C++ thread to drop a comment saying how the language sucks and you should use (insert other language) instead.

Rust becomes a significant burden if you need a GUI or hardware-accelerated graphics.

C++ isn't much better for GUI.

C++ was the GUI king during the 1990's, and none of the Rust toolkits is half as good as the surviving frameworks, like C++ Builder, Qt, wxWidgets, heck even MFC has better tooling.

I assume most of them are just grabbing qt

In addition to other reasons given: If you have a team of C++ developers, let them use the language they know best.

Yes. If you're targeting Apple platforms and want to allow clients to use your product in Xcode (the common case) or even need Swift/ObjC interop yourself, using rust or anything not explicitly supported by Apple and Xcode is just too fiddly.

Why not pick swift in this situation over c++?

(Shrug) If I want Rust, I'll feed my C++ to an LLM and tell it to port it to Rust. Since we've been assured that Rust magically fixes everything that's wrong, bad, or unsafe about C++, this seems like a sound approach.

We probably aren't that far off actually. Even taking asm with no symbols back into rust works well. You have truth, just have the agent repeat until the asm matches. Doesn't work on giant codebases, but on a few functions it absolutely does. And while the llm may get the algorithm wrong, the type system does seem to help it generate useful code in rust for a starting place v

Yeah, but then just let the agent generate proper C++ code, contrary to an human it doesn't forget about best practices, or how ownership is supposed to be handled.

Except the llm forgets about that in rust too, then the agent looks at the ownership errors from the previous iteration and fixes them.

You missed the other take, with AI assisted coding, you can stay in C++, as it will take care everything is coded with enough care.

Or why bother with Rust, when the LLM gets to generate C++ code with best practices.

While I like Rust, I think AI as the next abstraction step in programming has kind of taken its relevance away, when computer assisted programming is part of the workflow.


Yeah, good point, I don't know how I missed that possibility.

/s of course... for now, but not for long.


Strange. I wouldn’t trust the output of a coding agent and I would want stronger review of its output. If it passes a strict compiler that gives me more confidence than if it passed a lax one.

But sure, if you trust it to have written C++ to a higher standard than the experts, then go for it.


So not into the vibe coding hype taking over all our jobs?

It is what it is, I accept that’s where the industry is heading.

But if I have to produce reams of code I’d much rather have it be reviewed by rustc than clang. rustc may take longer to satisfy, but it’ll be worth it because I won’t be responsible for horrors in production.

You’re happy to be responsible for buffer overflows written by an LLM? I’m not, which is why I prefer a language where it’s not possible.


It's weird to me, as the former lead maintainer of this page for ten years or so, that this got submitted to both r/c++ and HN on the same day. Like... what's so exciting about it? Was there something on the page that caught someone's eye?

Somewhat notable is that `char8_t` is banned with very reasonable motivation that applies to most codebases:

> Use char and unprefixed character literals. Non-UTF-8 encodings are rare enough in Chromium that the value of distinguishing them at the type level is low, and char8_t* is not interconvertible with char* (what ~all Chromium, STL, and platform-specific APIs use), so using u8 prefixes would obligate us to insert casts everywhere. If you want to declare at a type level that a block of data is string-like and not an arbitrary binary blob, prefer std::string[_view] over char*.


`char8_t` is probably one of the more baffling blunders of the standards committee.

there is no guarantee `char` is 8 bits, nor that it represents text, or even a particular encoding.

If your codebase has those guarantees, go ahead and use it.


> there is no guarantee `char` is 8 bits, nor that it represents text, or even a particular encoding.

True, but sizeof(char) is defined to be 1. In section 7.6.2.5:

"The result of sizeof applied to any of the narrow character types is 1"

In fact, char and associated types are the only types in the standard where the size is not implementation-defined.

So the only way that a C++ implementation can conform to the standard and have a char type that is not 8 bits is if the size of a byte is not 8 bits. There are historical systems that meet that constraint but no modern systems that I am aware of.

[1] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n49...


Don't some modern DSPs still have 32bit as minimum addressable memory? Or is it a thing of the past?

If you're on such a system, and you write code that uses char, then perhaps you deserve whatever mess that causes you.

char8_t also isn't guaranteed to be 8-bits, because sizeof(char) == 1 and sizeof(char8_t) >= 1. On a platform where char is 16 bits, char8_t will be 16 bits as well

The cpp standard explicitly says that it has the same size, typed, signedness and alignment as unsigned char, but its a distinct type. So its pretty useless, and badly named


Wouldn't it be rather the case that char8_t just wouldn't exist on that platform? At least that's the case with the uintN_t types, they are just not available everywhere. If you want something that is always available you need to use uintN_least_t or uintN_fast_t.


It is pretty consistent. It is part of the C Standard and a feature meant to make string handling better, it would be crazy if it wasn't a complete clusterfuck.

There's no guarantee char8_t is 8 bits either, it's only guaranteed to be at least 8 bits.

> There's no guarantee char8_t is 8 bits either, it's only guaranteed to be at least 8 bits.

Have you read the standard? It says: "The result of sizeof applied to any of the narrow character types is 1." Here, "narrow character types" means char and char8_t. So technically they aren't guaranteed to be 8 bits, but they are guaranteed to be one byte.


Yes, but the byte is not guaranteed to be 8 bits, because on many ancient computers it wasn't.

The poster to whom you have replied has read correctly the standard.


What platforms have char8_t as more than 8 bits?

Well platforms with CHAR_BIT != 8. In c and c++ char and there for byte is atleast 8 bytes not 8 bytes. POSIX does force CHAR_BIT == 8. I think only place is in embeded and that to some DSPs or ASICs like device. So in practice most code will break on those platforms and they are very rare. But they are still technically supported by c and c++ std. Similarly how c still suported non 2's complement arch till 2023.

How many non-8-bit-char platforms are there with char8_t support, and how many do we expect in the future?

TI C2000 is one example

Thank you. I assume you're correct, though for some reason I can't find references claiming C++20 being supported with some cursory searches.

Mostly DSPs

Is there a single esoteric DSP in active use that supports C++20? This is the umpteenth time I've seen DSP's brought up in casual conversations about C/C++ standards, so I did a little digging:

Texas Instruments' compiler seems to be celebrating C++14 support: https://www.ti.com/tool/C6000-CGT

CrossCore Embedded Studio apparently supports C++11 if you pass a switch in requesting it, though this FAQ answer suggests the underlying standard library is still C++03: https://ez.analog.com/dsp/software-and-development-tools/cce...

Everything I've found CodeWarrior related suggests that it is C++03-only: https://community.nxp.com/pwmxy87654/attachments/pwmxy87654/...

Aside from that, from what I can tell, those esoteric architectures are being phased out in lieu of running DSP workloads on Cortex-M, which is just ARM.

I'd love it if someone who was more familiar with DSP workloads would chime in, but it really does seem that trying to be the language for all possible and potential architectures might not be the right play for C++ in 202x.

Besides, it's not like those old standards or compilers are going anywhere.


Cadence DSPs have C++17 compatible compiler and will be c++20 soon, new CEVA cores also (both are are clang based). TI C7x is still C++14 (C6000 is ancient core, yet still got c++14 support as you mentioned). AFIR Cadence ASIP generator will give you C++17 toolchain and c++20 is on roadmap, but not 100% sure.

But for those devices you use limited subset of language features and you would be better of not linking c++ stdlib and even c stdlib at all (so junior developers don't have space for doing stupid things ;))


Green Hills Software's compiler supports more recent versions of C++ (it uses the EDG frontend) and targets some DSPs.

Back when I worked in the embedded space, chips like ZSP were around that used 16-bit bytes. I am twenty years out of date on that space though.


How common is it to use Green Hills compilers for those DSP targets? I was under the impression that their bread was buttered by more-familiar-looking embedded targets, and more recently ARM Cortex.

Dunno! My last project there was to add support for one of the TI DSPs, but as I said, that's decades past now.

Anyway, I think there are two takeaways:

1. There probably do exist non-8-bit-byte architectures targeted by compilers that provide support for at-least-somewhat-recent C++ versions

2. Such cases are certainly rare

Where that leaves things, in terms of what the C++ standard should specify, I don't know. IIRC JF Bastien or one of the other Apple folks that's driven things like "twos complement is the only integer representation C++ supports" tried to push for "bytes are 8 bits" and got shot down?


> but it really does seem that trying to be the language for all possible and potential architectures might not be the right play for C++ in 202x.

Portability was always a selling point of C++. I'd personaly advise those who find it uncomfortable, to choose a different PL, perhaps Rust.


> Portability was always a selling point of C++.

Judging by the lack of modern C++ in these crufty embedded compilers, maybe modern C++ is throwing too much good effort after bad. C++03 isn't going away, and it's not like these compilers always stuck to the standard anyway in terms of runtime type information, exceptions, and full template support.

Besides, I would argue that the selling point of C++ wasn't portability per se, but the fact that it was largely compatible with existing C codebases. It was embrace, extend, extinguish in language form.


> Judging by the lack of modern C++ in these crufty embedded compilers,

Being conservative with features and deliberately not implementing them are two different thing. Some embedded compilers go through certification, to be allowed to be used producing mission critical software. Chasing features is prohibitively expensive, for no obvious benefit. I'd bet in 2030s most embedded compiler would support C++ 14 or even 17. Good enough for me.


> Being conservative with features and deliberately not implementing them are two different thing.

There is no version of the C++ standard that lacks features like exceptions, RTTI, and fully functional templates.

If the compiler isn't implementing all of a particular standard then it's not standard C++. If an implementation has no interest in standard C++, why give those implementations a seat at the table in the first place? Those implementations can continue on with their C++ fork without mandating requirements to anyone else.


Non-8-bit-char DSPs would have char8_t support? Definitely not something I expected, links would be cool.

Why not? except it is same as `unsigned char` and can be larger than 8 bit

ISO/IEC 9899:2024 section 7.30

> char8_t which is an unsigned integer type used for 8-bit characters and is the same type as unsigned char;


> Why not?

Because "it supports Unicode" is not an expected use case for a non-8-bit DSP?

Do you have a link to a single one that does support it?


That's where the standard should come in and say something like "starting with C++26 char is always 1 byte and signed. std::string is always UTF-8" Done, fixed unicode in C++.

But instead we get this mess. I guess it's because there's too much Microsoft in the standard and they are the only ones not having UTF-8 everywhere in Windows yet.


char is always 1 byte. What it's not always is 1 octet.

you're right. What I meant was that it should always be 8 bit, too.

std::string is not UTF-8 and can't be made UTF-8. It's encoding agnostic, its API is in terms of bytes not codepoints.

Of course it can be made UTF-8. Just add a codepoints_size() method and other helpers.

But it isn't really needed anyway: I'm using it for UTF-8 (with helper functions for the 1% cases where I need codepoints) and it works fine. But starting with C++20 it's starting to get annoying because I have to reinterpret_cast to the useless u8 versions.


Related: in C at least (C++ standards are tl;dr), type names like `int32_t` are not required to exist. Most uses, in portable code, should be `int_least32_t`, which is required.

char on linux arm is unsigned, makes for fun surprises when you only ever dealt with x86 and assumed char to be signed everywhere.

This bit us in Chromium. We at least discussed forcing the compiler to use unsigned char on all platforms; I don't recall if that actually happened.

I recall that google3 switched to -funsigned-char for x86-64 a long time ago.

A cursory Chromium code search does not find anything outside third_party/ forcing either signed or unsigned char.

I suspect if I dug into the archives, I'd find a discussion on cxx@ with some comments about how doing this would result in some esoteric risk. If I was still on the Chrome team I'd go looking and see if it made sense to reraise the issue now; I know we had at least one stable branch security bug this caused.


Isn't the real reason to use char8_t over char that it that char8_t* are subject to the same strict aliasing rules as all other non-char primitive types? (i.e., the compiler doesn't have to worry that a char8_t* could point to any random piece of memory like it would for char*?).

At least in Chromium that wouldn't help us, because we disable strict aliasing (and have to, as there are at least a few core places where we violate it and porting to an alternative looks challenging; some of our core string-handling APIs that presume that wchar_t* and char16_t* are actually interconvertible on Windows, for example, would have to begin memcpying, which rules out certain API shapes and adds a perf cost to the rest).

> using u8 prefixes would obligate us to insert casts everywhere.

Unfortunately, casting a char8_t* to char* (and then accessing the data through the char* pointer) is undefined behavior.


Yes, reading the actual data would still be UB. Hopefully will be fixed in C++29: https://github.com/cplusplus/papers/issues/592

In a lot of places, they point out the std implementation is strictly inferior to theirs in some way, so its not always organizational inertia, it's that the C++ standard types could have been designed strictly better with no tradeoff.

Not an Googler, but my, probably way too much romanticized, understanding of Google was that they never ask you about specific tech because for everything there's an in-house version.

The problem is that too many people drank too much koolaid and trying to parrot everything to a letter without understanding the bigger picture.

The best example would be Kubernetes. Employed by many orgs that have 20 devs and 50 services.


> for everything there's an in-house version.

Reasonable summary. There's some massive NIH syndrome going on.

Another piece is that a lot of stuff that makes sense in the open source world does not make sense in the context of the giant google3 monorepo with however many billions of lines of code all in one pile.


> Nothing particularly notable here. A lot of it seems to be 'We have something in-house designed for our use cases, use that instead of the standard lib equivalent'.

The bulk of the restrictions are justified as "Banned in the Google Style Guide."

In turn the Google Style Guide bans most of the features because they can't/won't refactor most of their legacy code to catch up with post C++0x.

So even then these guidelines are just a reflection of making sure things stay safe for upstream and downstream consumers of Google's largely unmaintained codebase.


I don't think that's an accurate representation. There are a few features like that, but the majority of things banned in the Google style guide are banned for safety, clarity, or performance concerns. Usually in such cases Google and/or Chromium have in-house replacements that choose different tradeoffs.

That's different from an inability to refactor.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: