Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Or even better, don't use C for string handling.

It's too much trouble for too little gain.

I think the areas where C is a good and effective solution are shrinking and safer languages are becoming more common and faster (even C++)

The C syntax, not its library doesn't allow for good string handling. String handling should be built deeper into the language, and, yes, even though it is possible to have safe C code, it's very hard. So try when it's worth it.



> safer languages are becoming more common and faster

> though it is possible to have safe C code, it's very hard

A dull knife is pretty safe for most people. It's still possible to shove it into your eye and blind yourself, but other than those extreme cases it won't cause much injury when used in the regular manner. However, it is also extremely inefficient at the purpose it was designed for: cutting things.

There are, of course, safer alternatives to a knife. EMTs use special tools designed to fit a seatbelt or cloth into a small slot and slice through without any risk to a person; they also have specially designed shears which make it difficult to cut flesh, but easily cut through nylon and leather. Utility knives have retractable blades to reduce injury, and other tools are designed to fit specific materials into slots and make cutting people impossible.

All of those are purpose-driven and application-specific solutions, however. For the most high performance and general purpose application, a really sharp fixed-blade knife is still the most precise and efficient tool for the job. When wielded correctly it is still safe and efficient. But the practitioner is not protected from harming themselves; it's expected that they know what they're doing. And really, it's not that hard to learn how to use it correctly.

But I totally get that it's easier to use a dull knife or scissors than learn all about knives, and it gets the job done.


"A dull knife is pretty safe for most people."

While your point is true, I want to object to your metaphor. I object for safety purposes. In fact, a dull knife is more dangerous than a sharp knife for most anyone who needs to cut things.

You need to press much harder with a dull knife and sometimes even to saw down into the object. You may even need to get a firmer grip on the object you're cutting to counter all that force. Those are very dangerous behaviors. A sharp knife that cuts easily is much, much safer.


A dull knife is safer at rest. A sharp knife is safer in use.


Not true. A dull knife is dull; it's not dangerous at all because it basically can't cut anything. A half-dull or half-sharp knife is dangerous. It can cut, but you don't know how much, and variations in the blade make it unpredictable, in addition to the behaviors you defined.


I would describe a typical kitchen table knife as a "dull knife", and trying to use that for tasks to which it is not suited can certainly lead to injury.


" And really, it's not that hard to learn how to use it correctly."

Patronizing and 100% wrong.

Or do you think there are only idiots developing C code?

How does things like OpenBSD get so secure? With a lot of code revisions by people that are good at catching problems. And even they have problems sometimes.

Your comparison with a knife is false, I can have multiple languages in my development machine without a big burden, as opposed to carry a lot of specialized cutting equipment.

So is Java/Python/Go a dull knife? Let's try something then, create a Web application in C as fast as it's doable with these languages and as safe as them.


Your analogy suggests that "safer languages" are like crappier versions of C. I don't think that assertion is well supported.


I think I qualified my analogy correctly. For general purpose, high-performance efficiency, you can't beat a knife. However, there are tons of specific applications where a tool other than a knife is preferred. That's the idea behind the phrase "the right tool for the job".

I wouldn't say a hatchet or a machete is a "crappier version" of a knife. Surely a hatchet is much better suited for chopping down a tree, for example. On the other hand, a knife would be very inefficient at the job, even if it could get the job done eventually. However, a hatchet is arguably less safe than a knife for many kinds of jobs, and a knife for hatchet-jobs, etc.

So really I guess my point was the idea of a "safer" language is dumb, because not every tool is "safe" for every job, and not every job is suitable for a "safe" tool.


> the purpose it was designed for: cutting things.

The point of the parent is use the correct tool. Most of time people don't need to cut things, they just need to spread some butter.


> However, it is also extremely inefficient at the purpose it was designed for: cutting things.

... So you're saying that C was designed for string handling?

> When wielded correctly it is still safe and efficient

You do realize that even the most skilled chefs have 'battle-scars', right? Your analogy falls on its face in this respect.


Saying C is "very hard" is useful to dissuade the unprepared from writing buggy, vulnerable software, but for the experienced developer with modern tools, C is at worst "tedious". Some simple habits, like always writing your malloc() and free() calls at the same time in balanced pairs, can make C quite managable. Add valgrind, unit tests, and static analysis, and I'd have much more confidence in a good C program than an average program in a weakly typed language.


For people not familiar with it, Valgrind will run your program in a VM and trace memory accesses. It detects when you read from unitialized/unallocated memory, don't free your memory, etc. Almost every time I have had a non-logic bug in C code, in had a corresponding warning in Valgrind.


"malloc() and free() calls at the same time in balanced pairs"

There is no stable correspondence between number of malloc calls and number of free calls. I might allocate things in 10 places that get freed in one, or vice-versa. A simple example would be "parse a packet and send the built packet (through a message queue) to handling code."


While it's true that there are times when you can't achieve it, I intend to suggest limiting oneself to design patterns that facilitate simpler memory management, and implementing both halves of the memory management equation at the same time.

Naturally there's a tradeoff; if redesigning your code to allow one malloc() to one free() would introduce more bugs in logic than it would solve in memory issues, then it's not worth it.


I'm pretty sure the its very hard refers to string handling in C, and also fairly solid in my agreement that unless you have a very very good reason, that is one of the places most people are better off staying away from (especially once you start playing with C unicode constructs).

(but your point is totally valid with regard to C in general~)


Is there a good string handling library for C?


Other HNers have recommended bstring. I haven't tried it myself.

http://bstring.sourceforge.net/


BString relies on undefined behavior for security:

  The reason is it is resiliant to length overflows is that
  bstring lengths are bounded above by INT_MAX, instead of
  ~(size_t)0.  So length addition overflows cause a wrap
  around of the integer value making them negative causing 
  balloc() to fail before an erroneous operation can occurr.
I wouldn't touch this library.


Some simple habits, like always writing your malloc() and free() calls at the same time in balanced pairs, can make C quite managable.

One can also use the Boehm GC if they feel it necessary.


I'm curious as to who uses a garbage collector in C. It seems like if you're using C, you probably are in a situation where you want as close control of that kind of stuff as possible (short of assembly).


GCC and Mono use it internally, I believe.

It's a very conservative garbage collector at the end of the day. It can be tweaked so as to be completely bare bones, and the performance impact is very benign. Rather, memory consumption is its weakness.


The same thing can be said about malloc. It's a matter of degree, and different projects have (sometimes subtly) different requirements.


Expanding - in particular, Boehm seems to only run collections on allocations. If your code is structured such that you don't allocate during periods when you need more precise control (already a good idea, if you are using malloc!) then this won't have an impact.


This is the point people often miss-understand when discussing different programming languages. Few languages prevent you doing anything. No one claims they do.

The point is some languages encourage you to do things correctly, others, make it an uphill struggle.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: