Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I really like the way documentation works in Rust: You basically write markdown in a special type of comment over the module, function, datatype or method you wanna document and then you can convert that into documentation automatically.

Even better: if you have examples in code blocks in these docstrings per default they get tested as well, so if you don't update them, the tests will fail and you will notice.

In my eyes one of the biggest problems with keeping documentation up to date is that over time the mapping between the piece of code you are documenting and the place where you find it in the documentation becomes more complex, to a point where missing something is not unlikely. Rust's documentation-in-code-approach addresses this problem neatly.



I don't have experience with this in Rust but have come to passionately hate this kind of documentation in other language. I think all of pydoc, javadoc and, doxygen are all garbage. If one could apply them sensibly it would not be so much of a problem but then you have documentation nazis who force you to document every method and every parameter. This leads to hightly enlightening prose documentation that the get_height method "gets the height", and that its return value is the height. A more high level problem with this is that you get documentation that is just as fragmented as the code and where the high level usage of things is not explained at all. Also it clutters the code with many highly trivial remarks.


I am a documentation nazi. I hate it when people skip over documentation because something is obvious or trivial to them. Stuff isn't obvious or trivial to people who have to use your code.

get_height gets which height, outer or inner? Are there error values, e.g. 0 as "don't know any height"? Does it have side effects? Is it a stable and reliable part of the API or bound to change soon? Is it thread safe? Will it change any of its parameters? Who deallocates the return value? Do you need to hold a lock somwhere?

Of course it might be a good idea to group together get_height, get_width, get_diagonal and get_depth if the above is all the same for those. But having no documentation just because you think it is trivial that get_height gets some height from somewhere just means that you are sloppy and didn't think of all of the above. So your code shouldn't be touched with a 10-foot-pole imho.

My solution, which I personally hate but know of no alternative to, is a documentation template for each function asking the above questions (depending on runtime and language of course) that I give people to fill in. Until they learn...


Most of what you described as needing documentation could be expressed as code (mixing multiple languages here to express the point more clearly)

const fn get_outer_height() -> Result<SomeErrorType, WeakReference<Number>>

- `const` makes it clear this doesn't mutate

- the function name says exactly what it does

- The return type makes it clear it can return an error

- The return value is typed in a way that makes it clear what the ownership is

Throw in a language like Rust that gives guarantees about thread safety and now the only thing left is if the API is stable or not. Which I would argue doesn't matter much at all since people will still end up depending on it regardless of the comment saying "This API might not be stable"

And the best part? My definition will never get outdated. If the assumptions change, the definition will also need to change (well, except for maybe the name)


You are absolutely right, and one should prefer languages that can give such guarantees wherever possible.

But often one doesn't have a choice. People still write software in inferior languages such as Javascript or Python, where you cannot even be sure about a return or parameter data type.


To add to what you're saying:

Quite often something is not obvious or trivial to someone who is examining a piece of code or using a library for the first time because it assumes the person already understands the context.

An example of what I mean: perhaps it is because I have a background in the sciences, but I assume that most properties have units. A property such as height certainly does have units. So is get_height() returning the height in pixels, inches, meters, or something else? I have also been bitten by graphics libraries that measure distances in unexpected (to me) way. Is the radius of the arc with line thickness 'n' using the inside radius, outside radius, center line, or something else? The person writing the original code may think the developer using their code down the road can test different assumptions, yet the reality is the number of combinations to test will rarely be trivial (and that is assuming they identify the correct parameters to test).

It's at the point where I refuse to even consider using libraries that leave out documentation for obvious things. Even comments like "gets the height" raises red flags since it is a demonstration that the author did not put any thought into what they are documenting.


I don't agree with this answer. I fully agree that documentation is important, needs to be correct and maintained. However, I do stand by the original poster saying that it is often a bad idea to enfroce javadoc style comments to autogenerate documentation. This often leads to low quality documentation.

Like you say, get_height is a trivial function, but still requires attention. Enforcing in-code docs is not going to help to have higher quality docs, quite the contrary. You often get low effort stuff, just to make the code checker happy.

And if you put in a peer review process to validate the in-code comment, it loses all power because you might as well use that step for decent documentation. get_height should be documented in a logical place, where it makes sense indeed, like grouped with get_width. But now you have the logical place to put your documentation, and the forced javadoc comment. That's double work, and one of them will be bad quality as a result of it.

Nobody is arguing for no documentation, but I am arguing for avoiding javadoc enforcements. My solution is much simpler: have a documentation check as part of the peer review process. Sure, have a template for the documentation, but don't make it strict. Ours is simple: all juniors are on documentation peer review as part of their onboarding. If they don't get it, it needs to be fixed.

Our peer review process is quite simple: is the documentation adapted, is there a relevant unit test (we actually have low UT coverage, we only do them for critical code and as part of bug fixing, so enforced frameworks make no sense for us), and naturally is the code quality itself ok

But many companies don't do those, yeah well thats how you get shit.


> get_height gets which height, outer or inner? Are there error values, e.g. 0 as "don't know any height"? Does it have side effects?

In my experience, if your documentation covers all these aspects it's guaranteed to be either wrong, misleading, or out of date on any of these details, and you better read the actual code to be sure.

In particular, the answer will often be "it depends on what the rest of the system does". Perhaps it delegates the actual calculation to an API, and this doc won't change when the API changes in a _mostly_ compatible way.

I mean, even with the best efforts given, code has bugs, words are vague, there's no way you should trust the dev who wrote the code to properly convey what it actually does.


Wouldn't it be more efficient to just read the source code? Also, the point the guy you're responding to is making is not that code shouldn't be documented, but that in-line documentation of this variety is not that great. You seem to be interpreting him to be saying that the code shouldn't be documented at all, which is not what he is saying.


It's true that there can be gotcha's in code which should be documented, but I don't think forcing a template on people is the solution. In fact, I wouldn't expect people to be better at documenting them with that in place.


Well, yes, you need to have manual or automatic checks for the presence of the template, and the correctness of the contents is often uncheckable by automated tests. But if things break and e.g. the filled-in documentation template incorrectly said "thread-safe: yes", it will be very easy to 'git blame' the culprit. That way you can at least slowly weed out the sloppy documenters, but I admit that this is tedious.

And, yes, I don't like this either and would like a better solution. But so far I didn't get any viable suggestions.


Yeah this is why I love literate programming. Being able to read a program with "narration" is so much nicer than just reading documentation piecemeal. Maintaining literate programs though is a quite difficult because you have to figure out where new code or changes fit in the overall narrative. I have a scraper I wrote in literate style and I only have to change it yearly. Each year I forget what I wrote and then I reread the program and make the necessary changes.


Literate programming is great until your code base becomes unwieldy. Then you need a README.md file which acts like a pointer to the correct entry points.

Then as the code grows, you need to document the architecture, add small gotchas, etc.

At the end of the day, documentation wins.


I am very code literate, but depending on the size of the codebase and how much smoke and mirrors are used, I might take longer to grok how things are interconnected by reading the code than it would take me if someone gave an high level overview in a paragraph or two.


I understand the sentiment, because I hate this in other languages too. But somehow in Rust it works. Of course there is also bad documentation of the kind you describe, but that is a different problem.

Bridging the gap between a prosaic high level explaination (how do the parts work together?) and a fine grained explaination of each part (what does that part do?) can be a challenge. Rust solves this somewhat by allowing you to do module level documentation (that can essentially look like a blog post, only that the examples get checked when the code is tested) and it lets you link to different entites.

I would always prefer a well written blog post, if people were able to keep the examples working and the code up to date. But experience shows they are not.

I'd rather have generated documentation that is true than a blog post where half of the examples won't work, because nobody bothered to update the post after the code changed. The first might at times be barren if done badly, the latter is downright misleading.


I am firmly on the opinion that every bit of public interface should get a documentation string.

You say "gets the height" is trivial, but that text tells not only what the function does, but also that the author couldn't think of anything else important to say. This is very different from no text at all, where you can't be certain if the author even thought about it.

IMO, enforcing an internal structure (AKA "you must document each parameter and the return value") is counter-productive, but enforcing the existence of the comment is very productive.

> Also it clutters the code with many highly trivial remarks.

I'd say that it "clutters" the code with markers for public elements and hard to understand ones. Those are actually valuable, and not clutter at all.

Anyway, if those markers are a large share of your lines, you may need to rethink your architecture. It's usually not valuable to have a lot of interface for trivial things.


You've put your finger on exactly why I don't like this kind of documentation.

Also, I don't tend to trust it as much, because it isn't actually what the program executes.

When I'm trying to read the source code I don't want to read about the source code -- I want to read the actual source code -- and if I keep coming across long multi-line idiotic comments then it breaks my flow and concentration.

I like the source code itself to be extraordinarily readable, with long and descriptive variable and method names, but I want it to be dense and packed into paragraphs of sense.

To the extent there are any comments at all they should be extremely short and completely clarifying -- they should not even partially overlap with information conveyed through function or variable names, for instance.


> Also, I don't tend to trust it as much, because it isn't actually what the program executes.

Maybe you don't know Rust, but it is a strongly typed language. If you go to a rust project and run `cargo doc --open` you will see exactly what the type system lays out. Sure if someone writes "changes the windows height and returns a new Window" on a change_window_height(height: Pixels) -> Window and it changes the windows width instead that is still wrong. But (A) you can click on "source" and see the actual code and (B) you are guaranteed that the function takes Pixels and spits out a Window

That being said I hate that kind of documentation in most other languages, not in Rust. In Rust it is basically an alternative view onto the same code with a (needed) emphasis on the realtions of the entities the type system describes. If there is prose that can be a nice extra, but cargo doc is even useful without a single comment.


This is API reference documentation. What you're missing is conceptual documentation and use case examples.

Conceptual documentation, the big picture, is important to convey the mental model implemented by an API. In applications, you can generally infer it from using the app, but it's not always easy, and it's indirect.

Use case examples string together multiple APIs, multiple domain objects, to achieve a high level business objective. When working on a project, you can sometimes get away without this - the existing code can be example enough to copy. You can end up with cargo culting, people copying things without understanding why. But if you have an API for third-party use, which needs documenting, you need to have either a well-seeded set of open source users, or a great set of examples.


Rustdoc supports conceptual documentation quite well, including testing the example code. A specialized page is better, but the pareto principle applies.

See the Clap documentation:

https://docs.rs/clap/latest/clap/


I often call this narrative documentation. It's the antidote to Chesterton's Fence and explains the thinking of the programmer who created the thing. How did that programmer expect people to use their tool.


Most popular languages have a version of this, including Java. This is only one level of documentation, and one must know a bit about what they don't know to use it effectively. In other words, if you already know that you need to use the foo function, then the foo function's documentation is great. If you don't even know which function or class/module to use, it isn't a great starting point.


I built a testing library on top of pytest based upon the idea of doing this mapping at an application level instead of a method/function/class level.

If you write tests in a strongly typed, non-turing complete markup (in this case, StrictYAML), you can then use it with a template and test artefacts (e.g. app screenshots) to generate readable how-to/tutorial docs which are guaranteed to stay up to date.

https://github.com/hitchdev/hitchstory

This isn't a new idea, but I find that people are often skeptical because there's a history of people getting their fingers burned by Gherkin's language design or YAML's weak typing (both of which are completely valid).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: