Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Jq Internals: Backtracking (2017) (github.com/stedolan)
124 points by mshockwave on Oct 5, 2022 | hide | past | favorite | 47 comments


For a tool I only use very occasionally I find jq's syntax to be too unique with too steep of a learning curve. Every time I reach for it I have to spend a bunch of time reading the manual trying to think like it does.

I wish there was something for querying json that had more of a xpath or even a sql-like syntax, or GraphQL with the addition of wildcards.


If you're used to S-expressions/know a bit of Clojure, `jet` is a similar tool that not only allows you to convert between JSON <> EDN, but also has a tiny query language built-in, that is just normal Clojure functions.

Clojure really excels at data querying/manipulation, so `jet` is very ergonomic to use.

One example:

    $ echo '{"anApple": 1}' | jet -i json -k '#(-> % csk/->kebab-case keyword)' -o edn
    {:an-apple 1}
Since starting to use jet, I haven't found anything jq could do that jet couldn't also do but with the additional feature of actually being able to read what I've done with it days later.

https://github.com/borkdude/jet


I have a hard time suggesting such a thing, because I find JMESPath incredibly inferior to jq's expressiveness, but if you're in the AWS ecosystem much, you may enjoy https://github.com/jmespath/jp#readme which uses the same query language as does awscli (https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-f...). That may at least pay more dividends than keeping jq's language in your head where it will only ever be used by jq



> dsq registers go-sqlite3-stdlib so you get access to numerous statistics, url, math, string, and regexp functions that aren't part of the SQLite base. (https://github.com/multiprocessio/dsq#standard-library)

Ah, I wondered if they rolled their own SQL parser, but no, I now see the sqlite.go in the repo and all is made clear


There’s some limitations in dsq which means it doesn’t query sub objects that have varying fields - they just remain as JSON strings


jqp the Text User Interface for playing about with jq might help a bit.

You can have your input JSON terminal left, the output post jq terminal right and interactively edit your jq script elements above.

( based upon the Go implementation of jq )

See: https://github.com/noahgorstein/jqp


Columnq is pretty amazing, fully SQL, you can even join them to CSV etc


The trick is to use it daily. The learning curve isn't worse than vim


To use it daily you need a reason, or have the time & will to practice it.


Fair. I guess I think anyone using jq occasionally could probably be using it daily as well. Dealing with JSON data is pretty much standard for anyone developing or consuming a rest API


> I guess I think anyone using jq occasionally could probably be using it daily as well.

No. Occasionally: "at infrequent or irregular intervals; now and then."


anyone knows why the last release is 4 years ago? the tool is so popular that this kind of release cadence seems weird

https://github.com/stedolan/jq/releases/tag/jq-1.6


We, the maintainers, ran out of energy. We had big PRs to work on that sucked a lot of energy, then covid happened, and lots of distractions, and now we have no active maintainers. The community at times got unruly and required bursts of our energy. Will and I felt burned out on jq.

That said, I'm trying to bring on a new maintainer, and I'll try to find energy during the holidays, which is when I usually find energy for jq. Also, I can't reach Stephen, which means the only way to add new maintainers is to fork the project, which I don't want to do without Stephen's approval. I've created a `jqlang` org in GitHub just to squat on the name, but before I ever fork it, I want Stephen's approval, and I want him to accept ownership in the org -- it's his baby!


You guys did a great job with jq so thank you for everything you've put into it. It's a tool I use multiple times a day, every day. Also, I love your documentation page!

When I saw the comment "why is the latest release so old" my thought was "because it's perfect".


It needs a bunch of love. It really needs a way to define new builtins externally, and it needs co-routines. And it needs a way to do binary (my idea: treat binary as an array of small integers, and if you ever insert any other kind of value, then you either get an error or you get an array that's mostly of small integers and no longer "binary" in any way).


If you can't reach Stephen directly maybe you can try to reach him via Yaron Minsky.


Thanks for the clue. I'll try that.


I've always been curious, what do maintainers that are trying to pass the torch look for? Surely there are a huge amount of people that offer but they mostly get rejected. What makes a good candidate?


Contribution quality. When Stephen made me a maintainer, he did it because I kept sending him PRs he liked, and he was losing energy for jq. When I asked Stephen to make Will and David maintainers, it was because their contributions were of very good quality. Problem is I've fallen way behind on keeping up with issues and PRs, so right now I don't know who would be a good candidate. The other problem is, as I mentioned elsewhere, that I cannot make someone a maintainer -- only Stephen can. If Stephen agrees to a `jqlang` org, then it will be possible for him to delegate authority to add new maintainers.


Thank you for producing such an excellent language! I'll always recommend it, its come in handy so many times and fits nicely in a lot of workflows.


Thank Stephen Dolan.


Damn, sorry to hear that the cycles became so draining.

It's such a powerful tool at surface level in this api/k8s interfacing world that looking under the hood and seeing all the work just knocked me over.

You all have done something so powerful for the community regardless of the hiatus.

THANK YOU!


Thank you so much for JQ. It has been my secret weapon in more than one instance.


If there's no planned changes or unreleased work in the repo, then there's not much need for more releases. Jq is a pretty small tool and isn't obviously incomplete.

Though it's funny because I do agree that in the past that jq wasn't having its committed code being released often enough. There was a popular Github issue[1] with arguments and 70+ reactions about a confusing feature in jq that turned out to have already been fixed and committed in 2015, before the Github issue was even opened, but the fix didn't actually make it into a released version until 2018. Definitely one of the more frustrating cases of investigating a software issue I ran into. Everyone should make sure their project's release process can promptly release updates!

[1] https://github.com/stedolan/jq/issues/1110


It isn't obviously complete either, and definitely still has bugs. And there are significant changes on master that haven't made it into a release yet. I think it is more that the maintainer(s) no longer have a lot of time to put into it. Which is fine and understandable. Maybe eventually a fork will gain enough traction to take over, or maybe one of the alternatives will become more popular, or maybe the maintainer will come back, or pass it on to someone else.


I've also wondered this, afaics the project appears effectively unmaintained, ci appears to have been broken for a while, and both pull requests and issues piling up with not much response. it does appear there are several other folks with commit access to the repo :shrug:


It works, and does what it is supposed to. Why does it need a new version?


To compete with the newer, blazingly fast tools


What are the competitors in the jq space?


At least jaq[1] gojq[2], and jp[3] (this last of which is jmespath rather than the jq query language)

[1]: https://github.com/01mf02/jaq

[2]: https://github.com/itchyny/gojq

[3]: https://github.com/jmespath/jp


I especially love gojq thanks to this change:

    $ echo '{"alpha":"beta"}' | jq -r '"hello \(.alpha\)"'
    jq: error: syntax error, unexpected INVALID_CHARACTER (Unix shell quoting issues?) at <top-level>, line 1:
    "hello \(.alpha\)"
    jq: 1 compile error

    $ echo '{"alpha":"beta"}' | gojq -r '"hello \(.alpha\)"'
    gojq: invalid query: "hello \(.alpha\)"
        "hello \(.alpha\)"
                       ^  unexpected token "\\"
because (a) muscle memory (b) sure, it's easy to spot that mistake in an 18 character expression, but for bigger ones, getouttahere


There's a particularly complicated system I built a few years ago that heavily leaned on jq for configuring some backend systems. It had lots of variables piped in and so that particular error was common. From your snippet, I'm pretty sure gojq would have saved me many hours of debugging.


Yeah the ergonomy and error messaging of jq is complicated

I'm not a big fan of the language, but it works


jp --help

"It's complicated, you know"


Xidel for XPath on JSON, XML, and HTML: https://github.com/benibela/xidel


Quite impressive sophistication level for such a 'small' utility.

Very inspiring read :)


Does anyone know of a visual representation of backtracking? Either video or images, I'm not fussed.


Backtracking is a stone's throw from the mathematical concept of permutations so it might be better to look up visualizations of that and work from there. This one seems good: https://pebreo.github.io/combinations-visualization/

The number of permutations of n items is n!, because at every position you can place one item from all the items that haven't already been placed. Backtracking basically recursively produces all n! solutions (where every recursion depth is one position) similar to DFS brute force enumeration with the only difference that it also continuously evaluates partial solutions and short-circuits if it detects a partial solution that is guaranteed to make any complete solution containing it invalid.

E.g. if you want to enumerate the permutations of 4 people standing in a row, there are 4!=24 ways to do it. However, if you add the condition that Alice can't stand next to Bob, you can short-circuit the enumeration early as soon as you detect that Alice and Bob have been placed together (such as Alice-Bob-?-?)

You could also have more complex scenarios where every position has its own disjoint set of items to choose from.

What you do with the enumeration is up to you. Maybe you just want to enumerate and display all possible permutations. Maybe you just want to produce the best or k best as determined by some fitness function.


This blog post has a nice visualization of sudoku solving using backtracking. https://medium.com/analytics-vidhya/sudoku-backtracking-algo...

Although the case they use is almost perversely simple. Imagine in a 'real' sudoku that each number might need to be backtracked multiple times, such that the algorithm is regularly going almost back to the start.

Edit: here's a classic one trying to solve the 8 queens problem: https://commons.wikimedia.org/wiki/File:Eight-queens-animati...


Thanks! I really liked this tree from the blog, it made it quite clear.

https://miro.medium.com/max/786/1*Q-DyKa25eozOeMdN5YQONA.png


Marty Stepp from Stanford had the best videos on this but they took it down from YouTube ¯\_(ツ)_/¯



Don't have one to hand, but it's basically walking a tree depth first, where sibling nodes are the choice points.


I wish jq had a spec.


I hear you, but OTOH in this thread are two alternative implementations, one which seems especially focused on "bolt tightening" some of the edge cases: https://github.com/01mf02/jaq#assignments

Isn't the adage to only build a framework after the 3rd or 4th implementation? That seems to apply to writing a RFC, also


w jq




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: