Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nth/2 semantic is strange #1867

Closed
muhmuhten opened this issue Mar 22, 2019 · 5 comments · Fixed by #2674
Closed

nth/2 semantic is strange #1867

muhmuhten opened this issue Mar 22, 2019 · 5 comments · Fixed by #2674
Labels

Comments

@muhmuhten
Copy link
Contributor

def nth(n; g): last(limit(n + 1; g)); does not match my intuition for what "the nth output of g" means when there are fewer than n+1 outputs of g.

Expected behavior

nth($n; exp) should probably be analogous to [exp][$n] (i.e. ([exp]|.[$n])), except less expensive and without evaluating the $n+1th and subsequent outputs of exp.

One thing to note is that $array[$n] != $array[:$n][-1], but more closely matches $array[$n:][0]... If $n is greater than the number of outputs, I'd expect to get back either empty or null. This implies something more like the following:

def drop($n; g): foreach g as $_ ($n; .-1; . < 0 or empty|$_);
def nth($n; g): first(drop($n; g), null);

Additional context

I thought I'd found a more efficient implementation of nth in the following, but well, the above:

diff --git a/src/builtin.jq b/src/builtin.jq
index a6cdabe..509047c 100644
--- a/src/builtin.jq
+++ b/src/builtin.jq
@@ -165,7 +165,9 @@ def any(condition): any(.[]; condition);
 def all: all(.[]; .);
 def any: any(.[]; .);
 def last(g): reduce . as $_ (.; g|[.]) | .[]?;
-def nth($n; g): if $n < 0 then error("nth doesn't support negative indices") else last(limit($n + 1; g)) end;
+def nth($n; g):
+  if $n < 0 then error("nth doesn't support negative indices")
+  else label $out | foreach g as $_ ($n; .-1; . < 0 or empty|$_, break $out) end;
 def first: .[0];
 def last: .[-1];
 def nth($n): .[$n];

This would be kind of a gratuitous incompatibility but might be nice for 2.0.
(The above first/drop implementation runs just as fast but reads nicer IMO.)

@nicowilliams
Copy link
Contributor

I agree.

With builtin modules we could have a "jq/2.0" module which, if you include it, will cause 1.x builtins not to be included, so we can feel free to fix mistakes.

For example, I think allowing multiple functions with the same names but different arities turns out to be a mistake because the , operator is easily confused with the ; syntactic argument separator. We don't get to remove this feature, but in 2.x we could simply not make use of it.

@pkoppstein
Copy link
Contributor

@muhmuhten - In my opinion, the current def of nth/2 is simply wrong.

@nicowilliams - I don't really see a connection between muhmuten's observations (which I take to be that nth/2 is either misnamed or incorrectly implemented) and your remarks regarding ,/; confusability.
By the way, regarding ,/; confusability, my experience has been that users quickly catch on, and usually tolerate the inconvenience once they appreciate the glory of "," in jq and the beauty of jq's polymorphism. You should give yourself more credit!

@nicowilliams
Copy link
Contributor

@pkoppstein Hmm, well, maybe. I've been worried that it was a mistake. Good to know someone doesn't agree with that.

@nicowilliams
Copy link
Contributor

I think we can just fix nth.

@muhmuhten
Copy link
Contributor Author

@pkoppstein I think first/0, last/0, and nth/1 operating on arrays is a mistake and probably shouldn't have existed in the first place, but too late...

Some notion of multiple arguments is clearly necessary, and it's hard to get around the confusion for something like limit without flat-out having explicit syntax for expressions that can be multi-valued. It's also decent substitute for having default parameters; the main pain points are
(a) the regex functions, where an accidentally commaed-in flags looks just like a regex
(b) the handful of functions that do totally different thing at different arities: split, all/any, first/last/nth?

Most of the latter are gratuitous array functions... (e.g. imo the right any/1 is def any(g): isempty(g or empty)|not; which is basically what any(g; .) does now but with an extra call, see the current definitions of IN)

On the other hand, the only really salient one is range/1, which felt like the #1 builtin I missed during that stint of C-builtins-only jq despite the workaround being trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants