Skip to content

fix: fix non-POSIX sed expressions#1564

Merged
scop merged 9 commits intoscop:mainfrom
akinomyoga:posix-sed
Feb 2, 2026
Merged

fix: fix non-POSIX sed expressions#1564
scop merged 9 commits intoscop:mainfrom
akinomyoga:posix-sed

Conversation

@akinomyoga
Copy link
Collaborator

No description provided.

Existing codes using | and \| in POSIX BRE is non-portable.  In POSIX
BRE, selection by | or \| is not available.  Selection \| in BRE is
extension by implementations such as GNU sed.  Selection is available
in POSIX ERE, and POSIX sed started to support "sed -E" for ERE in
POSIX.1-2024, but the standard is still new.  We have been using BRE
in the codebase anyway, and we have been fine with BRE.  These were
not detected by test/runLint because they are on lines different from
"sed".

References: 9b531a0 (apt-get), 334b47c (ebtables), 0ad9c24
(iptables), b33a3c5 (mplayer), 3a69bd3 (service).

* Note: To confirm that "|"s used in "ebtables" indeed intend to mean
  selection, an example output from "ebtables" is this:

  ```
  # ebtables -t nat -L
  Bridge table: nat

  Bridge chain: PREROUTING, entries: 0, policy: ACCEPT

  Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

  Bridge chain: POSTROUTING, entries: 0, policy: ACCEPT
  ```

  where one observes "PREROUTING", "OUTPUT", and "POSTROUTING" appear
  separately.
In the first place, we do not have to use "sed" to create the regular
expression, and we do not have to use "sed" to filter the lines using
a regular expression either.
In POSIX BRE used in sed, the character escape sequences like "\t" and
"\r" and perlre(1)-origin character class "\w" cannot be used.  They
are GNU extensions.  The only available escape sequence is "\n" in the
replacement of "s///".

References: 98dee7d (apt-get), 116db06 (gpg), e1f39bd (gpg2),
622abf3 (ipmitool), b7937bf (screen).
POSIX requires ";" or <newline> before "}", though GNU grep and
probably other implementations are permissive when the context is
unambiguous.  These were introduced in commit c10a272.
@akinomyoga akinomyoga force-pushed the posix-sed branch 3 times, most recently from 64cba46 to 92d8b42 Compare January 31, 2026 09:59
local muttcmd=${words[0]}

local querycmd="$("$muttcmd" -Q query_command 2>/dev/null | command sed -e 's|^query_command=\"\(.*\)\"$|\1|' -e 's|%s|'"$cur"'|')"
local querycmd="$("$muttcmd" -Q query_command 2>/dev/null | command sed -e 's|^query_command="\(.*\)"$|\1|' -e 's|%s|'"$cur"'|')"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe these backslashed double quotes were there to escape them from the surrounding quoted "$( ... )". But IDK if that's necessary in the first place, or further if actually quoting the surrounding thing is ever necessary (i.e. if it could be written just as $( ... ) instead of "$( ... )"). But if so, let's look into addressing/cleaning up that in another PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inside of $(...) is an independent syntactic context and it's unaffected by the quoting outside of it. I guess the command substitution was originally in the old style `...`, where the quoting was necessary, but the quoting was left after switching to $(...). I've identified the commit: 755041c, where it seems to have been POSIX'ly-incorrect from the beginning, or it might have been copied from an even older code.

Yeah, the outside double-quoting "$()" is also unnecessary on the right-hand side of assignments.

The word-boundary anchors \<\> are used in sed to exclude already
specified items from a list.  However, they are non-POSIX.  Actually,
we can use `compgen -X` to exclude items from the generated list, so
we do not have to use "sed" in the first place.
Currently, in POSIX-bracket expressions [...] in the POSIX sed regular
expressions, a backslash is treated literally, i.e., a single
backslash is enough to include it in the set of characters.  However,
according to a newly added paragraph "APPLICATION USAGE/p5 - sed -
POSIX Issue 8 XCU", POSIX may allow implementations to interpret
backslashes in bracket expressions bracket in the future (in a similar
way as POSIX awk processes).  It recommends using two backslashes in
bracket expressions for the future compatibility.

https://pubs.opengroup.org/onlinepubs/9799919799/utilities/sed.html#tag_20_109_16
@scop
Copy link
Owner

scop commented Feb 2, 2026

Thanks!

@scop scop merged commit e999091 into scop:main Feb 2, 2026
8 checks passed
@akinomyoga akinomyoga deleted the posix-sed branch February 2, 2026 22:06
@akinomyoga
Copy link
Collaborator Author

Thank you!

akinomyoga added a commit to akinomyoga/bash-completion that referenced this pull request Feb 2, 2026
In the current implementation, we try to filter out the elements of
"${options[@]}" that already appear in "${words[@]}".  The strategy is
to generate

a. "${options[@]}" and
b. the intersect of "${words[@]}" and "${options[@]}"

in a single list and to pick up only unique elements.  However, the
current implementation does not correctly generate "b. the intersect".
It also contains elements of words that contain any of "${options[@]}"
as a substring.  This possibly results in generating some words in the
command line as options.

scop#1564 (comment)

In this patch, we rather implement the filtering using `_compgen -X`.
akinomyoga added a commit to akinomyoga/bash-completion that referenced this pull request Feb 2, 2026
The filtering attempted in the iptables completion have not taken
effect because `s/.../.../p` (contained in "$chain") has been
performed before the filtering by "s/<chains-to-remove>//".  In this
perspective, the ebtables completion did the correct thing by first
performing the filtering and then performing `s/.../.../p`.

However, even with the above issue fixed, both the iptables and
ebtables completion may wrongly reduce a part of chain names, such as
`DOCKER-FORWARD` to `DOCKER-`.  The filtering should be done by the
exact match.

One solution is to split the "sed" invocation into "sed" and "grep -v"
and perform the filtering of the exact match in the "grep" command as
"grep -vEx 'INPUT|OUTPUT|...'".  In this patch, we instead use the
option `-X '@(INPUT|OUTPUT|...)'` of `_comp_compgen_split`.

scop#1564 (comment)

Co-authored-by: Ville Skyttä <ville.skytta@iki.fi>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants