Edgewall Software

Ticket #1153 (closed defect: fixed)

Opened 4 years ago

Last modified 3 years ago

Broken display of changeset numbers when using URL-style format

Reported by: haui at haumacher.de Owned by: cboos
Priority: normal Milestone: 0.9
Component: wiki system Version: 0.8
Severity: normal Keywords: changeset, link, question mark.
Cc:

Description

The following log message

Add-on to changeset:123:
 * Some change.
 * Yet another change.

is displayed as:

Add-on to changeset:123:

  • Some change.
  • Yet another change.

In wiki mode, the reference to the changeset is rendered as

changeset:123:?

with an additional (magic?) question mark and with the link expanded around the closing colon.

When displaying the log message of a changeset, the link is rendered as

changeset:123:[[BR]]

May this problem be introduced in changeset:776?

Attachments

shref_fixes_r2399.patch (3.8 KB) - added by cboos 3 years ago.
Combining fixes for this ticket and the related #2240
dont_escape_wiki_text_beforehand.patch (7.8 KB) - added by cboos 3 years ago.
Keep "<", ">" and "&" in the Wiki text being parsed, and escape them during the parsing. This enables an easier manipulation of those characters in regexps (instead of having to handle the corresponding substituted strings). Patch made on r2403.

Change History

Changed 3 years ago by cboos

The behavior is now slightly better, as the question mark is not displayed anymore. However, the ":" will be still part of the link.

One way to correctly write such content would be:

Add-on to changeset:"123":
 * Some change.
 * Yet another change.

which displays as:

Add-on to changeset:"123":

  • Some change.
  • Yet another change.

I'm not sure if we can close this as a "worksforme" or try to improve the situation, to make the "123" quoting unecessary.

However, as I'd rather not exclude the ":" or any other character from the target part in the shref/lhref regexps, I don't see how this could be done.

Changed 3 years ago by moschny at ipd uni-karlsruhe de

What about this patch (against 0.9b2):

  • trac/wiki/formatter.py

    old new  
    151151                  r"(?P<htmlescapeentity>!?&#\d+;)"] 
    152152    _post_rules = [(r"(?P<shref>!?((?P<sns>%s):" % LINK_SCHEME + 
    153153                    r"(?P<stgt>'[^']+'|\"[^\"]+\"|" 
    154                     r"((\|(?=[^| ])|[^| ])*[^|'~_\., \)]))))"), 
     154                    r"((\|(?=[^| ])|[^| ])*[^|'~_\., \)" "\n" r":]))))"), 
    155155                   (r"(?P<lhref>!?\[(?:(?P<lns>%s):" % LINK_SCHEME + 
    156156                    r"(?P<ltgt>'[^']+'|\"[^\"]+\"|[^\] ]*)" 
    157157                    r"|(?P<rel>[/.][^ [\]]*))" 

It fixes the problem of Bernhards original bug report + one additional problem:

The formatting of the said changelog entry is even more broken in the timeline view, because newlines are not stripped by the one-line flavor of the wiki parser.

The patch addresses both problems by disallowing ':' and '\n' as the last character in the target part.

Changed 3 years ago by cboos

  • owner changed from jonas to cboos
  • milestone set to 0.9

This is somehow related to both #2172 and #2240.

Changed 3 years ago by cboos

You might be interested to test the following attachment:ticket:2172:oneliner_fixes_and_tests.patch

Changed 3 years ago by moschny at ipd uni-karlsruhe de

The patch can be applied to r2391 (and not to 0.9b2). While it seems to fix the main problem described in this ticket, some problems remain. See for example the last sentence of the main ticket text:

May this problem be introduced in changeset:776?

The question mark after the changeset number becomes part of the (thus broken) link. Not sure, but can a qm really be the last character of the target part?

Changed 3 years ago by cboos

  • status changed from new to closed
  • resolution set to fixed

Fixed in r2395, which also exclude "?" and "!" from the allowed SHREF_TARGET_LAST_CHAR.

Changed 3 years ago by moschny at ipd uni-karlsruhe de

  • status changed from closed to reopened
  • resolution fixed deleted

Well, I hate to nit-pick, but there are still some problems left:

  • monospace changeset:123 or changeset:123 don't work (no link generated) - is "monospace font-style" to be understood as "verbatim", i.e. without further processing?
  • subscript changeset:123: The link is broken, it contains a caret (and a colon) at the end
  • backslash, ampersand, gt, double quote as last characters of the target:
    • changeset:123\
    • changeset:123>
    • changeset:123&
    • ... there might be more problems with non-alphanumerical last characters, so maybe instead of a black-list, a white-list regexp should be used (utilizing \w of course)?

Changed 3 years ago by cboos

No problem :)

Yes, "monospace font-style" is to be understood as "verbatim", i.e. without further processing. It provides exactly the same quoting as {{{...}}} does.

For the rest, please test the following:

Index: formatter.py
===================================================================
--- formatter.py        (revision 2396)
+++ formatter.py        (working copy)
@@ -138,7 +138,7 @@
     QUOTED_STRING = r"'[^']+'|\"[^\"]+\""
 
     SHREF_TARGET_MIDDLE = r"(?:\|(?=[^|\s])|&(?!lt;)|[^|&\s])"
-    SHREF_TARGET_LAST_CHAR = r"[^|'~_\.,&\s\)\]:?!]"
+    SHREF_TARGET_LAST_CHAR = r"[a-zA-Z0-9/=]"
 
     LHREF_RELATIVE_TARGET = r"[/.][^\s[\]]*"

That supports all the current unit-tests, but I'm sure that we miss something with that...

Also, "\w" can't be used, as it contains "_" and therefore will make fail the following: __ticket:1__.

We're going to need something similar for the first char (see #2240).

Changed 3 years ago by moschny at ipd uni-karlsruhe de

Almost working now, but > and & make problems:

are rendered wrongly: The links contain a trailing ">" resp. "&", and additionally a semicolon is printed after the link. The link itself (the href attribute) contains a trailing "&gt" resp. "&amp" (without semicolon).

It seems to me that the raw wiki text contains &gt; and &amp;, which is wrong (?), but otoh parsed correctly by the shref rule.

Regarding {{{..}}}: I would consider this at least a documentation bug. The one-liner version is clearly placed in the font-style section.

Changed 3 years ago by cboos

Combining fixes for this ticket and the related #2240

Changed 3 years ago by cboos

The attachment:shref_fixes_r2399.patch should now also handle the case of changeset:123>,

changeset:123& can't be easily fixed as this would break the query: and search: syntax, where we need to have "&" in the target.

I think the way to go would be to not HTML escape the wiki text beforehand, but do that while doing the parsing.

I'll have a try, but I suspect this would be a too big change for 0.9, now.

Changed 3 years ago by moschny at ipd uni-karlsruhe de

The patch looks good, thanks.

Changed 3 years ago by cboos

  • status changed from reopened to closed
  • resolution set to fixed

Ok, I applied attachment:shref_fixes_r2399.patch in r2401.

I also experimented with the idea of not escaping the "<", ">" and "&" before doing the Wiki parsing. This allows to handle the changeset:123& situation easily.

While I succeeded in having all the unit-tests pass, this is quite a big change and I suspect it would make the Wiki engine less robust than it currently is. At that point near the 0.9 release, it's not the right moment to introduce that change.

However, one might consider the idea later, so I'll drop the patch here (attachment:dont_escape_wiki_text_beforehand.patch).

Changed 3 years ago by cboos

Keep "<", ">" and "&" in the Wiki text being parsed, and escape them during the parsing. This enables an easier manipulation of those characters in regexps (instead of having to handle the corresponding substituted strings). Patch made on r2403.

Changed 3 years ago by moschny at ipd uni-karlsruhe de

Shall I create a new ticket for the changeset:123& case so it doesn't get lost?

Changed 3 years ago by cboos

Yes, please. That would be an ER for milestone:1.0.

Changed 3 years ago by cboos

See the ticket #2455 for the changeset:123& case.

Add/Change #1153 (Broken display of changeset numbers when using URL-style format)

Author



Change Properties
<Author field>
Action
as closed
Next status will be 'reopened'
to The owner will change from cboos. Next status will be 'closed'
 
Note: See TracTickets for help on using tickets.