• last updated 16 hours ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
ad_dom_sanitize_html: fixed 2 resource leaks

in case of parsing errors in the input string the following structures leaked:

- dom tree

- stuct::tree

Always use "--" in "dom parse" when document is interpolated

This is a safety measure to make sure that the document parsed is

never confused with an option, when the document starts with a "-". In

the best case, the error message provided by "dom parse" might be

misleading. This might be a problem for user contributed documents

(passed as variables, or return values from functions).

The double dash is supported in tdom since version 0.9.0.

  1. … 18 more files in changeset.
Fixed regression test and make more use of "aa_test_start" and "aa_test_end"

This change reduces the errors in the log file a lets run the

regression test (on my site) without reported errors as reported by

acs_kernel__server_startup_ok.

  1. … 3 more files in changeset.
Complete the URL earlier, so that protocol-relative URLs can be correctly determined as external or not

Rework URL validation in ad_dom_sanitize_html

We now prefer higher level api to determine:

- if a URL is external

- what protocol should be assumed for a URL when this is relative or protocol-relative

made test for valid protocols case invariant

In the end we do phase out the util_expand_entities* procs for being too lame

Good riddance

  1. … 1 more file in changeset.
Reimplement util_expand_entities_ie_style

This proc turned out to be long broken. We could consider phasing it out, but as it is a public interface used in a few places we prefer to keep it around and try to fix it.

The intended behavior has been reconstructed from the documentation. The new approach uses a single regexp to extract entities, which does not risk to loop indefinitely as before.

fixed broken indentation and broken nesting

    • -1149
    • +1148
    ./text-html-procs.tcl
fix incorrect nesting in switch statements

improved spelling

  1. … 5 more files in changeset.
Untangle if logics

Reject URLs displaying multiple protocols

Strenghten validation against smarter attempts to disguise the javascript: protocol

Manually replace the ":" entity to prevent attempts at disguising "javascript:" links

When using ad_dom_sanitize_html to validate markup, treat failure to parse as a normal validation failure, rather than an error

reduce verbosity

Use a better regexp to reimplement ad_looks_like_html_p, use the improved api to port downstream feature: an ad_form datatype validator that won't allow to insert markup

Many thanks to Günter Ernst

  1. … 1 more file in changeset.
Deprecate trivial wrappers for ad_html_text_convert

  1. … 2 more files in changeset.
improve robustness

  1. … 1 more file in changeset.
make parsing more robust

fix typo

reduce verbosity

  1. … 2 more files in changeset.
Small improvements:

- use "string is space" instead of trimming the string and checking if empty, at least 2x faster on development, wherever we don't need the trimmed value

- modernize leftover foreach trick with lassign

  1. … 6 more files in changeset.
if truncate_len is provided we have to call util_close_html_tags for truncating the string

Rename proc according to convention enforced in acs-tcl: naming__proc_naming

comment tags, which are NOT supported by HTML5, allow "abbr" in enhanced text

Reduce verbosity

Fix typo

implemented ad_html_security_check based on ns_parsehtml