[pmg-devel] [PATCH pmg-api v2] fix #3734: scrub 'url' from style tags/attributes

Fri Nov 26 08:28:41 CET 2021

On 25.11.21 15:14, Dominik Csapak wrote:
> if 'view images' for the quarantine is disabled, it is expected that
> *no* images will be loaded. but in addition to img (src/href/etc.)
> also css can load external images via the 'url' directive
> 
> since html scrubber does not parse/iterate over css, we simply remove
> the url+protocol part of those tags/attributes. this technically leaves behind
> invalid css, but the browsers should cope with that.
> (we cannot 'cleanly' remove without much more effort because of quoting)
> 
> also we have to scrub the style tags in 'dump_html' since HTML::Scrubber
> does not have a way to modify the *content* of a tag, only the
> attributes...
> 
> Signed-off-by: Dominik Csapak <d.csapak at proxmox.com>
> ---
> changes from v1:
> * replace url with ___ and protocol:// with _ instead of removing
> * move sub out and use the reference
> * always pass $cid_hash and only use it in the function when
>   $view_images is set
> * improve comment to show what 'dump_html' does
> 
> @thomas: a note to our off-list discussion regarding url-encoding the
> protocol: you *could* do it, but the browser does not recognize it as
> a protocol and interprets it as a relative url, so we're safe on
> this regard
> 

Another option: Setting the content security policy:

For this call we could use:
$resp->header("Content-Security-Policy", "default-src 'self'; style-src 'self' 'unsafe-inline';"); 

Maybe even;
"Content-Security-Policy", "default-src 'none'; style-src 'unsafe-inline';"

That works out quite well here.

In the long run the CSP is something we could evaluate in general, at least for API
calls, as only (mostly?) those contain dynamic, sometimes user/foreign controlled input.

If we would like to set a CSP for everything we'd need something like:

$resp->header("Content-Security-Policy", "default-src 'self'; style-src 'self' 'unsafe-inline'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; img-src 'self' data:;");

to cover it, and ideally switch from 'unsafe-inline' to nonce/sha approach of
whitelisting, but in any case, nothing that we'd want to rush out now.

Also, doing both is an option, avoiding requests in the first place, so no scary
errors in the browser console, and the CSP, for really just this the *quarantine/content
API call, as safety net..