blob: render binary blob summary information rather than raw content

Binary blobs are currently rendered as raw data directly into the HTML
output, looking much like "line noise". This is rarely, if ever,
meaningful, and consumes considerable storage space since the entire raw
blob content is embedded in the generated HTML file.

Address this issue by instead emitting summary information about the
blob, such as its classification ("binary") and its size. Other
information can be added as needed.

As in Git itself, a blob is considered binary if a NUL is present in the
first ~8KB.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
This commit is contained in:
Eric Sunshine 2015-01-13 04:57:13 -05:00 committed by Alberto Bertogli
parent 58037e57c5
commit 09c2f33f5a
3 changed files with 12 additions and 0 deletions

@ -187,6 +187,7 @@ def with_utils(f):
'markdown_blob': utils.markdown_blob, 'markdown_blob': utils.markdown_blob,
'can_embed_image': utils.can_embed_image, 'can_embed_image': utils.can_embed_image,
'embed_image_blob': utils.embed_image_blob, 'embed_image_blob': utils.embed_image_blob,
'is_binary': utils.is_binary,
'abort': bottle.abort, 'abort': bottle.abort,
'smstr': git.smstr, 'smstr': git.smstr,
} }

@ -103,3 +103,6 @@ def embed_image_blob(fname, image_data):
return '<img style="max-width:100%;" src="data:{0};base64,{1}" />'.format( \ return '<img style="max-width:100%;" src="data:{0};base64,{1}" />'.format( \
mimetype, base64.b64encode(image_data)) mimetype, base64.b64encode(image_data))
def is_binary(s):
# Git considers a blob binary if NUL in first ~8KB, so do the same.
return '\0' in s[:8192]

@ -41,6 +41,14 @@
% if can_embed_image(repo, fname.unicode): % if can_embed_image(repo, fname.unicode):
{{!embed_image_blob(fname.raw, blob.raw_content)}} {{!embed_image_blob(fname.raw, blob.raw_content)}}
% elif is_binary(blob.raw_content):
<table class="nice">
<tr>
<td>
binary &mdash; {{'{:,}'.format(len(blob.raw_content))}} bytes
</td>
</tr>
</table>
% elif can_markdown(repo, fname.unicode): % elif can_markdown(repo, fname.unicode):
{{!markdown_blob(blob.utf8_content)}} {{!markdown_blob(blob.utf8_content)}}
% elif can_colorize(blob.utf8_content): % elif can_colorize(blob.utf8_content):