Commit Graph

19 Commits

Author SHA1 Message Date
Alberto Bertogli
20b99ee568 Introduce type annotations
This patch introduces type annotations, which can be checked with mypy.

The coverage is not very comprehensive for now, but it is a starting
point and will be expanded in later patches.
2020-05-24 16:04:24 +01:00
Alberto Bertogli
ad950208bf Auto-format the code with black
This patch applies auto-formatting of the source code using black
(https://github.com/psf/black).

This makes the code style more uniform and simplifies editing.

Note I also tried yapf, and IMO produced nicer output and handled some
corner cases much better, but unfortunately it doesn't yet support type
annotations, which will be introduced in later commits.

So in the future we might switch to yapf instead.
2020-05-24 16:04:04 +01:00
Alberto Bertogli
1183d6f817 Move to Python 3
Python 3 was released more than 10 years ago, and support for Python 2
is going away, with many Linux distributions starting to phase it out.

This patch migrates git-arr to Python 3.

The generated output is almost exactly the same, there are some minor
differences such as HTML characters being quoted more aggresively, and
handling of paths with non-utf8 values.
2020-05-24 04:50:39 +01:00
Alberto Bertogli
722d765973 markdown: Handle local links
By default, the markdown generator creates links for local files
transparently. For example, "[text](link.md)" will generate
"<a href=link.md>text</a>".

This works fine for absolute and external links, but breaks for links
relative to the repository itself, as git-arr links are of the form
"dir/f=file.ext.html".

So this patch adds a markdown extension to rewrite the links. It uses a
heuristic to detect them, which should work for the vast majority of
common cases.
2018-03-04 20:53:35 +00:00
Alberto Bertogli
53155e566a markdown: Enable table and fenced code extensions
This patch enables the table and fenced code extensions in markdown
processing.

Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2016-11-03 01:45:46 +00:00
Eric Sunshine
6f3942ce38 blob: render hexdump(1)-style binary blob content
Raw binary blob content tends to look like "line noise" and is rarely,
if ever, meaningful. A hexdump(1)-style rendering (specifically,
"hexdump -C"), on the other hand, showing runs of hexadecimal byte
values along with an ASCII representation of those bytes can sometimes
reveal useful information about the data.

(A subsequent patch will add the ability to cap the amount of data
rendered in order to reduce storage space requirements.)

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2015-01-13 19:51:44 +00:00
Eric Sunshine
09c2f33f5a blob: render binary blob summary information rather than raw content
Binary blobs are currently rendered as raw data directly into the HTML
output, looking much like "line noise". This is rarely, if ever,
meaningful, and consumes considerable storage space since the entire raw
blob content is embedded in the generated HTML file.

Address this issue by instead emitting summary information about the
blob, such as its classification ("binary") and its size. Other
information can be added as needed.

As in Git itself, a blob is considered binary if a NUL is present in the
first ~8KB.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2015-01-13 19:51:44 +00:00
Eric Sunshine
50c004f8a5 embed_image_blob: retire reload of image blob
Historically, the 'blob' view was unconditionally handed cooked
(utf8-encoded) blob content, so embed_image_blob(), which requires raw
blob content, has been forced to reload the blob in raw form, which is
ugly and expensive. However, now that the Blob returned by Repo.blob()
is able to vend raw or cooked content, it is no longer necessary for
embed_image_blob() to reload the blob to gain access to the raw content.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2015-01-13 19:51:44 +00:00
Eric Sunshine
1d79988228 Blob: vend raw or cooked content
Some blob representations require raw blob content, however, the 'blob'
view is unconditionally handed cooked (utf8-encoded) content, thus
representations which need raw content are forced to reload the blob in
raw form, which is ugly and expensive.

The ultimate goal is to eliminate the wasteful blob reloading when raw
content is needed. Toward that end, teach Blob how to vend raw or cooked
content.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2015-01-13 19:51:44 +00:00
Eric Sunshine
0ba89d75e6 git.py: introduce Blob abstraction
Some blob representations (such as embedded images) require raw blob
content, however, the 'blob' view is unconditionally handed cooked
(utf8-encoded) content, thus representations which need raw content are
forced to reload the blob in raw form, which is ugly and expensive (due
to shelling out to git-cat-file a second time).

The ultimate goal is to eliminate the wasteful blob reloading when raw
content is needed. As a first step, introduce a Blob abstraction to be
returned by Repo.blob() rather than the cooked content. A subsequent
change will flesh out Blob, allowing it to return raw or cooked content
on demand without the client having to specify one or the other when
invoking Repo.blob().

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2015-01-13 19:51:44 +00:00
Vanya Sergeev
eb7cadd64f Enable line number anchors when using pygments HtmlFormatter
Signed-off-by: Vanya Sergeev <vsergeev@gmail.com>
2014-07-03 00:56:19 +01:00
Alberto Bertogli
54026b7585 Make embedding markdown and images configurable per-repo
This patch introduces the embed_markdown and embed_images configuration
options, so users can enable and disable those features on a per-repository
basis.

Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2013-11-02 21:12:50 +00:00
Alberto Bertogli
a42d7da6a4 utils: Make the embedded image code use mimetypes
This patch makes minor changes to the code that handles embedded images,
mostly to make it use mimetypes, and to remove SVG support (at least for now)
due to security concerns.

Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2013-11-02 19:14:36 +00:00
Vanya Sergeev
21522f8a3a Add embed data URI image blob support 2013-11-02 19:07:59 +00:00
Vanya Sergeev
f62ca211eb Add markdown blob support 2013-11-02 19:03:59 +00:00
Alberto Bertogli
9ec2bde5c4 Only guess the lexer if the file starts with "#!"
The lexer guesser based on content is often wrong; to minimize the chances of
that happening, we only use it on files that start with "#!", for which it
usually has smarter rules.

Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2012-11-27 02:57:31 +00:00
Alberto Bertogli
bad8c52ef2 Fall back to guess the lexer by content
If we can't guess the lexer by the file name, try to guess based on the
content.

This allows pygments to colorize extension-less files, usually scripts.

Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2012-11-18 14:55:27 +00:00
Alberto Bertogli
62da3ebc08 Use heuristics to decide what to colorize
In practise pygments seems to have a very hard time processing large files and
files with long lines, so try to avoid using it in those cases.

Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2012-11-18 14:55:22 +00:00
Alberto Bertogli
80ef0017d4 Initial commit
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
2012-11-10 17:49:54 +00:00