This patch adds support for having the main branch be named different
than "master".
It will use "master" or "main" if available, and fall back to the first
branch name if they're both missing.
Pygments 2.12 changes the element layout slightly, adding a wrapper
<div> that was accidentally removed before:
https://github.com/pygments/pygments/issues/632.
This patch adds a workaround, so the styling is consistent on both 2.11
and 2.12.
This patch updates Pygments' syntax highlight CSS to support dark mode.
It uses two themes from Pygments: `default` for light (same as before),
and `native` for dark.
This patch extends our CSS to introduce dark mode, so the style shown
matches the user media preference.
It is very analogous to the previous one, only minor adjustments have
been made to make the contrast levels pass the accessibility standards.
No changes have been made to the pygments CSS. It works surprisingly
well as-is, but there are some minor changes that may be needed. Those
will be done in subsequent patches.
This patch memoizes some of the functions to help speed up execution.
The speedup is quite variable, but ~30% is normal when generating a
medium size repository, and the output is byte-for-byte identical.
The Markdown extension for rewriting local links was using an API that
is now deprecated, and as of python-markdown 3.4 it is no longer
available.
This patch adjusts the code to use the new API which should be available
from 3.0 onwards.
When a <pre> section (commit message, blob, diff) has a very long line,
today it makes the entire page very wide, causing usability issues.
This patch makes <pre> have a horizontal scroll in those cases, which is
easier to use.
The commit message has a very large left and right padding, but doesn't
improve readability and might make the commit message more difficult to
read on smaller screens.
This patch shortens the padding.
With the Python 2 to 3 migration and the type checking, we can be
fairly confident that smstr are always constructed from strings, not
bytes.
This allows the code to be simplified, as we no longer need to carry
the dual raw/unicode representation.
This patch introduces type annotations, which can be checked with mypy.
The coverage is not very comprehensive for now, but it is a starting
point and will be expanded in later patches.
This patch applies auto-formatting of the source code using black
(https://github.com/psf/black).
This makes the code style more uniform and simplifies editing.
Note I also tried yapf, and IMO produced nicer output and handled some
corner cases much better, but unfortunately it doesn't yet support type
annotations, which will be introduced in later commits.
So in the future we might switch to yapf instead.
Python 3 was released more than 10 years ago, and support for Python 2
is going away, with many Linux distributions starting to phase it out.
This patch migrates git-arr to Python 3.
The generated output is almost exactly the same, there are some minor
differences such as HTML characters being quoted more aggresively, and
handling of paths with non-utf8 values.
By default, the markdown generator creates links for local files
transparently. For example, "[text](link.md)" will generate
"<a href=link.md>text</a>".
This works fine for absolute and external links, but breaks for links
relative to the repository itself, as git-arr links are of the form
"dir/f=file.ext.html".
So this patch adds a markdown extension to rewrite the links. It uses a
heuristic to detect them, which should work for the vast majority of
common cases.
The default CSS is not very comfortable for markdown, as for example the
links are hidden.
This patch makes the markdown CSS tunable by wrapping it into a div, and
then adjusting the default styles to increase readability.
As an experiment, make the sections of the summary to be toggable. This
can help readability, although it's unclear if it's worth the additional
complexity and could be removed later.
Including the tree as part of the summary gives a bit more information
and provides an easy path into the tree.
It does clutter things a bit, so this is an experiment and may be
removed later.
An empty string as a pathspec element matches all paths, but git has
recently started complaining about it, as it could be problematic for
some operations like rm. In the future, it will be considered an error.
So this patch uses "." instead of the empty pathspec, as recommended.
d426430e6e
We display the location of the repository, but the entire row is not
convenient for copy-pasting.
This patch changes the wording to "git clone" so the entire row can be
copied and pasted into a terminal.
There's a trick, because if we just changed the wording to:
<td>git clone</td> <td>https://example.com/repo</td>
that would get copied as:
git clone\thttps://example.com/repo
which does not work well when pasted into a terminal (as the \t gets
"eaten" in most cases).
So this patch changes the HTML to have a space after "clone":
<td>git clone </td> <td>https://example.com/repo</td>
and the CSS to preserve the space, so the following gets copied:
git clone \thttps://example.com/repo
which works when pasting on a terminal.
There's a significant amount of overrides to make the font sizes
smaller, but that can hurt readability in some cases. We should try to
use the "natural" sizes as much as possible.
This patch does that, removing a lot of the font sizes and bringing them
to be based on the normal sizes.
It also changes listings to use monospace, for readability.
When using pygments, make the line numbers grey.
This was the intention all along, but the <a> style overrides the <div>
style and the grey colour does not take effect.
This patch fixes the problem by setting the style specifically to <a>
within the line numbers div.
This patch adds a "prefix" configuration option, so repositories created
with recursion are named with a prefix.
This can be useful to disambiguate between repositories that are named
the same but live in different directories.
This patch moves the pages to HTML5, and adds some simple meta tags and CSS media
constraints so things render better on mobile browsers, while leaving the
desktop unaffected.
It's still not ideal, though.
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
When having symbolic links to the same repositories (e.g. if you have "repo"
and a "repo.git" linking to it), it can be useful to ignore based on regular
expressions to avoid having duplicates in the output.
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
The top level index contains a "last updated" field, but it doesn't get
updated if using the --only option, which is very common in post-update hooks,
and causes the date to be stale.
This patch fixes that by always generating the top level index, even if --only
was given.
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
The page title in a root tree displays as "git >> repo >> branch >>",
which looks odd and fails to convey the fact that the page represents a
tree. Appending a '/' (for example "git >> repo >> branch >> /") makes
it more obvious that the page shows a tree, in general, and the root
tree, in particular.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
For blobs in subdirectories, the page title always includes a double
slash between the final directory component and the filename (for
example, "git >> repo >> branch >> doc//readme.txt"). This is unsightly.
git-arr:blob() ensures that the directory passed to views/blob always
has a trailing slash, so we can drop the slash inserted by views/blob
between the directory and the filename.
As a side-effect, this also changes the page title for blobs in the root
directory. Instead of "git >> repo >> branch >> /readme.txt", the title
becomes "git >> repo >> branch >> readme.txt", which is slightly more
aesthetically pleasing.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Binding (or "pegging") a Repo at a particular branch via new_in_branch()
increases the cognitive burden since the reader must maintain a mental
model of which Repo instances are pegged and which are not. This burden
outweighs whatever minor convenience (if any) is gained by pegging the
Repo at a particular branch. It is easier to reason about the code when
the branch name is passed to clients directly rather than indirectly via
a pegged Repo.
Preceding patches retired all callers of new_in_branch(), therefore
remove it.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Passing the branch name into the view indirectly via
Repo.new_in_branch() increases cognitive burden, thus outweighing
whatever minor convenience (if any) is gained by doing so. The code is
easier to reason about when the branch name is passed to the view
directly.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Passing the branch name into the view indirectly via
Repo.new_in_branch() increases cognitive burden, thus outweighing
whatever minor convenience (if any) is gained by doing so. The code is
easier to reason about when the branch name is passed to the view
directly.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Passing the branch name into the view indirectly via
Repo.new_in_branch() increases cognitive burden, thus outweighing
whatever minor convenience (if any) is gained by doing so. The code is
easier to reason about when the branch name is passed to the view
directly.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Empty (zero-length) blobs are currently rendered by 'pygments'
misleadingly as a single empty line, or, when 'pygments' is unavailable,
as "nothingness" preceding a horizontal rule. In either case, it is
somewhat difficult to glean concrete information about the blob.
Address this by instead rendering summary information about the blob: in
particular, its classification ("empty") and its size ("0 bytes"). This
is analogous to the summary information rendered for binary blobs
("binary" and size).
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Although hexdump(1)-style rendering of binary blob content may reveal
some meaningful information about the data, it wastes even more storage
space than embedding the raw data itself. However, many binary files
have a "magic number" or other signature near the beginning of the file,
so it is often possible to glean useful information from just the
initial chunk of the file without having the entire content available.
Thus, limiting the rendered data to just an initial chunk saves storage
space while still potentially presenting useful information about the
binary content.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Raw binary blob content tends to look like "line noise" and is rarely,
if ever, meaningful. A hexdump(1)-style rendering (specifically,
"hexdump -C"), on the other hand, showing runs of hexadecimal byte
values along with an ASCII representation of those bytes can sometimes
reveal useful information about the data.
(A subsequent patch will add the ability to cap the amount of data
rendered in order to reduce storage space requirements.)
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Binary blobs are currently rendered as raw data directly into the HTML
output, looking much like "line noise". This is rarely, if ever,
meaningful, and consumes considerable storage space since the entire raw
blob content is embedded in the generated HTML file.
Address this issue by instead emitting summary information about the
blob, such as its classification ("binary") and its size. Other
information can be added as needed.
As in Git itself, a blob is considered binary if a NUL is present in the
first ~8KB.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Batch output of git-cat-file has the form:
<sha1> SP <type> SP <size> LF <contents> LF
It unconditionally includes a trailing line-feed which Repo.blob()
incorrectly returns as part of blob content. For textual blobs, this
extra character is often benign, however, for binary blobs, it can
easily change the meaning of the data in unexpected or disastrous ways.
Fix this by respecting the blob size reported by git-cat-file.
(The alternate approach of unconditionally dropping the final LF also
works, however, respecting the reported size is perhaps a bit more
robust and "correct".)
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Historically, the 'blob' view was unconditionally handed cooked
(utf8-encoded) blob content, so embed_image_blob(), which requires raw
blob content, has been forced to reload the blob in raw form, which is
ugly and expensive. However, now that the Blob returned by Repo.blob()
is able to vend raw or cooked content, it is no longer necessary for
embed_image_blob() to reload the blob to gain access to the raw content.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Some blob representations require raw blob content, however, the 'blob'
view is unconditionally handed cooked (utf8-encoded) content, thus
representations which need raw content are forced to reload the blob in
raw form, which is ugly and expensive.
The ultimate goal is to eliminate the wasteful blob reloading when raw
content is needed. Toward that end, teach Blob how to vend raw or cooked
content.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Some blob representations (such as embedded images) require raw blob
content, however, the 'blob' view is unconditionally handed cooked
(utf8-encoded) content, thus representations which need raw content are
forced to reload the blob in raw form, which is ugly and expensive (due
to shelling out to git-cat-file a second time).
The ultimate goal is to eliminate the wasteful blob reloading when raw
content is needed. As a first step, introduce a Blob abstraction to be
returned by Repo.blob() rather than the cooked content. A subsequent
change will flesh out Blob, allowing it to return raw or cooked content
on demand without the client having to specify one or the other when
invoking Repo.blob().
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Sneakily extracting the raw 'fd' from the utf8-encoding wrapper
returned by GitCommand.run() is ugly and fragile. Instead, take
advantage of the new formal API for requesting raw command output.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Currently, clients which want the raw output from a Git command must
sneakily extract the raw 'fd' from the utf8-encoding wrapper returned
by GitCommand.run(). This is ugly and fragile. Instead, provide a
formal mechanism for requesting raw output.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Currently, clients which want the raw output from a Git command must
sneakily extract the raw 'fd' from the utf8-encoding wrapper returned
by run_git(). This is ugly and fragile. Instead, provide a formal
mechanism for requesting raw output.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
The 'max_pages' default value of 5 is quite low. Coupled with
'commits_per_page' default 50, this allows for only 250 commits, which
is likely unsuitable for even relatively small projects. Options are to
remove the cap altogether or to raise the default limit. At this time,
choose the latter, which should be friendlier to larger projects, in
general, while still guarding against run-away storage space
consumption.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Branch names in Git may be hierarchical (for example, "wip/parser/fix"),
however, git-arr's Bottle routing rules do not take this into account.
Fix this shortcoming.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>