7 Commits
0.15 ... 0.30

Author SHA1 Message Date
Alberto Bertogli
78f1b6def0 Update README
This patch updates README, converting it to markdown, adding more links
and references, and explicitly mention the Python 3 dependency.
2020-05-25 02:22:53 +01:00
Alberto Bertogli
4cb2f59dd4 Remove TODO
The TODO includes many obsolete entries and is generally not kept up to
date; remove it to avoid confusion.
2020-05-25 02:22:53 +01:00
Alberto Bertogli
e2155f6b33 Remove unused/unnecessary code
This patch removes some code that is unused and/or unnecessary. Most of
it is left over from previous situations, but is no longer needed.
2020-05-25 02:04:55 +01:00
Alberto Bertogli
aee18d0edd Simplify smstr
With the Python 2 to 3 migration and the type checking, we can be
fairly confident that smstr are always constructed from strings, not
bytes.

This allows the code to be simplified, as we no longer need to carry
the dual raw/unicode representation.
2020-05-24 16:05:18 +01:00
Alberto Bertogli
20b99ee568 Introduce type annotations
This patch introduces type annotations, which can be checked with mypy.

The coverage is not very comprehensive for now, but it is a starting
point and will be expanded in later patches.
2020-05-24 16:04:24 +01:00
Alberto Bertogli
ad950208bf Auto-format the code with black
This patch applies auto-formatting of the source code using black
(https://github.com/psf/black).

This makes the code style more uniform and simplifies editing.

Note I also tried yapf, and IMO produced nicer output and handled some
corner cases much better, but unfortunately it doesn't yet support type
annotations, which will be introduced in later commits.

So in the future we might switch to yapf instead.
2020-05-24 16:04:04 +01:00
Alberto Bertogli
1183d6f817 Move to Python 3
Python 3 was released more than 10 years ago, and support for Python 2
is going away, with many Linux distributions starting to phase it out.

This patch migrates git-arr to Python 3.

The generated output is almost exactly the same, there are some minor
differences such as HTML characters being quoted more aggresively, and
handling of paths with non-utf8 values.
2020-05-24 04:50:39 +01:00
11 changed files with 578 additions and 486 deletions

3
.gitignore vendored
View File

@@ -1,3 +1,4 @@
*.pyc
__pycache__
.*.swp
.*
!.gitignore

56
README
View File

@@ -1,56 +0,0 @@
git-arr - A git repository browser
----------------------------------
git-arr is a git repository browser that can generate static HTML instead of
having to run dynamically.
It is smaller, with less features and a different set of tradeoffs than
other similar software, so if you're looking for a robust and featureful git
browser, please look at gitweb or cgit instead.
However, if you want to generate static HTML at the expense of features, then
it's probably going to be useful.
It's open source under the MIT licence, please see the LICENSE file for more
information.
Getting started
---------------
You will need Python, and the bottle.py framework (the package is usually
called python-bottle in most distributions).
If pygments is available, it will be used for syntax highlighting, otherwise
everything will work fine, just in black and white.
First, create a configuration file for your repositories. You can start by
copying sample.conf, which has the list of the available options.
Then, to generate the output to "/var/www/git-arr/" directory, run:
$ ./git-arr --config config.conf generate --output /var/www/git-arr/
That's it!
The first time you generate, depending on the size of your repositories, it
can take some time. Subsequent runs should take less time, as it is smart
enough to only generate what has changed.
You can also use git-arr dynamically, although it's not its intended mode of
use, by running:
$ ./git-arr --config config.conf serve
That can be useful when making changes to the software itself.
Contact
-------
If you want to report bugs, send patches, or have any questions or comments,
just let me know at albertito@blitiri.com.ar.

65
README.md Normal file
View File

@@ -0,0 +1,65 @@
# git-arr - A git repository browser
[git-arr] is a [git] repository browser that can generate static HTML.
It is smaller, with less features and a different set of tradeoffs than
other similar software, so if you're looking for a robust and featureful git
browser, please look at [gitweb] or [cgit] instead.
However, if you want to generate static HTML at the expense of features, then
it's probably going to be useful.
It's open source under the MIT licence, please see the `LICENSE` file for more
information.
[git-arr]: https://blitiri.com.ar/p/git-arr/
[git]: https://git-scm.com/
[gitweb]: https://git-scm.com/docs/gitweb
[cgit]: https://git.zx2c4.com/cgit/about/
## Getting started
You will need [Python 3], and the [bottle.py] framework (the package is usually
called `python3-bottle` in most distributions).
If [pygments] is available, it will be used for syntax highlighting, otherwise
everything will work fine, just in black and white.
First, create a configuration file for your repositories. You can start by
copying `sample.conf`, which has the list of the available options.
Then, to generate the output to `/var/www/git-arr/` directory, run:
```sh
./git-arr --config config.conf generate --output /var/www/git-arr/
```
That's it!
The first time you generate, depending on the size of your repositories, it
can take some time. Subsequent runs should take less time, as it is smart
enough to only generate what has changed.
You can also use git-arr dynamically, although it's not its intended mode of
use, by running:
```
./git-arr --config config.conf serve
```
That can be useful when making changes to the software itself.
[Python 3]: https://www.python.org/
[bottle.py]: https://bottlepy.org/
[pygments]: https://pygments.org/
## Contact
If you want to report bugs, send patches, or have any questions or comments,
just let me know at albertito@blitiri.com.ar.

13
TODO
View File

@@ -1,13 +0,0 @@
In no particular order.
- Atom/RSS.
- Nicer diff:
- Better stat section, with nicer handling of filenames. We should switch to
--patch-with-raw and parse from that.
- Nicer output, don't use pygments but do our own.
- Anchors in diff sections so we can link to them.
- Short symlinks to commits, with configurable length.
- Handle symlinks properly.
- "X hours ago" via javascript (only if it's not too ugly).

431
git-arr
View File

@@ -1,22 +1,17 @@
#!/usr/bin/env python
#!/usr/bin/env python3
"""
git-arr: A git web html generator.
"""
from __future__ import print_function
import configparser
import math
import optparse
import os
import re
import sys
from typing import Union
try:
import configparser
except ImportError:
import ConfigParser as configparser
import bottle
import bottle # type: ignore
import git
import utils
@@ -26,12 +21,13 @@ import utils
# Note this assumes they live next to the executable, and that is not a good
# assumption; but it's good enough for now.
bottle.TEMPLATE_PATH.insert(
0, os.path.abspath(os.path.dirname(sys.argv[0])) + '/views/')
0, os.path.abspath(os.path.dirname(sys.argv[0])) + "/views/"
)
# The path to our static files.
# Note this assumes they live next to the executable, and that is not a good
# assumption; but it's good enough for now.
static_path = os.path.abspath(os.path.dirname(sys.argv[0])) + '/static/'
static_path = os.path.abspath(os.path.dirname(sys.argv[0])) + "/static/"
# The list of repositories is a global variable for convenience. It will be
@@ -46,39 +42,39 @@ def load_config(path):
as configured.
"""
defaults = {
'tree': 'yes',
'rootdiff': 'yes',
'desc': '',
'recursive': 'no',
'prefix': '',
'commits_in_summary': '10',
'commits_per_page': '50',
'max_pages': '250',
'web_url': '',
'web_url_file': 'web_url',
'git_url': '',
'git_url_file': 'cloneurl',
'embed_markdown': 'yes',
'embed_images': 'no',
'ignore': '',
'generate_patch': 'yes',
"tree": "yes",
"rootdiff": "yes",
"desc": "",
"recursive": "no",
"prefix": "",
"commits_in_summary": "10",
"commits_per_page": "50",
"max_pages": "250",
"web_url": "",
"web_url_file": "web_url",
"git_url": "",
"git_url_file": "cloneurl",
"embed_markdown": "yes",
"embed_images": "no",
"ignore": "",
"generate_patch": "yes",
}
config = configparser.SafeConfigParser(defaults)
config = configparser.ConfigParser(defaults)
config.read(path)
# Do a first pass for general sanity checking and recursive expansion.
for s in config.sections():
if config.getboolean(s, 'recursive'):
root = config.get(s, 'path')
prefix = config.get(s, 'prefix')
if config.getboolean(s, "recursive"):
root = config.get(s, "path")
prefix = config.get(s, "prefix")
for path in os.listdir(root):
fullpath = find_git_dir(root + '/' + path)
fullpath = find_git_dir(root + "/" + path)
if not fullpath:
continue
if os.path.exists(fullpath + '/disable_gitweb'):
if os.path.exists(fullpath + "/disable_gitweb"):
continue
section = prefix + path
@@ -86,58 +82,60 @@ def load_config(path):
continue
config.add_section(section)
for opt, value in config.items(s, raw = True):
for opt, value in config.items(s, raw=True):
config.set(section, opt, value)
config.set(section, 'path', fullpath)
config.set(section, 'recursive', 'no')
config.set(section, "path", fullpath)
config.set(section, "recursive", "no")
# This recursive section is no longer useful.
config.remove_section(s)
for s in config.sections():
if config.get(s, 'ignore') and re.search(config.get(s, 'ignore'), s):
if config.get(s, "ignore") and re.search(config.get(s, "ignore"), s):
continue
fullpath = find_git_dir(config.get(s, 'path'))
fullpath = find_git_dir(config.get(s, "path"))
if not fullpath:
raise ValueError(
'%s: path %s is not a valid git repository' % (
s, config.get(s, 'path')))
"%s: path %s is not a valid git repository"
% (s, config.get(s, "path"))
)
config.set(s, 'path', fullpath)
config.set(s, 'name', s)
config.set(s, "path", fullpath)
config.set(s, "name", s)
desc = config.get(s, 'desc')
if not desc and os.path.exists(fullpath + '/description'):
desc = open(fullpath + '/description').read().strip()
desc = config.get(s, "desc")
if not desc and os.path.exists(fullpath + "/description"):
desc = open(fullpath + "/description").read().strip()
r = git.Repo(fullpath, name = s)
r = git.Repo(fullpath, name=s)
r.info.desc = desc
r.info.commits_in_summary = config.getint(s, 'commits_in_summary')
r.info.commits_per_page = config.getint(s, 'commits_per_page')
r.info.max_pages = config.getint(s, 'max_pages')
r.info.commits_in_summary = config.getint(s, "commits_in_summary")
r.info.commits_per_page = config.getint(s, "commits_per_page")
r.info.max_pages = config.getint(s, "max_pages")
if r.info.max_pages <= 0:
r.info.max_pages = sys.maxint
r.info.generate_tree = config.getboolean(s, 'tree')
r.info.root_diff = config.getboolean(s, 'rootdiff')
r.info.generate_patch = config.getboolean(s, 'generate_patch')
r.info.max_pages = sys.maxsize
r.info.generate_tree = config.getboolean(s, "tree")
r.info.root_diff = config.getboolean(s, "rootdiff")
r.info.generate_patch = config.getboolean(s, "generate_patch")
r.info.web_url = config.get(s, 'web_url')
web_url_file = fullpath + '/' + config.get(s, 'web_url_file')
r.info.web_url = config.get(s, "web_url")
web_url_file = fullpath + "/" + config.get(s, "web_url_file")
if not r.info.web_url and os.path.isfile(web_url_file):
r.info.web_url = open(web_url_file).read()
r.info.git_url = config.get(s, 'git_url')
git_url_file = fullpath + '/' + config.get(s, 'git_url_file')
r.info.git_url = config.get(s, "git_url")
git_url_file = fullpath + "/" + config.get(s, "git_url_file")
if not r.info.git_url and os.path.isfile(git_url_file):
r.info.git_url = open(git_url_file).read()
r.info.embed_markdown = config.getboolean(s, 'embed_markdown')
r.info.embed_images = config.getboolean(s, 'embed_images')
r.info.embed_markdown = config.getboolean(s, "embed_markdown")
r.info.embed_images = config.getboolean(s, "embed_images")
repos[r.name] = r
def find_git_dir(path):
"""Returns the path to the git directory for the given repository.
@@ -147,25 +145,26 @@ def find_git_dir(path):
An empty string is returned if the given path is not a valid repository.
"""
def check(p):
"""A dirty check for whether this is a git dir or not."""
# Note silent stderr because we expect this to fail and don't want the
# noise; and also we strip the final \n from the output.
return git.run_git(p,
['rev-parse', '--git-dir'],
silent_stderr = True).read()[:-1]
return git.run_git(
p, ["rev-parse", "--git-dir"], silent_stderr=True
).read()[:-1]
for p in [ path, path + '/.git' ]:
for p in [path, path + "/.git"]:
if check(p):
return p
return ''
return ""
def repo_filter(unused_conf):
"""Bottle route filter for repos."""
# TODO: consider allowing /, which is tricky.
regexp = r'[\w\.~-]+'
regexp = r"[\w\.~-]+"
def to_python(s):
"""Return the corresponding Python object."""
@@ -179,8 +178,9 @@ def repo_filter(unused_conf):
return regexp, to_python, to_url
app = bottle.Bottle()
app.router.add_filter('repo', repo_filter)
app.router.add_filter("repo", repo_filter)
bottle.app.push(app)
@@ -191,18 +191,18 @@ def with_utils(f):
templates.
"""
utilities = {
'shorten': utils.shorten,
'can_colorize': utils.can_colorize,
'colorize_diff': utils.colorize_diff,
'colorize_blob': utils.colorize_blob,
'can_markdown': utils.can_markdown,
'markdown_blob': utils.markdown_blob,
'can_embed_image': utils.can_embed_image,
'embed_image_blob': utils.embed_image_blob,
'is_binary': utils.is_binary,
'hexdump': utils.hexdump,
'abort': bottle.abort,
'smstr': git.smstr,
"shorten": utils.shorten,
"can_colorize": utils.can_colorize,
"colorize_diff": utils.colorize_diff,
"colorize_blob": utils.colorize_blob,
"can_markdown": utils.can_markdown,
"markdown_blob": utils.markdown_blob,
"can_embed_image": utils.can_embed_image,
"embed_image_blob": utils.embed_image_blob,
"is_binary": utils.is_binary,
"hexdump": utils.hexdump,
"abort": bottle.abort,
"smstr": git.smstr,
}
def wrapped(*args, **kwargs):
@@ -216,89 +216,108 @@ def with_utils(f):
return wrapped
@bottle.route('/')
@bottle.view('index')
@bottle.route("/")
@bottle.view("index")
@with_utils
def index():
return dict(repos = repos)
return dict(repos=repos)
@bottle.route('/r/<repo:repo>/')
@bottle.view('summary')
@bottle.route("/r/<repo:repo>/")
@bottle.view("summary")
@with_utils
def summary(repo):
return dict(repo = repo)
return dict(repo=repo)
@bottle.route('/r/<repo:repo>/c/<cid:re:[0-9a-f]{5,40}>/')
@bottle.view('commit')
@bottle.route("/r/<repo:repo>/c/<cid:re:[0-9a-f]{5,40}>/")
@bottle.view("commit")
@with_utils
def commit(repo, cid):
c = repo.commit(cid)
if not c:
bottle.abort(404, 'Commit not found')
bottle.abort(404, "Commit not found")
return dict(repo = repo, c=c)
return dict(repo=repo, c=c)
@bottle.route('/r/<repo:repo>/c/<cid:re:[0-9a-f]{5,40}>.patch')
@bottle.view('patch',
# Output is text/plain, don't do HTML escaping.
template_settings={"noescape": True})
@bottle.route("/r/<repo:repo>/c/<cid:re:[0-9a-f]{5,40}>.patch")
@bottle.view(
"patch",
# Output is text/plain, don't do HTML escaping.
template_settings={"noescape": True},
)
def patch(repo, cid):
c = repo.commit(cid)
if not c:
bottle.abort(404, 'Commit not found')
bottle.abort(404, "Commit not found")
bottle.response.content_type = 'text/plain; charset=utf8'
bottle.response.content_type = "text/plain; charset=utf8"
return dict(repo = repo, c=c)
return dict(repo=repo, c=c)
@bottle.route('/r/<repo:repo>/b/<bname:path>/t/f=<fname:path>.html')
@bottle.route('/r/<repo:repo>/b/<bname:path>/t/<dirname:path>/f=<fname:path>.html')
@bottle.view('blob')
@bottle.route("/r/<repo:repo>/b/<bname:path>/t/f=<fname:path>.html")
@bottle.route(
"/r/<repo:repo>/b/<bname:path>/t/<dirname:path>/f=<fname:path>.html"
)
@bottle.view("blob")
@with_utils
def blob(repo, bname, fname, dirname = ''):
if dirname and not dirname.endswith('/'):
dirname = dirname + '/'
def blob(repo, bname, fname, dirname=""):
if dirname and not dirname.endswith("/"):
dirname = dirname + "/"
dirname = git.smstr.from_url(dirname)
fname = git.smstr.from_url(fname)
path = dirname.raw + fname.raw
# Handle backslash-escaped characters, which are not utf8.
# This matches the generated links from git.unquote().
path = path.encode("utf8").decode("unicode-escape").encode("latin1")
content = repo.blob(path, bname)
if content is None:
bottle.abort(404, "File %r not found in branch %s" % (path, bname))
return dict(repo = repo, branch = bname, dirname = dirname, fname = fname,
blob = content)
return dict(
repo=repo, branch=bname, dirname=dirname, fname=fname, blob=content
)
@bottle.route('/r/<repo:repo>/b/<bname:path>/t/')
@bottle.route('/r/<repo:repo>/b/<bname:path>/t/<dirname:path>/')
@bottle.view('tree')
@bottle.route("/r/<repo:repo>/b/<bname:path>/t/")
@bottle.route("/r/<repo:repo>/b/<bname:path>/t/<dirname:path>/")
@bottle.view("tree")
@with_utils
def tree(repo, bname, dirname = ''):
if dirname and not dirname.endswith('/'):
dirname = dirname + '/'
def tree(repo, bname, dirname=""):
if dirname and not dirname.endswith("/"):
dirname = dirname + "/"
dirname = git.smstr.from_url(dirname)
return dict(repo = repo, branch = bname, tree = repo.tree(bname),
dirname = dirname)
return dict(
repo=repo, branch=bname, tree=repo.tree(bname), dirname=dirname
)
@bottle.route('/r/<repo:repo>/b/<bname:path>/')
@bottle.route('/r/<repo:repo>/b/<bname:path>/<offset:int>.html')
@bottle.view('branch')
@bottle.route("/r/<repo:repo>/b/<bname:path>/")
@bottle.route("/r/<repo:repo>/b/<bname:path>/<offset:int>.html")
@bottle.view("branch")
@with_utils
def branch(repo, bname, offset = 0):
return dict(repo = repo, branch = bname, offset = offset)
def branch(repo, bname, offset=0):
return dict(repo=repo, branch=bname, offset=offset)
@bottle.route('/static/<path:path>')
@bottle.route("/static/<path:path>")
def static(path):
return bottle.static_file(path, root = static_path)
return bottle.static_file(path, root=static_path)
#
# Static HTML generation
#
def is_404(e):
"""True if e is an HTTPError with status 404, False otherwise."""
# We need this because older bottle.py versions put the status code in
@@ -309,17 +328,19 @@ def is_404(e):
else:
return e.status_code == 404
def generate(output, only = None):
def generate(output: str, only=None):
"""Generate static html to the output directory."""
def write_to(path, func_or_str, args = (), mtime = None):
path = output + '/' + path
def write_to(path: str, func_or_str, args=(), mtime=None):
path = output + "/" + path
dirname = os.path.dirname(path)
if not os.path.exists(dirname):
os.makedirs(dirname)
if mtime:
path_mtime = 0
path_mtime: Union[float, int] = 0
if os.path.exists(path):
path_mtime = os.stat(path).st_mtime
@@ -339,7 +360,7 @@ def generate(output, only = None):
else:
# Otherwise, be lazy if we were given a function to run, or write
# always if they gave us a string.
if isinstance(func_or_str, (str, unicode)):
if isinstance(func_or_str, str):
print(path)
s = func_or_str
else:
@@ -348,71 +369,99 @@ def generate(output, only = None):
print(path)
s = func_or_str(*args)
open(path, 'w').write(s.encode('utf8', errors = 'xmlcharrefreplace'))
open(path, "w").write(s)
if mtime:
os.utime(path, (mtime, mtime))
def link(from_path, to_path):
from_path = output + '/' + from_path
from_path = output + "/" + from_path
if os.path.lexists(from_path):
return
print(from_path, '->', to_path)
print(from_path, "->", to_path)
os.symlink(to_path, from_path)
def write_tree(r, bn, mtime):
t = r.tree(bn)
def write_tree(r: git.Repo, bn: str, mtime):
t: git.Tree = r.tree(bn)
write_to('r/%s/b/%s/t/index.html' % (r.name, bn),
tree, (r, bn), mtime)
write_to("r/%s/b/%s/t/index.html" % (r.name, bn), tree, (r, bn), mtime)
for otype, oname, _ in t.ls('', recursive = True):
for otype, oname, _ in t.ls("", recursive=True):
# FIXME: bottle cannot route paths with '\n' so those are sadly
# expected to fail for now; we skip them.
if '\n' in oname.raw:
print('skipping file with \\n: %r' % (oname.raw))
if "\n" in oname.raw:
print("skipping file with \\n: %r" % (oname.raw))
continue
if otype == 'blob':
if otype == "blob":
dirname = git.smstr(os.path.dirname(oname.raw))
fname = git.smstr(os.path.basename(oname.raw))
write_to(
'r/%s/b/%s/t/%s%sf=%s.html' %
(str(r.name), str(bn),
dirname.raw, '/' if dirname.raw else '', fname.raw),
blob, (r, bn, fname.url, dirname.url), mtime)
"r/%s/b/%s/t/%s%sf=%s.html"
% (
str(r.name),
str(bn),
dirname.raw,
"/" if dirname.raw else "",
fname.raw,
),
blob,
(r, bn, fname.url, dirname.url),
mtime,
)
else:
write_to('r/%s/b/%s/t/%s/index.html' %
(str(r.name), str(bn), oname.raw),
tree, (r, bn, oname.url), mtime)
write_to(
"r/%s/b/%s/t/%s/index.html"
% (str(r.name), str(bn), oname.raw),
tree,
(r, bn, oname.url),
mtime,
)
# Always generate the index, to keep the "last updated" time fresh.
write_to('index.html', index())
write_to("index.html", index())
# We can't call static() because it relies on HTTP headers.
read_f = lambda f: open(f).read()
write_to('static/git-arr.css', read_f, [static_path + '/git-arr.css'],
os.stat(static_path + '/git-arr.css').st_mtime)
write_to('static/git-arr.js', read_f, [static_path + '/git-arr.js'],
os.stat(static_path + '/git-arr.js').st_mtime)
write_to('static/syntax.css', read_f, [static_path + '/syntax.css'],
os.stat(static_path + '/syntax.css').st_mtime)
write_to(
"static/git-arr.css",
read_f,
[static_path + "/git-arr.css"],
os.stat(static_path + "/git-arr.css").st_mtime,
)
write_to(
"static/git-arr.js",
read_f,
[static_path + "/git-arr.js"],
os.stat(static_path + "/git-arr.js").st_mtime,
)
write_to(
"static/syntax.css",
read_f,
[static_path + "/syntax.css"],
os.stat(static_path + "/syntax.css").st_mtime,
)
rs = sorted(repos.values(), key = lambda r: r.name)
rs = sorted(list(repos.values()), key=lambda r: r.name)
if only:
rs = [r for r in rs if r.name in only]
for r in rs:
write_to('r/%s/index.html' % r.name, summary(r))
write_to("r/%s/index.html" % r.name, summary(r))
for bn in r.branch_names():
commit_count = 0
commit_ids = r.commit_ids('refs/heads/' + bn,
limit = r.info.commits_per_page * r.info.max_pages)
commit_ids = r.commit_ids(
"refs/heads/" + bn,
limit=r.info.commits_per_page * r.info.max_pages,
)
for cid in commit_ids:
write_to('r/%s/c/%s/index.html' % (r.name, cid),
commit, (r, cid))
write_to(
"r/%s/c/%s/index.html" % (r.name, cid), commit, (r, cid)
)
if r.info.generate_patch:
write_to('r/%s/c/%s.patch' % (r.name, cid), patch, (r, cid))
write_to(
"r/%s/c/%s.patch" % (r.name, cid), patch, (r, cid)
)
commit_count += 1
# To avoid regenerating files that have not changed, we will
@@ -421,65 +470,83 @@ def generate(output, only = None):
# write.
branch_mtime = r.commit(bn).committer_date.epoch
nr_pages = int(math.ceil(
float(commit_count) / r.info.commits_per_page))
nr_pages = int(
math.ceil(float(commit_count) / r.info.commits_per_page)
)
nr_pages = min(nr_pages, r.info.max_pages)
for page in range(nr_pages):
write_to('r/%s/b/%s/%d.html' % (r.name, bn, page),
branch, (r, bn, page), branch_mtime)
write_to(
"r/%s/b/%s/%d.html" % (r.name, bn, page),
branch,
(r, bn, page),
branch_mtime,
)
link(from_path = 'r/%s/b/%s/index.html' % (r.name, bn),
to_path = '0.html')
link(
from_path="r/%s/b/%s/index.html" % (r.name, bn),
to_path="0.html",
)
if r.info.generate_tree:
write_tree(r, bn, branch_mtime)
for tag_name, obj_id in r.tags():
try:
write_to('r/%s/c/%s/index.html' % (r.name, obj_id),
commit, (r, obj_id))
write_to(
"r/%s/c/%s/index.html" % (r.name, obj_id),
commit,
(r, obj_id),
)
except bottle.HTTPError as e:
# Some repos can have tags pointing to non-commits. This
# happens in the Linux Kernel's v2.6.11, which points directly
# to a tree. Ignore them.
if is_404(e):
print('404 in tag %s (%s)' % (tag_name, obj_id))
print("404 in tag %s (%s)" % (tag_name, obj_id))
else:
raise
def main():
parser = optparse.OptionParser('usage: %prog [options] serve|generate')
parser.add_option('-c', '--config', metavar = 'FILE',
help = 'configuration file')
parser.add_option('-o', '--output', metavar = 'DIR',
help = 'output directory (for generate)')
parser.add_option('', '--only', metavar = 'REPO', action = 'append',
default = [],
help = 'generate/serve only this repository')
parser = optparse.OptionParser("usage: %prog [options] serve|generate")
parser.add_option(
"-c", "--config", metavar="FILE", help="configuration file"
)
parser.add_option(
"-o", "--output", metavar="DIR", help="output directory (for generate)"
)
parser.add_option(
"",
"--only",
metavar="REPO",
action="append",
default=[],
help="generate/serve only this repository",
)
opts, args = parser.parse_args()
if not opts.config:
parser.error('--config is mandatory')
parser.error("--config is mandatory")
try:
load_config(opts.config)
except (configparser.NoOptionError, ValueError) as e:
print('Error parsing config:', e)
print("Error parsing config:", e)
return
if not args:
parser.error('Must specify an action (serve|generate)')
parser.error("Must specify an action (serve|generate)")
if args[0] == 'serve':
bottle.run(host = 'localhost', port = 8008, reloader = True)
elif args[0] == 'generate':
if args[0] == "serve":
bottle.run(host="localhost", port=8008, reloader=True)
elif args[0] == "generate":
if not opts.output:
parser.error('Must specify --output')
generate(output = opts.output, only = opts.only)
parser.error("Must specify --output")
generate(output=opts.output, only=opts.only)
else:
parser.error('Unknown action %s' % args[0])
parser.error("Unknown action %s" % args[0])
if __name__ == '__main__':
if __name__ == "__main__":
main()

371
git.py
View File

@@ -12,101 +12,86 @@ import subprocess
from collections import defaultdict
import email.utils
import datetime
import urllib
from cgi import escape
import urllib.request, urllib.parse, urllib.error
from html import escape
from typing import Any, Dict, IO, Iterable, List, Optional, Tuple, Union
# Path to the git binary.
GIT_BIN = "git"
class EncodeWrapper:
"""File-like wrapper that returns data utf8 encoded."""
def __init__(self, fd, encoding = 'utf8', errors = 'replace'):
self.fd = fd
self.encoding = encoding
self.errors = errors
def __iter__(self):
for line in self.fd:
yield line.decode(self.encoding, errors = self.errors)
def read(self):
"""Returns the whole content."""
s = self.fd.read()
return s.decode(self.encoding, errors = self.errors)
def readline(self):
"""Returns a single line."""
s = self.fd.readline()
return s.decode(self.encoding, errors = self.errors)
def run_git(repo_path, params, stdin = None, silent_stderr = False, raw = False):
def run_git(
repo_path: str, params, stdin: bytes = None, silent_stderr=False, raw=False
) -> Union[IO[str], IO[bytes]]:
"""Invokes git with the given parameters.
This function invokes git with the given parameters, and returns a
file-like object with the output (from a pipe).
"""
params = [GIT_BIN, '--git-dir=%s' % repo_path] + list(params)
params = [GIT_BIN, "--git-dir=%s" % repo_path] + list(params)
stderr = None
if silent_stderr:
stderr = subprocess.PIPE
if not stdin:
p = subprocess.Popen(params,
stdin = None, stdout = subprocess.PIPE, stderr = stderr)
p = subprocess.Popen(
params, stdin=None, stdout=subprocess.PIPE, stderr=stderr
)
else:
p = subprocess.Popen(params,
stdin = subprocess.PIPE, stdout = subprocess.PIPE,
stderr = stderr)
p = subprocess.Popen(
params,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=stderr,
)
assert p.stdin is not None
p.stdin.write(stdin)
p.stdin.close()
assert p.stdout is not None
if raw:
return p.stdout
# We need to wrap stdout if we want to decode it as utf8, subprocess
# doesn't support us telling it the encoding.
if sys.version_info.major == 3:
return io.TextIOWrapper(p.stdout, encoding = 'utf8',
errors = 'replace')
else:
return EncodeWrapper(p.stdout)
return io.TextIOWrapper(
p.stdout, encoding="utf8", errors="backslashreplace"
)
class GitCommand (object):
class GitCommand(object):
"""Convenient way of invoking git."""
def __init__(self, path, cmd, *args, **kwargs):
def __init__(self, path: str, cmd: str):
self._override = True
self._path = path
self._cmd = cmd
self._args = list(args)
self._kwargs = {}
self._stdin_buf = None
self._args: List[str] = []
self._kwargs: Dict[str, str] = {}
self._stdin_buf: Optional[bytes] = None
self._raw = False
self._override = False
for k, v in kwargs:
self.__setattr__(k, v)
def __setattr__(self, k, v):
if k == '_override' or self._override:
if k == "_override" or self._override:
self.__dict__[k] = v
return
k = k.replace('_', '-')
k = k.replace("_", "-")
self._kwargs[k] = v
def arg(self, a):
def arg(self, a: str):
"""Adds an argument."""
self._args.append(a)
def raw(self, b):
def raw(self, b: bool):
"""Request raw rather than utf8-encoded command output."""
self._override = True
self._raw = b
self._override = False
def stdin(self, s):
def stdin(self, s: bytes):
"""Sets the contents we will send in stdin."""
self._override = True
self._stdin_buf = s
@@ -116,46 +101,37 @@ class GitCommand (object):
"""Runs the git command."""
params = [self._cmd]
for k, v in self._kwargs.items():
dash = '--' if len(k) > 1 else '-'
for k, v in list(self._kwargs.items()):
dash = "--" if len(k) > 1 else "-"
if v is None:
params.append('%s%s' % (dash, k))
params.append("%s%s" % (dash, k))
else:
params.append('%s%s=%s' % (dash, k, str(v)))
params.append("%s%s=%s" % (dash, k, str(v)))
params.extend(self._args)
return run_git(self._path, params, self._stdin_buf, raw = self._raw)
return run_git(self._path, params, self._stdin_buf, raw=self._raw)
class SimpleNamespace (object):
class SimpleNamespace(object):
"""An entirely flexible object, which provides a convenient namespace."""
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
class smstr:
"""A "smart" string, containing many representations for ease of use.
"""A "smart" string, containing many representations for ease of use."""
This is a string class that contains:
.raw -> raw string, authoritative source.
.unicode -> unicode representation, may not be perfect if .raw is not
proper utf8 but should be good enough to show.
.url -> escaped for safe embedding in URLs, can be not quite
readable.
.html -> an HTML-embeddable representation.
"""
def __init__(self, raw):
if not isinstance(raw, str):
raise TypeError("The raw string must be instance of 'str'")
self.raw = raw
self.unicode = raw.decode('utf8', errors = 'replace')
self.url = urllib.pathname2url(raw)
raw: str # string, probably utf8-encoded, good enough to show.
url: str # escaped for safe embedding in URLs (not human-readable).
html: str # HTML-embeddable representation.
def __init__(self, s: str):
self.raw = s
self.url = urllib.request.pathname2url(s)
self.html = self._to_html()
def __cmp__(self, other):
return cmp(self.raw, other.raw)
# Note we don't define __repr__() or __str__() to prevent accidental
# misuse. It does mean that some uses become more annoying, so it's a
# tradeoff that may change in the future.
@@ -163,11 +139,11 @@ class smstr:
@staticmethod
def from_url(url):
"""Returns an smstr() instance from an url-encoded string."""
return smstr(urllib.url2pathname(url))
return smstr(urllib.request.url2pathname(url))
def split(self, sep):
"""Like str.split()."""
return [ smstr(s) for s in self.raw.split(sep) ]
return [smstr(s) for s in self.raw.split(sep)]
def __add__(self, other):
if isinstance(other, smstr):
@@ -176,10 +152,10 @@ class smstr:
def _to_html(self):
"""Returns an html representation of the unicode string."""
html = u''
for c in escape(self.unicode):
if c in '\t\r\n\r\f\a\b\v\0':
esc_c = c.encode('ascii').encode('string_escape')
html = ""
for c in escape(self.raw):
if c in "\t\r\n\r\f\a\b\v\0":
esc_c = c.encode("unicode-escape").decode("utf8")
html += '<span class="ctrlchr">%s</span>' % esc_c
else:
html += c
@@ -187,17 +163,26 @@ class smstr:
return html
def unquote(s):
def unquote(s: str):
"""Git can return quoted file names, unquote them. Always return a str."""
if not (s[0] == '"' and s[-1] == '"'):
# Unquoted strings are always safe, no need to mess with them; just
# make sure we return str.
s = s.encode('ascii')
# Unquoted strings are always safe, no need to mess with them
return s
# Get rid of the quotes, we never want them in the output, and convert to
# a raw string, un-escaping the backslashes.
s = s[1:-1].decode('string-escape')
# The string will be of the form `"<escaped>"`, where <escaped> is a
# backslash-escaped representation of the name of the file.
# Examples: "with\ttwo\ttabs" , "\303\261aca-utf8", "\361aca-latin1"
# Get rid of the quotes, we never want them in the output.
s = s[1:-1]
# Un-escape the backslashes.
# latin1 is ok to use here because in Python it just maps the code points
# 0-255 to the bytes 0x-0xff, which is what we expect.
s = s.encode("latin1").decode("unicode-escape")
# Convert to utf8.
s = s.encode("latin1").decode("utf8", errors="backslashreplace")
return s
@@ -205,18 +190,18 @@ def unquote(s):
class Repo:
"""A git repository."""
def __init__(self, path, name = None, info = None):
def __init__(self, path: str, name=None, info=None):
self.path = path
self.name = name
self.info = info or SimpleNamespace()
self.info: Any = info or SimpleNamespace()
def cmd(self, cmd):
"""Returns a GitCommand() on our path."""
return GitCommand(self.path, cmd)
def for_each_ref(self, pattern = None, sort = None, count = None):
def for_each_ref(self, pattern=None, sort=None, count=None):
"""Returns a list of references."""
cmd = self.cmd('for-each-ref')
cmd = self.cmd("for-each-ref")
if sort:
cmd.sort = sort
if count:
@@ -228,61 +213,57 @@ class Repo:
obj_id, obj_type, ref = l.split()
yield obj_id, obj_type, ref
def branches(self, sort = '-authordate'):
def branches(self, sort="-authordate"):
"""Get the (name, obj_id) of the branches."""
refs = self.for_each_ref(pattern = 'refs/heads/', sort = sort)
refs = self.for_each_ref(pattern="refs/heads/", sort=sort)
for obj_id, _, ref in refs:
yield ref[len('refs/heads/'):], obj_id
yield ref[len("refs/heads/") :], obj_id
def branch_names(self):
"""Get the names of the branches."""
return ( name for name, _ in self.branches() )
return (name for name, _ in self.branches())
def tags(self, sort = '-taggerdate'):
def tags(self, sort="-taggerdate"):
"""Get the (name, obj_id) of the tags."""
refs = self.for_each_ref(pattern = 'refs/tags/', sort = sort)
refs = self.for_each_ref(pattern="refs/tags/", sort=sort)
for obj_id, _, ref in refs:
yield ref[len('refs/tags/'):], obj_id
yield ref[len("refs/tags/") :], obj_id
def tag_names(self):
"""Get the names of the tags."""
return ( name for name, _ in self.tags() )
def commit_ids(self, ref, limit = None):
def commit_ids(self, ref, limit=None):
"""Generate commit ids."""
cmd = self.cmd('rev-list')
cmd = self.cmd("rev-list")
if limit:
cmd.max_count = limit
cmd.arg(ref)
cmd.arg('--')
cmd.arg("--")
for l in cmd.run():
yield l.rstrip('\n')
yield l.rstrip("\n")
def commit(self, commit_id):
"""Return a single commit."""
cs = list(self.commits(commit_id, limit = 1))
cs = list(self.commits(commit_id, limit=1))
if len(cs) != 1:
return None
return cs[0]
def commits(self, ref, limit = None, offset = 0):
def commits(self, ref, limit=None, offset=0):
"""Generate commit objects for the ref."""
cmd = self.cmd('rev-list')
cmd = self.cmd("rev-list")
if limit:
cmd.max_count = limit + offset
cmd.header = None
cmd.arg(ref)
cmd.arg('--')
cmd.arg("--")
info_buffer = ''
info_buffer = ""
count = 0
for l in cmd.run():
if '\0' in l:
pre, post = l.split('\0', 1)
if "\0" in l:
pre, post = l.split("\0", 1)
info_buffer += pre
count += 1
@@ -301,11 +282,11 @@ class Repo:
def diff(self, ref):
"""Return a Diff object for the ref."""
cmd = self.cmd('diff-tree')
cmd = self.cmd("diff-tree")
cmd.patch = None
cmd.numstat = None
cmd.find_renames = None
if (self.info.root_diff):
if self.info.root_diff:
cmd.root = None
# Note we intentionally do not use -z, as the filename is just for
# reference, and it is safer to let git do the escaping.
@@ -316,13 +297,13 @@ class Repo:
def refs(self):
"""Return a dict of obj_id -> ref."""
cmd = self.cmd('show-ref')
cmd = self.cmd("show-ref")
cmd.dereference = None
r = defaultdict(list)
for l in cmd.run():
l = l.strip()
obj_id, ref = l.split(' ', 1)
obj_id, ref = l.split(" ", 1)
r[obj_id].append(ref)
return r
@@ -333,39 +314,49 @@ class Repo:
def blob(self, path, ref):
"""Returns a Blob instance for the given path."""
cmd = self.cmd('cat-file')
cmd = self.cmd("cat-file")
cmd.raw(True)
cmd.batch = '%(objectsize)'
cmd.batch = "%(objectsize)"
if isinstance(ref, unicode):
ref = ref.encode('utf8')
cmd.stdin('%s:%s' % (ref, path))
# Format: <ref>:<path>
# Construct it in binary since the path might not be utf8.
cmd.stdin(ref.encode("utf8") + b":" + path)
out = cmd.run()
head = out.readline()
if not head or head.strip().endswith('missing'):
if not head or head.strip().endswith(b"missing"):
return None
return Blob(out.read()[:int(head)])
return Blob(out.read()[: int(head)])
def last_commit_timestamp(self):
"""Return the timestamp of the last commit."""
refs = self.for_each_ref(pattern = 'refs/heads/',
sort = '-committerdate', count = 1)
refs = self.for_each_ref(
pattern="refs/heads/", sort="-committerdate", count=1
)
for obj_id, _, _ in refs:
commit = self.commit(obj_id)
return commit.committer_epoch
return -1
class Commit (object):
class Commit(object):
"""A git commit."""
def __init__(self, repo,
commit_id, parents, tree,
author, author_epoch, author_tz,
committer, committer_epoch, committer_tz,
message):
def __init__(
self,
repo,
commit_id,
parents,
tree,
author,
author_epoch,
author_tz,
committer,
committer_epoch,
committer_tz,
message,
):
self._repo = repo
self.id = commit_id
self.parents = parents
@@ -378,28 +369,30 @@ class Commit (object):
self.committer_tz = committer_tz
self.message = message
self.author_name, self.author_email = \
email.utils.parseaddr(self.author)
self.author_name, self.author_email = email.utils.parseaddr(
self.author
)
self.committer_name, self.committer_email = \
email.utils.parseaddr(self.committer)
self.committer_name, self.committer_email = email.utils.parseaddr(
self.committer
)
self.subject, self.body = self.message.split('\n', 1)
self.subject, self.body = self.message.split("\n", 1)
self.author_date = Date(self.author_epoch, self.author_tz)
self.committer_date = Date(self.committer_epoch, self.committer_tz)
# Only get this lazily when we need it; most of the time it's not
# required by the caller.
self._diff = None
def __repr__(self):
return '<C %s p:%s a:%s s:%r>' % (
self.id[:7],
','.join(p[:7] for p in self.parents),
self.author_email,
self.subject[:20])
return "<C %s p:%s a:%s s:%r>" % (
self.id[:7],
",".join(p[:7] for p in self.parents),
self.author_email,
self.subject[:20],
)
@property
def diff(self):
@@ -411,57 +404,68 @@ class Commit (object):
@staticmethod
def from_str(repo, buf):
"""Parses git rev-list output, returns a commit object."""
if '\n\n' in buf:
if "\n\n" in buf:
# Header, commit message
header, raw_message = buf.split('\n\n', 1)
header, raw_message = buf.split("\n\n", 1)
else:
# Header only, no commit message
header, raw_message = buf.rstrip(), ' '
header, raw_message = buf.rstrip(), " "
header_lines = header.split('\n')
header_lines = header.split("\n")
commit_id = header_lines.pop(0)
header_dict = defaultdict(list)
for line in header_lines:
k, v = line.split(' ', 1)
k, v = line.split(" ", 1)
header_dict[k].append(v)
tree = header_dict['tree'][0]
parents = set(header_dict['parent'])
author, author_epoch, author_tz = \
header_dict['author'][0].rsplit(' ', 2)
committer, committer_epoch, committer_tz = \
header_dict['committer'][0].rsplit(' ', 2)
tree = header_dict["tree"][0]
parents = set(header_dict["parent"])
authorhdr = header_dict["author"][0]
author, author_epoch, author_tz = authorhdr.rsplit(" ", 2)
committerhdr = header_dict["committer"][0]
committer, committer_epoch, committer_tz = committerhdr.rsplit(" ", 2)
# Remove the first four spaces from the message's lines.
message = ''
for line in raw_message.split('\n'):
message += line[4:] + '\n'
message = ""
for line in raw_message.split("\n"):
message += line[4:] + "\n"
return Commit(
repo,
commit_id=commit_id,
tree=tree,
parents=parents,
author=author,
author_epoch=author_epoch,
author_tz=author_tz,
committer=committer,
committer_epoch=committer_epoch,
committer_tz=committer_tz,
message=message,
)
return Commit(repo,
commit_id = commit_id, tree = tree, parents = parents,
author = author,
author_epoch = author_epoch, author_tz = author_tz,
committer = committer,
committer_epoch = committer_epoch, committer_tz = committer_tz,
message = message)
class Date:
"""Handy representation for a datetime from git."""
def __init__(self, epoch, tz):
self.epoch = int(epoch)
self.tz = tz
self.utc = datetime.datetime.utcfromtimestamp(self.epoch)
self.tz_sec_offset_min = int(tz[1:3]) * 60 + int(tz[4:])
if tz[0] == '-':
if tz[0] == "-":
self.tz_sec_offset_min = -self.tz_sec_offset_min
self.local = self.utc + datetime.timedelta(
minutes = self.tz_sec_offset_min)
minutes=self.tz_sec_offset_min
)
self.str = self.utc.strftime('%a, %d %b %Y %H:%M:%S +0000 ')
self.str += '(%s %s)' % (self.local.strftime('%H:%M'), self.tz)
self.str = self.utc.strftime("%a, %d %b %Y %H:%M:%S +0000 ")
self.str += "(%s %s)" % (self.local.strftime("%H:%M"), self.tz)
def __str__(self):
return self.str
@@ -469,6 +473,7 @@ class Date:
class Diff:
"""A diff between two trees."""
def __init__(self, ref, changes, body):
"""Constructor.
@@ -488,23 +493,23 @@ class Diff:
ref_id = next(lines)
except StopIteration:
# No diff; this can happen in merges without conflicts.
return Diff(None, [], '')
return Diff(None, [], "")
# First, --numstat information.
changes = []
l = next(lines)
while l != '\n':
l = l.rstrip('\n')
added, deleted, fname = l.split('\t', 2)
added = added.replace('-', '0')
deleted = deleted.replace('-', '0')
while l != "\n":
l = l.rstrip("\n")
added, deleted, fname = l.split("\t", 2)
added = added.replace("-", "0")
deleted = deleted.replace("-", "0")
fname = smstr(unquote(fname))
changes.append((int(added), int(deleted), fname))
l = next(lines)
# And now the diff body. We just store as-is, we don't really care for
# the contents.
body = ''.join(lines)
body = "".join(lines)
return Diff(ref_id, changes, body)
@@ -512,13 +517,15 @@ class Diff:
class Tree:
""" A git tree."""
def __init__(self, repo, ref):
def __init__(self, repo: Repo, ref: str):
self.repo = repo
self.ref = ref
def ls(self, path, recursive = False):
def ls(
self, path, recursive=False
) -> Iterable[Tuple[str, smstr, Optional[int]]]:
"""Generates (type, name, size) for each file in path."""
cmd = self.repo.cmd('ls-tree')
cmd = self.repo.cmd("ls-tree")
cmd.long = None
if recursive:
cmd.r = None
@@ -532,17 +539,17 @@ class Tree:
for l in cmd.run():
_mode, otype, _oid, size, name = l.split(None, 4)
if size == '-':
if size == "-":
size = None
else:
size = int(size)
# Remove the quoting (if any); will always give us a str.
name = unquote(name.strip('\n'))
name = unquote(name.strip("\n"))
# Strip the leading path, the caller knows it and it's often
# easier to work with this way.
name = name[len(path):]
name = name[len(path) :]
# We use a smart string for the name, as it's often tricky to
# manipulate otherwise.
@@ -552,12 +559,12 @@ class Tree:
class Blob:
"""A git blob."""
def __init__(self, raw_content):
def __init__(self, raw_content: bytes):
self.raw_content = raw_content
self._utf8_content = None
@property
def utf8_content(self):
if not self._utf8_content:
self._utf8_content = self.raw_content.decode('utf8', 'replace')
self._utf8_content = self.raw_content.decode("utf8", "replace")
return self._utf8_content

3
pyproject.toml Normal file
View File

@@ -0,0 +1,3 @@
[tool.black]
line-length = 79
include = "(git-arr|git.py|utils.py)$"

106
utils.py
View File

@@ -5,16 +5,16 @@ These are mostly used in templates, for presentation purposes.
"""
try:
import pygments
from pygments import highlight
from pygments import lexers
from pygments.formatters import HtmlFormatter
import pygments # type: ignore
from pygments import highlight # type: ignore
from pygments import lexers # type: ignore
from pygments.formatters import HtmlFormatter # type: ignore
except ImportError:
pygments = None
try:
import markdown
import markdown.treeprocessors
import markdown # type: ignore
import markdown.treeprocessors # type: ignore
except ImportError:
markdown = None
@@ -23,12 +23,16 @@ import mimetypes
import string
import os.path
def shorten(s, width = 60):
import git
def shorten(s: str, width=60):
if len(s) < 60:
return s
return s[:57] + "..."
def can_colorize(s):
def can_colorize(s: str):
"""True if we can colorize the string, False otherwise."""
if pygments is None:
return False
@@ -41,7 +45,7 @@ def can_colorize(s):
# If any of the first 5 lines is over 300 characters long, don't colorize.
start = 0
for i in range(5):
pos = s.find('\n', start)
pos = s.find("\n", start)
if pos == -1:
break
@@ -51,7 +55,8 @@ def can_colorize(s):
return True
def can_markdown(repo, fname):
def can_markdown(repo: git.Repo, fname: str):
"""True if we can process file through markdown, False otherwise."""
if markdown is None:
return False
@@ -61,73 +66,86 @@ def can_markdown(repo, fname):
return fname.endswith(".md")
def can_embed_image(repo, fname):
"""True if we can embed image file in HTML, False otherwise."""
if not repo.info.embed_images:
return False
return (('.' in fname) and
(fname.split('.')[-1].lower() in [ 'jpg', 'jpeg', 'png', 'gif' ]))
return ("." in fname) and (
fname.split(".")[-1].lower() in ["jpg", "jpeg", "png", "gif"]
)
def colorize_diff(s):
lexer = lexers.DiffLexer(encoding = 'utf-8')
formatter = HtmlFormatter(encoding = 'utf-8',
cssclass = 'source_code')
def colorize_diff(s: str) -> str:
lexer = lexers.DiffLexer(encoding="utf-8")
formatter = HtmlFormatter(encoding="utf-8", cssclass="source_code")
return highlight(s, lexer, formatter)
def colorize_blob(fname, s):
def colorize_blob(fname, s: str) -> str:
try:
lexer = lexers.guess_lexer_for_filename(fname, s, encoding = 'utf-8')
lexer = lexers.guess_lexer_for_filename(fname, s, encoding="utf-8")
except lexers.ClassNotFound:
# Only try to guess lexers if the file starts with a shebang,
# otherwise it's likely a text file and guess_lexer() is prone to
# make mistakes with those.
lexer = lexers.TextLexer(encoding = 'utf-8')
if s.startswith('#!'):
lexer = lexers.TextLexer(encoding="utf-8")
if s.startswith("#!"):
try:
lexer = lexers.guess_lexer(s[:80], encoding = 'utf-8')
lexer = lexers.guess_lexer(s[:80], encoding="utf-8")
except lexers.ClassNotFound:
pass
formatter = HtmlFormatter(encoding = 'utf-8',
cssclass = 'source_code',
linenos = 'table',
anchorlinenos = True,
lineanchors = 'line')
formatter = HtmlFormatter(
encoding="utf-8",
cssclass="source_code",
linenos="table",
anchorlinenos=True,
lineanchors="line",
)
return highlight(s, lexer, formatter)
def markdown_blob(s):
def markdown_blob(s: str) -> str:
extensions = [
"markdown.extensions.fenced_code",
"markdown.extensions.tables",
RewriteLocalLinksExtension(),
]
return markdown.markdown(s, extensions = extensions)
return markdown.markdown(s, extensions=extensions)
def embed_image_blob(fname, image_data):
def embed_image_blob(fname: str, image_data: bytes) -> str:
mimetype = mimetypes.guess_type(fname)[0]
return '<img style="max-width:100%;" src="data:{0};base64,{1}" />'.format( \
mimetype, base64.b64encode(image_data))
b64img = base64.b64encode(image_data).decode("ascii")
return '<img style="max-width:100%;" src="data:{0};base64,{1}" />'.format(
mimetype, b64img
)
def is_binary(s):
def is_binary(b: bytes):
# Git considers a blob binary if NUL in first ~8KB, so do the same.
return '\0' in s[:8192]
return b"\0" in b[:8192]
def hexdump(s):
graph = string.ascii_letters + string.digits + string.punctuation + ' '
def hexdump(s: bytes):
graph = string.ascii_letters + string.digits + string.punctuation + " "
b = s.decode("latin1")
offset = 0
while s:
t = s[:16]
hexvals = ['%.2x' % ord(c) for c in t]
text = ''.join(c if c in graph else '.' for c in t)
yield offset, ' '.join(hexvals[:8]), ' '.join(hexvals[8:]), text
while b:
t = b[:16]
hexvals = ["%.2x" % ord(c) for c in t]
text = "".join(c if c in graph else "." for c in t)
yield offset, " ".join(hexvals[:8]), " ".join(hexvals[8:]), text
offset += 16
s = s[16:]
b = b[16:]
if markdown:
class RewriteLocalLinks(markdown.treeprocessors.Treeprocessor):
"""Rewrites relative links to files, to match git-arr's links.
@@ -137,6 +155,7 @@ if markdown:
Note that we're already assuming a degree of sanity in the HTML, so we
don't re-check that the path is reasonable.
"""
def run(self, root):
for child in root:
if child.tag == "a":
@@ -157,9 +176,8 @@ if markdown:
new_target = os.path.join(head, "f=" + tail + ".html")
tag.set("href", new_target)
class RewriteLocalLinksExtension(markdown.Extension):
def extendMarkdown(self, md, md_globals):
md.treeprocessors.add(
"RewriteLocalLinks", RewriteLocalLinks(), "_end")
"RewriteLocalLinks", RewriteLocalLinks(), "_end"
)

View File

@@ -10,7 +10,7 @@
% relroot = reltree + '../' * (len(branch.split('/')) - 1)
<title>git &raquo; {{repo.name}} &raquo;
{{branch}} &raquo; {{dirname.unicode}}{{fname.unicode}}</title>
{{branch}} &raquo; {{dirname.raw}}{{fname.raw}}</title>
<link rel="stylesheet" type="text/css"
href="{{relroot}}../../../../../static/git-arr.css"/>
<link rel="stylesheet" type="text/css"
@@ -33,7 +33,7 @@
% if not c.raw:
% continue
% end
<a href="{{base.url}}{{c.url}}/">{{c.unicode}}</a> /
<a href="{{base.url}}{{c.url}}/">{{c.raw}}</a> /
% base += c + '/'
% end
<a href="">{{!fname.html}}</a>
@@ -45,7 +45,7 @@
<td>empty &mdash; 0 bytes</td>
</tr>
</table>
% elif can_embed_image(repo, fname.unicode):
% elif can_embed_image(repo, fname.raw):
{{!embed_image_blob(fname.raw, blob.raw_content)}}
% elif is_binary(blob.raw_content):
<table class="nice blob-binary">
@@ -72,12 +72,12 @@
</tr>
% end
</table>
% elif can_markdown(repo, fname.unicode):
% elif can_markdown(repo, fname.raw):
<div class="markdown">
{{!markdown_blob(blob.utf8_content)}}
</div>
% elif can_colorize(blob.utf8_content):
{{!colorize_blob(fname.unicode, blob.utf8_content)}}
{{!colorize_blob(fname.raw, blob.utf8_content)}}
% else:
<pre class="blob-body">
{{blob.utf8_content}}

View File

@@ -1,5 +1,5 @@
<table class="nice toggable ls" id="ls">
% key_func = lambda (t, n, s): (t != 'tree', n.raw)
% key_func = lambda x: (x[0] != 'tree', x[1].raw)
% for type, name, size in sorted(tree.ls(dirname.raw), key = key_func):
<tr class="{{type}}">
% if type == "blob":

View File

@@ -10,7 +10,7 @@
% relroot = reltree + '../' * (len(branch.split('/')) - 1)
<title>git &raquo; {{repo.name}} &raquo;
{{branch}} &raquo; {{dirname.unicode if dirname.unicode else '/'}}</title>
{{branch}} &raquo; {{dirname.raw if dirname.raw else '/'}}</title>
<link rel="stylesheet" type="text/css"
href="{{relroot}}../../../../../static/git-arr.css"/>
<meta http-equiv="content-type" content="text/html; charset=utf-8"/>
@@ -31,7 +31,7 @@
% if not c.raw:
% continue
% end
<a href="{{base.url}}{{c.url}}/">{{c.unicode}}</a> /
<a href="{{base.url}}{{c.url}}/">{{c.raw}}</a> /
% base += c + '/'
% end
</h3>