How I Created Pacman Digest

It all started when I was upgrading the operating system on my portable road warrior, a ThinkPad x250 running Arch Linux, when pacman told me that I didn’t have sufficient space for the upgrade. This was the first time I’d ever encountered that, so I decided to make a program that would help me (and other pacman users, I guess) better understand space usage on the root partition where packages are installed. It wasn’t too difficult with a bit of Python and HTML/CSS/JS.

You can check out the completed project here: GitHub


Oh, and in case you didn’t know, pacman is the default package manager for Arch Linux.

The Problem and Goal

The first step in any engineering scenario is to identify the problem, obviously.

Pacman users may have difficulty understanding space usage on their root partition. Because of this, they may have trouble managing space on their system. Something that could help would be a program that helps (1) visualize their space usage on their root partition and (2) tells them what their largest packages are as well as (3) how much space they take up.

Right off the bat, we’ve clearly defined our minimum viable product: a program that

  1. visualizes space usage on a user’s root partition;
  2. tells them what their largest packages are; and
  3. tells them how much space these packages take up.

Some totally arbitrary additions I decided to push on:

Getting Data

On systems using pacman, you can get a list of all installed packages and their information using pacman -Qi. The information for a single package may look something like:

Name            : aircrack-ng
Version         : 1.6-6
Description     : Key cracker for the 802.11 WEP and WPA-PSK protocols
Architecture    : x86_64
URL             : https://www.aircrack-ng.org
Licenses        : GPL2
Groups          : None
Provides        : aircrack-ng-scripts
Depends On      : openssl  sqlite  iw  net-tools  wireless_tools  ethtool  pcre  libpcap  python  zlib  libnl  hwloc  usbutils
Optional Deps   : None
Required By     : None
Optional For    : None
Conflicts With  : aircrack-ng-scripts
Replaces        : aircrack-ng-scripts
Installed Size  : 2.19 MiB
Packager        : Felix Yan <felixonmars@archlinux.org>
Build Date      : Wed Dec 1 13:53:36 2021
Install Date    : Sun Feb 6 18:48:26 2022
Install Reason  : Explicitly installed
Install Script  : No
Validated By    : Signature

I wrote a parser function to read a single package information entry and throw the relevant data into a list that it then returns, which you can read in the source code if you’re interested. For actually running the command itself, I used the check_output function provided by the subprocess library:

from subprocess import Popen, check_output

# "pacman -Qi" gets list of packages and their info entries.
cmd = lambda: check_output(["pacman", "-Qi"]).decode("utf-8").split("\n\n")

# Compile a list of packages from the output of "pacman -Qi"
packages = [parse(e) for e in cmd() if len(e) > 0]

The parse function was something that I wrote and it automatically converts “KiB,” “MiB,” and “GiB” into bytes for simplicity in calculations later. The sum of all these obviously equals the total disk space used by package installations. In order to get the rest of the information about the disk, we can use the disk_usage function provided by the shutil library:

# Get the total, used, and free space (B) of the root partition.
total, used, free = shutil.disk_usage("/")

Interfacing Python and HTML

Now that Python has the relevant data, we can conveniently package it into a single JSON object and write it into the HTML. I wrote a template.html with places for {{ SUBSTITUTIONS }} that Python can simply replace in the string object of the HTML. After we read the HTML and write the data, we can rinse and repeat for all CSS and JS:

# Wrap up all data into a single JSON string.
data = json.dumps({"packages": packages, "total": total, "used": used})

# Read the template and substitute the data.
with open(ROOT + "/template.html", "r") as f:
	html = f.read().replace("{{ DATA }}", data)

# Read the stylesheet and substitute the CSS.
with open(ROOT + "/css/style.css", "r") as f:
	html = html.replace("{{ CSS }}", f.read())

# Read the jQuery and substitute the JS.
with open(ROOT + "/js/jquery-3.6.0.slim.min.js", "r") as f:
	html = html.replace("{{ JQUERY }}", f.read())

# Read the Chart.js and substitute the JS.
with open(ROOT + "/js/chart.min.js", "r") as f:
	html = html.replace("{{ CHARTJS }}", f.read())

# Read the JavaScript and substitute the JS.
with open(ROOT + "/js/script.js", "r") as f:
	html = html.replace("{{ JAVASCRIPT }}", f.read())

# Write to "digest.html"
with open(ROOT + "/digest.html", "w") as f:
	f.write(html)

script.js is where we write our own JavaScript and jQuery and Chart.js are just external libraries used to make life easier. By including their JavaScript directly in the HTML file, we make the output portable (so it’s independent of any file structure) and viewable offline.

The Embedded Fonts and Graphics Problem

I’m not really going to go super in-depth into the HTML/CSS since it’s honestly not very interesting and rather trivial. If you’re interested, you can check out the source code linked earlier. One problem that I did run into, however, was what I’ll call the embedded fonts and graphics problem.

Traditionally, fonts are imported via a URL like so:

@font-face {
	font-family: fontname;
	src: url("url/to/fontname.ttf");
}

This represents a problem for me since portability was something I desired. I needed to import a custom font, the Public Pixel font for the arcade-like styling, but simultaneously wanted to make the program portable so that it generates a single HTML file with everything needed already wrapped up inside. The solution was a little trick not many people may have been aware was possible:

@font-face {
    font-family: Arcade;
    src: url(data:font/ttf;base64,AAEAAAAMAIAAAwBAT1MvMkg4OMIAAAFIA...) format('truetype');
}

I cropped out the full base64 from the code snippet above, as otherwise it’d take up the whole page. Indeed, you can simply just base64 encode the font file and as long as you specify the format and that it is base64, you can embed the data directly in place of a traditional URL.

The same concept was applied for the favicon image:

<link rel="shortcut icon" type="image/x-icon" href="data:image/ico;base64,AAABAAEAEBAQAAEABAAoAQAA...">

Generating Analytics in JS

Generating the analytics with JavaScript is trivial now that the data has been conveniently packaged into a single JSON object and is available directly in the code. Sorting packages by installed size from largest to smallest, for example, is a simple one-liner:

data["packages"].sort((a, b) => b[10] - a[10]);

Finding out how many packages there are total and writing the number to the page is likewise a simple one-liner:

$("#pkgs-count").text(data["packages"].length);

The total space used by all packages? A for loop:

let pkgUsage = 0.0;

/* Yes I use Allman braces. You can @ me because I'm not a coward. */
for (let i = 0; i < data["packages"].length; i++)
{
	pkgUsage += data["packages"][i][10];
}

In the above code snippet, note that I used indices to access specific data (such as size). I used indices instead of a more traditional key:value pairing since I knew specific types of data would always exist at specific indices, so I could cut down on some file size (not like anyone’s counting) by using indices instead. I can do this because it’s an assumption I can guarantee – I wrote the program.

Generating the percentages and partition chart was just some basic math as well as using the Chart.js library, for which I referred to the documentation to see how to create a pie chart. For the usage table, I would get a maximum of 50 packages and simply generate table entries for them using a simple for loop, writing raw HTML to a variable and then writing the HTML to the page:

let rows = "";

for (let i = 0; i < (data["packages"].length < 50 ? data["packages"].length : 50); i++)
{
	rows += "<tr><td>" + data["packages"][i][0] + "</td>";
	rows += "<td class='table-data-right'><span class='orange'>" + (data["packages"][i][10]/1048576).toFixed(0) + "</span>";
	rows += " MiB</td></tr>";
}

$("#usage-table tr").first().after(rows)

Putting all the pieces together, I created a tool that helps me better understand and maintain my system! It was a fun day project that helped stave off boredom. That’s one summer project done:

Until next time,

Happy hacking!