Flawless APKBUILD parsing concept
This is a draft. Feedback welcome, as well as people who want to actually implement this. (Since the current parsing is still working surprisingly well, there are a lot more urgent issues that I'm looking into, so I won't touch this for a while.)
Motivation and justification
APKBUILDs are shell scripts. Right now we are parsing them with a few lines of python code, and this works surprisingly well for practically all APKBUILDs that we are using.
However, there will always be cases where it does not work, such as #1800 (closed) currently, or when loops or other shell magic is involved. I think that instead of always adjusting the python based parser, maybe it is time to retire it and just execute the shell scripts.
The reason why I preferred they python approach is, that it is fast and safe. I've changed my mind, and now I think we can get equally fast or even faster with a proper cache and parallelization during parsing with make. Regarding safety: right now the shell scripts that are APKBUILDs only get executed when we build packages, and with this change, they would get executed whenever we read them. But we need to trust them either way and the APKBUILDs come from pmaports.git, not from a random place of untrusted data from the internet. So I don't see how security would be weakened.
Caching dir
Directory structure, as usually in $WORK
(see pmbootstrap config work
):
cache_apkbuild/
version
full.ini
dir_main.ini
dir_coreapps.ini
dir_kde.ini
main/
hello-world.ini
main/
fbdebug.ini
kde/...
Implementation idea
- one function that parses all apkbuilds at once (see below)
- that function gets called as soon as pmbootstrap needs to access any apkbuild
- in most cases (no APKBUILDs were changed), python can directly parse full.ini without starting a shell to parse anything.
Parsing all apkbuilds
- implement logic similar to make:
- for each
$cache/$dir/$pkgname.ini
:- if
$dir/$pkgname/APKBUILD
does not exist anymore:- remove
$pkgname.ini
- remove
dir_*.ini
- remove
- if
- for each
$dir/$pkgname/APKBUILD
(e.g.main/hello-world/APKBUILD
):- if
$cache/$dir/$pkgname.ini
's timestamp is older than APKBUILD:- create
$pkgname.ini
(see below)
- create
- if
- for each
$cache/$dir/$pkgname.ini
:- if any of these ini files is newer than
dir_$dir.ini
ordir_$dir.ini
does not exist:- create
dir_$dir.ini
by appending all of$cache/$dir/$pkgname.ini
in alphabetical order
- create
- if any of these ini files is newer than
- for each
$cache/dir_$dir.ini
:- if any of these ini files is newer than
full.ini
orfull.ini
does not exist:- create
full.ini
by appending all of$cache/$dir/dir_$dir.ini
in alphabetical order
- create
- if any of these ini files is newer than
- for each
Just like make, if none of the files were changed, it is enough to check all the filestamps and therefore it should be definitively faster when it needs to parse nothing compared to the current approach.
Regarding whether to use make or not, I would do the filestamp checks with python first. So we don't even need to run make if we find out, that the files are up to date. In case they are not up to date, I would run make with a Makefile that implements the above logic (should be pretty readable in a Makefile). We would run make in the native Alpine chroot.
$pkgname.ini
Create I'm choosing ini as intermediate format, because it is simple to create from shell scripts and python can parse it natively.
- put a
APKBUILD2ini.sh
script into the native chroot, which looks like this:
#!/bin/sh -e
# Source the APKBUILD passed as parameter to this script
. "$1"
# strip new lines from each variable
pkgdesc="$(echo "$pkgdesc" | tr ...)"
...
echo "[$pkgname]"
echo "pkgdesc=\"$pkgdesc\""
...
- run
APKBUILD2ini.sh /path/to/APKBUILD > $cache/$dir/$pkgname.ini
EDIT: @MartijnBraam proposed in #postmarketOS, that we could also use JSON as intermediate format. Basically apk add jq
(~2 MB), then:
#!/bin/sh -e
. "$1"
jq -n env
This way we would not have problems with encoding special characters, and since python can parse json natively, it should still be pretty fast.
Notes
- Python also has its own "pickle" format for serialized data, which it can possibly parse even faster than ini - so maybe it is even worth going the extra mile of creating a full.pickle file. This will take a bit longer in the creation parse, so it might not be worth it or it may not make much of a difference, in which case I'd rather go with having less code.