[Feature] ARM/aarch64 support

Hey,

Since Tutor is fully available to run on Kubernetes it could be a great idea to allow multiarchitecture mode, MySQL and Mongo already has ARM variant in Dockerhub.

There are some instructions we can add to the Tutor build steps to make sure all the images are built for multiple architectures, this way Tutor can be used on these platforms (including Mac M1) as smoothly as possible (Given that you’re using Docker Desktop/minikube with VM).

The following article explains further how to build it all on a single machine:

I would love to try and help, I’m just not familiar with the codebase.

Thank you.

Checkout Running Tutor on ARM-based systems — Tutor documentation
Key Point: MySQL not supporting ARM right now

I think that the official BuildKit approach is way easier and I don’t have to rebuild all the containers on my machine.
Regarding MySQL: 8.0 is officially supported, Helm chart or specific tutor configuration can probably solve that.

And how is it supposed to work with Kubernetes?

Yes, it’s easier for you, but not for me, as I would have to run a CI on arm64 to build the images.

Open edX does not support MySQL 8.0.

Please do check the conversation here: Support for ARM64 · Issue #510 · overhangio/tutor · GitHub

I’ve attached the guide, using buildkit is just a small addition to the process and it will produce several architectures as needed, you don’t need extra hardware for that, just extra settings to the same CI.
I meant that using a Helm chart for Kubernetes that knows this limitation.

I’ll try to see if I can produce a MySQL arm64 Docker image on GitHub Docker because MySQL 5.7 is available as arm64 deb file and we’ll be able to use it, if it’ll work and you would like me to transfer the repository to you I’m OK with that.

Please do read the thread I mentioned above. I already attempted to cross-build arm64 images and it failed.

I’m all about sharing and open source so I’ve managed to cross compile it, I’ll post my solution in the next couple of days, sorry for the delay.

Thank you :slight_smile:

BTW I wasn’t aware that you were talking about openedX and not MySQL but once I’m done with MySQL (I managed to cross compile it with deb files downloaded from Launchpad manually since the files were compiled in Launchpad but I couldn’t find them in the Ubuntu repository for arm64).

I found some file that looks like the file you were trying to build with Docker but it has many Jinja templates in it, what am I supposed to do with it? Is there any file ready for building?

Thanks!

EDIT: This is the arm image implementation based on the original file with some modifications, it won’t cross compile as I had to download the deb files for arm manually:

@regis Like me or hate me but your changes work and after several hours of building on my 2015 MacBook Pro (Intel Core i7 2.2 GHz) it was finally built, I’m guessing it has something to do with that pyenv 2.2.2 that you’ve mentioned in the post.
I want to repeat this process for several other architectures but it’s going to be a very long time before I can show results.
In the meanwhile I’m thinking about provisioning GitHub actions to try and reproduce it to take the load off of your personal PC or wherever you are building from.

This is my repo with GitHub Actions running for several hours (72 hours is the maximum), It should build a Docker image and push it to ghcr, I can modify it to fit your favorite registry:

Sadly there’s no spoiler tag but these are the results, the total time 24407 seconds (6 hours and 46 minutes approximately):

yaron@Yarons-MacBook-Pro openedx % docker buildx build --platform=linux/arm64 .
WARN[0000] No output specified for docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
[+] Building 14349.1s (34/62)
[+] Building 14349.2s (34/62)
[+] Building 24407.3s (62/62) FINISHED
 => [internal] load build definition from Dockerfile                                               0.0s
 => => transferring dockerfile: 10.63kB                                                            0.0s
 => [internal] load .dockerignore                                                                  0.0s
 => => transferring context: 2B                                                                    0.0s
 => [internal] load metadata for docker.io/library/ubuntu:20.04                                    0.8s
 => [internal] load build context                                                                  0.1s
 => => transferring context: 13.37kB                                                               0.0s
 => [minimal 1/2] FROM docker.io/library/ubuntu:20.04@sha256:b5a61709a9a44284d88fb12e5c48db0409cf  4.7s
 => => resolve docker.io/library/ubuntu:20.04@sha256:b5a61709a9a44284d88fb12e5c48db0409cfad5b69d4  0.0s
 => => sha256:5f3d23ccb99f6c9462a15efcf35aef0863858073a06d56df671d0e791b26222a 27.17MB / 27.17MB   2.7s
 => => extracting sha256:5f3d23ccb99f6c9462a15efcf35aef0863858073a06d56df671d0e791b26222a          2.0s
 => [minimal 2/2] RUN apt update &&     apt install -y build-essential curl git language-pack-e  636.0s
 => [python 1/4] RUN apt update &&     apt install -y libssl-dev zlib1g-dev libbz2-dev     libr  370.2s
 => [locales 1/1] RUN cd /tmp     && curl -L -o openedx-i18n.tar.gz https://github.com/openedx/op  7.5s
 => [production  1/27] RUN apt update &&     apt install -y gettext gfortran graphviz graphviz-  284.1s
 => [code 1/8] RUN mkdir -p /openedx/edx-platform &&     git clone https://github.com/edx/edx-pl  42.8s
 => [dockerize 1/1] RUN curl -L -o /tmp/dockerize.tar.gz https://github.com/jwilder/dockerize/rel  6.2s
 => [code 2/8] WORKDIR /openedx/edx-platform                                                       0.2s
 => [code 3/8] RUN git config --global user.email "tutor@overhang.io"   && git config --global us  0.4s
 => [code 4/8] RUN git fetch --depth=2 https://github.com/regisb/edx-platform 51e0ec3b97ae5badbf9  4.8s
 => [code 5/8] RUN git fetch --depth=2 https://github.com/open-craft/edx-platform/ 03731f19459e5  31.2s
 => [code 6/8] RUN git fetch --depth=2 https://github.com/overhangio/edx-platform/ 78da3d86b79e80  8.0s
 => [code 7/8] RUN git fetch --depth=2 https://github.com/overhangio/edx-platform/ b63c01fb38a60f  5.9s
 => [code 8/8] RUN git fetch --depth=2 https://github.com/edx/edx-platform/ 85eb44445b8a6207b967b  7.6s
 => [production  2/27] RUN useradd --home-dir /openedx --create-home --shell /bin/bash --uid 1000  0.7s
 => [production  3/27] COPY --chown=app:app --from=dockerize /usr/local/bin/dockerize /usr/local/  0.1s
 => [production  4/27] COPY --chown=app:app --from=code /openedx/edx-platform /openedx/edx-platfo  5.3s
 => [production  5/27] COPY --chown=app:app --from=locales /openedx/locale /openedx/locale         0.5s
 => [python 2/4] RUN git clone https://github.com/pyenv/pyenv /opt/pyenv --branch v2.2.2 --depth   3.4s
 => [python 3/4] RUN /opt/pyenv/bin/pyenv install 3.8.12                                       10215.6s
 => [python 4/4] RUN /opt/pyenv/versions/3.8.12/bin/python -m venv /openedx/venv                  39.1s
 => [python-requirements  1/10] RUN apt update && apt install -y software-properties-common lib  251.5s
 => [nodejs-requirements 1/5] RUN pip install nodeenv==1.6.0                                      13.7s
 => [production  6/27] COPY --chown=app:app --from=python /opt/pyenv /opt/pyenv                    3.6s
 => [nodejs-requirements 2/5] RUN nodeenv /openedx/nodeenv --node=12.13.0 --prebuilt              36.5s
 => [nodejs-requirements 3/5] COPY --from=code /openedx/edx-platform/package.json /openedx/edx-pl  0.3s
 => [nodejs-requirements 4/5] WORKDIR /openedx/edx-platform                                        0.0s
 => [nodejs-requirements 5/5] RUN npm install --verbose --registry=https://registry.npmjs.org/  3354.4s
 => [python-requirements  2/10] COPY --from=code /openedx/edx-platform /openedx/edx-platform       7.4s
 => [python-requirements  3/10] WORKDIR /openedx/edx-platform                                      0.3s
 => [python-requirements  4/10] RUN pip install setuptools==44.1.0 pip==20.0.2 wheel==0.34.2      48.7s
 => [python-requirements  5/10] RUN pip install -r ./requirements/edx/base.txt                  6507.9s
 => [python-requirements  6/10] RUN pip install django-redis==4.12.1                              30.4s
 => [python-requirements  7/10] RUN pip install uwsgi==2.0.20                                    286.6s
 => [python-requirements  8/10] COPY ./requirements/ /openedx/requirements                         0.1s
 => [python-requirements  9/10] RUN cd /openedx/requirements/   && touch ./private.txt   && pip   23.8s
 => [python-requirements 10/10] RUN pip install 'openedx-scorm-xblock<14.0.0,>=13.0.0'            39.3s
 => [production  7/27] COPY --chown=app:app --from=python-requirements /openedx/venv /openedx/ven  8.5s
 => [production  8/27] COPY --chown=app:app --from=python-requirements /openedx/requirements /ope  0.5s
 => [production  9/27] COPY --chown=app:app --from=nodejs-requirements /openedx/nodeenv /openedx/  3.0s
 => [production 10/27] COPY --chown=app:app --from=nodejs-requirements /openedx/edx-platform/nod  34.2s
 => [production 11/27] WORKDIR /openedx/edx-platform                                               0.9s
 => [production 12/27] RUN pip install -r requirements/edx/local.in                              172.1s
 => [production 13/27] RUN mkdir -p /openedx/config ./lms/envs/tutor ./cms/envs/tutor              0.2s
 => [production 14/27] COPY --chown=app:app revisions.yml /openedx/config/                         0.0s
 => [production 15/27] COPY --chown=app:app settings/lms/*.py ./lms/envs/tutor/                    0.0s
 => [production 16/27] COPY --chown=app:app settings/cms/*.py ./cms/envs/tutor/                    0.0s
 => [production 17/27] RUN mkdir /openedx/locale/user                                              0.2s
 => [production 18/27] COPY --chown=app:app ./locale/ /openedx/locale/user/locale/                 0.0s
 => [production 19/27] RUN cd /openedx/locale/user &&     django-admin.py compilemessages -v1      2.1s
 => [production 20/27] RUN ./manage.py lms --settings=tutor.i18n compilejsi18n                   108.6s
 => [production 21/27] RUN ./manage.py cms --settings=tutor.i18n compilejsi18n                    99.8s
 => [production 22/27] COPY --chown=app:app ./bin /openedx/bin                                     0.1s
 => [production 23/27] RUN chmod a+x /openedx/bin/*                                                0.3s
 => [production 24/27] RUN openedx-assets xmodule     && openedx-assets npm     && openedx-ass  3975.1s
 => [production 25/27] COPY --chown=app:app ./themes/ /openedx/themes/                             0.1s
 => [production 26/27] RUN openedx-assets themes     && openedx-assets collect --settings=tuto  1528.2s
 => [production 27/27] RUN mkdir /openedx/data                                                     0.7s
yaron@Yarons-MacBook-Pro openedx %

EDIT: I can try and make some conditionals as explained here (TARGETPLATFORM and BUILDPLATFORM) in order to create a multiarch package:

Great news (Sorry but I can’t edit after 24hrs), I have a working GitHub Actions pipeline that creates and pushes the images to ghcr.io.

This is the resulting Docker image:

This is the working pipeline that created this:

So we now have 2 architectures supported, I tried adding some others but the Python compilation is a bit too much for this pipeline.
I can try working on the MySQL a bit further to provide the arm64 support like promised but it’s very hacky and there’s a lot to remove before it’s ready for production use.

Hey @yaron, thanks for your investigation, and sorry about the delay in replying. I need to work on a couple of other things first, but this will be the next big item on my tutor-TODO list. I will need to figure out why buildx succeeds on GitHub actions but not on my laptop.

@regis is there any way I can help? Can we have an online session to try and figure that out?

I think that we’re also missing mongo3 on ARM, if that’s the case I’ll start working on it, having an ARM version as docker images is mostly required for Kubernetes and Mac M1 AFAICT.

What makes you say that @yaron? Did you face an issue running MongoDb on arm64?

Correct me if I’m wrong but open edX uses mongo 3, starting mongo 4 there was a builtin arm64 support for both Dockers and binaries, it’s not the case for mongo 3, I wish you know something I don’t about this because although it’s a relatively easy migration I don’t want to create something that already exists.

Thank you.

In Maple, Open edX uses Mongodb 4.2, so there should not be any problem on this side, right?

I see that the docker.io/mongo images are available for arm64: Docker Hub

1 Like

Great, I’m always prepared for the worse and get excited when my theories fail, thank you :slight_smile:

OK I have made some progress on this item. The good news is that I managed to cross-build the “openedx” image with buildx. What I was missing was a proper installation of binfmt :sweat_smile: typically a case of rtfm

The bad news is that cross-platform compilation takes a very long time. And this is not just on my computer. The GitHub action built by @yaron (thanks again!) takes 3 hours to complete the build. This means that cross-platform building should not be part of our continuous integration (CI) workflow but only of continuous deployment (CD). Also, the build should be made in a asynchronous manner such that the linux build is not blocked by the arm64 build.

This is mostly a brain dump at this point. We are still actively working to find what’s the best solution to facilitate the integration with Mac M1 users.

LOL, glad you fixed the binfmt using the good old rtfm method :slight_smile:

Well, what I was thinking is allowing cross platform as part of tutor but building on the respective platform by default (So if you have M1 it’ll succesfully built for you with the respective parameters automatically), this way there’s not reason for me to build for an architecture I can’t use but I can do it if I have my own repository and I want to be able to push to it (ECR and an internal CI/CD process for example).

Keeping the CD option (So that the CI build server will build for both architectures and push to the docker registry) is pretty convenient because I can simply pull the tag and docker will know exactly which container to pull so I won’t have to build it myself if I don’t want to.

It looks like there’s a way to build simultanously on 2 different nodes (one for each platform) using a dedicated buildkit configuration:

BTW are you using GitHub Actions? Do you want me to try running those separated?

FYI @yaron the matter of getting Open edX to run on arm64 is getting a lot of traction by 2U, as they are keen to migrate their developers to Tutor. The matter is discussed here: Ensure Tutor works out-of-the-box on ARM64 · Issue #35 · overhangio/2u-tutor-adoption · GitHub
tl;dr: in the near future it is very likely that I will migrate my CI to a multi-platform build-cluster which will build multi-platform images for all plugins.

Well, apparently there are numerous issues and it’s not running out of the box but I finally have a working tutor on arm64.

The issues I ran into are the ones I’ve opened on GitHub here and here.

After fixing dockerize and the permissions docker it loads perfectly and since it’s not my budget I’ve had to turn this machine off but we will continue and use this machine and insights for documentation and bug fixing purposes as much as needed until we will have an out of the box working version.

Another thing worth mentioning is that the minimum instance type for tutor on AWS is c6g.large (2 vCPUs and 4 GB of RAM) and it’s better to configure some swap space otherwise every memory load spike causes the machine to hang for a couple of minutes, I restarted several times until I figured that out.

BTW are you sure about the permissions management with a distinct docker? It looks a bit overkill, maybe managing the permissions with tutor inside python internally has a better potential because if the permissions docker fail everything fails and we don’t even know why because we’re assuming that the permissions are correct without tutor knowing what’s actually going on.

Thanks.