We have been using Tutor/OpenEDX for about 3 months now and so far so good. We had some initial challenges setting it up with EKS on AWS using Teraform and S3 esp signed URLs but we managed to iron out the issues we had. We were hoping to look at Blue/Green deployments. We are using a Continous Delivery pipeline which deploys changes pretty quickly (~1 to 2 minutes) but as expected it does take the pods down and incurs downtime (somewhere between 3 to 5 minutes). Has anybody gotten blue/green working? Or tried it?
We have a few different ideas on how we might get this working but am really interested if anybody has any experience (and a working setup ).
Can you describe more precisely which steps cause downtime?
It has been stuff like adding an xblock, making a change to the lms and cms files and any theme changes.
I suspect these will become less frequent as we stop finding out about new bits to turn on and use.
No I meant: which steps in your CI cause downtime? Do you run
tutor k8s quickstart or anything like that?
Those steps should not be causing downtime, as it’s just a matter of adding/removing running containers. As far as I understand, the only things that should cause downtime are backward-incompatible database migrations.
Sorry - the bit that we do that causes a restart is the k8s reboot:
- if kubectl get namespace openedx; then tutor k8s reboot; else tutor k8s quickstart -I; fi
Should we be looking at doing it differently?
I went back to look at the documentation
Could we run tutor k8s start if its running kubectl apply under the hood? I don’t know why I thought we would have to run tutor k8s stop and then start, I thought it was imperative - and this would take longer.
Yes, absolutely The quickstart command was designed for basic deployments, but it should not be used for more advanced scenarios, such as zero downtime deployments.