TSYS Group Web Application Runtime Layer Q2 2021 Project PLan

Introduction

The TSYS Group needs a web application runtime layer for it's myriad of applications.

Broad Requirements for runtime layer

  • No single point of failure * High availability/auto recovery for containers * Distributed/replicated persistent storage for containers

Delivery schedule and compensation

  • Maximum equity offered : 2.5% upon completion of all milestones by deadline * Targeted completion of this project is July 4th 2021 * all equity will be fully vested at grant time. * Only consequence of non completion is no equity will be granted (but you would keep any equity already granted for completed milestones ). * Contractor is expected to work independently. TSYS Technical Operations Team is available in Discord for any requirements/access granting/architecture questions. * TSYS Technical Operations Team is not available to assist with implementation, hence the equity offer to an outside contactor.

Project milestones / deliverables / major areas

storage

0.5% equity

Replicated storage that fulfills the persistent volume claim of docker containers.

Deployed on db1/2/3 virtual machines.

Using something such as longhorn but we are open to anything that is production stable.

container runtime, control plane, control panel

0.5% equity

  • Kubernetes load balancer , something such as metallb, but open to other options. Only TCP load balancing is needed, all intelligence (certs/layer 7 etc) is handled at the routing/network layer already * Kubernetes runtime environment (workers and control plane ), Something like k3s from Rancher Labs * Kubernetes control panel authenticating to LDAP , Something like Rancher.

Control plane will be deployed on db1/2/3

Workers will be deployed on www2/3 and 1 (1 is currently the production server, so would be added in last)

Core container functionality (running as containers on the platform):

0.5% equity

  • docker registry * IAM * API gateway * Jenkins * all the above installed as containers running on the kubernetes runtime. * all the above configured for LDAP authentication * all the above no other configuration of the components would be in scope

PAAS

1% equity

  • blue/green and other standard deployment methodologies * able to auto deploy from ci/cd ) * orchestrate all of the primitives (load balancer, port assignment etc) (docker-compose target? helm chart? is Rancher suitable?)

This milestone is the most complex and will require discussion and further clarification. We can do so when we get to this point and see how far along contractor has come and time remaining etc .

Things not in scope

LDAP backend

Known Element Enterprises LLC utilizes Univention Corporate Server to provide Active Directory compatible services to the TSYS Group. This is up and running in production and all applications and systems utilize it for AAA.

You will have access to the UCS control plane and are expected to create and document any service accounts , groups etc needed for the services you deploy.

Data backend (RDS)

Known Element Enterprises LLC utilizes it's own proprietary database as a service solution to provide an HA cluster of:

  • MySQL * redis * memcached * postgreql * etcd * MQTT * mongodb * elasticsearch

If the above isn't sufficient, (we don’t have zookeeper for example ) , you would work with the Technical Operations Platform Team to deploy whatever additional clustered data store may be required.

You’ll be granted access to the database as a service systems and be expected to create and document any databases you need along with any needed accounts.

Applications running on the platform

TSYS Group Technical Operations team will deploy all applications onto the platform.

You are responsible for providing a demo of the whiteboard application showing storage and node redundancy.

Secrets store

Known Element Enterprises LLC utilizes bitwarden/envwarden for all secret storage. It provides a REST API and we have existing wrapper code to populate environment variables as needed. You may (if needed ) deploy a secrets store for the deliverables (such as Ansible Vault, Hashicopr Vault etc) if bitwarden/envwarden isn't sufficient.

General notes

  • It is up to contractor how todo infrastructure as code for the deliverables. Ansible might have the best coverage perhaps. Teraform is a solid contender. * All work must be put into gitea repositories and mirrored to github. You can use the mirror script found at: https://github.com/ReachableCEO/notes-public/blob/master/code/utils/gitMirror.sh with aliases (modify as desired of course) ```

    • lpom='git add -A :/ ; git commit -va'
    • gpom='git push all master'
    • tesla='lpom;gpom'
  • Also all work at contractor discretion can be live screen casted, recorded , blogged about , put on GitHub , talked about in any format etc. We actively support and encourage it! Also feel free to build out in parallel on any other cloud provider . * All IP must be licensed AGPL v3, with copyright dual assigned to both the contractor and Known Element Enterprises LLC * on day 1, contractor will have privileged access to :

    • opnsense
    • UCS
    • www 1/2/3
    • db 1/2/3

so contractor can be completely self sufficient

A suggested prescriptive technical stack / Work done so far

Followed some of this howto: https://rene.jochum.dev/rancher-k3s-with-galera/

Enough to get k3s control plane and workers deployed:

root@db1:/var/log/maxscale# kubectl get nodes -o wide
NAME   STATUS   ROLES                  AGE   VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
db2    Ready    control-plane,master   30d   v1.20.4+k3s1   10.251.51.2   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   containerd://1.4.3-k3s3
db3    Ready    control-plane,master   30d   v1.20.4+k3s1   10.251.51.3   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   containerd://1.4.3-k3s3
db1    Ready    control-plane,master   30d   v1.20.4+k3s1   10.251.51.1   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   containerd://1.4.3-k3s3
www1   Ready    <none>                 30d   v1.20.4+k3s1   10.251.50.1   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   containerd://1.4.3-k3s3
www2   Ready    <none>                 30d   v1.20.4+k3s1   10.251.50.2   <none>        Ubuntu 20.04.2 LTS   5.4.0-70-generic   containerd://1.4.3-k3s3
root@db1:/var/log/maxscale# 

and a bit of load balancing setup going:

fenixpi% kubectl get pods -A -o wide
NAMESPACE        NAME                                        READY   STATUS             RESTARTS   AGE   IP            NODE   NOMINATED NODE   READINESS GATES
metallb-system   speaker-7nsvs                               1/1     Running            10         30d   10.251.51.2   db2    <none>           <none>
kube-system      metrics-server-86cbb8457f-64ckz             1/1     Running            18         16d   10.42.2.23    db1    <none>           <none>
kube-system      local-path-provisioner-5ff76fc89d-kcg7k     1/1     Running            34         16d   10.42.2.22    db1    <none>           <none>
metallb-system   controller-fb659dc8-m2tlk                   1/1     Running            12         30d   10.42.0.42    db3    <none>           <none>
metallb-system   speaker-vfh2p                               1/1     Running            17         30d   10.251.51.3   db3    <none>           <none>
kube-system      coredns-854c77959c-59kpz                    1/1     Running            13         30d   10.42.0.41    db3    <none>           <none>
kube-system      ingress-nginx-controller-7fc74cf778-qxdpr   1/1     Running            15         30d   10.42.0.40    db3    <none>           <none>
metallb-system   speaker-7bzlw                               1/1     Running            3          30d   10.251.50.2   www2   <none>           <none>
metallb-system   speaker-hdwkm                               0/1     CrashLoopBackOff   4633       30d   10.251.51.1   db1    <none>           <none>
metallb-system   speaker-nhzf6                               0/1     CrashLoopBackOff   1458       30d   10.251.50.1   www1   <none>           <none>

Beyond that, it's greenfield.

Phase 1 (core capabilities )

  • Storage for persistent volume claims : https://longhorn.io/ * Control plane / workers for k8s: https://k3s.io/ * Control panel for k8s: https://rancher.com/products/rancher/ * Network load balancer : https://metallb.universe.tf/

Phase 2 (container and application support infrastructure)

Phase 3 (application deployment support infrastructure)

Need to research this more.

Some kind of PASS that would orchestrate storage , HA, network IP/ port .

Known Element Enterprises LLC already have haproxy and letsencrypt setup and in production use. All DNS is wild carded to the HAProxy IP. So any service can be spun up just by provisioning a cert and a VIP/ACL.

Some possibilities: