Installing ERPNext on Kubernetes has advantages in scalability, stability, and elasticity. However, installing on a VM can also be beneficial:
- easier debugging with “ad-hoc” techniques
potential use of AWS Arm Graviton2 instances(too many issues, Frappe will not support it)- remote debugging using Code Server or Gitpod
Launch Lightsail or EC2 Instance
You can use EC2 (t3a.medium) or Lightsail with 4 GB (bundle medium_2_0
). Using Pulumi is recommended. With 3.7 GiB, ERPNext uses 44% after running, but it could have used more during initial installation. A running production ERPNext with nginx, mariadb uninstalled, system redis-server disabled (only uses supervisor-managed redis), uses 623Mi, so it’s possible you can run it with t3a.micro.
Note about arm64 / Graviton2: Hendy tried to use Arm (m6g.medium 4GiB or t4g) but with several issues. Notably, v13.10.0 breaks due to pypika.
For EBS, 8 GiB is too small and won’t even complete initial bench installation, set at least 12 GiB. After installation, / partition usage is at 8.7 GiB.
For security groups, select “default” and “ssh-server”. No need to expose HTTP/HTTPS ports because we’ll use an ALB.
Prepare Ubuntu System Configuration
Set hostname:
sudo hostnamectl set-hostname erp-staging-sg02.soluvas.com
Enable byobu and make sure LC_ALL
is set by server even when connecting remotely:
#byobu-enable
sudo update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8
Then re-login using ssh.
Update Ubuntu packages:
sudo apt update && sudo apt -y full-upgrade
To enable VSCode to watch files, please apply the following system tweak to the server.
Troubleshooting: UnhandledPromiseRejectionWarning: Error: ENOSPC: System limit for number of file watchers reached
Solution:
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
Install ERPNext version-13 and Create a Site Manually using Bench in Production Mode
References:
- Frappe Bench README – Manual Installation
- Frappe Bench – Installation
- The Hitchhiker’s Guide to Installing Frappe on Linux
Note: This will install MariaDB locally, which is OK. Later you can dump & restore the data from local MariaDB to AWS MariaDB, then purge the local MariaDB installation.
- [If you install with Pulumi and/or use Lightsail User Data to do initial update] Wait until apt-get -y full-upgrade completes:
sudo ps -ef | grep apt
Make sure there is no apt* running. - Reboot the server: sudo reboot
- Install Prerequisites.
sudo apt-get -y install git python3-setuptools python3-pip python3-virtualenv
- Install MariaDB 10.5 locally. Make sure to store the root password in a safe place.
sudo apt-get -y install software-properties-common
sudo apt-key adv --fetch-keys 'https://mariadb.org/mariadb_release_signing_key.asc'
sudo add-apt-repository 'deb [arch=amd64] http://mariadb.mirror.globo.tech/repo/10.5/ubuntu focal main'
sudo apt-get update
sudo apt-get -y install mariadb-server mariadb-client
sudo mysql_secure_installation
Check status
sudo systemctl status mariadb
Install libmysqlclient-dev
sudo apt-get -y install libmysqlclient-dev
Edit /etc/mysql/my.cnf:
sudo nano /etc/mysql/my.cnf
And append this to the end of file
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
[mysql]
default-character-set = utf8mb4
Install Redis:
sudo apt-get -y install redis-server
Install NodeJS 14:
sudo apt-get -y install curl
curl -sL https://deb.nodesource.com/setup_14.x | sudo -E bash -
sudo apt-get -y install nodejs
Install yarn Classic:
sudo npm install -g yarn
Install wkhtmltopdf (check latest version in their website):
wget https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6-1/wkhtmltox_0.12.6-1.focal_amd64.deb
sudo apt install -y ./wkhtmltox*.deb
rm ./wkhtmltox*.deb
Create frappe user:
sudo adduser --disabled-password frappe
sudo usermod -aG sudo frappe
# https://phpraxis.wordpress.com/2016/09/27/enable-sudo-without-password-in-ubuntudebian/
sudo visudo
# In END, add this line: (important to be end, because others after it will override this directive)
# frappe ALL=(ALL) NOPASSWD:ALL
# Switch to frappe user
sudo su - frappe
Add the lovia-production public key into ~/.ssh/authorized_keys.
Install frappe-bench: (as root, because will be needed by “sudo bench setup production”)
sudo -H pip3 install frappe-bench
Log out (Ctrl+D) then log back in as frappe.
Check bench version:
bench --version
Initialize bench:
cd
bench init --frappe-branch version-13 --python /usr/bin/python3 frappe-bench
Enable site based multi-tenancy:
cd frappe-bench
bench config dns_multitenant on
echo -n '' > ~/frappe-bench/sites/currentsite.txt
Get ERPNext app:
bench get-app --branch version-13 erpnext
Get custom apps:
bench get-app --branch version-13 lovia https://gitlab.com/lovia/lovia.git
Create site:
bench new-site erpdemo.tmra.io
# Install erpnext first!
bench --site erpdemo.tmra.io install-app erpnext
Setup Production (supervisor and nginx, Cloudflare SSL)
sudo apt-get -y install nginx supervisor
sudo rm -f /etc/nginx/sites-enabled/*
sudo bench setup production frappe
To use Cloudflare SSL, you need to enable nginx SSL with snake-oil.
bench setup nginx --yes
# Edit ~/frappe-bench/config/nginx.conf
# Add after listen 80:
listen 443 ssl http2;
listen [::]:443 ssl http2 ipv6only=on;
include snippets/snakeoil.conf;
ssl_ciphers EECDH+CHACHA20:EECDH+AES128:RSA+AES128:EECDH+AES256:RSA+AES256:EECDH+3DES:RSA+3DES:!MD5;
# Then:
sudo apt-get -y install ssl-cert
sudo make-ssl-cert generate-default-snakeoil
# /etc/ssl/certs/ssl-cert-snakeoil.pem and /etc/ssl/private/ssl-cert-snakeoil.key.
sudo systemctl restart nginx
Check HTTP/2 support:
curl -k -I -L --header 'Host: erp.lovia.life' https://localhost/
# Should get: HTTP/2 200
Moving an ERPNext Site to Another Server (Backup & Restore Database and Files)
Reference:
- https://docs.erpnext.com/docs/v13/user/manual/en/setting-up/data/download-backup
- https://github.com/frappe/erpnext/wiki/Restoring-From-ERPNext-Backup
- https://discuss.erpnext.com/t/transfer-erpnext-from-one-server-to-another/8657
Backup first: https://docs.erpnext.com/docs/v13/user/manual/en/setting-up/data/download-backup
bench --site erp.lovia.life backup --with-files
# or:
bench backup-all-sites --with-files
You will get a report with 4 files: (note: If you are confused with the file names, it’s because it uses the site’s time zone, not the server time zone)
Backup Summary for erp.lovia.life at 2021-10-07 02:10:25.957338
Config : ./erp.lovia.life/private/backups/20211007_020946-erp_lovia_life-site_config_backup.json 661.0B
Database: ./erp.lovia.life/private/backups/20211007_020946-erp_lovia_life-database.sql.gz 38.3MiB
Public : ./erp.lovia.life/private/backups/20211007_020946-erp_lovia_life-files.tar 771.0MiB
Private : ./erp.lovia.life/private/backups/20211007_020946-erp_lovia_life-private-files.tar 406.0MiB
Backup for Site erp.lovia.life has been successfully completed with files
IMPORTANT: You will also need to backup site_config.json
which has encryption key! Because if you lost it:
- For each user that have API key, regenerate the API Key
- Go to Email Account List and re-input all IMAP/SMTP passwords
- ERPNext Mobile App log in error with Error 417. -> Use Frappe mobile instead (source).
Later, you will need to transfer these 4 files to the new server. But first, in the target server, you’ll need to create a new site.
bench new-site erp.lovia.life
# Install erpnext first!
bench --site erp.lovia.life install-app erpnext
# Now make sure that you have already get-app the custom apps, e.g. "lovia"
Edit site_config.json and edit the encrypted secret to match the old site.
Now you’ll need to transfer the actual backup files to the new server. In new server, create a SSH keypair using ssh-keygen -o -a 100 -t ed25519
. Then add the public key to old server’s ~/.ssh/authorized_keys.
Then you can copy the files:
cd ~/frappe-bench/sites/erp.lovia.life/private/backups/
rsync -Pa frappe@OLD_SERVER:frappe-bench/sites/erp.lovia.life/private/backups/20211007* .
Then do the restore: (prepare your MariaDB root password)
bench --site [HOSTNAME] --force restore [path to database backup file] --with-private-files [relative-path-to-private-files-backup-file] --with-public-files [relative-path-to-public-files-backup-file]
# e.g.
# bench --site erp.lovia.life --force restore ./erp.lovia.life/private/backups/20211007_020946-erp_lovia_life-database.sql.gz --with-private-files ./erp.lovia.life/private/backups/20211007_020946-erp_lovia_life-private-files.tar --with-public-files ./erp.lovia.life/private/backups/20211007_020946-erp_lovia_life-files.tar
# You will be asked for root password.
# You should get:
# *** Scheduler is disabled ***
# Site erp.lovia.life has been restored with files
After restore is done and successful, you can remove the public key from old server’s ~/.ssh/authorized_keys.
Development Configuration
Problem with AWS NLB/ALB health check is you can’t set the Host header, it’s always private IP (e.g. 172.30.1.228): (ALB documentation)
Solution for development: A better way is to mimic production configuration’s by installing nginx and using bench setup nginx
(see below). Now you can use bench start --no-dev
, and frappe.socketio client will use the standard HTTPS port instead of 9000.
bench setup nginx --yes
sudo ln -sv /home/frappe/frappe-bench/config/nginx.conf /etc/nginx/conf.d/frappe-bench.conf
sudo systemctl reload nginx
Workaround for development setup:
Install nginx-light: sudo apt install nginx-lightSet the health check to port 80
Production Configuration
Reference: https://frappeframework.com/docs/user/en/bench/guides/setup-production#nginx
Production configuration uses a combination of supervisor and nginx (nginx reverse-proxies the /socket.io endpoint as well):
frappe@erp-production-sg02:~/frappe-bench$ ls -l /etc/nginx/conf.d/
total 0
lrwxrwxrwx 1 root root 43 Oct 4 09:39 frappe-bench.conf -> /home/frappe/frappe-bench/config/nginx.conf
Enable DNS/Host based multitenancy.
bench config dns_multitenant on
With Host-based multitenancy, you need to “restore” the sites-enabled nginx configuration style that was replaced by Bench Easy Install script. Add to appropriate block in /etc/nginx/nginx.conf
:
http {
...
include /etc/nginx/sites-enabled/*;
}
Then enable the default site:
sudo ln -sv /etc/nginx/sites-available/default /etc/nginx/sites-enabled/
Without the above, you’ll fail ALB health checks because all requests will be forwarded to frappe-web, in addition to spamming logs/frappe.web.logs with 404 errors.
To regenerate nginx configuration in ~/frappe-bench/config/nginx.conf
(which is symlinked by /etc/nginx/conf.d/frappe-bench.conf
):
bench setup nginx --yes
sudo systemctl reload nginx
Troubleshooting: BunnyCDN does not support WebSockets
As of October 2020, BunnyCDN does not support WebSockets. That makes it unusable for ERPNext in default configuration.
We can use AWS CloudFront or Cloudflare instead.
Troubleshooting 1: LC_ALL
ubuntu@ip-172-30-0-218:~$ sudo python3 install.py --production
Logs are saved under /tmp/logs/easy-install__2020-10-03__06-43.log
curl already installed!
wget already installed!
git already installed!
Installing pip3...
pip3 installed!
ERROR: launchpadlib 1.10.13 requires testresources, which is not installed.
pip3 already installed!
Checking System Compatibility...
ubuntu 20 is compatible!
Bench's CLI needs these to be defined!
Run the following commands in shell:
export LC_ALL=C.UTF-8
The “Run the following commands in shell: export LC_ALL=C.UTF-8” warning is caused by ssh client sending the (conflicting) LC_ALL environment variable to remote server. What you can do:
sudo update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8
Then re-login SSH.
For the launchpadlib error, you can try reinstalling manually by:
sudo -H python3 -m pip install --upgrade launchpadlib
That is actually a bug since Easy Install should have stopped on that error, instead of continuing.
Troubleshooting 4: SocketIO Connects to Port 9000
Problem: When using develop environment, Desk UI tries to connect to https://erp-staging.lovia.life:9000/socket.io/?EIO=3&transport=polling&t=NJoyMs0 instead of regular HTTPS port.
You can check config by: bench –site erp-staging.lovia.life show-config
In proper working Kubernetes production environment, common_site_config.json
is:
{
"db_host": "*.ap-southeast-1.rds.amazonaws.com",
"db_port": 3306,
"maintenance_mode": 0,
"pause_scheduler": 0,
"redis_cache": "redis://frappe-bench-0001-erpnext-redis-cache:13000",
"redis_queue": "redis://frappe-bench-0001-erpnext-redis-queue:12000",
"redis_socketio": "redis://frappe-bench-0001-erpnext-redis-socketio:11000",
"socketio_port": 9000
}
This is in JavaScript client code, desk.js -> socketio_client.js -> frappe.socketio.init() -> frappe.socketio.get_host()
Both dev and prod (Kubernetes) on this socketio_client.js code, input port defaults to 3000.
In implementation of “frappe.socketio.get_host(port)” changes the actual host, due to window.dev_server==true and frappe.boot.socketio_port==9000:
get_host: function(port = 3000) {
var host = window.location.origin;
if(window.dev_server) {
var parts = host.split(":");
port = frappe.boot.socketio_port || port.toString() || '3000';
if(parts.length > 2) {
host = parts[0] + ":" + parts[1];
}
host = host + ":" + port;
}
return host;
},
Although you can run bench start –no-dev but this will result in Error 404 as the bench server does not reverse proxy to socketio server. So the least annoying workaround is:
1. open port 9000, and
2. Configure load balancer for that too. Don’t forget to set ALB’s security group including “socketio-9000-server”.
3. Meaning you can’t use CDN for ERPNext development/staging environment.
If it goes well you’ll get 101 Switching Protocols response on your browser:
Migrating Sites using Bench Backup & Restore
Reference: https://discuss.erpnext.com/t/backup-restore/5675
Two ways:
- Bench Backup & Restore
- Manually rsync the sites/SITE_NAME and MariaDB backup. If the source is using Kubernetes, you can do the rsync from the Kubernetes runnning container. You may need to “apt install openssh-client rsync” first.
Then. For development:
# stop bench first before re-running:
bench start
Or for production:
sudo supervisorctl restart all
Code Server
- Add frappe to
using visudo
: - frappe ALL=(ALL) NOPASSWD:ALL
- Switch to frappe
- Install code-server
- Edit ~/.config/code-server/config.yamll, change bind-addr to 0.0.0.0
- sudo systemctl enable –now code-server@$USER
- Now visit http://127.0.0.1:8080. Your password is in ~/.config/code-server/config.yaml
- Configure CNAME, target group and application load balancer. Technically, you can use nginx & LetsEncrypt directly instead of using a load balancer, but it may be more work, and offloading SSL increases performance. Use /login as health check endpoint.
Troubleshooting: ALB thinks it’s unhealthy.
Try: curl -v http://0.0.0.0:8080/login and make sure it works. By using 0.0.0.0 instead of localhost, you check for the external bound IP address.
Tip: Install Iosevka or Fira Code on your OS, then set Editor Font & ligatures in user settings. (reference: issue #1374)
"editor.fontFamily": "'Iosevka', 'Droid Sans Mono', 'monospace', monospace, 'Droid Sans Fallback'",
"editor.fontLigatures": true
Install Custom App
To install lovia
app, you do it the “normal” (non-Kubernetes) way:
- bench get-app
- install app into desired site
e.g.
# You'll need to enter GitLab credentials here
bench get-app --branch version-13 lovia https://gitlab.com/lovia/lovia.git
bench --site erp-staging.lovia.life install-app lovia
# Optional?
bench --site erp-staging.lovia.life migrate
Switch to Remote AWS RDS MariaDB
- Edit common_site_config.json and add “db_host” and “rds_db”: 1. Optionally: “db_port”: 3306, “db_type”: “mariadb”.
- Make sure sites/SITE_NAME/site_config.json has proper db_name, db_password
- Purge local MariaDB:
sudo apt purge mariadb-server-core-10.4
You can still connect to MariaDB shell by using bench:
bench --site erp-staging.lovia.life mariadb
Updating ERPNext or Custom App
Before updating ERPNext/Frappe/custom app, it’s recommended to update bench CLI first.
#If you're using bench from git, branch develop:
cd ~/.bench
git pull
pip3 install -e ~/.bench
# If you're using stable bench version from pip3:
pip3 install --upgrade frappe-bench
To update just the “lovia
” custom app, you can do the command below. It will pull from app’s git repository, then run bench migrate.
cd ~/frappe-bench
bench update --reset --pull --build --patch --apps lovia
To update all apps including frappe
and erpnext
, omit the “--apps lovia
“. It will update only from the current branch (same major version)
Example:
frappe@erp-production-sg02:~/frappe-bench$ bench update --pull --build --patch --apps lovia
WARN: bench is installed in editable mode!
This is not the recommended mode of installation for production. Instead, install the package from PyPI with: `pip install frappe-bench`
Backing up sites...
$ git pull upstream version-13-beta
Username for 'https://gitlab.com': ceefour
Password for 'https://[email protected]':
remote: Enumerating objects: 13, done.
remote: Counting objects: 100% (13/13), done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 7 (delta 5), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (7/7), 2.14 KiB | 1.07 MiB/s, done.
From https://gitlab.com/lovia/lovia
* branch version-13-beta -> FETCH_HEAD
bfe0b1b..46ded12 version-13-beta -> upstream/version-13-beta
Updating bfe0b1b..46ded12
Fast-forward
lovia/talentiva/doctype/talent/talent.json | 149 ++++++++++++++++++++++++++++-
1 file changed, 147 insertions(+), 2 deletions(-)
$ find . -name "*.pyc" -delete
Patching sites...
Migrating erp.lovia.life
Updating DocTypes for frappe : [====================] 100%
Updating DocTypes for erpnext : [====================] 100%
Updating DocTypes for lovia : [====================] 100%
Updating Dashboard for frappe
Updating Dashboard for erpnext
Updating Dashboard for lovia
Updating customizations for Address
Updating customizations for Contact
Update global search for all web pages...
Building search index for all web routes...
Compiling Python Files...
$ supervisorctl restart frappe-bench-workers: frappe-bench-web:
frappe-bench-workers:frappe-bench-frappe-schedule: stopped
frappe-bench-workers:frappe-bench-frappe-default-worker-0: stopped
frappe-bench-workers:frappe-bench-frappe-short-worker-0: stopped
frappe-bench-workers:frappe-bench-frappe-long-worker-0: stopped
frappe-bench-web:frappe-bench-node-socketio: stopped
frappe-bench-web:frappe-bench-frappe-web: stopped
frappe-bench-workers:frappe-bench-frappe-schedule: started
frappe-bench-workers:frappe-bench-frappe-default-worker-0: started
frappe-bench-workers:frappe-bench-frappe-short-worker-0: started
frappe-bench-workers:frappe-bench-frappe-long-worker-0: started
frappe-bench-web:frappe-bench-frappe-web: started
frappe-bench-web:frappe-bench-node-socketio: started
________________________________________________________________________________
Bench: Deployment tool for Frappe and Frappe Applications (https://frappe.io/bench).
Open source depends on your contributions, so do give back by submitting bug reports, patches and fixes and be a part of the community :)
Increase Maximum Attachment File Upload Size
By default, the attachment file size limit is 5 MB. Increase this by editing frappe-bench/sites/SITE_NAME/site_config.json
and adding (for 100 MiB limit):
"max_file_size": 104857600
Reference: https://discuss.erpnext.com/t/maximum-file-size/4885/4?u=hendy
Troubleshooting: Sometimes 502 Bad Gateway
This happened since ERPNext v13.2.0 with no clear explanation why. Tailing by tail -f ~/frappe/bench/*.log doesn’t show any relevant messages. CPU utilization, free drive space, and memory utilization seems normal (even abundant).
/var/log/nginx/error.log does give errors:
2021/05/02 12:39:26 [error] 598#598: *278 connect() failed (111: Connection refused) while connecting to upstream, client: 172.31.43.238, server: erp.dirgatama.id, request: "GET /socket.io/?EIO=3&transport=polling&t=Naj5pSe HTTP/1.1", upstream: "http://127.0.0.1:9000/socket.io/?EIO=3&transport=polling&t=Naj5pSe", host: "erp.lovia.life", referrer: "https://erp.lovia.life/app/account/view/Tree"
...
==> logs/web.error.log <==
/usr/lib/python3.8/os.py:1023: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, *args, **kwargs)
[2021-05-02 12:36:16 +0000] [14336] [INFO] Booting worker with pid: 14336
/usr/lib/python3.8/os.py:1023: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, *args, **kwargs)
[2021-05-02 12:39:24 +0000] [14292] [INFO] Handling signal: term
[2021-05-02 12:39:24 +0000] [14336] [INFO] Worker exiting (pid: 14336)
[2021-05-02 12:39:25 +0000] [14292] [INFO] Shutting down: Master
[2021-05-02 12:39:27 +0000] [22804] [INFO] Starting gunicorn 19.10.0
[2021-05-02 12:39:27 +0000] [22804] [INFO] Listening at: http://127.0.0.1:8000 (22804)
[2021-05-02 12:39:27 +0000] [22804] [INFO] Using worker: sync
/usr/lib/python3.8/os.py:1023: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, *args, **kwargs)
[2021-05-02 12:39:27 +0000] [22861] [INFO] Booting worker with pid: 22861
If you’re thinking why server: erp.dirgatama.id but host is erp.lovia.life, it’s because the complete clause in nginx.conf is: server_name erp.dirgatama.id erp.lovia.life ;
So problem is not with ALB or nginx, but with web & socketio:
upstream frappe-bench-frappe {
server 127.0.0.1:8000 fail_timeout=0;
}
upstream frappe-bench-socketio-server {
server 127.0.0.1:9000 fail_timeout=0;
}
Check using curl:
# to frappe-web directly
curl --header 'Host: erp.lovia.life' http://127.0.0.1:8000/app
# to nginx
curl --header 'Host: erp.lovia.life' http://127.0.0.1/app
Sometimes accessing via ELB gave 502 Bad Gateway, but curl to frappe-web or nginx works? erp CNAME has been checked to route directly to ALB.