Redwerk’s dev team needed to carefully define and optimize any gaps and even the smallest leaks of memory and CPU usage. The actual project load sent millions of actions containing both @usernames and @things in a single line. Objects required more advanced logic than users, because they needed to be parsed, then checked if existed in database, created a new entry if didn’t, then updated score, leaderboard and the global trending. With millions of such actions, most of them happening within a single message, it overloaded the DB very quickly, as well as the memory and CPU.
Another high load task was to optimize score updates in the leaderboards, as every action had to influence the position in the global leaderboards of users and objects. The incredibly high amount of items that had to be parsed, checked, and updated, created additional challenges for development.
In order to optimise all these actions, our team developed functionality that uses multiple CPU cores on each server, along with a specific Bot Manager, which routes each bot to a specific node (Server + CPU core) — a kind of task which Amazon load balancer couldn’t handle. After that, we rebuilt request logic to decrease the number of requests.
We also switched from Ubuntu to CentOS. Ubuntu didn’t allow us to create additional connections to a database. CentOS in its turn provided more stable environment while working with MongoDB and supported the excessive action load. It also allowed us to re-configure AWS server and update OS core settings on nodes to improve the file descriptor and threads limits.