Bottlenecks in the clouds: the history of the Pokemon Go and Trivia Crack

image

Lesson: "a System that works with two million users might not be able to cope with ten million".

After the release of Pokémon Go in the United States in July 2016 it has become the most popular at the time with the game in augmented reality. This product of many years of cooperation, a game developed by Niantic and Google (Niantic't yet stood on his feet, it was an internal startup of Google). So the infrastructure is Pokémon Go heavily dependent on the cloud platform and application services Google. (Nintendo and Pokémon also took part in the creation of addictive gameplay-growing little monsters for mobile platforms.)

It was not the first game in Niantic augmented reality. The company has previously created Ingress, a game about alien invasion, released in 2013 for Android devices. But Pokémon Go became a game of a completely different level, pokemon has long been a cultural phenomenon. The game interested audience for many years waiting for a mobile game. Therefore, the number of installations grew rapidly. Over half the game ranked first in the income on the iPhone. To some extent it was the most ambitious in the world release of the mobile game.

But her success has increased the load on the platform: two days after the release of the technical Director of Niantic's John Hanke (John Hanke) announced that the company postpones the worldwide release of Pokémon Go, and the reason for this became overloaded servers. At the same time there was confidentiality problems for ways to work with Niantic identity services and Google location. The company had to fix a lot of mistakes while solving problems with the server capacity.

In theory, the cloud should cope with the peak periods, simplifying application management and services provided by the cloud solution providers, was to simplify the development of various mobile apps (not only games). And opportunities of the cloud really made it easier to use new features that require high computational capacity (such as augmented reality).

But, as with other network platforms of the past, the developers found that the presence of all these facilities no matter if you are not able to connect to them. The more interactive mobile application, the more difficult becomes the exchange of data between mobile devices and cloud infrastructure. Add factor in the differences in data transfer speed mobile network operators around the world and get a system in which to provide the users the speed necessary to take into account many parameters.

Few applications go up as well as it became viral on a global scale Pokemon Go. Those developers that want to scale games and applications, will be useful for the study of how Niantic and other game developers coped with leaning on them unexpected success. If the hit mobile game can deal with obstacles that occur when testing and debugging the performance of the interaction devices and the cloud, and the Corporation is also capable of solving problems with unexpected peaks in user activity.

the

will be processed


In order to learn the lessons of Pokémon Go, Arstechnica recently spoke with the technical Director Niantic Caslino Phil (Phil Keslin). We talked about the complex interactions between the public parts of the Google cloud and internal data.

Pokémon Go uses Google Compute Engine, its cloud storage and full stack of network technologies, including infrastructure, data processing and queries. Of course, the game uses Google Maps to determine the location of the player. According to Celina all the changes in the AI gameplay require that the mobile client was carried out by calls to the data store Niantic. "Every time you change the state of the game — throwing pokebola, catching pokemon or another action, you interact with the repository data."
When the first large peak, then, according to Celina, "Google didn't even notice, but the game at least two times increased the number of data to be processed". However, this has not led to an overload of the systems. "The easiest way to say this: we had forecast the worst-case scenario, but the game has surpassed even him." On the day of release there was a real explosion. "We found the bottlenecks, zamedlenie performance. After their elimination, we rested on a new "bottleneck"".

Part of the bottleneck was in the code, Niantic, "but we have had problems with a couple of libraries open source what we did not expect — they were the most difficult to solve." In General, Niantic found five or six bottlenecks, the elimination of each of which lasted one to two days.

But the fault occurred from Google. Go Pokémon have had problems with the cloud infrastructure; engine container consists of subsystems that have never been tested under such load. There was a couple of problems with the network stack.

Eliminate bottlenecks required a lot of work from a team of five people, consisting of Celina, team leader and three engineers. "In the first two weeks we hardly slept," says Caslin. "The guys from Google too, were laid in full".

Another factor that influenced performance in the different regions, according to Celina, was the difference between mobile operators in different parts of the world. "We developed Pokémon Go so that the game could work on mobile devices with low bandwidth. The arising problem is more about marketing programs of Telecom operators". For example, a large mobile operator in the Philippines provided all its subscribers free access to the Pokémon Go, so Niantic need to ensure that users and disabling them after the completion of the promotion.

Despite the initial chaos, Caslin said that Niantic did not have to change the architecture of Pokémon Go after the game's release. (The company continues to consistently improve the application and is preparing to launch the second generation of Pokémon Go, which she hopes will give the game a second wind after wave pokemania subsided.) "The infrastructure has been developed taking into account the full set of pokemon," he says. "The core of the system remains the same, we just add new gameplay. We were lucky that it turned out to create a scalable system, we have not tested it. Fortunately, the established architecture securely massturbate".

Can advise Keslin other developers seeking to create a new phenomenon of augmented reality? "Think about scaling from the start. The development team of our game was focused on performance. Because of this we have been able to maximize performance at low cost, and was able to scale the system."

the

Addicted to Trivia Crack



Logo Trivia Crack Etermax company

Other companies went through the public cloud with the gaming infrastructure, the results were contradictory. Two years before Pokémon Go the jackpot, the Argentine company Etermax has created its own hit mobile gaming: Trivia Crack, a game from a set of competitive games, running on Amazon Web Services.

According to technical Director Etermax Gonzalo Garcia (Gonzalo Garcia), a huge success, Trivia Crack came to her in two waves: the first, in March 2014, marked the success of the games part of the South American countries. Traffic has increased from 100 thousand to 10 million daily active users. The second wave came when the game became popular in the United States in October of the same year, increasing the number of daily active users to 25 million.

"We didn't see it coming," says Garcia. "According to our estimates and tests, we knew that we could cope with one million players, and has planned two million. But we did not expect such a growth, we don't even have invested so much in advertising! Without cloud server infrastructure, we would never have coped."
"We felt that releasing another game of the company", — adds the Director of information technology Etermax Martin Dominguez (Dominguez). "Previously, the limit was one million users, but that approaches two million, is not always sufficient for ten million."

This load is subjected to stress, the development process Agile company. "We believed that it is not bound to the Scrum, and always worked in Agile style," says Dominguez. "The problem was that this jump in the number of users, the sprints could not be completed in two weeks, we had to work day after day."

To cope with the popularity, Etermax first had to give up a part of functions. She also had to change some of the databases and adapt the processes to improve efficiency. In particular, Etermax has changed the way of using the Amazon Relational Database Service (RDS). She has implemented fragmentation, secondary servers, and the exchange of data among secondary servers. She also collaborated with AWS to resolve the problem with the number of packets per second.

To support advanced network functionality Etermax had to migrate from the public network Amazon virtual private cloud. "It was quite difficult," recalls Dominguez. "On the second peak of users, increasing the number to 25 million, the staff AWS said that I've never seen a company that used the RDS at this level".

Garcia said that with the explosive growth of the popularity of Trivia Crack Etermax helped to handle one feature. "Trivia Crack interface was almost fully in a mobile device, therefore, only transmit small amounts of information". Due to this synchronous connection became less of a problem for Trivia Crack compared to other games Etermax, for example, mobile games bingo Bingo Crack: "We constantly need to convey information about the balls bingo, to know who is first, second or third shout out "Bingo!"". What Etermax has taught the success of Trivia Crack? According to Dominguez, when it comes to infrastructure, you need to think fast, to be proactive in making changes, and guess, what problems will arise. "Pokémon Go was the same problem — every day we had to survive." To prepare for testing and be ready for a new popular hit Etermax has increased the number of staff, core development team from three to ten engineers.

the

success Management


Patrick palm (Palm Patric) is technical Director and founder of the Swedish company Hansoft. She has created collaborative work Favro, used by many game developers, including Ubisoft and id Software. Patrick has the opportunity to observe the company's customers, to solve their tasks in the industry of mobile gaming.

Considering the problems faced by the Pokémon Go, palm highlights the issue of differences in connection speeds in different regions of the world. However, he stressed that thanks to cloud computing, the scalability has now become a smaller obstacle.

"A few years ago, scalability was a much more serious problem, which needed a lot more people." Since the Niantic were able to shift the solution of the problems on Google Cloud, she was able to focus on the implementation and registration in different countries. "Cloud systems solve one of the biggest business problems," said palm.

Due to the fact that the solution to the scalability problem has been found, pay attention to other, more minor problems. In the case of Pokémon Go they are quick battery drain of the phone, forcing players to run around town with external batteries. "The game is Niantic heavily discharged battery," says palm. "Cloud developers of games now have thought much about energy consumption".

Another related problem: restrictions on data transfer at different rates. "Not every user has a good tariff of the mobile operator". This is another incentive to transfer a greater part of the burden on the server side. Games running in the background consume not only energy but also data.

"We need the cloud more".

the

game Rules


If Pokémon Go and Trivia Crack was able to cope with the scaling issues of cloud computing on mobile platforms, then you will succeed. Here are some tips for assessing your own task:

    the
  1. Consider a scenario "worse than the worst." of Course, when it comes to assessments, you need to choose some parameters. But you also need to plan how to scale above the maximum limit. The main advantage of cloud computing — elasticity, think about what will happen if the scale would grow faster than you've tested.
  2. the
  3. Discuss the emergency situation with the service provider. When the traffic increased, Niantic and Etermax had to work closely with Google and Amazon. When choosing a cloud provider specifically discuss what services they can provide, if your needs will increase dramatically compared to expectations.
  4. the
  5. Learn network operators and mobile device. If your cloud-based technology to run their own mobile devices of users (perhaps in different parts of the world), then think of how many compounds will be transmitted from device to cloud and rate the tariff plans with the slowest connections, the most problematic operators, which will have to encounter your product.
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

The use of Lisp in production

FreeBSD + PostgreSQL: tuning the database server

As we did a free Noodle for iOS and how we plan to earn