📜 ⬆️ ⬇️

Netflix Best Practices for Moving from Oracle DB to Amazon SimpleDB

This is a partial translation of the Netflix article; it concerns only problematic transitions from Oracle DB to Amazon SimpleDB and how the company resolves them.

Around the end of 2008, Netflix had only one data center. And this DC has put before us a few questions. As the only point of failure, for example, due to problems with electricity, it could lead to dissatisfaction of our users with the service. In addition, with the simultaneous growth of streaming traffic and subscriptions to services, Netflix would soon outgrow this data center - we saw an inevitable need for electricity, better cooling, we needed more space and more equipment.

Alternatively, it was possible to build new data centers. However, besides the high costs, this effort would have resulted in our technical staff not being able to deal with new products, so they would be busy expanding the DC. In addition, we understood that managing multiple data centers is a difficult task. Building and maintaining multiple data centers seemed like a dangerous distraction from our core business.

Instead of taking this path, we chose a more radical one. We switched to the IAAS (Infrastructure as a Service) solution offered at that time by the Amazon web service. With a multitude of data centers already operating, multiple redundancy levels of various services (such as S3 and SimpleDB), AWS promised better availability and scalability in a relatively short time.
')
Putting various networking and background tasks into outsourcing, Netflix focused on its core business, delivering movies and TV shows.

In the process of transition to the AWS infrastructure, we formulated a set of best practices needed to work with AP systems such as SimpleDB.

Leaving the DBMS Behind


Partial or no SQL support at all. In general, SimpleDB supports submultiply SQL


Lack of connections between domains


No transaction


No triggers


There is no schema support — and besides, this is not obvious. A request with an invalid attribute name does not cause an error


Lack of sequence support

Sequences are often used as primary keys.


Sequences are also often used for order numbers.


No operations to work with time


There is no support for restrictions, in particular, there are no restrictions on the uniqueness of the field, no control of the foreign key, no restrictions on the integrity.


New challenges posed by SimpleDB


In addition, there were features that we encountered that were specific to SimpleDB. Here are some of them:

SimpleDB domains provide the maximum write speed, if you split the data into several domains.


No support for native data types. All data is stored and processed as strings.

All comparisons (i.e. WHERE conditions) and sorting occur only with strings.


Two separate API calls for DeleteAtributes and PutAttributes.

How to perform an atomic operation that requires both deleting one attribute and updating another attribute in the same line?


Using case-sensitive domain names and attributes

In many DBMS, the names of tables and columns are case-insensitive, and in SimpleDB, on the contrary, in this connection, the put, delete, and select operations may not work correctly without even reporting an error. The programmer’s task is to detect all inconsistencies in the name register.


Operations with typos in select, put, or delete attribute names can end without error notification.

Unlike case sensitivity, this problem occurs due to the lack of schema checking. Simple DB is not only a sparse database, but also without schema support.


If you forget to specify LIMIT in select, you may need several queries in order to get all the data.

Multiple requests reduce the likelihood that the site will receive all the data for the required time interval.


It is also necessary to bear in mind the problems of “integrity ultimately” ( eventual consistency ).

Anti-pattern reading should be avoided immediately after writing.


Non-indexed queries can be very expensive.

Anti-pattern: SELECT * FROM MY_DOMAIN WHERE MY_ATTR_1 IS NULL


Some queries are slow, although indexed.

Index selectivity affects speed just like other SQL engines.


As with any other multi-user system, significant jumps in response time may appear.


Deciding to switch to SimpleDB and S3, Netflix very quickly migrated to the cloud infrastructure, fully armed with new aggressive product launch plans and traffic growth. There were problems, but we coped with almost everyone.

Source: https://habr.com/ru/post/163961/


All Articles