As I was writing my "Bicycle" to dotirovanie tables in Oracle and Postgre

Greetings, Habr!

In this article I will tell about how we struggled with the problem of rapid growth the size of the tables in the database on busy EMS system. The highlight adds that the problem has been solved for two databases: Oracle and Postgre. Interested please under cat.

Initial conditions


So, there is a certain EMS system that receives and processes messages from network elements. A record of each message is logged in the database table. According to customer requirements the number of incoming messages (and therefore number of records in the table) is an average of 100 per second, while the peak load may increase to 1500.
It is easy to calculate that a day on average, more than 8 million records in the table. The problem appeared when it turned out that when the data volumes of more than 20 million rows some queries of the system begin to slow down and to go beyond the work time requested by the customer.

Task


Thus, I had to figure out what to do with the data, so no information is lost and the queries worked quickly. While initially the system worked on Postgre, but soon planned the transition to Oracle and wish the transition was a minimum of problems with the transfer functionality.
The option of using partitioning dropped immediately, because it was known that Oracle Partitioning is not exactly will be included in the license, and to alter from one to another parterning don't really like, so began to think over the implementation of some of his Bicycle.
Facilitated the task that logs older than a couple of days do not need to display in the system, because to investigate the vast majority of problems should be enough of messages from two days ago. But keep them "just in case" of need. And then the idea was born to implement procedures for periodic "dotirovanie" the data in the tables, i.e. transfer them from the tables to display some of the historical table.

Solution implementation


It was decided to keep 2 tables with the most relevant data to display (let's call them table — basic and table_secondary is optional). These two hung table view table_view from which were taken the data to display: it's after the moment of the transfer data in the UI drastically missing all records. Older records are transferred to the historical table with the names of the type H$table_NUM, where NUM is the number of historical table (the data older than the number above). Historical table, so as not to clog the main tablespace, also occasionally dragged into the "cold" tablespace, tables which can be stored on slower disks. Surgery is generally heavy, so is rarely a separate procedure. In addition, the procedure removes from the "cold" too old tablespace of the table.
About how exactly you migrate the data: due to the large number of indexes on the tables transfer directly to records using insert'and worked slowly, so we chose the approach with renaming of tables and recreating indexes and triggers.
Schematically, the work procedures presented in the figure below:
image

So, the algorithm works out roughly like this (the algorithm and example code, cite procedures for oracle, postgre will be able to view on github):
Procedure rotate_table(primary_table_name). Runs, say, every hour.
    the
  1. Check that the number of rows in the main table has exceeded a certain limit;
  2. the
  3. Check that there is a "cold" tablespace:

    the
    SELECT COUNT(*) INTO if_cold_ts_exists FROM USER_TABLESPACES WHERE tablespace_name = 'EMS_HISTORICAL_DATA';
    

  4. the
  5. Create an empty auxiliary table new_table based on the current main table. To do this in postgre has user-friendly functionality CREATE TABLE... (LIKE... INCLUDING ALL), Oracle also had to write its counterpart — the procedure create_tbl_like_including_all(primary_table_name, new_table_name, new_idx_trg_postfix, new_idx_trg_prefix), which creates a similar empty table:
    the
    SELECT
    replace(dbms_metadata.get_ddl('TABLE', primary_table_name), primary_table_name, new_table_name)
    INTO ddl_query
    FROM dual;
    ddl_query := substr(ddl_query, 1, length(ddl_query) - 1);
    EXECUTE IMMEDIATE ddl_query;
    

    And triggers, and indexes to it:

    the
    FOR idx IN (SELECT idxs.index_name FROM user_indexes idxs WHERE idxs.table_name = primary_table_name)
    LOOP
    ddl_query := REPLACE(
    REPLACE(dbms_metadata.get_ddl('INDEX', idx.index_name), primary_table_name, new_table_name),
    idx.index_name, new_idx_trg_prefix || idx.index_name || new_idx_trg_postfix);
    ddl_query := substr(ddl_query, 1, length(ddl_query) - 1);
    EXECUTE IMMEDIATE ddl_query;
    END LOOP;
    

  6. the
  7. Rename table

    the
    EXECUTE IMMEDIATE 'alter table' || secondary_table_name || 'rename to' || history_table_name;
    EXECUTE IMMEDIATE 'alter table' || primary_table_name || 'rename to' || secondary_table_name;
    EXECUTE IMMEDIATE 'alter table' || new_table_name || 'rename to' || primary_table_name;
    

  8. the
  9. Rename the triggers and indexes on them;
  10. the
  11. If cold tablespace does not exist, we believe that historical data store is not necessary, and delete the relevant table:

    the
    EXECUTE IMMEDIATE 'drop table' || history_table_name || 'cascade CONSTRAINTS';
    

  12. the
  13. Perform rebild views (oracle only):

    the
    EXECUTE IMMEDIATE 'select * from' || view_name || 'where 1=0';
    



Procedure move_history_logs_to_cold_ts(primary_table_name). Executed, e.g., once a day.
    the
  1. If the "cold" tablespace exists, looking for all the historical tables that are not in the tablespace:

    the
    EXECUTE IMMEDIATE
    'select
    table_name
    from
    user_tables
    where
    table_name like "' || history_table_pattern || "'
    and (tablespace_name != "EMS_HISTORICAL_DATA" or tablespace_name is null)'
    BULK COLLECT INTO history_tables;
    

  2. the
  3. Roaming each table in the "cold" tablespace:

    the
    EXECUTE IMMEDIATE 'ALTER TABLE' || history_tables(i) || 'MOVE TABLESPACE ems_historical_data';
    

  4. the
  5. moved to Deleted tables, triggers and indexes;
  6. the
  7. Performed a delete too old tables from "cold" tablespace.


Startup procedures are on schedule was done with Quartz Sheduler in case of Postgre, and using Oracle Scheduler in Oracle, scripts for configuration, which is also in the source code.

Opinion


Full source code of procedures and scripts for configuration scheduler can be viewed at GitHub.

Thank you for your attention!
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Why I left Google Zurich

2000 3000 icons ready — become a sponsor! (the table of orders)

New web-interface for statistics and listen to the calls for IP PBX Asterisk