MySQL 5.0s Pluggable Storage Engine Architecture, Part 1: An Overview
MySQL 5.0’s Pluggable Storage Engine Architecture, Part 1: An Overview
The reasons for the huge popularity and increasing adoption of MySQL as a serious database platform can many times be boiled down to two, but surprisingly contrasting, considerations:
- How much MySQL is like other database platforms.
- How much MySQL is different than other database platforms.
Of course, many companies are turning to MySQL because of the database server’s high performance, rock-solid reliability, and uncomplicated mode of operation. But in addition, modern enterprises migrating from proprietary databases to MySQL find the transition very easy as MySQL sports ANSI standard SQL, a familiar stored procedure/function language, standard relational tables and indexes, and many other features that resemble most standard relational database characteristics. In this fashion, MySQL mirrors other proprietary databases, which means that the migration from other systems and the ramp-up time of training database personnel in MySQL is very quick and painless.
On the other hand, many large enterprises are choosing MySQL because it offers a new and different paradigm of database management. Perhaps the one key differentiator between MySQL and other database platforms — whether they are proprietary or open source — is the pluggable storage engine architecture of MySQL.
What exactly is the MySQL pluggable storage engine architecture and what benefits does it offer to today’s modern enterprises? This paper addresses these questions and more by outlining what the MySQL pluggable storage engine architecture is, giving practical examples of how it works, and showcasing the many benefits that come from using it.
Overview of the MySQL Pluggable Storage Engine Architecture
The MySQL pluggable storage engine architecture allows a database professional to select a specialized storage engine for a particular application need while being completely shielded from the need to manage any specific application coding requirements. The MySQL server architecture encapsulates the application programmer and DBA from all of the low-level implementation details at the storage level providing a consistent and easy application model and API. So while there are different capabilities across different storage engines, the application is shielded from these.
Graphically depicted, the MySQL pluggable storage engine architecture looks like Figure 1.

Figure 1 — MySQL Pluggable Storage Engine Architecture is both flexible and modular
The pluggable storage engine architecture provides a standard set of management and support services that are common among all underlying storage engines. The storage engines themselves are the components of the database server that actually perform actions on the underlying data that is maintained at the physical server level.
This efficient and modular architecture provides huge benefits for those wishing to specifically target a particular application need — such as data warehousing, transaction processing, high availability situations, etc. — while enjoying the advantage of utilizing a set of interfaces and services that are independent of any one storage engine.
The application programmer and DBA interact with the MySQL database through Connector APIs and service layers that are above the storage engines. If application changes bring about requirements that demand the underlying storage engine change, or that one or more additional storage engines be added to support new needs, no significant coding or process changes are required to make things work. The MySQL server architecture shields the application from the underlying complexity of the storage engine by presenting a consistent and easy to use API that applies across storage engines.
Let’s first take a look at the common layer of the MySQL database server and then examine what a storage engine actually is and how they compare to one another in terms of functionality and use.
The Common MySQL Database Server Layer
A MySQL pluggable storage engine is the component in the MySQL database server that is responsible for performing the actual data I/O operations for a database as well as enabling and enforcing certain feature sets that target a specific application need. A major benefit of using specific storage engines is that you are only delivered the features needed for a particular application, and therefore you have less system overhead in the database, with the end result being more efficient and higher database performance. This is one of the reasons that MySQL has always been known to have such high performance, matching or beating proprietary monolithic databases in industry standard benchmarks.
From a technical perspective, what are some of the unique supporting infrastructure components that are in a storage engine? Some of the key differentiations include:
- Concurrency — some applications have more granular lock requirements (such as row-level locks) than others. Choosing the right locking strategy can reduce overhead and therefore help with overall performance. This area also includes support for capabilities like multi-version concurrency control or ’snapshot’ read.
- Transaction Support - not every application needs transactions, but for those that do, there are very well defined requirements like ACID compliance and more.
- Referential Integrity - the need to have the server enforce relational database referential integrity through DDL defined foreign keys.
- Physical Storage - this involves everything from the overall page size for tables and indexes as well as the format used for storing data to physical disk.
- Index Support - different application scenarios tend to benefit from different index strategies, and so each storage engine generally has its own indexing methods, although some (like B-tree indexes) are common to nearly all engines.
- Memory Caches - different applications respond better to some memory caching strategies than others, so while some memory caches are common to all storage engines (like those used for user connections, MySQL’s high-speed Query Cache, etc.), others are uniquely defined only when a particular storage engine is put in play.
- Performance Aids - includes things like multiple I/O threads for parallel operations, thread concurrency, database checkpointing, bulk insert handling, and more.
- Miscellaneous Target Features - this may include things like support for geospatial operations, security restrictions for certain data manipulation operations, and other like items.
Each set of the pluggable storage engine infrastructure components are designed to offer a selective set of benefits for a particular application. Conversely, avoiding a set of component features helps steer clear of unnecessary overhead. So it stands to reason that understanding a particular application’s set of requirements and selecting the proper MySQL storage engine can have a dramatic impact on overall system efficiency and performance.
Let’s now take a look at some of the more prominently used MySQL storage engines and how they compare to one another to understand the benefits each provides.
Comparing Different Storage Engines
As can be seen in Figure 1, there are a number of MySQL pluggable storage engines that can be used with the MySQL database server. Some of the more common engines include:
- MyISAM - the default MySQL pluggable storage engine and the one that is used the most in Web, data warehousing, and other application environments. Note that a MySQL server’s default storage engine can easily be changed by altering the STORAGE_ENGINE configuration variable.
- InnoDB - used for transaction processing applications, and sports a number of features including ACID transaction support.
- BDB - an alternative transaction engine to InnoDB that supports COMMIT, ROLLBACK, and other transactional features.
- Memory - stores all data in RAM for extremely fast access in environments that require quick look ups of reference and other like data.
- Merge - allows a MySQL DBA or developer to logically group together a series of identical MyISAM tables and reference them as one object. Good for VLDB environments like data warehousing.
- Archive - provides the perfect solution for storing and retrieving large amounts of seldom-referenced historical, archived, or security audit information.
- Federated - offers the ability to link together separate MySQL servers to create one logical database from many physical servers. Very good for distributed or data mart environments.
- Cluster/NDB - the Clustered database engine of MySQL that is particularly suited for applications with high performance lookup needs that also require the highest possible degree of uptime and availability.
- Other - other storage engines include CSV (references comma-separated files as database tables), Blackhole (for temporarily disabling application input to the database) and an Example engine that helps jump start the process of creating custom pluggable storage engines.
While the above brief descriptions will give you a general idea of what type of application might benefit from a particular storage engine, a more detailed look at various common database tasks and needs across the various engines may help delineate the differences a little more. Keep in mind that the grid on the next page is not exhaustive by any means; for a more detailed analysis of each storage engine’s feature set, please see the MySQL Reference Manual.
Of course, you can use multiple storage engines in a single application; you are not limited to using only one storage engine in a particular database. So, you can easily mix and match storage engines for the given application need. This is often the best way to achieve optimal performance for truly demanding applications: use the right storage engine for the right job.
Because you have such flexibility and choice with MySQL, you should carefully weigh your application’s requirements before selecting a particular storage engine for use. For example, let’s say you have a new world-wide Business Intelligence (BI) application with the following requirements:
- Heavy amounts of nightly data loads with a small time window for the loads to complete.
- Other than load activity, the only other operations are read-only in nature, which implies no need for high levels of data concurrency as only shared locks are used for reads.
- Application will be hosted from a web interface that requires full-text search capabilities.
- Transaction support is not necessary
- Data referential integrity is assured via the source transactional system.
- Data will be replicated to various geographical sites for performance benefits.
For this application, the natural choice would likely be MyISAM. Again, however, there may be more detailed parts of the application that are suited for a different storage engine. For example, maybe the BI application described above has both current and seldom-referenced historical data that must be kept online for government compliance reasons. In that case, a mixture of the MyISAM and Archive storage engines would be recommended.
In Part II of this series we will examine the MySQL pluggable storage engine architecture in action to see how easy it is to make use of it and switch between various storage engines during an application design and testing phase.