May 2, 2022

Modifying Magento core database tables?

Yireo Blog Post

When learning to build modules on top of Magento 2 (v2.0.0 alpha 42 or so), I learned one thing: Do not hack the database. At least, that's what I thought would always be the correct thing to do. But nowadays, I'm undergoing a mind shift. Is it ok to modify the Magento core database tables anyway?

Been there, done that

With Magento 1, I experienced first hand what it could mean to alter core tables. Hacks were made and then - with a major Magento upgrade - the core table structure was dramatically changed. With the EAV changes in Magento 1.3 and 1.4, things simply broke big time. After that, newer Magento 1.X releases never changed the database structure that much, so hacking your changes in there caused less issues.

Until that moment that you wanted to upgrade from Magento 1 and Magento 2. Even though the Data Migration Tool allowed you to add your own PHP functionality to map and transform fields, this was so troublesome that I always recommended people in practice to move data into custom tables, undoing the hacks made in core tables.

No core hacks in the core tables

In short, when the core database structure changes frequently, adding your own customizations to that core database structure is not so smart. I compare it to core hacks - modifying core code in vendor/ might work but only upto a certain point. However even core hacks could be a proven way to modify things: You could document where a certain adjustment in vendor code is made. You could even automate that with composer patches. And if that procedure saves you time, perhaps it is better than to implement the official solution. It depends. And it requires some wisdom.

Let's talk about what it means to modify core tables. And let's talk about what would be the wisest solution.

What does "modifying core tables" mean?

An important question is: What does modifying core tables mean? Obviously, shortening the numeric length of an entity ID of the product table is simply evil. Likewise, it is not a smart move to change the type of a SKU from varchar into blob - even though you could raise the question how disastrous this actually is (increasing does less/no harm compared to decreasing).

Is adding a new custom column to an existing table bad? It will (most likely) not break existing functionality. And perhaps, if the table itself is changed (renamed, dropped) so that your column is gone too, you were intending for that to happen anyway.

In this light, I would say modifying core tables could mean various things. But let's mainly consider the scenario of adding new columns to existing tables. And let's see if we can do this in a smarter new way. But before we get there, let's first discuss the old way: Using extension attributes, as suggested by Magento.

Not adding a custom column to an existing table

The database of Magento could be divided into 3 groups of things: EAV entities, non-EAV entities (flat entities) and other tables (mappings, indexes, etc). When dealing with EAV entities, you could opt for adding a new EAV attribute: It requires a database setup (with the new Database Schemas that would be a data patch using the EavSetup class) and then you can implement things. Unfortunately, your attribute will always be a dynamic field. It will not be part of interfaces, it will not be hard-coded.

Because of this, the more recommended approach is to use an extension attributes. With a bit of XML (extension_attributes.xml), interfaces are generated for you. But then ... the procedure requires you to setup your own DI plugin to hook into all relevant repository methods, so that whenever the original entity is loaded or modified, your own logic is called upon as well. Usually, it takes me about 2-3 hours to build all of this from scratch. Compared to the EAV attribute procedure only taking me 30 minutes.

With the extension attributes, a more common scenario is that you would also add your own custom table to save your own attribute. And this means setting up your own db_schema.xml, DTO model, resource model, collection and perhaps a repository. And aweful lot of work, when you only wanted to add a plain text field to an entity.

The end justifies the means

So perhaps we need to take a more realistic approach: We should not endlessly spend time on custom code to just follow along with a less efficient standard. Perhaps we should focus upon the end goal. If the end goal is to add a plain text field to an entity and an EAV attribute requires the least amount of time - on the short term and the long term - then what is the problem?

Obviously, EAV attributes will not always work. For instance, when you are dealing with a non-EAV entity (duh). And exactly there, we could use a simple db_schema.xml to extend the existing table with a new column. Read on.

Mixing custom columns with extension attributes

Such a new column will normally be picked up without issues by the current model, resource model and collection class. It could only be that a repository is selecting only specific fields, excluding our new field. For this, you could add a DI plugin to intercept the relevant repository method.

And while we're at it, we could also implement the extension_attributes.xml file after all: Copying the data value ($model->getData('foobar')) into the extension attribute value ($model->getExtensionAttributes()->getFoobar()) so that it complies to interfaces, APIs, etcetera. In short, we can still use extension attributes, but skip the creation of a separate table with model, resource model and collection, which saves a huge amount of effort.

Look mam, 10.000 columns in a table

Ok, ok, so everyone is going to do this, tables might end up containing 10.000s of columns. Yes but no. First of all, MySQL has a limit of 4096 columns per table - so 10.000 columns is out of the question. And InnoDB even downsizes this to a maximum of 1017 columns per table. Still, 1.000 columns in a table would cause a bad performance, right?

I agree. But is that performance worse than an EAV entity with 1.000 attributes? I would say that these scenarios require custom tuning anyway, so why talk about this scenario while the majority of shops are not going to lead to this scenario either.

Look mam, the core database is refactored

Another caveat would be that if the core database tables are going to be refactored (renamed, removed, etc), then this could issues with custom fields added to them. True, but this is also a danger with EAV and even with extension attributes. And fact is that the database structure did not change heavily with Magento 2 releases. The codebase felt less reliable. Plus, isn't it easier to worry about this when the time is there, instead of making the current implementations far too complicated to avoid this potential issue.

It's not that bad

When I began training Magento developers, I always told them not to modify core tables. But nowadays, I would say it's not that bad. Try out, see if you like it.

Posted on May 2, 2022

About the author

Author Jisse Reitsma

Jisse Reitsma is the founder of Yireo, extension developer, developer trainer and 3x Magento Master. His passion is for technology and open source. And he loves talking as well.

Sponsor Yireo

Looking for a training in-house?

Let's get to it!

We don't write too commercial stuff, we focus on the technology (which we love) and we regularly come up with innovative solutions. Via our newsletter, you can keep yourself up to date on all of this coolness. Subscribing only takes seconds.

Do not miss out on what we say

This will be the most interesting spam you have ever read

We don't write too commercial stuff, we focus on the technology (which we love) and we regularly come up with innovative solutions. Via our newsletter, you can keep yourself up to date on all of this coolness. Subscribing only takes seconds.