I have a monthly sales table like the following:
create table ventas (
id int NOT NULL AUTO_INCREMENT,
year int,
month int,
monto numeric(15,2),
PRIMARY KEY (id)
);
insert into ventas (year, month, monto) values (2018, 1, 100);
insert into ventas (year, month, monto) values (2018, 1, 300);
insert into ventas (year, month, monto) values (2018, 3, 340);
insert into ventas (year, month, monto) values (2018, 5, 200);
insert into ventas (year, month, monto) values (2018, 5, 100);
insert into ventas (year, month, monto) values (2018, 7, 100);
insert into ventas (year, month, monto) values (2018, 8, 100);
insert into ventas (year, month, monto) values (2018, 9, 200);
insert into ventas (year, month, monto) values (2018,11, 350);
insert into ventas (year, month, monto) values (2018,12, 440);
I am trying to make a report of these sales per month, I tried it like this:
select year,
month,
sum(monto) as total
from ventas
group by year,
month
And I get something like this:
| year | month | total |
|------|-------|-------|
| 2018 | 1 | 400 |
| 2018 | 3 | 340 |
| 2018 | 5 | 300 |
| 2018 | 7 | 100 |
| 2018 | 8 | 100 |
| 2018 | 9 | 200 |
| 2018 | 11 | 350 |
| 2018 | 12 | 440 |
Which is correct, but as can be seen there are "gaps", that is, months without values, I would like to have a report, but with the 12 months and complete those that have not had sales with a 0, that is to say something like this
| year | month | total |
|------|-------|-------|
| 2018 | 1 | 400 |
| 2018 | 2 | 0 |
| 2018 | 3 | 340 |
| 2018 | 4 | 0 |
| 2018 | 5 | 300 |
| 2018 | 6 | 0 |
| 2018 | 7 | 100 |
| 2018 | 8 | 100 |
| 2018 | 9 | 200 |
| 2018 | 10 | 0 |
| 2018 | 11 | 350 |
| 2018 | 12 | 440 |
Important
It may not be sales, or it may be some other type of data, it is important to understand the conceptual problem, which is, what do we do when we are missing information in a table? When we do not have sensor readings for all hours, when there are accounting accounts that do not register movements in certain months, when we want to list the sales of all the branches, but there are branches that have not had sales, when we want to know how many people occupied a room, but there are rooms that have never been occupied, etc.
Before trying anything weird, be clear that we are missing data, we don't have data for months 2, 4, 6 and 10. So there is no magic possible to fix this. Whatever solution we face, the first thing to do is "invent" these rows that we don't have. And then how do we "invent" the missing rows? . In my opinion, the best alternative is:
Manage a period table
You do not have it? It is a good time to create it, believe me these tables are tremendously useful. For example, a fairly flexible period table form would be:
PeriodId
: will be ourid
of the period, the important thing is that it respects the order of the periods, in such a way that by ordering by this column, we obtain the natural order of the periodsTipo
: A "luxury" that we can allow ourselves, to manage different types of periods, for exampleM
for monthly periods, could beY
to carry annual periods,S
to carry weekly periods,F
for fiscal periods, etc.Nombre
is simply the description that we are going to showFechaDesde
andFechaHasta
they are useful when, for example, we have to obtain a period given a date, whichFechaHasta
has one more day, it is simply a help so that theBETWEEN
Now that we have this table of periods, it is extremely easy to solve this problem. Actually, what we have to think about is that the conceptual problem is: (a) list all the periods (b) obtain the total sales of each one:
And now if:
ifnull(sum(v.monto),0)
to put a 0 in those periods that we do not have sales.join
we doon P.PeriodoId = (v.year*100) + v.month
to generate from a year and month the id with which we have defined the period. It is a way, you could also handle a simpler table of periods only with years and monthsImportant : You must generate the period tables to cover the full reasonable universe of cases. For example, what is your first date registered in the system? What is the last? to those limits add a few more periods up and down, enough to not have future problems (or at least not be alive when the claims come).
sql fiddle
This other alternative is aimed at:
We already said that we are missing months, also that the best way is to have a table of periods, right? If we can't do this, we will still have to generate the periods in some dynamic way that works for us. Let's look at some possibilities:
Simple
Just for something simple, where we list a full year, we can hard-create a query that returns the 12 months we need:
And for more than a year?
This is where you will regret not being able or wanting to create the period table. Generating these periods dynamically, when they are more than the previous twelve months, requires generating a dynamic sequence for the years, which is not something natural, or let's say it is not part of the SQL specification. A way that is conceptually compatible with all the bases (with slight differences) is to make Cartesian products between tables of 10 elements and thus multiply from 10 to ten, for example, to generate 1000 years, from 1900 to 2900, it could be done like this :
And now yes, we put everything together, the months, the years and the sales:
The output should be identical to the previous answer using the period table, but clearly much more complex and difficult to read.
Note: sqlfiddle