Suppose we have a graph showing the energy consumption on a day of "n" number of meters. Each point is separated from each other by a period of 15 minutes. If data is missing in the container, where I have the data of the energy consumed by each of the meters, how do you see the data graphed?
- Option 1: Is there a break in the plotted line?
- Option 2: Connect with a straight line between the point before the cut and the point after the cut?
- Option 3: Connect the line with value 0?
I make a modest drawing so that my doubt is better understood.
I understand that it can be given in some other way, and that this could vary according to how the DataSet is defined, but any more precise information about it is appreciated.
According to the contribution of the user abulafia who responded to the post in the form of a comment, I express his opinion in response format to close the topic.
Depends on what you mean by "missing data". Suppose that a sample is taken every second, and that the X-axis is labeled by the instant of the samples. And that in certain samples the data is NaN (but the sample exists). In that case you would have the first graph. If in certain samples the data is 0 obviously you would have the third. Also if you do a fillna() (which replaces the NaNs with zeroes) in the first case. If there is a "jump" in the index itself, for example after instant 100 it goes to 150, and the 50 intermediate ones are not there, you would have the middle graph.
If you want to avoid the middle plot you can "resample" the data using pandas. In this way Pandas creates a "continuous" index in which there is no missing data, adding whatever there is in the period, or putting NaN if there is none in the period. So you could get the first graph (or the last if you do .fillna())