The truth is rarely pure and never simple

pgfplots: being careful with samples and domains

As usual, a new day in the lab requires new function plots. But now that you have a different data source, you may observe some potential pitfalls with pgfplots.

Let’s assume, you’re interested in x/sin(x). What you get is

defaults: \addplot[draw=red] {x/sin(deg(x))};
defaults: \addplot[draw=red] {x/sin(deg(x))};

The LaTeX source is shown at the end of the post. As only the addplot line differs, I’m only going to show this particular line.

This does not look like anything you expected. First of all, the curve does not span the whole canvas. But the excellent documentation tells you to use the domain keyword.

\addplot[domain={-7:7},draw=red] {x/sin(deg(x))};
\addplot[domain={-7:7},draw=red] {x/sin(deg(x))};

First of all, you may be tempted to think that specifying a domain larger than the plotting coordinates does you no harm. Unfortunately, this is counterintuitive: pgfplots uses a fixed number of samples (default: 25) that are equally spaced within the specified domain. Therefore, using a large domain gives you less precise sampling of the region you are actually displaying.

In order to make the curve look smooth, you have to add more samples. What pgfplot basically does is calculating a certain number of function values at equally spaced positions and then draw straight lines between them. (In fact, ‘equally spaced’ refers to the drawing canvas and not to the axis. Logarithmic x axis will give you logarithmic sample points.) The result then is cropped to the drawing canvas. As you can see in the next plot, data points outside the diagram are calculated, as well. This behavior may be desired for some functions, but in this example having poles, it is simply an error. This diagram uses four times the number of samples than the default graphics:

\addplot[domain={-7:7},draw=red,samples=100] {x/sin(deg(x))};
\addplot[domain={-7:7},draw=red,samples=100] {x/sin(deg(x))};

Now there are several things you should keep in mind.

  1. LaTeX is for typesetting, not for calculations. Hence it is very slow at it. And it may get wrong results. And is has a very limited value range. In principle, LaTeX only calculates lengths. pgfplots does a lot under the hood to circumvent many problems but it is still very slow. (Besides that, LaTeX works with a fixed amount of allocated memory.) So you should avoid using very high sample counts.
  2. Every datapoint has to be stored somewhere. Placing samples=10000 everywhere just because then you would have one thing less to worry about won’t make you happy when looking at the file size either. By the way: every nth point is a valid pgfplots option, as well.
  3. Any values for samples or domain won’t help you with poles. To overcome this limitation, you either have to restrict the y range (solution) or to plot the segments separately restricting the x range (kludge).

In order to account for these three issues, you can try something along the lines of this:

\addplot[draw=red,line width=1pt,samples at={-6,-5.9,...,-3, -3,-2.5,...,3,3,3.1,...,6},restrict y to domain={-70:70}] {x/sin(deg(x))};
      \addplot[domain={-7:7},draw=blue,samples=100,restrict y to domain={-70:70}] {x/sin(deg(x))};

which gives the final curve in blue and the distorted one in red.

see text
see text

Now there are several things to take away from this last plot:

  1. You can make the sample spacing irregular within one single plot with samples at. If you only specify x,…,y, then the step count will match the default value mentioned above. But you can define a custom step width by writing x,x+s,…,y. This way you can make pgfplots calculating only the sample points necessary for your function.
  2. Using samples at means that you can’t use samples any more. While this might not be that surprising, you will have to notice that you can’t use domain, as well. The domain setting will be assumed to be described by the lowest and the highest sample point defined in your samples at directive.
  3. Using restrict y to domain helps you avoiding the pole issue. However, you will have to search for a good threshold by hand. Using something too low (and using the canvas limits is too low), will give you gaps near the edges of the canvas. Using something too high does not solve the problem in any way. But at least you do not have to draw everything by yourself.
  4. Obviously, there is a sample point at (0|0) for the red curve but the function itself does not approach 0 at this point. As this point is not defined, pgfplots assumes it to be zero. Now you may wonder why then the red curve has no problems with the singularity. The reason is the even number of sampling points. Just choose an odd number and you will have the dip for the blue curve, as well.

Of course, this is only an example and the actual tricks and values needed for your graph highly depend on the function you want to plot. Please keep in mind that most of the keywords described above do not apply to scattered data. Happy plotting.

The template

      \addplot[draw=red] {x/sin(deg(x))};

Leave a comment

Your email address will not be published.