It is natural to assume that memory representing an image or screen will have data laid out as one row of pixels followed directly by the next row of pixels. However many APIs (e.g. Direct3D) that return a pointer to image data add an extra block of data at the end of each row.
Important: this block is often used for caching purposes so should not be overwritten by us.
When the extra block of memory is used the width of a row of pixels in memory is not bufferWidth * bytesPerPixel it is in fact another value known as Pitch or Stride (the two terms mean the same, I will use pitch from now on). The pitch is the amount of bytes to offset to move down one row.
It is important to take this into account when doing any image buffer operations. For example to calculate a offset into a buffer for an arbitrary x,y pixel may be approached like this
int offset = ( x + y * bufferWidth) * bytesPerPixel;
This will not work though if there is padding at the end of each row. Instead we need to advance by stride bytes. So the calculation becomes:
int offset = x * bytesPerPixel + y * pitch;
Note that some APIs (like HAPI) do not use a buffer at the end of each row so pitch always equals bufferWidth * bytesPerPixel.
The DirectX help describes this issue here: MSDN Width vs. Pitch