• 什么是字节对齐,为什么要对齐?


    Computer Systems: A Programmer's Perspective:

    3.9.3 Data Alignment

    Many computer systems place restrictions on the allowable addresses for the primitive data types, requiring that the address for some type of object must be a multiple of some value K (typically 2, 4, or 8). Such alignment restrictions simplify the design of the hardware forming the interface between the processor and the memory system. For example, suppose a processor always fetches 8 bytes from memory with an address that must be a multiple of 8. If we can guarantee that any double will be aligned to have its address be a multiple of 8, then the value can be read or written with a single memory operation. Otherwise, we may need to perform two memory accesses, since the object might be split across two 8-byte memory blocks.

    The IA32 hardware will work correctly regardless of the alignment of data. However, Intel recommends that data be aligned to improve memory system performance. Linux follows an alignment policy where 2-byte data types (e.g., short) must have an address that is a multiple of 2, while any larger data types (e.g., int, int *, float, and double) must have an address that is a multiple of 4. Note that this requirement means that the least significant bit of the address of an object of type short must equal zero. Similarly, any object of type int, or any pointer, must be at an address having the low-order 2 bits equal to zero.

    Aside: A case of mandatory alignment

    For most IA32 instructions, keeping data aligned improves efficiency, but it does not affect program behavior. On the other hand, some of the SSE instructions for implementing multimedia operations will not work correctly with unaligned data. These instructions operate on 16-byte blocks of data, and the instructions that transfer data between the SSE unit and memory require the memory addresses to be multiples of 16. Any attempt to access memory with an address that does not satisfy this alignment will lead to an exception, with the default behavior for the program to terminate.

    This is the motivation behind the IA32 convention of making sure that every stack frame is a multiple of 16 bytes long (see the aside of page 226). The compiler can allocate storage within a stack frame in such a way that a block can be stored with a 16-byte alignment.

    Aside: Alignment with Microsoft Windows

    Microsoft Windows imposes a stronger alignment requirement—any primitive object of K bytes, for K = 2, 4, or 8, must have an address that is a multiple of K. In particular, it requires that the address of a double or a long long be a multiple of 8. This requirement enhances the memory performance at the expense of some wasted space. The Linux convention, where 8-byte values are aligned on 4-byte boundaries was probably good for the i386, back when memory was scarce and memory interfaces were only 4 bytes wide. With modern processors, Microsoft’s alignment is a better design decision. Data type long double, for which gcc generates IA32 code allocating 12 bytes (even though the actual data type requires only 10 bytes) has a 4-byte alignment requirement with both Windows and Linux.
     
    Alignment is enforced by making sure that every data type is organized and allocated in such a way that every object within the type satisfies its alignment restrictions. The compiler places directives in the assembly code indicating the desired alignment for global data. For example, the assembly-code declaration of the jump table beginning on page 217 contains the following directive on line 2:

    .align 4

    This ensures that the data following it (in this case the start of the jump table) will start with an address that is a multiple of 4. Since each table entry is 4 bytes long, the successive elements will obey the 4-byte alignment restriction.

    Library routines that allocate memory, such as malloc, must be designed so that they return a pointer that satisfies the worst-case alignment restriction
    for the machine it is running on, typically 4 or 8. For code involving structures, the compiler may need to insert gaps in the field allocation to ensure that each structure element satisfies its alignment requirement. The structure then has some required alignment for its starting address.

    For example, consider the following structure declaration:
    struct S1 {
      int  i;
      char c;
      int  j;
    };

    Suppose the compiler used the minimal 9-byte allocation, diagrammed as follows:
    Then it would be impossible to satisfy the 4-byte alignment requirement for both fields i (offset 0) and j (offset 5). Instead, the compiler inserts a 3-byte gap (shown
    here as shaded in blue) between fields c and j:
    As a result, j has offset 8, and the overall structure size is 12 bytes. Furthermore, the compiler must ensure that any pointer p of type struct S1* satisfies
    a 4-byte alignment. Using our earlier notation, let pointer p have value xp. Then xp must be a multiple of 4. This guarantees that both p->i (address xp)
    and p->j (address xp + 8) will satisfy their 4-byte alignment requirements.

    In addition, the compiler may need to add padding to the end of the structure so that each element in an array of structures will satisfy its alignment requirement.
    For example, consider the following structure declaration:
    struct S2 {
      int  i;
      int  j;
      char c;
    };
    If we pack this structure into 9 bytes, we can still satisfy the alignment requirements for fields i and j by making sure that the starting address of the structure satisfies
    a 4-byte alignment requirement. Consider, however, the following declaration:

    struct S2 d[4];

    With the 9-byte allocation, it is not possible to satisfy the alignment requirement for each element of d, because these elements will have addresses xd, xd + 9,
    xd + 18, and xd + 27. Instead, the compiler allocates 12 bytes for structure S2, with the final 3 bytes being wasted space:
    That way the elements of d will have addresses xd, xd + 12, xd + 24, and xd + 36. As long as xd is a multiple of 4, all of the alignment restrictions will be satisfied.
     
  • 相关阅读:
    浅浅的分析下es6箭头函数
    css实现背景半透明文字不透明的效果
    五星评分,让我告诉你半颗星星怎么做
    微信小程序--成语猜猜看
    微信小程序开发中如何实现侧边栏的滑动效果?
    强力推荐微信小程序之简易计算器,很适合小白程序员
    swing _JFileChooser文件选择窗口
    file类简单操作
    序列化对象
    MessageBox_ swt
  • 原文地址:https://www.cnblogs.com/skullboyer/p/7762710.html
Copyright © 2020-2023  润新知