llvm libLLVMCore source code analysis 01 - Type Class

Source path

llvm\include\llvm\IR\Type.h

llvm\include\llvm\IR\DerivedTypes.h

llvm type system

Llvm type system is one of the most important features of llvm IR. It is an important difference between llvm IR and ordinary three address code. It is the basis of a series of IR based optimization. Take the following source code as an example:

// type.cpp

int add1(int a, int b) {
	return a + b;
}

Use the command "bang - S - emit llvm type.cpp - O type. Ll" to generate the corresponding llvm IR. You can see that the IR of llvm will use type i32 to decorate the operand.

; ModuleID = 'type.cpp'
source_filename = "type.cpp"
target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-w64-windows-gnu"

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @_Z4add1ii(i32 %a, i32 %b) #0 {
entry:
  %a.addr = alloca i32, align 4
  %b.addr = alloca i32, align 4
  store i32 %a, i32* %a.addr, align 4
  store i32 %b, i32* %b.addr, align 4
  %0 = load i32, i32* %a.addr, align 4
  %1 = load i32, i32* %b.addr, align 4
  %add = add nsw i32 %0, %1
  ret i32 %add
}

llvm Type class

For example, vmtype is not required to implement all types in the subclass of vmtype, because vmtype is not required to implement all types in the system.

  enum TypeID {
    // PrimitiveTypes - make sure LastPrimitiveTyID stays up to date.
    VoidTyID = 0,    ///<  0: type with no size
    HalfTyID,        ///<  1: 16-bit floating point type
    FloatTyID,       ///<  2: 32-bit floating point type
    DoubleTyID,      ///<  3: 64-bit floating point type
    X86_FP80TyID,    ///<  4: 80-bit floating point type (X87)
    FP128TyID,       ///<  5: 128-bit floating point type (112-bit mantissa)
    PPC_FP128TyID,   ///<  6: 128-bit floating point type (two 64-bits, PowerPC)
    LabelTyID,       ///<  7: Labels
    MetadataTyID,    ///<  8: Metadata
    X86_MMXTyID,     ///<  9: MMX vectors (64 bits, X86 specific)
    TokenTyID,       ///< 10: Tokens

    // Derived types... see DerivedTypes.h file.
    // Make sure FirstDerivedTyID stays up to date!
    IntegerTyID,     ///< 11: Arbitrary bit width integers
    FunctionTyID,    ///< 12: Functions
    StructTyID,      ///< 13: Structures
    ArrayTyID,       ///< 14: Arrays
    PointerTyID,     ///< 15: Pointers
    VectorTyID       ///< 16: SIMD 'packed' format, or other vector type
  };

The inheritance tree of Type class is shown in the following figure:

llvm::Type

All parent classes of Type cannot be instantiated directly because the constructor is declared protected.

llvm::IntegerType

integer type of arbitrary bit width. The range of bit width is [IntegerType::MIN_INT_BITS (1), integertype:: max_int_bits].

llvm::FunctionType

Function signature, including 1 return value type and 1 parameter type list.

Syntax: < ReturnType > (< parameter list >)

Example:

i32 (i32)Pass in 1 i32 and return i32
float (i16, i32 *) *Function pointer type, pass in 1 i16 and 1 i32 pointer and return float
i32 (i8*, ...)Variable length function, pass in at least one i8# pointer and put it back to i32, that is, the function signature of printf in llvm.
{i32, i32} (i32)Pass in 1 i32 and return 1 structure composed of 2 i32

llvm::PointerType

Refers to an object in memory, indicating Memory Location. PointerType has an optional address space attribute, which is 0 by default. The semantics of address space other than 0 is target specific.

Syntax:

<type> *

ptr

Example:

[4 x i32]*i32 array pointer with length 4
i32 (i32*) *Function pointer, pass in 1 i32 * and return i32
i32 addrspace(5)*Pointer to i32 in address space 5
ptrOpaque pointer
ptr addrspace(5)Opaque pointer in address space 5

llvm::StructureType

A collection of data members in memory. Data members must be Type with size attribute.

Structure access in memory: obtain the data member pointer through the "getelementptr" instruction, and then use the load and store instructions for the pointer.

Structure access in registers: use the extractvalue and insertvalue instructions

Structure can be "packed" Structure, aligned by 1 byte; It can also be a "no packed" Structure. The padding between data members is determined by the DataLayout string in the module.

Structure can be "literal", which is defined in the form of inline, for example: {i32, i32} *; It can also be "identified", which is defined by name, for example:

%T1 = type { <type list> }     ;  Define no packed identified structure T1
 %T2 = type <{ <type list> }>   ;  Define packed identified structure T2

Structure can also be opaque and is usually used for pre declaration. The syntax is as follows:

%X = type opaque  ;  Define the named opaque structure X
 %52 = type opaque ;  Define an opaque structure without a name 52

llvm::ArrayType

ArrayType is a type that arranges elements in order in memory. It has two attributes: size and element type.

Syntax:

[<# elements> x <elementtype>]

Where elementtype is any Type with size attribute.

Example:

[40 x i32]32-bit integer Array with length of 40
[41 x i32]32-bit integer Array with length of 41
[4 x i8]Array of 8-bit integers with length of 4
[3 x [4 x i32]]3x4 32-bit integer Array
[12 x [10 x float]]12x10 single precision floating point Array
[2 x [3 x [4 x i16]]]2x3x4 16 bit integer Array

llvm::VectorType

VectorType represents the Vector of an element. It is used for SIMD (single instruction multiple data). It has three attributes: size, element type (must be primitive type) and whether it is scalable. If the VectorType is extensible, the actual length of the Vector = N * size.

When the length of VectorType is in bytes, the arrangement of VectorType in memory is the same as that of ArrayType. When the length of VectorType is not in bytes, the bitcase instruction needs to be used to complete the mutual conversion from VectorType to IntegerType. The example is as follows (big end):

%val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16

; Bitcasting from a vector to an integral type can be seen as
; concatenating the values:
;   %val now has the hexadecimal value 0x1235.

store i16 %val, i16* %ptr

; In memory the content will be (8-bit addressing):
;
;    [%ptr + 0]: 00010010  (0x12)
;    [%ptr + 1]: 00110101  (0x35)

Syntax:

< <# elements> x <elementtype> >          ; Fixed-length vector
< vscale x <# elements> x <elementtype> > ; Scalable vector

elementtype can only be integer, floating point number and pointer.

Example:

<4 x i32>32-bit integer Vector with length of 4
<8 x float>32-bit single precision floating-point Vector with length of 8
<2 x i64>64 bit integer Vector with length 2
<4 x i64*>64 bit integer pointer Vector with length 2
<vscale x 4 x i32>The length of the integer Vector is 4 times of 32 bits

Reference materials

LLVM Language Reference Manual — LLVM 15.0.0git documentation

LLVM Programmer's Manual — LLVM 15.0.0git documentation

Keywords: llvm

Added by hothientuan on Tue, 15 Feb 2022 11:22:46 +0200