Discussion on running speed of Java loop

public static void LoopTestList(){
        int i = 0 ;
        int j = 0;
        ArrayList<Integer> ai  = new ArrayList<Integer>();
        for(i =0;i<10000000;i++){
            ai.add(i);
        }
        long time1 = System.currentTimeMillis();
        for(i=0;i< ai.size();i++){
            j=ai.get(i);
        }
        long time2 = System.currentTimeMillis();
        long aa = time2 - time1;
        System.out.println("The execution time of the first cycle is:"+aa);

        long time3 = System.currentTimeMillis();
        for(i=0;i< ai.size();i=i+16){
            j=ai.get(i);
        }
        long time4 = System.currentTimeMillis();
        System.out.println("The execution time of the second cycle is:"+(time4 - time3));
    }

    public static void LoopTestArray(){
       int arr[] = new int[10000000];
       long start1 = System.currentTimeMillis();
       for(int i =0;i<arr.length;i++) arr[i]*=3;
       System.out.println("The execution time of the first cycle is:"+(System.currentTimeMillis()-start1));
       long start2 = System.currentTimeMillis();
       for(int i =0;i<arr.length;i+=16) arr[i]*=3;
       System.out.println("The execution time of the second cycle is:"+(System.currentTimeMillis()-start2));

    }

    public static void main(String args[]){
        for(int i =0;i<10;i++) {
            System.out.println("The first"+i+"Test times");
            System.out.println("List test");
            LoopTestList();
            System.out.println("Array test");
            LoopTestArray();
            System.out.println("-----------------------");
        }
    }

I saw a discussion about the execution rate of circular code blocks on the Internet before. I thought it was very helpful, so I studied more and wrote it down. The above is the program test code, which uses the array and list to test the cycle time respectively. The two cycles use successive cycles and 16 cycles at an interval. After repeated tests, it can be found that when using the array, the execution time of the two cycles does not differ, and when using the list, the execution time of the two cycles is about twice the difference.

Analyze the reason. For arrays, due to the fast reading speed of arrays, in actual operation, the time for CPU to read array data from cache can be ignored, and the time consumption mainly focuses on the interaction between cache and data in memory. At present, the cache lines of mainstream CPUs are mainly 64 bytes. Int data takes up 4 bytes in Java, so each cache line can store exactly 16 int data. Therefore, in the loop test of the first array type, each interaction between cache and memory data is 16 data once, so the time difference between the two is not large. It can be inferred that if we expand the cycle interval to 32 times, the data interaction will be updated to 32 data interactions once, so it will reduce the time of all accesses by half. Back to the list, the data in the list is stored in an Object array, and the Object is stored as an Object in Java. Taking 64 bit CPU as an example, the pointer address used to find the Object address is 64 bits, that is, 8 bytes. Therefore, the data that can be stored in each cache will be half less than that of int type, so its execution time will be half of that of all traversals.

Postscript: it is found in the actual measurement that after the interval is expanded, the list cycle time consumption is the same as the expected result, and the array result fluctuates greatly, and the 64 interval is only half the time gap. It is speculated that the array memory and cache exchange strategy may be changed, or there are other basic time consumption for array access, resulting in no obvious 32 interval time gap.

Note: after that, I tried to test the content in C language, and found that the difference between all traversal and interval 16 bit traversal in C language is more than 6 times. The cache content should be caused by the participation of java virtual machine and cannot be applied to other languages. The C language code is as follows

    int n = 64 * 1024 * 1024;
	int* arr = (int*)malloc(sizeof(int) * n);
	long start1, end1, start2, end2;
	long dur1, dur2;
	int i;
	if (arr == NULL) {
		return 0;
	}
	for (i = 0; i < n; i++) {
		arr[i] = i + 1;
	}
	start1 = clock();
	for (i = 0; i < n; i++) arr[i] *= 3;
	end1 = clock();
	dur1 = end1 - start1;
	printf("The execution time of the first cycle is: %d" ,dur1);
	start2 = clock();
	for (i = 0; i < n; i += 16) arr[i] *= 3;
	end2 = clock();
	dur2 = end2 - start2;
	printf("The execution time of the second cycle is: %d", dur2);

Keywords: Java

Added by php_user13 on Fri, 21 Jan 2022 19:13:59 +0200