Home arrow C++ arrow Page 2 - More Tricks to Gain Speed in Programming Contests
C++

More Tricks to Gain Speed in Programming Contests


As much as we would like to encourage smart thinking in programming contests, in the end, what makes the difference most of the time is who knows more tips and tricks to gain the extra advantage required. Today I am going to present another collection of them, believing that maybe someday, one of these will help you win.

Author Info:
By: Gabor Bernat
Rating: 4 stars4 stars4 stars4 stars4 stars / 14
July 14, 2009
TABLE OF CONTENTS:
  1. · More Tricks to Gain Speed in Programming Contests
  2. · Maintain the cache intact
  3. · Using the for loop
  4. · Breaking out with smart solutions

print this article
SEARCH DEVARTICLES

More Tricks to Gain Speed in Programming Contests - Maintain the cache intact
(Page 2 of 4 )

"What cache?" you may ask. Well most people do not know, but at read time the processor reads a cache line, not only a single value. The length of this cache varies, depending on a few factors (though most of the time it is 32 bytes). However, in the end, it is always a number that is a power of two. Take this into account when you read your arrays.

The same is true when your construct structures. To not force the Central Processing Unit to read a value from the memory twice, construct the structures so they will use an amount of memory that is a power of two. To visualize this problem, I have written a short program that will just present and measure the difference.

#include "stdio.h"

#include "stdlib.h"

 

// using Windows interface QueryPerformanceCounter( )

// to High Resolution CPU timer

// compute CPU time in microseconds of f():

 

#include <time.h>

#include <stdio.h>

#include <stdlib.h>

#include <windows.h>

#include <winbase.h>

 

#define maxN 1000

#define maxM 1000

 

int t[maxN][maxM];

 

int f(void)

{

int i,j;

int s = 0;

for(i = 0; i < maxM; ++ i)

for(j = 0; j < maxN; ++ j)

s += t[j][i];

return s;

}

 

int main()

{

LARGE_INTEGER ticksPerSecond;

LARGE_INTEGER tick; // A point in time

LARGE_INTEGER start_ticks, end_ticks, cputime;

 

// get the high resolution counter's accuracy

QueryPerformanceFrequency(&ticksPerSecond);

// what time is it?

QueryPerformanceCounter(&tick);

 

double s = 0;

double temp;

for ( int x = 0; x< 10; ++x)

{

 

QueryPerformanceCounter(&start_ticks);

/* start f() */

 

for(int k = 0; k < 100; ++ k)

{

f();

}

 

/* end f( ) */

 

QueryPerformanceCounter(&end_ticks);

cputime.QuadPart = end_ticks.QuadPart- start_ticks.QuadPart;

 

printf ("tElapsed CPU time test: %.9f secn",

temp = ((float)cputime.QuadPart/(float)ticksPerSecond.QuadPart));

 

s += temp;

 

}

 

printf("ntAverage with global variables : %.9fn

Catch maintained", s/10);

return 0;

}

The code of interest is inside the f function. It is about the line: s += t[j][i];. Here you can observe that we calculate the sum of the matrix by breaking the cache line after every read. When the CPU reads t[i][j] it will also read the t[i][j+1], t[i][j+2] and t[i][j+3]. However, if we calculate the sum this way at every read, it will have to reread the values, as the ones inside the catch are no longer usable. The time required to complete the process on my machine, with an E4300 Intel Dual Core at 1.8 GHz, is:

Now if we just switch up the two variables and write:

s += t[i][j];

The speed of the program now is:

That is double. Talk about a huge difference! In the future, remember that the CPU is still much faster than the memory, so avoid breaking the cache. Also, avoid using the global variables for local purpose; first, because this can lead to strange problems, and second, because this can slow down the application. Now this will have very little influence, but the influence is still there. 

The local memory pool (stack) is much faster than the global one. This is limited to 4MB, though, so avoid exceeding this amount. In contests the limit is set at 1MB, so really, do not overuse it.


blog comments powered by Disqus
C++ ARTICLES

- Intel Threading Building Blocks
- Threading Building Blocks with C++
- Video Memory Programming in Text Mode
- More Tricks to Gain Speed in Programming Con...
- Easy and Efficient Programming for Contests
- Preparing For Programming Contests
- Programming Contests: Why Bother?
- Polymorphism in C++
- Overview of Virtual Functions
- Inheritance in C++
- Extending the Basic Streams in C++
- Using Stringstreams in C++
- Custom Stream Manipulation in C++
- General Stream Manipulation in C++
- Serialize Your Class into Streams in C++

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials