Stockfish - chess engines UCI
Previous version chess engine Stockfish
Author compilation - Lucasart
Fix TT comment and static_assert()
Comment is based on a misunderstanding of what unaligned memory access is. Here is an article that explains it very clearly:
https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt
No matter how we define TTEntry or TTCluster, there will never be any unaligned memory access. This is because the complier knows the alignment rules, and does the necessary adjustments to make sure unaligned memory access does not occur.
The issue being adressed here has nothing to do with unaligned memory access. It is about cache performance. In order to achieve best cache performance:
- we prefetch the cacheline as soon as possible.
- we ensure that TT clusters do not spread across two cachelines. If they did, we would need to prefetch 2 cachelines, which could hurt cache performance.
Therefore the true conditions to achieve this are:
1/ start adress of TT is cache line aligned. void TranspositionTable::resize() enforces this.
2/ TT cluster size should *divide* the cache line size. Currently, we pack 2 clusters per cache lines. It used to be 1 before "TT sardines". Does not matter what the ratio is, all we want is to fit an integer number of clusters per cache line.
No functional change.
Resolves #506
JCER=3256
Previous version chess engine Stockfish
Author compilation - Lucasart
Information on the compilation:
Timestamp: 1448090633 Fix TT comment and static_assert()
Comment is based on a misunderstanding of what unaligned memory access is. Here is an article that explains it very clearly:
https://www.kernel.org/doc/Documentation/unaligned-memory-access.txt
No matter how we define TTEntry or TTCluster, there will never be any unaligned memory access. This is because the complier knows the alignment rules, and does the necessary adjustments to make sure unaligned memory access does not occur.
The issue being adressed here has nothing to do with unaligned memory access. It is about cache performance. In order to achieve best cache performance:
- we prefetch the cacheline as soon as possible.
- we ensure that TT clusters do not spread across two cachelines. If they did, we would need to prefetch 2 cachelines, which could hurt cache performance.
Therefore the true conditions to achieve this are:
1/ start adress of TT is cache line aligned. void TranspositionTable::resize() enforces this.
2/ TT cluster size should *divide* the cache line size. Currently, we pack 2 clusters per cache lines. It used to be 1 before "TT sardines". Does not matter what the ratio is, all we want is to fit an integer number of clusters per cache line.
No functional change.
Resolves #506
JCER=3256
Comments
Post a Comment