Skip to content

contrib/cityhash: namespace config.h as city_config.h for vendored builds#515

Merged
slabko merged 1 commit into
ClickHouse:masterfrom
iliaal:cityhash-config-h-guard
Jun 12, 2026
Merged

contrib/cityhash: namespace config.h as city_config.h for vendored builds#515
slabko merged 1 commit into
ClickHouse:masterfrom
iliaal:cityhash-config-h-guard

Conversation

@iliaal

@iliaal iliaal commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

city.cc does an unguarded #include "config.h". The generic name collides with (or is shadowed by) a consumer's own autoconf config.h when the tree is vendored, and breaks outright on builds without an autoconf step that prune the generated header:

fatal error C1083: Cannot open include file: 'config.h'

Rename the checked-in header to city_config.h and CITY_-prefix every macro it defines, updating the two usages in city.cc (CITY_WORDS_BIGENDIAN, CITY_HAVE_BUILTIN_EXPECT) and the Bazel srcs entry. The header is static (the #if WIN32 || WIN64 HAVE_BUILTIN_EXPECT block is a hand edit, not autoconf output), so there is no configure step to keep in sync.

Inert #undef placeholders for standard types (size_t, uint*_t, ...) stay unprefixed: they would define standard symbols on deficient platforms, not CityHash macros. The pre/post object file for city.cc is byte-identical, so no runtime or wire-hash change.

@iliaal iliaal requested review from mzitnik and slabko as code owners June 10, 2026 20:31
@iliaal iliaal force-pushed the cityhash-config-h-guard branch from dbca1d0 to f03227f Compare June 10, 2026 20:32
@slabko

slabko commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Hi @iliaal,

Thank you for your contribution! Could you please explain the steps that led you to the problem you're trying to solve? config.h is included in the library along with the other CityHash source files. It seems like you have a specific use case that I hadn't considered before.

@iliaal

iliaal commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

You're right that config.h is checked in next to city.cc and resolves fine in this repo's own CMake build. The breakage hits consumers that vendor the CityHash sources, like my PHP ClickHouse extension (iliaal/php_clickhouse), which vendors clickhouse-cpp and builds it through PHP's own build system (config.w32 on Windows, config.m4 on Unix).

config.h is a generated autoconf artifact ("Generated from config.h.in by configure"). Build systems that vendor the sources but prune generated headers (mine does, which is common) don't carry it, and MSVC then fails: C1083: Cannot open include file: 'config.h'. Guarding the include behind HAVE_CONFIG_H is the standard autoconf convention here: it makes the generated config optional instead of a hard dependency on a committed build artifact.

city.cc reads only two macros from it. It takes the little-endian path either way, since the shipped config.h leaves WORDS_BIGENDIAN undef. And HAVE_BUILTIN_EXPECT is already 0 on MSVC, identical to the header being absent, so on the platform where the include fails the file changes nothing.

The only build the bare guard would affect is GCC/Clang, which gets HAVE_BUILTIN_EXPECT 1 today and would lose the __builtin_expect hints. So I've added target_compile_definitions(cityhash PRIVATE HAVE_CONFIG_H) to cityhash/CMakeLists.txt: this build keeps including config.h exactly as before, and only consumers that neither define the macro nor ship the header fall back to the defaults.

@slabko

slabko commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@iliaal, I see the issue now.

Unfortunately, it is a bit more complicated because, in addition to CMake, we also have Bazel. What do you think about renaming CityHash's config.h to something more specific, such as city_config.h, and prefixing all definitions in it, as well as their usages in CityHash's source code, so that they never overlap with your autoconf use?

…ilds

city.cc did an unguarded `#include "config.h"`. The generic name collides
with (or is shadowed by) a consumer's own autoconf config.h when these
sources are vendored, and breaks outright (MSVC C1083: cannot open include
file 'config.h') when the consumer prunes the generated header.

Rename the checked-in header to city_config.h and CITY_-prefix every macro
it defines, updating the two usages in city.cc (CITY_WORDS_BIGENDIAN,
CITY_HAVE_BUILTIN_EXPECT) and the Bazel srcs entry. The header is static
(the `#if WIN32 || WIN64` HAVE_BUILTIN_EXPECT block is a hand edit, not
autoconf output), so there is no configure step to keep in sync.

Inert `#undef` placeholders for standard types (size_t, uint*_t, ...) stay
unprefixed: they would define standard symbols on deficient platforms, not
CityHash macros. The pre/post object file for city.cc is byte-identical, so
no runtime or wire-hash change.
@iliaal iliaal force-pushed the cityhash-config-h-guard branch from 07d83d0 to db6bdc5 Compare June 11, 2026 14:12
@iliaal iliaal changed the title contrib/cityhash: guard #include "config.h" for vendored builds contrib/cityhash: namespace config.h as city_config.h for vendored builds Jun 11, 2026
@iliaal

iliaal commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Reworked to the rename. city_config.h with every macro CITY_-prefixed, the include unconditional and uniquely named, so it resolves the same way under CMake, Bazel, and any consumer that vendors the sources. No per-build-system HAVE_CONFIG_H define, which also closes the gap my earlier CMakeLists approach left under Bazel.

Details:

  • city.cc's two usages updated to CITY_WORDS_BIGENDIAN / CITY_HAVE_BUILTIN_EXPECT, and the Bazel srcs entry points at the renamed file.
  • The header is already static, not regenerated: the committed copy carries a hand-edited #if WIN32 || WIN64 block around HAVE_BUILTIN_EXPECT that isn't autoconf output, so there's no configure step to keep in sync after the rename.
  • Inert #undef placeholders for standard types (size_t, uint*_t, ...) are left unprefixed on purpose: they would define standard symbols on deficient platforms, not CityHash macros.
  • Compiled city.cc before and after the rename: the object file is byte-identical, so the CityHash128 wire variant is unchanged.

@slabko slabko left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works!

@slabko slabko merged commit a67036d into ClickHouse:master Jun 12, 2026
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants