-
Notifications
You must be signed in to change notification settings - Fork 835
Fix host.db load failure due to incorrect object_version compatibility check #12395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for the review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting for additional commit. Marking as draft can also prevent accidental merge in the future.
ProblemA bug was discovered where the cache loaded from host.db is not being used. trafficserver/iocore/hostdb/HostDB.cc Lines 567 to 570 in 9f6abab
CauseThe cause of the stale determination was identified in trafficserver/iocore/hostdb/I_HostDBProcessor.h Lines 180 to 183 in 9f6abab
This code first populates trafficserver/iocore/hostdb/I_HostDBProcessor.h Lines 322 to 339 in 9f6abab
Debug logs added before and after the constructor call showed that fields such as
The patch used to add the debug logs is shown below: patchdiff --git a/iocore/hostdb/I_HostDBProcessor.h b/iocore/hostdb/I_HostDBProcessor.h
index 0be0a81b4..2356337d7 100644
--- a/iocore/hostdb/I_HostDBProcessor.h
+++ b/iocore/hostdb/I_HostDBProcessor.h
@@ -178,10 +178,22 @@ struct HostDBInfo : public RefCountObj {
HostDBInfo *ret = HostDBInfo::alloc(size - sizeof(HostDBInfo));
int buf_index = ret->_iobuffer_index;
memcpy((void *)ret, buf, size);
+
+ Debug("hostdb", "HostDBInfo before new: key=%lu hostname_offset=%u, "
+ "ip_timestamp=%u, ip_timeout_interval=%u, _iobuffer_index=%d, m_refcount=%d",
+ ret->key, ret->hostname_offset, ret->ip_timestamp, ret->ip_timeout_interval,
+ ret->_iobuffer_index, ret->refcount());
+
// Reset the refcount back to 0, this is a bit ugly-- but I'm not sure we want to expose a method
// to mess with the refcount, since this is a fairly unique use case
ret = new (ret) HostDBInfo();
ret->_iobuffer_index = buf_index;
+
+ Debug("hostdb", "HostDBInfo after new: key=%lu hostname_offset=%u, "
+ "ip_timestamp=%u, ip_timeout_interval=%u, _iobuffer_index=%d, m_refcount=%d",
+ ret->key, ret->hostname_offset, ret->ip_timestamp, ret->ip_timeout_interval,
+ ret->_iobuffer_index, ret->refcount());
+
return ret;
}
diff --git a/iocore/hostdb/RefCountCache.cc b/iocore/hostdb/RefCountCache.cc
index af9cf0fd2..8f58c3767 100644
--- a/iocore/hostdb/RefCountCache.cc
+++ b/iocore/hostdb/RefCountCache.cc
@@ -49,5 +49,5 @@ RefCountCacheHeader::operator==(RefCountCacheHeader const &that) const
bool
RefCountCacheHeader::compatible(RefCountCacheHeader *that) const
{
- return this->magic == that->magic && this->version == that->version && this->object_version == that->version;
+ return this->magic == that->magic && this->version == that->version && this->object_version == that->object_version;
}; FixTo fix this issue, the construction of
I considered ways to initialize the reference counter without calling the constructor. trafficserver/iocore/hostdb/I_HostDBProcessor.h Lines 180 to 183 in 9f6abab
As a result, I chose the following workaround: |
Thank you for waiting. |
Thanks for the additional commit. Your description of the hostdb issue is very helpful.
[approve ci autest] |
Problem
When loading host.db, it is incorrectly determined as "incompatible," and the file fails to load. This issue does not occur on the master branch.
An example of how to reproduce the issue is shown below:
proxy.config.cache.hostdb.sync_frequency
in records.config:Cause
Compatibility is checked when loading host.db using the following logic:
https://ghe.corp.yahoo.co.jp/cdn/trafficserver/blob/16f04bd6a849998d1db9348c5504164328541803/iocore/hostdb/P_RefCountCache.h#L593-L597
The
RefCountCacheHeader::compatible
function checks whether the following member values ofRefCountCacheHeader
are equal:magic
version
object_version
https://ghe.corp.yahoo.co.jp/cdn/trafficserver/blob/16f04bd6a849998d1db9348c5504164328541803/iocore/hostdb/RefCountCache.cc#L49-L53
However, the current implementation mistakenly compares
object_version
withversion
, and since the values differ, the cache is deemed incompatible:object_version
: 1.1version
: 1.0object_version
was updated to 1.1 in the following PR, which likely introduced the issue starting from ATS 9.1.0:#7264
Fix
This patch updates
RefCountCacheHeader::compatible
to compareobject_version
values correctly.