Hello Community,
We are running tutor on our private network and deployed using Tutor + K8s.
In the LMS logs, we have observed that we are very frequently getting pymongo.errors.ServerSelectionTimeoutError: mongodb:27017: [Errno -2] Name or service not known
. And this error sometimes results in 502 error for the end-users.
Traceback of LMS logs:
Traceback (most recent call last):
File "/openedx/venv/lib/python3.8/site-packages/mongodb_proxy.py", line 55, in wrapper
return func(*args, **kwargs)
File "/openedx/edx-platform/common/lib/xmodule/xmodule/contentstore/mongo.py", line 134, in find
fp = self.fs.get(content_id)
File "/openedx/venv/lib/python3.8/site-packages/gridfs/__init__.py", line 153, in get
gout._ensure_file()
File "/openedx/venv/lib/python3.8/site-packages/gridfs/grid_file.py", line 486, in _ensure_file
self._file = self.__files.find_one({"_id": self.__file_id},
File "/openedx/venv/lib/python3.8/site-packages/pymongo/collection.py", line 1273, in find_one
for result in cursor.limit(-1):
File "/openedx/venv/lib/python3.8/site-packages/pymongo/cursor.py", line 1156, in next
if len(self.__data) or self._refresh():
File "/openedx/venv/lib/python3.8/site-packages/pymongo/cursor.py", line 1050, in _refresh
self.__session = self.__collection.database.client._ensure_session()
File "/openedx/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1810, in _ensure_session
return self.__start_session(True, causal_consistency=False)
File "/openedx/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1763, in __start_session
server_session = self._get_server_session()
File "/openedx/venv/lib/python3.8/site-packages/pymongo/mongo_client.py", line 1796, in _get_server_session
return self._topology.get_server_session()
File "/openedx/venv/lib/python3.8/site-packages/pymongo/topology.py", line 482, in get_server_session
self._select_servers_loop(
File "/openedx/venv/lib/python3.8/site-packages/pymongo/topology.py", line 208, in _select_servers_loop
raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: mongodb:27017: [Errno -2] Name or service not known
We also found one thing odd about CONTENTSTORE, following is the value on the tutor server:
>>> settings.CONTENTSTORE
{
"ENGINE": "xmodule.contentstore.mongo.MongoContentStore",
"ADDITIONAL_OPTIONS": {},
"DOC_STORE_CONFIG": {
"host": "mongodb",
"port": 27017,
"user": None,
"password": None,
"db": "openedx",
},
}
For CONTENTSTORE, values are different on the native installation:
>>> settings.CONTENTSTORE
{
"ADDITIONAL_OPTIONS": {},
"DOC_STORE_CONFIG": {
"authsource": "",
"collection": "modulestore",
"connectTimeoutMS": 2000,
"db": "edxapp",
"host": "172.42.21.109",
"password": "TEST",
"port": 27017,
"read_preference": "SECONDARY_PREFERRED",
"replicaSet": "",
"socketTimeoutMS": 3000,
"ssl": False,
"user": "edxapp",
},
"ENGINE": "xmodule.contentstore.mongo.MongoContentStore",
"OPTIONS": {
"auth_source": "",
"db": "edxapp",
"host": "172.42.21.109",
"password": "TEST",
"port": 27017,
"ssl": False,
"user": "edxapp",
},
}
When we are getting this MongoDB error, we have also checked the STATUS of the MongoDB pod and it is up and running, not restarted once, and no error trace on MongoDB logs.
Tutor version:
tutor, version 12.2.0
Looking for help on this,
Thanks.