Problem background:
In our project, the gateway is used to realize the diversion of microservices, that is, to control the proportion of traffic to different instances of a microservice. Therefore, many methods to call Nacos API are written in geteway.
When deploying a new environment, the following errors are reported. Our server uses k8s and the images are unified.
2021-11-23 16:53:54.568 ERROR [***-gateway,,,] 1 --- [ main] com.alibaba.nacos.client.naming : [NA] failed to write cache for dom:DEFAULT_GROUP@@***-****
java.lang.IllegalStateException: failed to create cache dir: /root/nacos/naming/753378b3-d4ad-4f1a-859b-f9d57df33c9f
at com.alibaba.nacos.client.naming.cache.DiskCache.makeSureCacheDirExists(DiskCache.java:154) ~[nacos-client-1.1.4.jar:na]
at com.alibaba.nacos.client.naming.cache.DiskCache.write(DiskCache.java:45) ~[nacos-client-1.1.4.jar:na]
at com.alibaba.nacos.client.naming.core.HostReactor.processServiceJSON(HostReactor.java:184) [nacos-client-1.1.4.jar:na]
Troubleshooting process:
The error content is obvious, that is, to write the cache file to the server, but it failed. Through the error prompt, we found the error reporting class in nacos-client-1.1.4.jar
package com.alibaba.nacos.client.naming.cache;
public class DiskCache {
public static void write(ServiceInfo dom, String dir) {
try {
makeSureCacheDirExists(dir);
File file = new File(dir, dom.getKeyEncoded());
if (!file.exists()) {
// add another !file.exists() to avoid conflicted creating-new-file from multi-instances
if (!file.createNewFile() && !file.exists()) {
throw new IllegalStateException("failed to create cache file");
}
}
StringBuilder keyContentBuffer = new StringBuilder("");
String json = dom.getJsonFromServer();
if (StringUtils.isEmpty(json)) {
json = JSON.toJSONString(dom);
}
keyContentBuffer.append(json);
//Use the concurrent API to ensure the consistency.
ConcurrentDiskUtil.writeFileContent(file, keyContentBuffer.toString(), Charset.defaultCharset().toString());
} catch (Throwable e) {
NAMING_LOGGER.error("[NA] failed to write cache for dom:" + dom.getName(), e);
}
}
*******
private static File makeSureCacheDirExists(String dir) {
File cacheDir = new File(dir);
if (!cacheDir.exists() && !cacheDir.mkdirs()) {
throw new IllegalStateException("failed to create cache dir: " + dir);
}
return cacheDir;
}
}
The write method calls makesurecachedirexists. In the makesurecachedirexists method, if the cache file does not exist and the directory creation fails, an exception will be thrown.
Through the transfer relationship, we find out who called the write method of diskcache. I find hostreactor. The cache address cachedir is passed in through the construction method.
package com.alibaba.nacos.client.naming.core;
public class HostReactor {
public HostReactor(EventDispatcher eventDispatcher, NamingProxy serverProxy, String cacheDir, boolean loadCacheAtStart, int pollingThreadCount) {
......
}
}
Looking forward, I found that when nacosnamingservice was instantiated, it was called HostReactor
package com.alibaba.nacos.client.naming;
@SuppressWarnings("PMD.ServiceOrDaoClassShouldEndWithImplRule")
public class NacosNamingService implements NamingService {
private HostReactor hostReactor;
public NacosNamingService(String serverList) {
Properties properties = new Properties();
properties.setProperty(PropertyKeyConst.SERVER_ADDR, serverList);
init(properties);
}
public NacosNamingService(Properties properties) {
init(properties);
}
private void init(Properties properties) {
namespace = InitUtils.initNamespaceForNaming(properties);
initServerAddr(properties);
InitUtils.initWebRootContext();
initCacheDir();
initLogName(properties);
eventDispatcher = new EventDispatcher();
serverProxy = new NamingProxy(namespace, endpoint, serverList);
serverProxy.setProperties(properties);
beatReactor = new BeatReactor(serverProxy, initClientBeatThreadCount(properties));
hostReactor = new HostReactor(eventDispatcher, serverProxy, cacheDir, isLoadCacheAtStart(properties), initPollingThreadCount(properties));
}
private void initCacheDir() {
cacheDir = System.getProperty("com.alibaba.nacos.naming.cache.dir");
if (StringUtils.isEmpty(cacheDir)) {
cacheDir = System.getProperty("user.home") + "/nacos/naming/" + namespace;
}
}
......
}
The construction methods of nacosnamingservice all call the init method, and the init method calls the initcachedir() method to assign a value to the cachedir variable, and finally completes the initialization of the hostreactor class.
When you see the contents of the initcachedir method, you should understand that there are two ways to specify the Nacos cache path:
1. Specify the parameters in the project configuration file: com.alibaba.nacos.naming.cache.dir
2. The root directory of the running user of the server+ /nacos/naming/
Solution:
1. If there is only root account on the server, you can try to let the operation and maintenance students release the write permission of/root/Nacos/naming/directory
2. Generally, the root directory is not allowed to be written casually. You can change other accounts on the server, start the application, and open the write permission of/user/Nacos/naming/directory
3. Configure in the YML file of the program Com.alibaba.nacos.naming.cache.dir, write the cache to an open file directory.