Small collection of 30 common Kafka errors

This article is a summary of common errors in the use of Kafka. I hope it will help you.

1,UnknownTopicOrPartitionException

org.apache.kafka.common.errors.UnknownTopicOrPartitionException:
This server does not host this topic-partition

Error content: partition data is not available

Cause analysis: the producer sends a message to a topic that does not exist. The user can check whether the topic exists or set auto create. topics. Enable parameter

2,LEADER_NOT_AVAILABLE

WARN Error while fetching metadata with correlation id 0 : {test=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClien

Error content: the leader is unavailable

Cause analysis: many topics are being deleted and leader election is in progress. Use Kafka topics script to check leader information

Then check the survival of the broker and try to restart the solution.

3,NotLeaderForPartitionException

org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition

Error content: the broker is no longer the leader of the corresponding partition

Cause analysis: when the leader changes, when the leader switches from one broker to another, analyze what causes the leader to switch.

4,TimeoutException

org.apache.kafka.common.errors.TimeoutException: Expiring 5 record(s) for test-0: 30040 ms has passe

Error content: request timeout

Cause analysis: observe where the thrown observation network can pass. If so, consider adding a request timeout. Value of MS

5,RecordTooLargeException

WARN async.DefaultEventHandler: Produce request with correlation id 92548048 failed due to [TopicName,1]: org.apache.kafka.common.errors.RecordTooLargeException

Error content: the message is too large

Cause analysis: if the producer cannot process the message, you can add a request timeout. MS reduce batch size

6,Closing socket connection

Closing socket connection to/127,0,0,1.(kafka.network.Processor)

Error content: connection closed

Cause analysis: if the javaApi producer has a high version and wants to start the low version Verification on the client consumer, it will constantly report errors

Unrecognized client message.

7,ConcurrentModificationException

java.util.ConcurrentModificationException: KafkaConsumer is not safe for multi-threaded access

Error content: thread unsafe

Cause analysis: Kafka consumer is non thread safe

8,NetWorkException

[kafka-producer-network-thread | producer-1] o.apache.kafka.common.network.Selector : [Producer clientId=producer-1] Connection with / disconnected

Error content: network exception

Cause analysis: the network connection is interrupted. Check the network condition of the broker

9,ILLEGAL_GENERATION

ILLEGAL_GENERATION occurred while committing offsets for group

Error content: invalid "generation"

Cause analysis: consumer missed rebalance because consumer spent a lot of time processing data.

You need to reduce max.poll Increase the records value by max.poll interval. MS or try to increase the speed of message processing

10. Start advertised Abnormal listener configuration

java.lang.IllegalArgumentException: requirement failed: advertised.listeners cannot use the nonroutable meta-address 0.0.0.0. Use a routable IP address.
    at scala.Predef$.require(Predef.scala:277)
    at kafka.server.KafkaConfig.validateValues(KafkaConfig.scala:1203)
    at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:1170)
    at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:881)
    at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:878)
    at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:28)
    at kafka.Kafka$.main(Kafka.scala:82)
    at kafka.Kafka.main(Kafka.scala)

Solution: modify the server properties

advertised.listeners=PLAINTEXT://{ip}: 9092 # ip can be intranet, extranet and 127.0 0.1 or domain name

Resolution:

server. There are two listeners in properties. Listeners: start the ip and port that kafka service listens to. You can listen to intranet ip and 0.0 0.0 (can't be Internet ip), the default is Java net. InetAddress. The ip obtained by getcanonicalhostname(). advertised.listeners: the address of the producer and consumer connection. kafka will register this address in zookeeper, so it can only be divided by 0.0 For legal ip or domain names other than 0.0, the default configuration is the same as that of listeners.

11. Exception starting PrintGCDateStamps

[0.004s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:/data/service/kafka_2.11-0.11.0.2/bin/../logs/kafkaServer-gc.log instead.
Unrecognized VM option 'PrintGCDateStamps'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Solution: replace jdk1 Version 8. X or > = kafka1 Version of 0. X.

Resolution:

Only in jdk1 9 and kafka version is 1.0 Before X.

12. The generator fails to send a message or the consumer cannot consume (kafka1.0.1)

#(java)org.apache.kafka warning
Connection to node 0 could not be established. Broker may not be available.


# (nodejs) Kafka node exception (exception after executing producer.send)
{ TimeoutError: Request timed out after 30000ms
    at new TimeoutError (D:\project\node\kafka-test\src\node_modules\kafka-node\lib\errors\TimeoutError.js:6:9)
    at Timeout.setTimeout [as _onTimeout] (D:\project\node\kafka-test\src\node_modules\kafka-node\lib\kafkaClient.js:737:14)
    at ontimeout (timers.js:466:11)
    at tryOnTimeout (timers.js:304:5)
    at Timer.listOnTimeout (timers.js:264:5) message: 'Request timed out after 30000ms' }

Solution: check advertised Configure the listeners (if there are multiple brokers, check the configuration according to the corresponding node number of the java version) to determine whether the current network can be connected to the address (telnet, etc.)

13. The value of partitions configuration is too small, resulting in an error (kafka1.0.1)

#(java)org. apache. Kafka (execute producer.send)
Exception in thread "main" org.apache.kafka.common.KafkaException: Invalid partition given with record: 1 is not in the range [0...1).
    at org.apache.kafka.clients.producer.KafkaProducer.waitOnMetadata(KafkaProducer.java:908)
    at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:778)
    at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:768)
    at com.wenshao.dal.TestProducer.main(TestProducer.java:36)


# (nodejs) Kafka node exception (exception after executing producer.send)
{ BrokerNotAvailableError: Could not find the leader
    at new BrokerNotAvailableError (D:\project\node\kafka-test\src\node_modules\kafka-node\lib\errors\BrokerNotAvailableError.js:11:9)
    at refreshMetadata.error (D:\project\node\kafka-test\src\node_modules\kafka-node\lib\kafkaClient.js:831:16)
    at D:\project\node\kafka-test\src\node_modules\kafka-node\lib\client.js:514:9
    at KafkaClient.wrappedFn (D:\project\node\kafka-test\src\node_modules\kafka-node\lib\kafkaClient.js:379:14)
    at KafkaClient.Client.handleReceivedData (D:\project\node\kafka-test\src\node_modules\kafka-node\lib\client.js:770:60)
    at Socket.<anonymous> (D:\project\node\kafka-test\src\node_modules\kafka-node\lib\kafkaClient.js:618:10)
    at Socket.emit (events.js:159:13)
    at addChunk (_stream_readable.js:265:12)
    at readableAddChunk (_stream_readable.js:252:11)
    at Socket.Readable.push (_stream_readable.js:209:10) message: 'Could not find the leader' }

Solution: modify the value of num.partitions. Partitions is the number of partitions nodes created by default when creating topics. It only takes effect for newly created topics. Try to set a reasonable value during project planning. You can also dynamically expand capacity through the command line ()

./bin/kafka-topics.sh --zookeeper  localhost:2181 --alter --partitions 2 --topic  foo

14. Kafka topic operation

  • Add: add a new Kafka topic: "mobilePhone", set a partition and a replica for it, and create it on the local zookeeper. It can be master:2181,slave1:2181,slave3:2181/kafka, or directly localhost: 2181/kafka

cd /usr/kafka/bin
./kafka-topics.sh --create --zookeeper master:2181,slave1:2181,slave3:2181/kafka --replication-factor 1 --partitions 1 --topic mobilePhone

When creating, the following error is reported: error while executing topic command: replication factor: 1 large than available brokers: 0

Solution: it is likely to be in the server The properties configuration folder is inconsistent with the zookeeper directory where the command is executed-- The value of zookeeper needs to bring the root directory, otherwise such an error will be reported. For example, the connection directory written in the configuration file is zookeeper Connect = Master: 2181, slave1:2181, slave3:2181 / kafka, but the kafka directory is written less when executing the command. Write it as follows

--zookeeper master:2181,slave1:2181,slave3:2181. The above error will be reported. Therefore, make sure that the directory of zookeeper is consistent.

When the Topic is successfully created, the Created topic "mobilePhone" will be output, as shown in the figure above.

Note: replication factor cannot be greater than the number of broker s.

  • Query: query the information of a Topic and mobilePhone

cd /usr/kafka/bin
./kafka-topics.sh --describe --zookeeper master:2181,slave1:2181,slave3:2181/kafka --topic mobilePhone

You can query all topics and their information without specifying a specific topic name, that is, executing a command without the -- Topic parameter.

  • Modify: modify a Topic parameter, for example, modify mobilePhone to 5 partitions, alter to modify the number of partitions (can only be increased)

cd /usr/kafka/bin
./kafka-topics.sh --alter --zookeeper master:2181,slave1:2181,slave3:2181/kafka  --partitions 5 --topic mobilePhone

Note: the following error occurred just after entering the command:

Error while executing topic command : Topic mobilePhone does not exist on ZK path master:2181,slave1:2181,slave3:2181 [2018-11-29 16:14:02,100] ERROR java.lang.IllegalArgumentException: Topic mobilePhone does not exist on ZK path master:2181,slave1:2181,slave3:2181 at kafka.admin.TopicCommand$.alterTopic(TopicCommand.scala:124) at kafka.admin.TopicCommand$.main(TopicCommand.scala:65) at kafka.admin.TopicCommand.main(TopicCommand.scala) (kafka.admin.TopicCommand$)

The reason for this error is that when executing the command, you forget to enter the root directory hostname:port/kafak when configuring zookeeper, which is directly written as the host name and port number, so zookeeper can't find the path of topic.

  • Delete: delete some unnecessary topics

cd /usr/kafka/bin
./kafka-topics.sh --delete --zookeeper master:2181,slave1:2181,slave3:2181/kafka  --topic mobilePhone

When you execute the delete topic command, you will be prompted that the topic cannot be deleted because it is on the server In the configuration file of properties, kafka defaults to unable to delete, that is, false. Therefore, you need to modify delete in the configuration file of each node topic. enable=true. Then it can be deleted normally. Mark as shown in the figure below

However, this deletion only marks the topic as deleted, and does not delete it in a real sense. When a topic with the same name is re created, an error will still be reported, and the topic already exists. Therefore, in order to completely delete topic, we enter the bin directory of zookeeper and enter/ zkCli.sh enters the command line of zookeeper and deletes three directories: 1. rmr /kafka/brokers/topics/mobilePhone; 2,rmr /kafka/admin/delete_topics/mobliePhone; 3,rmr /kafka/config/topics/mobilePhone

At this point, you can completely delete the topic. If you re create the topic with the same name and still report an existing error, you need to restart the kafka service.

15. Kafka producer operation

Before executing the producer and consumer commands, we create a topic as newPhone according to the above creation method, and change its partition to 2. The zookeeper addresses we set here are localhost:2181/kafak, which is no different from the above, but created on the current machine. This method is recommended, which is not only concise, but also saves space.

  • Create producer

cd /usr/kafka/config
./kafka-console-producer.sh --broker-list localhost:9092 --topic newPhone

Broker list: the service address of Kafka (separated by multiple commas). The default port number is 9092. If you do not want to use this port number, you can change the server.properties configuration file under config to modify it, as shown in the following figure:

--topic newPhone: it means that the producer has bound this topic and will produce data to this topic. After correctly executing the command, it can be as shown in the figure below, and data input can be started.

16. Kafka consumer operation

  • Create consumer

./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic newPhone [--from-beginning]

--Bootstrap server: the service address of kafka, -- topic newPhone: bind the topic and start consuming (fetching) data from the specified topic, [-- from beginning]: read the data from the beginning, not after the consumer is connected.

In the whole operation process, first we are using producer to produce several pieces of data:

At this time, on the ssh tool (the kitchen uses SecureCRT, which is very easy to use), clone a Session. Execute. / kafka-console-consumer.sh -- bootstrap server localhost: 9092 -- Topic newphone. As shown in the figure below, if the from starting parameter option is not added, the consumer cannot read the four pieces of data produced by the previous Producer in the Topic.

After adding the from starting parameter option, the consumer can consume the data in the Topic from scratch. As shown in the figure below:

Tip: if you enter message when the producer production data, the following error occurs:

[root@master bin]# ./kafka-console-producer.sh --broker-list localhost:9092 --topic newPhone
>hisdhodsa        
[2018-11-29 17:28:16,926] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 1 : {newPhone=LEADER_NOT_AVAILABLE}          
(org.apache.kafka.clients.NetworkClient)         
[2018-11-29 17:28:17,080] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 5 : {newPhone=LEADER_NOT_AVAILABLE}           
(org.apache.kafka.clients.NetworkClient)

Solution: since port 9092 is not open, it is on the server In the properties configuration file, delete the comment of listeners=PLAINTEXT://:9092, as shown in the following figure

17. kafka startup error

First error

2017-02-17 17:25:29,224] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)

kafka.common.KafkaException: Failed to acquire lock on file .lock in /var/log/kafka-logs. A Kafka instance in another process or thread is using this directory.

    at kafka.log.LogManager$$anonfun$lockLogDirs$1.apply(LogManager.scala:100)

    at kafka.log.LogManager$$anonfun$lockLogDirs$1.apply(LogManager.scala:97)

    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)

    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)

    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)

    at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)

    at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)

    at scala.collection.AbstractTraversable.map(Traversable.scala:104)

    at kafka.log.LogManager.lockLogDirs(LogManager.scala:97)

    at kafka.log.LogManager.<init>(LogManager.scala:59)

    at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:609)

    at kafka.server.KafkaServer.startup(KafkaServer.scala:183)

    at io.confluent.support.metrics.SupportedServerStartable.startup(SupportedServerStartable.java:100)

    at io.confluent.support.metrics.SupportedKafka.main(SupportedKafka.java:49)

Solution: failed to acquire lock on file lock in /var/log/kafka-logs.-- The reason for the problem is that another process is using Kafka, ps -ef|grep kafka. Just kill the process using the directory;

The second error: no permission to the index file

Change the permissions of the file to the correct user name and user group;

Directory / var / log / Kafka logs /, where__ consumer_offsets-29 is the offset;

The third type of production and consumption error: jaas connection problem

kafka_client_jaas.conf There is a problem with the file configuration

16 Environment

/opt/dataload/filesource_wangjuan/conf lower kafka_client_jaas.conf

KafkaClient {
    com.sun.security.auth.module.Krb5LoginModule required

    useKeyTab=true

    storeKey=true

    keyTab="/home/client/keytabs/client.keytab"

        serviceName="kafka"

    principal="client/dcp@DCP.COM";

};

18. kafka production error report

First: the producer failed to send a message to topic:

[2017-03-09 09:16:00,982] [ERROR] [startJob_Worker-10] [DCPKafkaProducer.java line:62] produceR towards topicdf02211 Exception in sending message

org.apache.kafka.common.KafkaException: Failed to construct kafka producer

        at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:335)

The reason is the configuration file: Kafka_ client_ jaas. There is a problem with the configuration in conf, which is caused by the wrong path of keyTab;

Second: production and consumption error: Failed to construct kafka producer

Key error information: Failed to construct kafka producer

Solution: configuration file problem: the serviceName in KafkaClient should be kafka, which was previously configured as zookeeper; Just restart it;

The configuration file is as follows:

KafkaServer {
    com.sun.security.auth.module.Krb5LoginModule required

    useKeyTab=true

    storeKey=true

    useTicketCache=false

    serviceName=kafka

    keyTab="/etc/security/keytabs/kafka.service.keytab"

    principal="kafka/dcp16@DCP.COM";

};

KafkaClient {
    com.sun.security.auth.module.Krb5LoginModule required

    useKeyTab=true

    storeKey=true

    serviceName=kafka

    keyTab="/etc/security/keytabs/kafka.service.keytab"

    principal="kafka/dcp16@DCP.COM";

};

Client {
    com.sun.security.auth.module.Krb5LoginModule required

    useKeyTab=true

    storeKey=true

    useTicketCache=false

    serviceName=zookeeper

    keyTab="/etc/security/keytabs/kafka.service.keytab"

    principal="kafka/dcp16@DCP.COM";

};

Problem Description:

[kafka@DCP16 bin]$ ./kafka-console-producer   --broker-list DCP16:9092 --topic topicin050511  --producer.config ../etc/kafka/producer.properties

org.apache.kafka.common.KafkaException: Failed to construct kafka producer

    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:335)

    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:188)

    at kafka.producer.NewShinyProducer.<init>(BaseProducer.scala:40)

    at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:45)

    at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)

Caused by: org.apache.kafka.common.KafkaException: java.lang.IllegalArgumentException: Conflicting serviceName values found in JAAS and Kafka configs value in JAAS file zookeeper, value in Kafka config kafka

    at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:86)

    at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:70)

    at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:83)

    at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:277)

    ... 4 more

Caused by: java.lang.IllegalArgumentException: Conflicting serviceName values found in JAAS and Kafka configs value in JAAS file zookeeper, value in Kafka config kafka

    at org.apache.kafka.common.security.kerberos.KerberosLogin.getServiceName(KerberosLogin.java:305)

    at org.apache.kafka.common.security.kerberos.KerberosLogin.configure(KerberosLogin.java:103)

    at org.apache.kafka.common.security.authenticator.LoginManager.<init>(LoginManager.java:45)

    at org.apache.kafka.common.security.authenticator.LoginManager.acquireLoginManager(LoginManager.java:68)

    at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:78)

    ... 7 more

[kafka@DCP16 bin]$ ./kafka-console-producer   --broker-list DCP16:9092 --topic topicin050511  --producer.config ../etc/kafka/producer.properties



Consumption times error: ERROR Unknown error when running consumer:  (kafka.tools.ConsoleConsumer$)



[root@DCP16 bin]# ./kafka-console-consumer --zookeeper dcp18:2181,dcp16:2181,dcp19:2181/kafkakerberos --from-beginning --topic topicout050511 --new-consumer --consumer.config ../etc/kafka/consumer.properties --bootstrap-server DCP16:9092

[2017-05-07 22:24:37,479] ERROR Unknown error when running consumer:  (kafka.tools.ConsoleConsumer$)

org.apache.kafka.common.KafkaException: Failed to construct kafka consumer

    at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:702)

    at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:587)

    at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:569)

    at kafka.consumer.NewShinyConsumer.<init>(BaseConsumer.scala:53)

    at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:64)

    at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:51)

    at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)

Caused by: org.apache.kafka.common.KafkaException: javax.security.auth.login.LoginException: Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner  authentication information from the user

    at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:86)

    at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:70)

    at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:83)

    at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:623)

    ... 6 more

Caused by: javax.security.auth.login.LoginException: Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner  authentication information from the user

    at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:899)

    at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:719)

    at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:606)

    at javax.security.auth.login.LoginContext.invoke(LoginContext.java:762)

    at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)

    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690)

    at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688)

    at java.security.AccessController.doPrivileged(Native Method)

    at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687)

    at javax.security.auth.login.LoginContext.login(LoginContext.java:595)

    at org.apache.kafka.common.security.authenticator.AbstractLogin.login(AbstractLogin.java:69)

    at org.apache.kafka.common.security.kerberos.KerberosLogin.login(KerberosLogin.java:110)

    at org.apache.kafka.common.security.authenticator.LoginManager.<init>(LoginManager.java:46)

    at org.apache.kafka.common.security.authenticator.LoginManager.acquireLoginManager(LoginManager.java:68)

    at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:78)

Derivative questions:

The kafka production message will report an error:

[2017-05-07 23:17:16,240] ERROR Error when sending message to topic topicin050511 with key: null, value: 0 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)

org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.

Change KafkaClient to the following configuration:

KafkaClient {
    com.sun.security.auth.module.Krb5LoginModule required

   useTicketCache=true;

};

19. kafka Consumption Error Report

First error:

replication factor: 1 larger than available brokers: 0

Consumption times error:

Error while executing topic command : replication factor: 1 larger than available brokers: 0

terms of settlement:

/confluent-3.0.0/bin  Next restart daemon

./kafka-server-stop  -daemon   ../etc/kafka/server.properties

./kafka-server-start  -daemon   ../etc/kafka/server.properties

Then zk restart; sh zkCli.sh -server ai186;

/usr/hdp/2.4.2.0-258/zookeeper/bin/zkCli.sh -- directory of scripts

If an error is reported, you can view the following configuration in the configuration file:

zookeeper.connect=dcp18:2181/kafkakerberosï¼›-- Is the group name

Second error: TOPIC_AUTHORIZATION_FAILED

./bin/kafka-console-consumer --zookeeper DCP185:2181,DCP186:2181,DCP187:2181/kafka --from-beginning --topic wangjuan_topic1 --new-consumer --consumer.config ./etc/kafka/consumer.properties --bootstrap-server DCP187:9092

[2017-03-02 13:44:38,398] WARN The configuration zookeeper.connection.timeout.ms = 6000 was supplied but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig)

[2017-03-02 13:44:38,575] WARN Error while fetching metadata with correlation id 1 : {wangjuan_topic1=TOPIC_AUTHORIZATION_FAILED} (org.apache.kafka.clients.NetworkClient)

[2017-03-02 13:44:38,677] WARN Error while fetching metadata with correlation id 2 : {wangjuan_topic1=TOPIC_AUTHORIZATION_FAILED} (org.apache.kafka.clients.NetworkClient)

[2017-03-02 13:44:38,780] WARN Error while fetching metadata with correlation id 3 : {wangjuan_topic1=TOPIC_AUTHORIZATION_FAILED} (org.apache.kafka.clients.NetworkClient)

Solution: the U of User in the following parameters in the configuration file must be uppercase;

super.users=User:kafka

Or maybe server Adver. In properties The IP address of listen is incorrect. It may be the IP address written in the code;

Possible solutions to the third error:

If consumption is not possible, check the error message in kafka's startup log: the log file belongs to the wrong group, which should be hadoop;

Or, check whether the configuration suffix of zookeeper corresponding to kafka has been changed. If so, the topic needs to be regenerated;

The third error: the tomcat error report of consumption:

[2017-04-01 06:37:21,823] [INFO] [Thread-5] [AbstractCoordinator.java line:542] Marking the coordinator DCP187:9092 (id: 2147483647 rack: null) dead for group test-consumer-group

[2017-04-01 06:37:21,825] [WARN] [Thread-5] [ConsumerCoordinator.java line:476] Auto offset commit failed for group test-consumer-group: Commit offsets failed with retriable exception. You should retry committing offsets.

In the changed code, the heartbeat timeout of tomcat is as follows:

Not changed:;

./webapps/web/WEB-INF/classes/com/ai/bdx/dcp/hadoop/service/impl/DCPKafkaConsumer.class;

After restart, the log shows:

[2017-04-01 10:14:56,167] [INFO] [Thread-5] [AbstractCoordinator.java line:542] Marking the coordinator DCP187:9092 (id: 2147483647 rack: null) dead for group test-consumer-group

[2017-04-01 10:14:56,286] [INFO] [Thread-5] [AbstractCoordinator.java line:505] Discovered coordinator DCP187:9092 (id: 2147483647 rack: null) for group test-consumer-group.

20. kafka - error creating topic

Error when creating topic:

[2017-04-10 10:32:23,776] WARN SASL configuration failed: javax.security.auth.login.LoginException: Checksum failed Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it. (org.apache.zookeeper.ClientCnxn)

Exception in thread "main" org.I0Itec.zkclient.exception.ZkAuthFailedException: Authentication failure

    at org.I0Itec.zkclient.ZkClient.waitForKeeperState(ZkClient.java:946)

    at org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:923)

    at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1230)

    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:156)

    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:130)

    at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:75)

    at kafka.utils.ZkUtils$.apply(ZkUtils.scala:57)

    at kafka.admin.TopicCommand$.main(TopicCommand.scala:54)

    at kafka.admin.TopicCommand.main(TopicCommand.scala)

Problem location: there is a problem with the jaas file:

Solution: server Super. In the properties file The user should be consistent with the principle of the keytab in the jaas file;

server.properties:super.users=User:client

kafka_ server_ jaas. The conf file is changed to:

KafkaServer {
    com.sun.security.auth.module.Krb5LoginModule required

    useKeyTab=true

    storeKey=true

    useTicketCache=false

    serviceName=kafka

    keyTab="/data/data1/confluent-3.0.0/kafka.keytab"

    principal="kafka@DCP.COM";

};



KafkaClient {
    com.sun.security.auth.module.Krb5LoginModule required

    useKeyTab=true

    storeKey=true

    keyTab="/home/client/client.keytab"

    principal="client/DCP187@DCP.COM";

};



Client {
    com.sun.security.auth.module.Krb5LoginModule required

    useKeyTab=true

    storeKey=true

    useTicketCache=false

    serviceName=zookeeper

    keyTab="/home/client/client.keytab"

    principal="client/DCP187@DCP.COM";

};

21. Datacaptain --- > kafka component reason

Error reporting prompt

Not has broker can connection metadataBrokerList

java.lang.IllegalArgumentException: requirement failed: advertised.listeners cannot use the nonroutable meta-address 0.0.0.0. Use a routable IP address.
    at scala.Predef$.require(Predef.scala:277)
    at kafka.server.KafkaConfig.validateValues(KafkaConfig.scala:1203)
    at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:1170)
    at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:881)
    at kafka.server.KafkaConfig$.fromProps(KafkaConfig.scala:878)
    at kafka.server.KafkaServerStartable$.fromProps(KafkaServerStartable.scala:28)
    at kafka.Kafka$.main(Kafka.scala:82)
    at kafka.Kafka.main(Kafka.scala)

Solution: modify the server properties

advertised.listeners=PLAINTEXT://{ip}: 9092 # ip can be intranet, extranet and 127.0 0.1 or domain name

Resolution:
server. There are two listeners in properties.

listeners: start the ip and port that kafka service listens to. You can listen to intranet ip and 0.0 0.0 (can't be Internet ip), the default is Java net. InetAddress. The ip obtained by getcanonicalhostname().

advertised.listeners: the address of the producer and consumer connection. kafka will register this address in zookeeper, so it can only be divided by 0.0 For legal ip or domain names other than 0.0, the default configuration is the same as that of listeners.

22. An error "TimeoutException(Java)" or "run out of brokers(Go)" or "Authentication failed for user(Python)" is reported

First, make sure that the servers are configured correctly, and then eliminate network problems through ping and telnet. Assuming that the network is running normally, Kafka on the cloud will authenticate the client when establishing a connection. There are two authentication methods (sasl_mechanism):

  • ONS: only for Java language; You need to configure your own AccessKey and SecretKey.

  • Plan: all languages are available; AccessKey needs to be configured, and the last 10 bits of SecretKey need to be configured.

If authentication fails, Kafka on the cloud will cut off the connection.
In addition, please carefully refer to the readme of each demo to configure it correctly.

23. The error "leader is not available" or "leader is in election" is reported

First, check whether the Topic has been created; Secondly, check whether the Topic type is "Kafka message".

24. Similar words like "TOPIC_AUTHORIZATION_FAILED" or "Topic or group not authorized" are reported in error

This kind of error usually indicates that the permission is wrong, that is, your AccessKey does not have permission to access the corresponding Topic or Consumer ID (also known as group or consumer group).

The permission rules for Topic and Consumer ID are as follows:

  • Topic must be created by the master account; When using, topic can be used by the master account itself or authorized by the master account to a sub account.

  • The user ID only belongs to the creator; The Consumer ID created by the primary account cannot be used by the sub account, and vice versa.

Note: please carefully check which account AccessKey and SecretKey come from to avoid misuse.

25. The Java client (including the Spring framework) reports an error "Failed to send SSL close message"

This error is usually followed by "connection reset by peer" or "broken pipe". The main reason for this error is that the server is a VIP network environment and will take the initiative to cut off the idle connection. It is recommended to retry sending when such errors are encountered. There is a retry mechanism inside the Java client, which can be configured by referring to the best practices of Producer. For other language clients, please refer to relevant documents. You can avoid this error by modifying the log level. Take log4j as an example and add the following line of configuration:

log4j.logger.org.apache.kafka.common.network.SslTransportLayer=ERROR

26. The Spring Cloud Stream consumer information times wrong "arrayindexoutofboundexception"

This error occurs because the Spring Cloud parses the message content in its own format. If you use Spring Cloud to send and consume at the same time, there will be no problem. This is also the recommended way to use it.

If you use other methods to send, such as calling Kafka's native Java client, you need to set the headerMode to raw when consuming with Spring Cloud, that is, disable parsing the message content. See the Spring Cloud official website for details.

27. Error "No worthy mechs found"

This error will be reported by the C + + client or the client wrapping C + +. This error indicates that a system library is missing: Cyrus SASL plain. The installation methods for yum managed systems are: yum install Cyrus sasl{, - plain}.

28. What are CID, Consumer ID, Consumer Group and Group ID

These names refer to the same concept: Kafka's Consumer Group. CID is the abbreviation of Consumer ID, which is also equivalent to Group ID (short for Consumer Group ID, which refers to a specific consumption group). Each consumption group can contain multiple consumption instances to consume subscribed topics for load balancing, so the relationship between CID and topic can be summarized as follows: each CID can subscribe to multiple topics, and each topic can also be subscribed by multiple CIDS; each CID is relatively independent and does not affect each other.

Assuming that CID 1 subscribes to Topic 1, each message of Topic 1 will be sent to an instance of CID 1, and only one instance will be sent. If CID 2 also subscribes to Topic 1, each message of Topic 1 will also be sent to an instance in CID 2, and only one instance will be sent.

[backcolor=transparent] Note: the Producer ID seen in the console is a concept in Aliware MQ and will not be used by Kafka. It will be improved and optimized later.

29. How to view consumption progress

To view the consumption progress of a specific subscription consumer, please follow the steps below:

  • On the left side of the ONS console, click [backcolor=transparent] publish and subscribe management - subscription management.

  • Enter topic or Cosumer ID in the search box and click [backcolor=transparent] search to find the Consumer ID you want to view the consumption progress.

  • After finding the Consumer ID, click [backcolor=transparent] consumer status in the operation column to view the total accumulation of [backcolor=transparent] in the pop-up page.

Total heap = number of all messages - number of messages consumed
[backcolor=transparent] Note: at present, the status of consumers will be displayed offline and will be optimized in the future. In addition to the total accumulation, other information is for reference only.

30. What if the messages accumulate

Message accumulation is generally caused by slow consumption or blocked consumption threads. It is recommended to print out the stack of consuming threads and view the thread execution.

Note: Java processes can use jstack.

31,java.lang.OutOfMemoryError:Map failed

The reason for the above problem is that OOM will cause the kafka process to crash directly! Therefore, you can only restart the broker node. However, in order to make the broker node start faster, you can increase the value of a parameter: "num.recovery.threads.per.data.dir=30". Yes, it is him. The greater the value, the better. This number of threads is mainly responsible for stopping and starting the broker. Because it is a 32core server, it is allocated 30. You can increase this parameter as much as possible to facilitate the broker node to join the ISR list faster.

First, according to the above tips, restoring the service is the first thing to do. Next, we have to analyze why this happened and allocate 20G memory to kafka cluster, as shown in the following figure:

After checking the monitoring chart for nearly 2 weeks, it is found that the available memory continues to decrease, and it is preliminarily suspected that a memory leak may have occurred.

This is just a doubt, because the JVM was not monitored before the error. After learning a lesson, we quickly monitored kafka's JVM with zabbix.

After that, adjust the following parameters, first observe a period of time.

sysctl vm. max_ map_ Count = maximum number of memory regions in 262144 # process. When malloc is called, mmap and mprotect are called directly, and the shared library is loaded, the memory mapping area is generated. Although most programs need no more than 1000 memory mapping areas, specific programs, especially malloc debugger, may need many. For example, each break will produce one or two memory mapping areas, and the default value is 65536.

Solution:

First: kafka's heap memory allocation should not be greater than 6G. We know that kafka does not eat heap memory, and it is not reasonable to set the default 1G. The recommended setting is 6G. The server has 32G of memory, and then allocated 22G of heap memory to kafka. After referring to the notes of kafka authoritative guide and Apache kafka actual combat, they recommend setting the heap size of kafka to 5G or 6G. Finally, I configured kafka's heap memory to 6G.

Second: adjust the "VM. Max_map_count" parameter. Yes, it is the parameter mentioned above. In the actual production environment, if there are too many topics on a single broker, the user may encounter a serious error of "java.lang.OutOfMemoryError:Map failed", resulting in the collapse of the kafka node. This is because creating a large number of topics will greatly consume the operating system memory for memory mapping operations. In this case, you need to adjust the vm.max_map_count parameter. The specific method can be set with the command: "sysctl vm.max_map_count=N". The default value of this parameter is 65535. You can consider setting a larger value for the online environment, such as 262144 or even larger.

Added by harty83 on Thu, 30 Dec 2021 10:59:46 +0200