At Badoo we have a distributed system that uses ssh on start to connect to each node and exchange some data.
Sometimes I’ve seen ssh fail with error
ssh_exchange_identification: read: Connection reset by peer error
Usually, If you see this error, it means that server didn’t like your certificate or you were banned from accessing this server.
The best way to debug this problem is to append -vvv
while executing ssh client to increase verbosity and to do the same on server.
If your Linux distribution uses systemd, then the best way to look into sshd logs is journalctl -f
command or specifically journalctl -u sshd.service -f
command.
To increase ssd verbosity, edit /etc/ssh/sshd_config
and add LogLevel DEBUG3
and then restart sshd with command sudo systemctl restart ssd.service
.
Unfortunately, logs didn’t show anything useful for me.
But I had a hunch that it is somehow connected to the fact that our distributed system makes a lot of ssh connections and very rapidly.
While browsing trough sshd_config
man page I’ve found config variable that could be to blame. And I was right.
If you happen to have the same problem I did, try increasing MaxStartups
variable and restarting sshd service.
I’ve increased it to 100:30:600 from default 10:30:60 and connection resets vanished.
It is unfortunate that ssh and sshd logs gave me no clue about the problem.