[PVE-User] PVE, Ceph, OSD in stop/out state: how to restart from commandline?
Marco Gaiarin
gaio at sv.lnf.it
Wed Jun 7 11:17:09 CEST 2017
Mandi! Fabian Grünbichler
In chel di` si favelave...
> OSDs are supposed to be enabled by UDEV rules automatically. This does
> not work on all systems, so PVE installs a ceph.service which triggers a
> scan for OSDs on all available disks.
> Either calling "systemctl restart ceph.service" or "ceph-disk
> activate-all" should start all available OSDs which haven't been started
> yet.
Good. I've missed that.
> The reason why you are not seeing ceph-osd at X systemd units for OSDs
> which haven't been available on this boot is that these units are
> purposely lost on a reboot, and only re-enabled for the current boot
> when ceph-disk starts the OSD (in systemd speech, they are "runtime"
> enabled). This kind of makes sense, since a OSD service can only be
> started if its disk is there, and if the disk is there it is supposed to
> have already been started via the UDEV rule.
> Which PVE and Ceph versions are you on? Is there anything out of the
> ordinary about your setup? Could you provide a log of the boot where the
> OSDs failed to start? The ceph.service should catch all the OSDs missed
> by UDEV on boot..
I'm using latest PVE 4.4, ceph hammer.
As i've sayed, power went off many times, at least 2. Looking at logs
(syslog):
1) main power outgage, server shut off:
Jun 2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 5 --pid-file /var/run/ceph/osd.5.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 2 19:07:29 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768; /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun 2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun 2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun 2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768; /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun 2 19:07:29 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 5 --pid-file /var/run/ceph/osd.5.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
2) power return just to be able to boot the server, then server stop
again. But just here only one OSD start.
[ i don't know if power returns many times, here, but not sufficient to
write logs ]
Jun 3 22:10:51 vedovanera ceph[1893]: === mon.1 ===
Jun 3 22:10:51 vedovanera ceph[1893]: Starting Ceph mon.1 on vedovanera...
Jun 3 22:10:51 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768; /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 3 22:10:51 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768; /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun 3 22:10:51 vedovanera ceph[1893]: Running as unit ceph-mon.1.1496520651.342882341.service.
Jun 3 22:10:51 vedovanera ceph[1893]: Starting ceph-create-keys on vedovanera...
Jun 3 22:10:51 vedovanera kernel: [ 12.777843] ip6_tables: (C) 2000-2006 Netfilter Core Team
Jun 3 22:10:51 vedovanera kernel: [ 12.788040] ip_set: protocol 6
Jun 3 22:10:51 vedovanera kernel: [ 12.949441] XFS (sdc1): Mounting V4 Filesystem
Jun 3 22:10:52 vedovanera kernel: [ 13.114011] XFS (sdc1): Ending clean mount
Jun 3 22:10:52 vedovanera ceph[1893]: === osd.3 ===
Jun 3 22:10:52 vedovanera ceph[1893]: 2017-06-03 22:10:52.370824 7f5e9c313700 0 -- :/3217267806 >> 10.27.251.8:6789/0 pipe(0x7f5e98061590 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e9805c1f0).fault
Jun 3 22:10:53 vedovanera bash[2122]: starting mon.1 rank 1 at 10.27.251.8:6789/0 mon_data /var/lib/ceph/mon/ceph-1 fsid 8794c124-c2ec-4e81-8631-742992159bd6
Jun 3 22:10:56 vedovanera ceph[1893]: 2017-06-03 22:10:56.541384 7f5e9c111700 0 -- 10.27.251.8:0/3217267806 >> 10.27.251.9:6789/0 pipe(0x7f5e8c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c004ef0).fault
Jun 3 22:11:02 vedovanera ceph[1893]: 2017-06-03 22:11:02.541210 7f5e9c313700 0 -- 10.27.251.8:0/3217267806 >> 10.27.251.7:6789/0 pipe(0x7f5e8c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c006470).fault
Jun 3 22:11:08 vedovanera ceph[1893]: 2017-06-03 22:11:08.541237 7f5e9c212700 0 -- 10.27.251.8:0/3217267806 >> 10.27.251.9:6789/0 pipe(0x7f5e8c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c004ea0).fault
Jun 3 22:11:11 vedovanera ceph[1893]: 2017-06-03 22:11:11.541285 7f5e9c313700 0 -- 10.27.251.8:0/3217267806 >> 10.27.251.11:6789/0 pipe(0x7f5e8c008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c00c5f0).fault
Jun 3 22:11:14 vedovanera ceph[1893]: 2017-06-03 22:11:14.541215 7f5e9c212700 0 -- 10.27.251.8:0/3217267806 >> 10.27.251.12:6789/0 pipe(0x7f5e8c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c004ea0).fault
Jun 3 22:11:17 vedovanera ceph[1893]: 2017-06-03 22:11:17.541237 7f5e9c313700 0 -- 10.27.251.8:0/3217267806 >> 10.27.251.7:6789/0 pipe(0x7f5e8c008350 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c00c5f0).fault
Jun 3 22:11:20 vedovanera ceph[1893]: 2017-06-03 22:11:20.541254 7f5e9c212700 0 -- 10.27.251.8:0/3217267806 >> 10.27.251.11:6789/0 pipe(0x7f5e8c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f5e8c004ea0).fault
Jun 3 22:11:22 vedovanera ceph[1893]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.3 --keyring=/var/lib/ceph/osd/ceph-3/keyring osd crush create-or-move -- 3 1.82 host=vedovanera root=default'
Jun 3 22:11:22 vedovanera ceph[1893]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.3']' returned non-zero exit status 1
Jun 3 22:11:22 vedovanera kernel: [ 43.350784] XFS (sdd1): Mounting V4 Filesystem
Jun 3 22:11:22 vedovanera kernel: [ 43.447046] XFS (sdd1): Ending clean mount
Jun 3 22:11:22 vedovanera ceph[1893]: === osd.5 ===
Jun 3 22:11:26 vedovanera ceph[1893]: 2017-06-03 22:11:26.541307 7fb7b7fff700 0 -- 10.27.251.8:0/279741970 >> 10.27.251.9:6789/0 pipe(0x7fb7b0000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0004ef0).fault
Jun 3 22:11:32 vedovanera ceph[1893]: 2017-06-03 22:11:32.541205 7fb7bc2a8700 0 -- 10.27.251.8:0/279741970 >> 10.27.251.7:6789/0 pipe(0x7fb7b0000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0006470).fault
Jun 3 22:11:37 vedovanera ceph[1893]: 2017-06-03 22:11:37.570726 7fb7bc1a7700 0 -- 10.27.251.8:0/279741970 >> 10.27.251.9:6789/0 pipe(0x7fb7b0000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0004ea0).fault
Jun 3 22:11:41 vedovanera ceph[1893]: 2017-06-03 22:11:41.541195 7fb7bc2a8700 0 -- 10.27.251.8:0/279741970 >> 10.27.251.11:6789/0 pipe(0x7fb7b0008350 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b000c5f0).fault
Jun 3 22:11:44 vedovanera ceph[1893]: 2017-06-03 22:11:44.541230 7fb7bc1a7700 0 -- 10.27.251.8:0/279741970 >> 10.27.251.12:6789/0 pipe(0x7fb7b0000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0004ea0).fault
Jun 3 22:11:47 vedovanera ceph[1893]: 2017-06-03 22:11:47.541209 7fb7bc2a8700 0 -- 10.27.251.8:0/279741970 >> 10.27.251.7:6789/0 pipe(0x7fb7b0008350 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b000c5f0).fault
Jun 3 22:11:50 vedovanera ceph[1893]: 2017-06-03 22:11:50.541203 7fb7bc1a7700 0 -- 10.27.251.8:0/279741970 >> 10.27.251.11:6789/0 pipe(0x7fb7b0000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb7b0004ea0).fault
Jun 3 22:11:52 vedovanera ceph[1893]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.5 --keyring=/var/lib/ceph/osd/ceph-5/keyring osd crush create-or-move -- 5 0.91 host=vedovanera root=default'
Jun 3 22:11:52 vedovanera ceph[1893]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.5']' returned non-zero exit status 1
Jun 3 22:11:52 vedovanera kernel: [ 73.636441] XFS (sdb1): Mounting V4 Filesystem
Jun 3 22:11:52 vedovanera kernel: [ 73.734257] XFS (sdb1): Ending clean mount
Jun 3 22:11:52 vedovanera ceph[1893]: === osd.4 ===
Jun 3 22:11:53 vedovanera ceph[1893]: 2017-06-03 22:11:53.541240 7f3b187db700 0 -- :/416680248 >> 10.27.251.11:6789/0 pipe(0x7f3b14061590 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b1405c1f0).fault
Jun 3 22:12:02 vedovanera ceph[1893]: 2017-06-03 22:12:02.541217 7f3b186da700 0 -- 10.27.251.8:0/416680248 >> 10.27.251.7:6789/0 pipe(0x7f3b08000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b08005040).fault
Jun 3 22:12:11 vedovanera ceph[1893]: 2017-06-03 22:12:11.541200 7f3b187db700 0 -- 10.27.251.8:0/416680248 >> 10.27.251.11:6789/0 pipe(0x7f3b08006e20 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b0800b0c0).fault
Jun 3 22:12:14 vedovanera ceph[1893]: 2017-06-03 22:12:14.541196 7f3b186da700 0 -- 10.27.251.8:0/416680248 >> 10.27.251.12:6789/0 pipe(0x7f3b08000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b08005040).fault
Jun 3 22:12:16 vedovanera ceph[1893]: 2017-06-03 22:12:16.860344 7f3b187db700 0 -- 10.27.251.8:0/416680248 >> 10.27.251.7:6789/0 pipe(0x7f3b08006e20 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b0800b0c0).fault
Jun 3 22:12:18 vedovanera systemd[1]: Stopping /bin/bash -c ulimit -n 32768; /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 3 22:12:18 vedovanera systemd[1]: Stopped /bin/bash -c ulimit -n 32768; /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun 3 22:12:20 vedovanera ceph[1893]: 2017-06-03 22:12:20.541149 7f3b186da700 0 -- 10.27.251.8:0/416680248 >> 10.27.251.11:6789/0 pipe(0x7f3b08000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f3b08005040).fault
3) power return:
Jun 4 15:36:15 vedovanera ceph[1901]: === mon.1 ===
Jun 4 15:36:15 vedovanera ceph[1901]: Starting Ceph mon.1 on vedovanera...
Jun 4 15:36:15 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768; /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 4 15:36:15 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768; /usr/bin/ceph-mon -i 1 --pid-file /var/run/ceph/mon.1.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun 4 15:36:15 vedovanera ceph[1901]: Running as unit ceph-mon.1.1496583375.326412133.service.
Jun 4 15:36:15 vedovanera ceph[1901]: Starting ceph-create-keys on vedovanera...
Jun 4 15:36:15 vedovanera bash[2118]: starting mon.1 rank 1 at 10.27.251.8:6789/0 mon_data /var/lib/ceph/mon/ceph-1 fsid 8794c124-c2ec-4e81-8631-742992159bd6
Jun 4 15:36:15 vedovanera kernel: [ 11.701050] XFS (sda1): Mounting V4 Filesystem
Jun 4 15:36:15 vedovanera kernel: [ 11.818819] ip6_tables: (C) 2000-2006 Netfilter Core Team
Jun 4 15:36:15 vedovanera kernel: [ 11.828320] ip_set: protocol 6
Jun 4 15:36:15 vedovanera kernel: [ 11.831759] XFS (sda1): Ending clean mount
Jun 4 15:36:15 vedovanera ceph[1901]: === osd.2 ===
Jun 4 15:36:18 vedovanera ceph[1901]: 2017-06-04 15:36:18.541236 7f60ac3d7700 0 -- :/1951895097 >> 10.27.251.7:6789/0 pipe(0x7f60a8061590 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f60a805c1f0).fault
Jun 4 15:36:21 vedovanera ceph[1901]: 2017-06-04 15:36:21.541317 7f60ac2d6700 0 -- :/1951895097 >> 10.27.251.9:6789/0 pipe(0x7f609c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c004ef0).fault
Jun 4 15:36:27 vedovanera ceph[1901]: 2017-06-04 15:36:27.541272 7f60ac1d5700 0 -- 10.27.251.8:0/1951895097 >> 10.27.251.7:6789/0 pipe(0x7f609c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c006470).fault
Jun 4 15:36:33 vedovanera ceph[1901]: 2017-06-04 15:36:33.541255 7f60ac3d7700 0 -- 10.27.251.8:0/1951895097 >> 10.27.251.9:6789/0 pipe(0x7f609c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c0057d0).fault
Jun 4 15:36:36 vedovanera ceph[1901]: 2017-06-04 15:36:36.541283 7f60ac1d5700 0 -- 10.27.251.8:0/1951895097 >> 10.27.251.11:6789/0 pipe(0x7f609c008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c00c5f0).fault
Jun 4 15:36:39 vedovanera ceph[1901]: 2017-06-04 15:36:39.541286 7f60ac3d7700 0 -- 10.27.251.8:0/1951895097 >> 10.27.251.12:6789/0 pipe(0x7f609c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c005bb0).fault
Jun 4 15:36:42 vedovanera ceph[1901]: 2017-06-04 15:36:42.541222 7f60ac1d5700 0 -- 10.27.251.8:0/1951895097 >> 10.27.251.7:6789/0 pipe(0x7f609c008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c00c5f0).fault
Jun 4 15:36:45 vedovanera ceph[1901]: 2017-06-04 15:36:45.541256 7f60ac3d7700 0 -- 10.27.251.8:0/1951895097 >> 10.27.251.11:6789/0 pipe(0x7f609c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f609c00fbf0).fault
Jun 4 15:36:45 vedovanera ceph[1901]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.2 --keyring=/var/lib/ceph/osd/ceph-2/keyring osd crush create-or-move -- 2 1.82 host=vedovanera root=default'
Jun 4 15:36:45 vedovanera ceph[1901]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.2']' returned non-zero exit status 1
Jun 4 15:36:46 vedovanera kernel: [ 42.122846] XFS (sdc1): Mounting V4 Filesystem
Jun 4 15:36:46 vedovanera kernel: [ 42.281840] XFS (sdc1): Ending clean mount
Jun 4 15:36:46 vedovanera ceph[1901]: === osd.3 ===
Jun 4 15:36:48 vedovanera ceph[1901]: 2017-06-04 15:36:48.541288 7f368c4dc700 0 -- :/1031853535 >> 10.27.251.7:6789/0 pipe(0x7f3688061590 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f368805c1f0).fault
Jun 4 15:36:51 vedovanera ceph[1901]: 2017-06-04 15:36:51.541359 7f368c3db700 0 -- :/1031853535 >> 10.27.251.9:6789/0 pipe(0x7f367c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c004ef0).fault
Jun 4 15:36:57 vedovanera ceph[1901]: 2017-06-04 15:36:57.541267 7f368c2da700 0 -- 10.27.251.8:0/1031853535 >> 10.27.251.7:6789/0 pipe(0x7f367c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c006470).fault
Jun 4 15:37:06 vedovanera ceph[1901]: 2017-06-04 15:37:06.541330 7f368c3db700 0 -- 10.27.251.8:0/1031853535 >> 10.27.251.11:6789/0 pipe(0x7f367c0080e0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c00e0d0).fault
Jun 4 15:37:09 vedovanera ceph[1901]: 2017-06-04 15:37:09.541277 7f368c2da700 0 -- 10.27.251.8:0/1031853535 >> 10.27.251.12:6789/0 pipe(0x7f367c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c00eb30).fault
Jun 4 15:37:12 vedovanera ceph[1901]: 2017-06-04 15:37:12.541280 7f368c3db700 0 -- 10.27.251.8:0/1031853535 >> 10.27.251.7:6789/0 pipe(0x7f367c0080e0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c0103e0).fault
Jun 4 15:37:15 vedovanera ceph[1901]: 2017-06-04 15:37:15.541249 7f368c2da700 0 -- 10.27.251.8:0/1031853535 >> 10.27.251.11:6789/0 pipe(0x7f367c000c00 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f367c012120).fault
Jun 4 15:37:16 vedovanera ceph[1901]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.3 --keyring=/var/lib/ceph/osd/ceph-3/keyring osd crush create-or-move -- 3 1.82 host=vedovanera root=default'
Jun 4 15:37:16 vedovanera ceph[1901]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.3']' returned non-zero exit status 1
Jun 4 15:37:16 vedovanera kernel: [ 72.458631] XFS (sdd1): Mounting V4 Filesystem
Jun 4 15:37:16 vedovanera kernel: [ 72.538849] XFS (sdd1): Ending clean mount
Jun 4 15:37:16 vedovanera ceph[1901]: === osd.5 ===
Jun 4 15:37:27 vedovanera ceph[1901]: 2017-06-04 15:37:27.545272 7fc898df7700 0 -- 10.27.251.8:0/3502424686 >> 10.27.251.7:6789/0 pipe(0x7fc88c000da0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c005040).fault
Jun 4 15:37:36 vedovanera ceph[1901]: 2017-06-04 15:37:36.545352 7fc898ef8700 0 -- 10.27.251.8:0/3502424686 >> 10.27.251.11:6789/0 pipe(0x7fc88c006e20 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c00e020).fault
Jun 4 15:37:39 vedovanera ceph[1901]: 2017-06-04 15:37:39.545309 7fc898df7700 0 -- 10.27.251.8:0/3502424686 >> 10.27.251.12:6789/0 pipe(0x7fc88c000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c00eb90).fault
Jun 4 15:37:42 vedovanera ceph[1901]: 2017-06-04 15:37:42.545271 7fc898ef8700 0 -- 10.27.251.8:0/3502424686 >> 10.27.251.7:6789/0 pipe(0x7fc88c006e20 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c0104a0).fault
Jun 4 15:37:45 vedovanera ceph[1901]: 2017-06-04 15:37:45.545256 7fc898df7700 0 -- 10.27.251.8:0/3502424686 >> 10.27.251.11:6789/0 pipe(0x7fc88c000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7fc88c00cbc0).fault
Jun 4 15:37:46 vedovanera ceph[1901]: failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf --name=osd.5 --keyring=/var/lib/ceph/osd/ceph-5/keyring osd crush create-or-move -- 5 0.91 host=vedovanera root=default'
Jun 4 15:37:46 vedovanera ceph[1901]: ceph-disk: Error: ceph osd start failed: Command '['/usr/sbin/service', 'ceph', '--cluster', 'ceph', 'start', 'osd.5']' returned non-zero exit status 1
Jun 4 15:37:46 vedovanera kernel: [ 102.738714] XFS (sdb1): Mounting V4 Filesystem
Jun 4 15:37:46 vedovanera kernel: [ 102.814995] XFS (sdb1): Ending clean mount
Jun 4 15:37:46 vedovanera ceph[1901]: === osd.4 ===
Jun 4 15:37:55 vedovanera ceph[1901]: 2017-06-04 15:37:55.921892 7f0c142f7700 0 -- 10.27.251.8:0/398750406 >> 10.27.251.7:6789/0 pipe(0x7f0c00000da0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f0c00005040).fault
Jun 4 15:38:01 vedovanera ceph[1901]: 2017-06-04 15:38:01.215940 7f0c154fb700 -1 monclient: _check_auth_rotating possible clock skew, rotating keys expired way too early (before 2017-06-04 14:38:01.215935)
Jun 4 15:38:01 vedovanera ceph[1901]: create-or-move updated item name 'osd.4' weight 0.91 at location {host=vedovanera,root=default} to crush map
Jun 4 15:38:01 vedovanera ceph[1901]: Starting Ceph osd.4 on vedovanera...
Jun 4 15:38:01 vedovanera ceph[1901]: Running as unit ceph-osd.4.1496583466.847479891.service.
Jun 4 15:38:01 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph -f...
Jun 4 15:38:01 vedovanera ceph[1901]: ceph-disk: Error: One or more partitions failed to activate
Jun 4 15:38:01 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 4 --pid-file /var/run/ceph/osd.4.pid -c /etc/ceph/ceph.conf --cluster ceph -f.
Jun 4 15:38:01 vedovanera bash[4907]: starting osd.4 at :/0 osd_data /var/lib/ceph/osd/ceph-4 /var/lib/ceph/osd/ceph-4/journal
Jun 4 15:38:03 vedovanera systemd[1]: Startup finished in 2.081s (kernel) + 1min 57.357s (userspace) = 1min 59.439s.
4) i've start the faulty OSD via web interface:
Jun 4 16:02:30 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 5 --pid-file /var/run/ceph/osd.5.pid -c /etc/pve/ceph.conf --cluster ceph -f...
Jun 4 16:02:30 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 5 --pid-file /var/run/ceph/osd.5.pid -c /etc/pve/ceph.conf --cluster ceph -f.
Jun 4 16:02:30 vedovanera bash[16353]: starting osd.5 at :/0 osd_data /var/lib/ceph/osd/ceph-5 /var/lib/ceph/osd/ceph-5/journal
Jun 4 16:03:34 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/pve/ceph.conf --cluster ceph -f...
Jun 4 16:03:34 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 2 --pid-file /var/run/ceph/osd.2.pid -c /etc/pve/ceph.conf --cluster ceph -f.
Jun 4 16:03:34 vedovanera bash[17125]: starting osd.2 at :/0 osd_data /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
Jun 4 16:04:43 vedovanera systemd[1]: Starting /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/pve/ceph.conf --cluster ceph -f...
Jun 4 16:04:43 vedovanera systemd[1]: Started /bin/bash -c ulimit -n 32768; /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/pve/ceph.conf --cluster ceph -f.
Jun 4 16:04:43 vedovanera bash[18009]: starting osd.3 at :/0 osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Looking at logs for osd.2, for example, there's no ''intermediate''
logs, eg:
2017-06-02 19:07:20.970337 7f4065e40700 0 log_channel(cluster) log [INF] : 3.fe deep-scrub starts
2017-06-02 19:07:22.474508 7f4065e40700 0 log_channel(cluster) log [INF] : 3.fe deep-scrub ok
2017-06-02 19:07:27.283142 7f4060f29700 0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6805/12122 pipe(0x26581000 sd=37 :40028 s=2 pgs=65 cs=1 l=0 c=0x3be1700).fault with nothing to send, going to standby
2017-06-02 19:07:27.286888 7f405c3e5700 0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6807/12122 pipe(0x27da2000 sd=47 :0 s=1 pgs=0 cs=0 l=1 c=0x307343c0).fault
2017-06-02 19:07:27.286934 7f405c4e6700 0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6806/12122 pipe(0x27d90000 sd=149 :0 s=1 pgs=0 cs=0 l=1 c=0x1c611760).fault
2017-06-02 19:07:27.287907 7f406142e700 0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6821/11698 pipe(0x26541000 sd=27 :47418 s=2 pgs=66 cs=1 l=0 c=0x3be1180).fault with nothing to send, going to standby
2017-06-02 19:07:27.290979 7f405e809700 0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6822/11698 pipe(0x199c3000 sd=152 :0 s=1 pgs=0 cs=0 l=1 c=0x1c612100).fault
2017-06-02 19:07:27.291636 7f405f213700 0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6823/11698 pipe(0x27a50000 sd=153 :0 s=1 pgs=0 cs=0 l=1 c=0x1c612ec0).fault
2017-06-02 19:07:27.301191 7f405b9db700 0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6826/12444 pipe(0x13bec000 sd=152 :0 s=1 pgs=0 cs=0 l=1 c=0x1c6123c0).fault
2017-06-02 19:07:27.301287 7f4060e28700 0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6825/12444 pipe(0x266a6000 sd=65 :43868 s=2 pgs=63 cs=1 l=0 c=0x3be1860).fault with nothing to send, going to standby
2017-06-02 19:07:27.301317 7f405b5d7700 0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6827/12444 pipe(0x27c94000 sd=153 :0 s=1 pgs=0 cs=0 l=1 c=0x1bce9860).fault
2017-06-02 19:07:27.329514 7f406122c700 0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6817/11123 pipe(0x5166000 sd=29 :33652 s=2 pgs=72 cs=1 l=0 c=0x3be0ec0).fault, initiating reconnect
2017-06-02 19:07:27.331358 7f407e0f5700 0 -- 10.27.251.8:6801/3278 >> 10.27.251.7:6817/11123 pipe(0x5166000 sd=29 :33652 s=1 pgs=72 cs=2 l=0 c=0x3be0ec0).fault
2017-06-02 19:07:27.334450 7f405bee0700 0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6818/11123 pipe(0x27deb000 sd=154 :0 s=1 pgs=0 cs=0 l=1 c=0x1bce6f20).fault
2017-06-02 19:07:27.334506 7f405bddf700 0 -- 10.27.251.8:0/3278 >> 10.27.251.7:6819/11123 pipe(0x27de1000 sd=155 :0 s=1 pgs=0 cs=0 l=1 c=0x1bce8c00).fault
2017-06-04 16:03:34.410880 7feecc646880 0 ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af), process ceph-osd, pid 17130
2017-06-04 16:03:34.503807 7feecc646880 0 filestore(/var/lib/ceph/osd/ceph-2) backend xfs (magic 0x58465342)
2017-06-04 16:03:34.791701 7feecc646880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: FIEMAP ioctl is supported and appears to work
2017-06-04 16:03:34.791712 7feecc646880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2017-06-04 16:03:34.807974 7feecc646880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2017-06-04 16:03:34.808049 7feecc646880 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_feature: extsize is disabled by conf
2017-06-04 16:03:36.686401 7feecc646880 0 filestore(/var/lib/ceph/osd/ceph-2) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2017-06-04 16:03:42.786238 7feecc646880 1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 20: 49964625920 bytes, block size 4096 bytes, directio = 1, aio = 1
2017-06-04 16:03:42.901178 7feecc646880 1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 20: 49964625920 bytes, block size 4096 bytes, directio = 1, aio = 1
2017-06-04 16:03:42.942584 7feecc646880 0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello
2017-06-04 16:03:43.034138 7feecc646880 0 osd.2 4002 crush map has features 1107558400, adjusting msgr requires for clients
2017-06-04 16:03:43.034150 7feecc646880 0 osd.2 4002 crush map has features 1107558400 was 8705, adjusting msgr requires for mons
2017-06-04 16:03:43.034162 7feecc646880 0 osd.2 4002 crush map has features 1107558400, adjusting msgr requires for osds
2017-06-04 16:03:43.034173 7feecc646880 0 osd.2 4002 load_pgs
2017-06-04 16:04:15.169521 7feecc646880 0 osd.2 4002 load_pgs opened 253 pgs
2017-06-04 16:04:15.178051 7feecc646880 -1 osd.2 4002 log_to_monitors {default=true}
2017-06-04 16:04:15.201447 7feeb6454700 0 osd.2 4002 ignoring osdmap until we have initialized
I hope can be useful. Thanks.
--
dott. Marco Gaiarin GNUPG Key ID: 240A3D66
Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/
Polo FVG - Via della Bontà, 7 - 33078 - San Vito al Tagliamento (PN)
marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797
Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
More information about the pve-user
mailing list