Post by Durval MenezesHi Gordan,
On Wed, Aug 14, 2013 at 4:07 PM, Durval Menezes
Hi Gordan,
On Wed, Aug 14, 2013 at 11:44 AM, Gordan Bobic
On Wed, Aug 14, 2013 at 3:32 PM, Durval Menezes
On Wed, Aug 14, 2013 at 11:02 AM, Gordan Bobic
On Wed, Aug 14, 2013 at 2:53 PM, Durval Menezes
On Wed, Aug 14, 2013 at 10:35 AM, Gordan Bobic
You're welcome. It seems to support fs <=
v5 and pool <= v26, so should be able to
handle zfs send|receive to/from the other
implementations.
I can confirm I was able to import a v26 pool
and mount its v5 FS, both created (and the FS
populated) using ZoL 0.6.1; right now I'm doing
running a "tar c| tar d" to verify whether any
data/metadata differences show up on this pool
(which is mounted using zfs-fuse) when compared
against the snapshot in another pool I populated
it from (which is right now mounted at the same
time and on the same machine running ZoL 0.6.1).
How cool is that, uh? :-)
Are you saying you have one mounted using ZoL and
the other using Z-F?
Yep :-)
I have to say I hadn't really considered that option...
I just installed the Z-F binaries (zpool, etc) in a
different out-of-PATH directory, it's just a matter of
running the binaries from there when one wants to do
anything with the Z-F pools/FSs, and running the Zol
binaries (from PATH) when one wants to handle the ZoL
pools/FSs.
I'd be concerned about cross-importing pools by accident.
That is likely to lead to corruption as they are being
accessed simultaneously. Be careful.
Humrmrmr.... doesn't importing a pool marks it on disk as busy,
in such a way as to require "-f" for a subsequent import? That
is, I should be safe as long as I take enough care when
specifying -f to import, right?
IIRC there was a thread on the ZoL list a few days ago started by
somebody who wanted such a feature, but I don't think it exists
(supposedly ext4 has it).
IIRC, that thread reflects a different situation: the OP wanted to be
able to *automatically* distinguish at "zpool import -f" time whether
other host that had imported the pool was still alive or if it had just
crashed, so as to be able to import -f it safely in another host being
brought up to replace the first. There are similarities with using both
Z-F and ZoL on the same host to access the same pool by accident, but it
ends with the "automatic" requirement: my imports will be done manually
and (in case -f is needed), very, very, *very* carefully :-)
Seriously now, I don't really expect to find any
differences nor hit any issues, but I will post
a reply to this message when it's finished (and
cross-post to the ZoL list so our friends there
know there's a reliable v26/v5 zfs-fuse available).
Great. Of course the real test will be when
somebody's pool breaks due to an implementation bug
- something that may not happen for months/years
(and hopefully never will, but it's nice to have
some level of insurance like this).
Nahh.... :-) I think I found a really easy way to break
a ZoL pool (and in fact anything that depends on flush
for consistency): just run it in a VirtualBox VM with
host disk caching turned on, then power off the VM
without exporting the pool... I know that shouldn't be
so, but it was that way that I managed to screw up a
perfectly good pool (in fact, that one I'm now diff'ing)
when testing imports with multiple OSs: the very first
time I shut down the VM without exporting it (while
running OpenIndiana FWIW) the pool wouldn't import
anymore until Nils made me aware of the -X complement to
the -F switch...
That isn't ZoL specific issue, or even a ZFS specific issue
- it's just that other file systems won't notice as much
wrong, even if some of your data is trashed.
I fully agree. If I were testing other filesystems, I would have
them checked via the "tar c|tar d" method no matter what...
Tripwire is technically the tool specifically designed for that job...
Humrmrmr.... tecnically tripwire is for intrusion detection, no? I agree
it would be more convenient to catch data/metadata corruption in a
production FS, but in my test case (as I still have the exact original
data in a snapshot from the other ZFS pool where I populated the pool in
question via a "zfs send|receive"), I think a "tar c|tar d" is not only
much simpler, but more guaranteed to catch any differences (after all,
tripwire checksum collisions do happen, and using more checksum
algorithms only reduce -- not eliminate -- it's probability). I agree
it's a very low probability, but with "tar c| tar d" we have *certainty*
as it compares byte-by-byte (or at least as much certainty as we are
allowed to have in a Universe ruled by Quantum Physics and Murphy's Law
:-)).
there is much higher. :)
Post by Durval MenezesIn fact, when the diff is over I think I'm going to try
and crash that pool again just to see whether this
latest zfs-fuse can import it.
On a totally unrelated note I'm now re-pondering
switching even my rootfs-es to ZFS - yesterday I hit
what seems to multiple disk failures manifesting as
massive silent data corruption, and preliminary
investigation suggests that MDRAID1+ext4 has
disintegrated pretty thoroughly. Most inconvenient.
Ouch.
FWIW, if we were betting, I would put all my cash on
ext4 being the culprit: in my experience the whole extX
family sucks in terms of robustness/reliability. I've
not used extX in machines under my direct supervision
at least since 2001 when ReiserFS became solid enough
for production use, but in a client which insists on
running only plain-bland-boring EL6 and which is
subjected to frequent main power/UPS SNAFUs, theyt used
to lose at least 2 or 3 ext4 hosts every time they lost
power, and that in a universe of 20-30 machines.... by
contrast, a specially critical machine of theirs which I
personally insisted on configuring with ReiserFS running
on top of linux md just kept running for years through
that selfsame power hell, never losing a beat. This (and
many other experiences) lead me to having a lot of faith
in Linux md and reiserfs, and a lot of doubt on ExtX.
I'm more suspecting the fundamental limitation of MD RAID1
here. If the disk is silently feeding back duff data, all
bets are off,
This sure can happen, but in my experience is very rare... in
the many dozens of disks which have been under my direct care
during the last few years, only once I found a silent
duff-producing disk... and I'm very careful, even on the ones I
wasn't running ZFS I think I would have detected any issues (see
below).
Yeah, it's rare (for some definition of rare) but it's now happened
to me twice in the past 4 years. Both times with no early warning signs.
And, with the current trend of "consolidation" (read: duopoly) in the
hard disk business, I think this only tends to get worse...
Triopoly. Don't forget Toshiba.
Post by Durval Menezesand unlike ZFS, traditional checksumless RAID has no hope of
guessing which disk (if not both) is feeding back duff data
for each block.
This is indeed the case. But it can detect the duff (unless of
course all disks in the array are returning exactly the same
duff, which I find extremely improbable): it's just a matter of
issuing a "check" command to the md device, which I do every day
on every md array.
Sure, that'll tell you that some of the data is duff, but it is
still non-repairable.
Not *automatically* repairable, I agree. OTOH, one can always manually
restore from the last backup as soon as you know it's corrupted... for
some cases, this can be a reasonable tradeoff re: performance...
Depends on the type of file, I guess. If it's an OS or a user file that
doesn't change much, great. If it's an InnoDB database file you are well
and truly out of luck.